How to use bzdiff to find difference between 2 bzipped files with diff -I option?
Posted
by
englebip
on Server Fault
See other posts from Server Fault
or by englebip
Published on 2011-03-11T11:53:15Z
Indexed on
2011/03/12
0:12 UTC
Read the original article
Hit count: 604
I'm trying to do a diff on MySQL dumps (created with mysqldump and piped to bzip2), to see if there are changes between consecutive dumps. The followings are the tails of 2 dumps:
tmp1:
/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;
-- Dump completed on 2011-03-11 1:06:50
tmp2:
/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;
-- Dump completed on 2011-03-11 0:40:11
When I bzdiff their bzipped version:
$ bzdiff tmp?.bz2
10c10
< -- Dump completed on 2011-03-11 1:06:50
---
> -- Dump completed on 2011-03-11 0:40:11
According to the manual of bzdiff, any option passed on to bzdiff is passed on to diff. I therefore looked at the -I option that allows to define a regexp; lines matching it are ignored in the diff. When I then try:
$ bzdiff -I'Dump' tmp1.bz2 tmp2.bz2
I get an empty diff. I would like to match as much as possible of the "Dump completed" line, though, but when I then try:
$ bzdiff -I'Dump completed' tmp1.bz2 tmp2.bz2
diff: extra operand `/tmp/bzdiff.miCJEvX9E8'
diff: Try `diff --help' for more information.
Same thing happens for some variations:
$ bzdiff '-IDump completed' tmp1.bz2 tmp2.bz2
$ bzdiff '-I"Dump completed"' tmp1.bz2 tmp2.bz2
$ bzdiff -'"IDump completed"' tmp1.bz2 tmp2.bz2
If I diff the un-bzipped files there is no problem:
$diff -I'^[-][-] Dump completed on' tmp1 tmp2
gives also an empty diff.
bzdiff is a shell script usually placed in /bin/bzdiff. Essentially, it parses the command options and passes them on to diff as follows:
OPTIONS=
FILES=
for ARG
do
case "$ARG" in
-*) OPTIONS="$OPTIONS $ARG";;
*) if test -f "$ARG"; then
FILES="$FILES $ARG"
else
echo "${prog}: $ARG not found or not a regular file"
exit 1
fi ;;
esac
done
[...]
bzip2 -cdfq "$1" | $comp $OPTIONS - "$tmp"
I think the problem stems from escaping the spaces in the passing of $OPTIONS
to diff, but I couldn't figure out how to get it interpreted correctly.
Any ideas?
EDIT
@DerfK: Good point with the .
, I had forgotten about them... I tried the suggestion with the multiple level of quotes, but that is still not recognized:
$ bzdiff "-I'\"Dump.completed.on\"'" tmp1.bz2 tmp2.bz2
diff: extra operand `/tmp/bzdiff.Di7RtihGGL'
© Server Fault or respective owner