Diff 2 files while ignoring parts of lines

Posted by Millianz on Super User See other posts from Super User or by Millianz
Published on 2012-04-05T17:17:17Z Indexed on 2012/04/05 23:33 UTC
Read the original article Hit count: 258

Filed under:
|
|
|

I would like to diff a file system. Currently my bash script prints out the file system recursively into a file (ls -l -R) and diffs it with an expected output.

An example for a line in this file would be: drw---- 100000f3 00000400 0 ./foo/

My current diff command is diff "$TEMP_LOG" "$DIFF_FILE_OUT" --strip-trailing-cr --changed-group-format='%>' --unchanged-group-format='' >> "$SubLog"

As you can see I ignore additional lines in the current output file, I only care about lines that match with the master output.

I now have the problem though that some files may differ in size, or a folder might even have a different name, but due to it's location I know what access rights it should have.

For example:

Output:

------- 00000000 00000000      528 ./foo/bar.txt

Master:

------- 00000000 00000000      200 ./foo/bar.txt

Only the size differs here, and it doesn't matter, I would like to just ignore certain parts of the diff, kind of like an ansi c comment.

Master:
------- 00000000 00000000      /*200*/ ./foo/bar.txt

-- OR --

Master:
d------ 00000000 00000000        /*10*/ ./foo//*123123*///*76456546*//bar.txt

Output:
d------ 00000000 00000000        0 ./foo/asd/sdf/bar.txt

And still have it diff correctly.

Is this even possible with diff, or will I have to write a custom script for it? Since I'm fairly new to cygwin I might be using the completely wrong tool all together, I'm happy for any suggestions.

Update:

Taking a step back, here is the general task at hand that I want to achieve. I want to write a script that checks the file system to see if the read/write permissions are set up correctly. The structure of the file system is under my control, so I don't have to worry about it changing too much. Sometimes folders/files might not be present, but if they are their permissions must be checked.

For Example assume that the following is a snapshot of the current file system structure

drw ./foo
drw ./foo/bar
-rw ./foow/bar/bar.txt
drw ./foo/baz
-rw ./foo/baz/baz.txt

And this is what the file system structure might dictate, i.e. if these folders / files are present, the permissions must match.

drw ./foo
drw ./foo/bar
-rw ./foo/bar/bar.txt
--- ./foo/bar/foobar.txt
drw ./foo/baz
-rw ./foo/baz/foobaz.txt

In this case the file system checked out ok, since all files present match their expected values. The situation becomes more complicated as soon as certain folders might have any arbitrary name, only due to their location I know what their permissions should be. Assume that the directory ./foo/bar in the above example might be such a case, i.e. instead of bar the folder could have any name, but still match the -rw permissions.

This seems like a very complicated situation, and I'm not even sure if I can solve it with bash scripting alone. I might have to write an actual application.

© Super User or respective owner

Related posts about bash

Related posts about shell