Line-length-tolerant XML diff

Posted by Jon Skeet on Super User See other posts from Super User or by Jon Skeet
Published on 2010-04-08T19:57:51Z Indexed on 2010/04/08 20:03 UTC
Read the original article Hit count: 425

Filed under:
|
|
|

I've looked at the answers to this question, and unfortunately none of them has helped me so far.

Not to beat about the bush, the second edition of C# in Depth is now in copy edit. I want to be able to see what the copy editor's done really easily, so I can reject or accept his changes.

We're using a modified form of docbook, but I'm happy enough looking at the raw XML source. All fine so far - except that when the copy editor makes a change, that can change the line wrapping. So something that used to read:

<para>Foo bar baz
 second line</para>

now reads

<para>Foo bar grontle
 baz second line</para>

Now the real change here is the insertion of "grontle". I don't care that "baz" has moved from the first line to the second line... but all the diff tools I've seen do.

I realise that one option would be to reformat the whole document (or possibly just whole paragraphs) into single lines... but that's then really hard to read, because diff tools don't wrap when they're displaying.

I'm sure I can manage with the tools I've got, but if anyone knows of anything better, I'd be really glad to hear about it. I suspect my publishers would too :)

(I've included the Windows tag here because I'd really need it to be available on Windows. I'd like to hear about any non-Windows software too, but only in case I could help to build it on Windows :)

© Super User or respective owner

Related posts about Xml

Related posts about diff