Suggestions on how build an HTML Diff tool?
- by Danimal
In this post I asked if there were any tools that compare the structure (not actual content) of 2 HTML pages. I ask because I receive HTML templates from our designers, and frequently miss minor formatting changes in my implementation. I then waste a few hours of designer time sifting through my pages to find my mistakes.
The thread offered some good suggestions, but there was nothing that fit the bill. "Fine, then", thought I, "I'll just crank one out myself. I'm a halfway-decent developer, right?".
Well, once I started to think about it, I couldn't quite figure out how to go about it. I can crank out a data-driven website easily enough, or do a CMS implementation, or throw documents in and out of BizTalk all day. Can't begin to figure out how to compare HTML docs.
Well, sure, I have to read the DOM, and iterate through the nodes. I have to map the structure to some data structure (how??), and then compare them (how??). It's a development task like none I've ever attempted.
So now that I've identified a weakness in my knowledge, I'm even more challenged to figure this out. Any suggestions on how to get started?
clarification: the actual content isn't what I want to compare -- the creative guys fill their pages with lorem ipsum, and I use real content. Instead, I want to compare structure:
<div class="foo">lorem ipsum<div>
is different that
<div class="foo"><p>lorem ipsum<p><div>