Repairing broken XML file - removing extra less-than/greater-than signs
- by peku
I have a large XML file which in the middle contains the following:
<ArticleName>Article 1 <START </ArticleName>
Obviously libxml and other XML libraries can't read this because the less-than sign opens a new tag which is never closed. My question is, is there anything I can do to fix issues like this automatically (preferably in Ruby)? The solution should of course work for any field which has an error like this. Someone said SAX parsing could do the trick but I'm not sure how that would work.