PHP - Processing Invalid XML
Posted
by Paul
on Stack Overflow
See other posts from Stack Overflow
or by Paul
Published on 2010-05-22T23:16:30Z
Indexed on
2010/05/22
23:20 UTC
Read the original article
Hit count: 383
I'm using SimpleXML to load in some xml files (which I didn't write/provide and can't really change the format of).
Occasionally (eg one or two files out of every 50 or so) they don't escape any special characters (mostly &, but sometimes other random invalid things too). This creates and issue because SimpleXML with php just fails, and I don't really know of any good way to handle parsing invalid XML.
My first idea was to preprocess the XML as a string and put ALL fields in as CDATA so it would work, but for some ungodly reason the XML I need to process puts all of its data in the attribute fields. Thus I can't use the CDATA idea. An example of the XML being:
<Author v="By Someone & Someone" />
Whats the best way to process this to replace all the invalid characters from the XML before I load it in with SimpleXML?
© Stack Overflow or respective owner