How to deal with unknown entity references?

Posted by Chris on Stack Overflow See other posts from Stack Overflow or by Chris
Published on 2010-03-12T16:19:44Z Indexed on 2010/03/13 9:35 UTC
Read the original article Hit count: 199

Filed under:
|
|

I'm parsing (a lot of) XML files that contain entity references which i dont know in advance (can't change that fact).

For example:

xml = "<tag>I'm content with &funny; &entity; &references;.</tag>"

when i try to parse this using the following code:

final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
final DocumentBuilder db = dbf.newDocumentBuilder();
final InputSource is = new InputSource(new StringReader(xml));
final Document d = db.parse(is);

i get the following exception:

org.xml.sax.SAXParseException: The entity "funny" was referenced, but not declared.

but, what i do want to achieve is, that the parser replaces every entity that is not declared (unknown to the parser) with an empty String ''. Or even better, is there a way to pass a map to the parser like:

Map<String,String> entityMapping = ...
entityMapping.put("funny","very");
entityMapping.put("entity","important");
entityMapping.put("references","stuff");

so that i could do the following:

final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
final DocumentBuilder db = dbf.newDocumentBuilder();
final InputSource is = new InputSource(new StringReader(xml));

db.setEntityResolver(entityMapping);
final Document d = db.parse(is);

if i would obtain the text from the document using this example code i should receive:

I'm content with very important stuff.

Any suggestions? Of course, i already would be happy to just replace the unknown entity's with empty strings.

Thanks,

© Stack Overflow or respective owner

Related posts about java

Related posts about Xml