Fastest XML parser for small, simple documents in Java
- by Varkhan
I have to objectify very simple and small XML documents (less than 1k, and it's almost SGML: no namespaces, plain UTF-8, you name it...), read from a stream, in Java.
I am using JAXP to process the data from my stream into a Document object. I have tried Xerces, it's way too big and slow... I am using Dom4j, but I am still spending way too much time in org.dom4j.io.SAXReader.
Does anybody out there have any suggestion on a faster, more efficient implementation, keeping in mind I have very tough CPU and memory constraints?
[Edit 1] Keep in mind that my documents are very small, so the overhead of staring the parser can be important. For instance I am spending as much time in org.xml.sax.helpers.XMLReaderFactory.createXMLReader as in org.dom4j.io.SAXReader.read
[Edit 2] The result has to be in Dom format, as I pass the document to decision tools that do arbitrary processing on it, like switching code based on the value of arbitrary XPaths, but also extracting lists of values packed as children of a predefined node.
[Edit 3] In any case I eventually need to load/parse the complete document, since all the information it contains is going to be used at some point.
(This question is related to, but different from, http://stackoverflow.com/questions/373833/best-xml-parser-for-java )