Trouble parsing quotes with SAX parser (javax.xml.parsers.SAXParser)
Posted
by johnrock
on Stack Overflow
See other posts from Stack Overflow
or by johnrock
Published on 2010-04-04T04:16:52Z
Indexed on
2010/04/05
1:43 UTC
Read the original article
Hit count: 396
When using a SAX parser, parsing fails when there is a " in the node content. How can I resolve this? Do I need to convert all " characters?
In other words, anytime I have a quote in a node:
<node>characters in node containing "quotes"</node>
That node gets butchered into multiple character arrays when the Handler is parsing it. Is this normal behaviour? Why should quotes cause such a problem?
Here is the code I am using:
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
...
HttpGet httpget = new HttpGet(GATEWAY_URL + "/"+ question.getId());
httpget.setHeader("User-Agent", PayloadService.userAgent);
httpget.setHeader("Content-Type", "application/xml");
HttpResponse response = PayloadService.getHttpclient().execute(httpget);
HttpEntity entity = response.getEntity();
if(entity != null)
{
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
ConvoHandler convoHandler = new ConvoHandler();
xr.setContentHandler(convoHandler);
xr.parse(new InputSource(entity.getContent()));
entity.consumeContent();
messageList = convoHandler.getMessageList();
}
© Stack Overflow or respective owner