Replacing characters in a non well-formed XML body
Posted
by ryanprayogo
on Stack Overflow
See other posts from Stack Overflow
or by ryanprayogo
Published on 2010-06-09T18:13:29Z
Indexed on
2010/06/09
18:22 UTC
Read the original article
Hit count: 203
In a (Java) code that I'm working on, I sometimes deal with a non well-formed XML (represented as a Java String
), such as:
<root>
<foo>
bar & baz < quux
</foo>
</root>
Since this XML will eventually need to be unmarshalled (using JAXB), obviously this XML as is will throw exception upon unmarshalling.
What's the best way to replace the &
and the <
to its character entities? For &
, it's as easy as:
xml.replaceAll("&", "&")
However, for the <
symbol, it's a bit tricky since obviously I don't want to replace the <
that's used for the XML tag opening 'bracket'.
Other than scanning the string and manually replacing <
in the XML body with <
, what other option can you suggest?
© Stack Overflow or respective owner