Java remove HTML from String without regular expressions
- by behrk2
Hello,
I am trying to remove all HTML elements from a String. Unfortunately, I cannot use regular expressions because I am developing on the Blackberry platform and regular expressions are not yet supported.
Is there any other way that I can remove HTML from a string? I read somewhere that you can use a DOM Parser, but I couldn't find much on it.
Text with HTML:
<![CDATA[As a massive asteroid hurtles toward Earth, NASA head honcho Dan Truman (<a href="http://www.netflix.com/RoleDisplay/Billy_Bob_Thornton/20000303">Billy Bob Thornton</a>) hatches a plan to split the deadly rock in two before it annihilates the entire planet, calling on Harry Stamper (<a href="http://www.netflix.com/RoleDisplay/Bruce_Willis/99786">Bruce Willis</a>) -- the world's finest oil driller -- to head up the mission. With time rapidly running out, Stamper assembles a crack team and blasts off into space to attempt the treacherous task. <a href="http://www.netflix.com/RoleDisplay/Ben_Affleck/20000016">Ben Affleck</a> and <a href="http://www.netflix.com/RoleDisplay/Liv_Tyler/162745">Liv Tyler</a> co-star.]]>
Text without HTML:
As a massive asteroid hurtles toward Earth, NASA head honcho Dan Truman (Billy Bob Thornton) hatches a plan to split the deadly rock in two before it annihilates the entire planet, calling on Harry Stamper (Bruce Willis) -- the world's finest oil driller -- to head up the mission. With time rapidly running out, Stamper assembles a crack team and blasts off into space to attempt the treacherous task.Ben Affleck and Liv Tyler co-star.
Thanks!