Removing HTML from a Java String

Posted by Mason on Stack Overflow See other posts from Stack Overflow or by Mason
Published on 2008-10-27T16:39:29Z Indexed on 2010/04/23 21:23 UTC
Read the original article Hit count: 213

Filed under:
|
|

Is there a good way to remove HTML from a Java string? A simple regex like

 replaceAll("\\<.*?>","")

will work, but things like

&amp;

wont be converted correctly and non-HTML between the two angle brackets will be removed (ie the .*? in the regex will disappear).

© Stack Overflow or respective owner

Related posts about java

Related posts about html