Cleaning a string consisting of html/server-side tags in Java
- by Denzil
I have a text like:
I've got a date with this fellow tomorrow. Well me and thousands of others. <br /><br /><img src="http://www.newwest.net/images/thumbnails_feature/barack_obama_westerners.jpg"><br /><br />Tomorrow morning I will be getting up at stupid o'clock and driving up to Manchester, NH to see Barak Obama speak. <br /><br />You all should come too!<br /><br /><a href="http://nh.barackobama.com/manchesterchange">RSVP for the event</a>
I would want to like to clean it too :
I've got a date with this fellow
tomorrow. Well me and thousands of
others
http://www.newwest.net/images/thumbnails_feature/barack_obama_westerners.jpg
Tomorrow morning I
will be getting up at stupid
o'clock and driving up to
Manchester, NH to see Barak Obama
speak.You all
should come too!
h**p://nh.barackobama.com/manchesterchange RSVP
for the event
I would like to write a JAVA program for the same. Any pointers/suggestions would be appreciated.The tags aren't limited to the above post. This was just an example.
Thanks!
PS: Replace *'s by t's in the second hyperlink as Stack Overflow doesn't allow me to post more than one link.