Retrieving well formed HTML using Jericho HTML parser in Java
- by Raj
Hello,
I've looked at jTidy for converting a snipped of malformed/real-world HTML into well-formed HTML/XHTML. However, there's a bug in the latest version due to which I'm not able to use it. I'm looking at Jericho since it has a lot of positive reviews around the net.
However, its not immediately obvious to me how one would go about implementing a method like:
public String getValidHTML(String messedUpHTML)
For instance, if it was passed <div>bar, it would return <div>bar</div>
Any pointers would be helpful.
Thanks in advance!