Java HTML parser/validator
Posted
by
at
on Stack Overflow
See other posts from Stack Overflow
or by at
Published on 2010-12-24T01:40:28Z
Indexed on
2010/12/24
1:53 UTC
Read the original article
Hit count: 559
We allow people to enter HTML code on our wiki-like site. But only a limited subset of HTML to not affect our styling and not allow malicious javascript code. Is there a good Java library on the server side to ensure that the code entered is valid?
We tried creating an XML Schema document to validate against. The only issue there is the libraries we used to validate gave back cryptic error messages. What I want is for the validation library to actually fix the issue (if there was a style="" attribute added to an element, remove it). If fixing it is not easy, at least allow me to report a message to the user with the location of the error (an error code that I can present a nice message from is fine, probably even preferable).
© Stack Overflow or respective owner