How to subString a block of user generated HTML while preserving formatting?

Posted by Chad on Stack Overflow See other posts from Stack Overflow or by Chad
Published on 2009-07-22T22:57:57Z Indexed on 2010/04/18 15:03 UTC
Read the original article Hit count: 303

Filed under:
|
|

I'd like to create the typical preview paragraph with a [read more] link. Problem is, the content that I'd like to SubString() contains text and html, written by a user with a WYSIWYG editor.

Of course, I check to make sure the string is not null or empty, then SubString() it, problem is that I could end up breaking the html tags, throwing off the rendering of the entire site.

The WYSIWYG editor doesn't seem to create perfectly formatted HTML, and many times seems to use <br /> tags instead of <p></p>, etc... basically, I can't rely on well-formed tags, etc.

My workaround was to just strip out all HTML and substring the leftover text. This works, but loses any of the formatting that was in the HTML.

What's the best method of SubString()'ing a block of non-well-formed HTML while maintaining HTML that won't break the rendering of the site?

© Stack Overflow or respective owner

Related posts about c#

Related posts about substring