html parsing with libxml

Posted by zajcev on Stack Overflow See other posts from Stack Overflow or by zajcev
Published on 2009-04-28T22:32:43Z Indexed on 2010/05/02 8:17 UTC
Read the original article Hit count: 647

Filed under:
|
|
|

In another thread I got convinced into using HTML parsers instead of regexps for HTML parsing (I thought they would work fine, but they didn't ;) ).

I thought of using libxml (it has some HTML parser built in), but failed to find any useful tutorial. I also found this site and it says here it should do fine even with severly broken HTML.

Could you give me some examples of HTML parsing with libxml, or maybe recommend some different free library for Linux? I'm using C++.

I just thought someone would have some example code, so that I don't have to analyze the headers ;)

© Stack Overflow or respective owner

Related posts about html

Related posts about parsing