html parsing with libxml
Posted
by zajcev
on Stack Overflow
See other posts from Stack Overflow
or by zajcev
Published on 2009-04-28T22:32:43Z
Indexed on
2010/05/02
8:17 UTC
Read the original article
Hit count: 649
In another thread I got convinced into using HTML parsers instead of regexps for HTML parsing (I thought they would work fine, but they didn't ;) ).
I thought of using libxml (it has some HTML parser built in), but failed to find any useful tutorial. I also found this site and it says here it should do fine even with severly broken HTML.
Could you give me some examples of HTML parsing with libxml, or maybe recommend some different free library for Linux? I'm using C++.
I just thought someone would have some example code, so that I don't have to analyze the headers ;)
© Stack Overflow or respective owner