Best library to parse HTML with Python 3 and example?

Posted by TMC on Stack Overflow See other posts from Stack Overflow or by TMC
Published on 2010-03-24T02:54:14Z Indexed on 2010/03/24 3:13 UTC
Read the original article Hit count: 280

Filed under:

I'm new to Python completely and am using Python 3.1 on Windows (pywin). I need to parse some HTML, to essentially extra values between specific HTML tags and am confused at my array of options, and everything I find is suited for Python 2.x. I've read raves about Beautiful Soup, HTML5Lib and lxml, but I cannot figure out how to install any of these on Windows.

Questions:

  1. What HTML parser do you recommend?
  2. How do I install it?
  3. Do you have a simple example on how to use the recommended library to snag HTML from a specific URL and return the value out of say something like this:

    fooLink

(say we want to return "/blahblah")

© Stack Overflow or respective owner

Related posts about python-3.x