BeautifulSoup HTMLParseError. What's wrong with this?

Posted by user1915496 on Stack Overflow See other posts from Stack Overflow or by user1915496
Published on 2012-12-20T05:00:55Z Indexed on 2012/12/20 5:03 UTC
Read the original article Hit count: 281

Filed under:
|

This is my code:

from bs4 import BeautifulSoup as BS
import urllib2
url = "http://services.runescape.com/m=news/recruit-a-friend-for-free-membership-and-xp"
res = urllib2.urlopen(url)
soup = BS(res.read())
other_content = soup.find_all('div',{'class':'Content'})[0]
print other_content

Yet an error comes up:

/Library/Python/2.7/site-packages/bs4/builder/_htmlparser.py:149: RuntimeWarning: Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.
  "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help."))
Traceback (most recent call last):
  File "web.py", line 5, in <module>
    soup = BS(res.read())
  File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 172, in __init__
    self._feed()
  File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 185, in _feed
    self.builder.feed(self.markup)
  File "/Library/Python/2.7/site-packages/bs4/builder/_htmlparser.py", line 150, in feed
    raise e

I've let two other people use this code, and it works for them perfectly fine. Why is it not working for me? I have bs4 installed...

© Stack Overflow or respective owner

Related posts about python

Related posts about beautifulsoup