Python Beautiful Soup .content Property

Posted by Robert Birch on Stack Overflow See other posts from Stack Overflow or by Robert Birch
Published on 2013-10-26T03:13:53Z Indexed on 2013/10/26 3:54 UTC
Read the original article Hit count: 180

Filed under:
|

What does BeautifulSoup's .content do? I am working through crummy.com's tutorial and I don't really understand what .content does. I have looked at the forums and I have not seen any answers. Looking at the code below....

from BeautifulSoup import BeautifulSoup
import re



doc = ['<html><head><title>Page title</title></head>',
       '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
        '<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
        '</html>']

soup = BeautifulSoup(''.join(doc))
print soup.contents[0].contents[0].contents[0].contents[0].name

I would expect the last line of the code to print out 'body' instead of...

  File "pe_ratio.py", line 29, in <module>
    print soup.contents[0].contents[0].contents[0].contents[0].name
  File "C:\Python27\lib\BeautifulSoup.py", line 473, in __getattr__
    raise AttributeError, "'%s' object has no attribute '%s'" % (self.__class__.__name__, attr)
AttributeError: 'NavigableString' object has no attribute 'name'

Is .content only concerned with html, head and title? If, so why is that?

Thanks for the help in advance.

© Stack Overflow or respective owner

Related posts about python

Related posts about beautifulsoup