python lxml problem

Posted by David ??? on Stack Overflow See other posts from Stack Overflow or by David ???
Published on 2011-03-16T23:54:29Z Indexed on 2011/03/17 0:09 UTC
Read the original article Hit count: 316

Filed under:
|
|

I'm trying to print/save a certain element's HTML from a web-page.
I've retrieved the requested element's XPath from firebug.

All I wish is to save this element to a file. I don't seem to succeed in doing so.
(tried the XPath with and without a /text() at the end)

I would appreciate any help, or past experience.
10x, David

import urllib2,StringIO
from lxml import etree

url='http://www.tutiempo.net/en/Climate/Londres_Heathrow_Airport/12-2009/37720.htm'
seite = urllib2.urlopen(url)
html = seite.read()
seite.close()
parser = etree.HTMLParser()
tree = etree.parse(StringIO.StringIO(html), parser)
xpath = "/html/body/table/tbody/tr/td[2]/div/table/tbody/tr[6]/td/table/tbody/tr/td[3]/table/tbody/tr[3]/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/text()"
elem = tree.xpath(xpath)


print elem[0].strip().encode("utf-8")

© Stack Overflow or respective owner

Related posts about python

Related posts about xpath