feedparser fails during script run, but can't reproduce in interactive python console

Posted by Rhubarb on Stack Overflow See other posts from Stack Overflow or by Rhubarb
Published on 2010-05-18T13:01:51Z Indexed on 2010/05/18 13:30 UTC
Read the original article Hit count: 445

It's failing with this when I run eclipse or when I run my script in iPython:

'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128) 

I don't know why, but when I simply execute the feedparse.parse(url) statement using the same url, there is no error thrown. This is stumping me big time.

The code is as simple as: try: d = feedparser.parse(url) except Exception, e: logging.error('Error while retrieving feed.') logging.error(e) logging.error(formatExceptionInfo(None)) logging.error(formatExceptionInfo1())

Here is the stack trace:

d = feedparser.parse(url)


 File "C:\Python26\lib\site-packages\feedparser.py", line 2623, in parse
    feedparser.feed(data)
  File "C:\Python26\lib\site-packages\feedparser.py", line 1441, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "C:\Python26\lib\sgmllib.py", line 104, in feed
    self.goahead(0)
  File "C:\Python26\lib\sgmllib.py", line 143, in goahead
    k = self.parse_endtag(i)
  File "C:\Python26\lib\sgmllib.py", line 320, in parse_endtag
    self.finish_endtag(tag)
  File "C:\Python26\lib\sgmllib.py", line 360, in finish_endtag
    self.unknown_endtag(tag)
  File "C:\Python26\lib\site-packages\feedparser.py", line 476, in unknown_endtag
    method()
  File "C:\Python26\lib\site-packages\feedparser.py", line 1318, in _end_content
    value = self.popContent('content')
  File "C:\Python26\lib\site-packages\feedparser.py", line 700, in popContent
    value = self.pop(tag)
  File "C:\Python26\lib\site-packages\feedparser.py", line 641, in pop
    output = _resolveRelativeURIs(output, self.baseuri, self.encoding)
  File "C:\Python26\lib\site-packages\feedparser.py", line 1594, in _resolveRelativeURIs
    p.feed(htmlSource)
  File "C:\Python26\lib\site-packages\feedparser.py", line 1441, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "C:\Python26\lib\sgmllib.py", line 104, in feed
    self.goahead(0)
  File "C:\Python26\lib\sgmllib.py", line 138, in goahead
    k = self.parse_starttag(i)
  File "C:\Python26\lib\sgmllib.py", line 296, in parse_starttag
    self.finish_starttag(tag, attrs)
  File "C:\Python26\lib\sgmllib.py", line 338, in finish_starttag
    self.unknown_starttag(tag, attrs)
  File "C:\Python26\lib\site-packages\feedparser.py", line 1588, in unknown_starttag
    attrs = [(key, ((tag, key) in self.relative_uris) and self.resolveURI(value) or value) for key, value in attrs]
  File "C:\Python26\lib\site-packages\feedparser.py", line 1584, in resolveURI
    return _urljoin(self.baseuri, uri)
  File "C:\Python26\lib\site-packages\feedparser.py", line 286, in _urljoin
    return urlparse.urljoin(base, uri)
  File "C:\Python26\lib\urlparse.py", line 215, in urljoin
    params, query, fragment))
  File "C:\Python26\lib\urlparse.py", line 184, in urlunparse
    return urlunsplit((scheme, netloc, url, query, fragment))
  File "C:\Python26\lib\urlparse.py", line 192, in urlunsplit
    url = scheme + ':' + url
  File "C:\Python26\lib\encodings\cp1252.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)

© Stack Overflow or respective owner

Related posts about python

Related posts about feedparser