Trouble with encoding and urllib
Posted
by Ockonal
on Stack Overflow
See other posts from Stack Overflow
or by Ockonal
Published on 2010-05-14T14:05:56Z
Indexed on
2010/05/14
14:14 UTC
Read the original article
Hit count: 298
Hello, I'm loading web-page using urllib. Ther eis russian symbols, but page encoding is 'utf-8'
1
pageData = unicode(requestHandler.read()).decode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 262: ordinal not in range(128)
2
pageData = requestHandler.read()
soupHandler = BeautifulSoup(pageData)
print soupHandler.findAll(...)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 340-345: ordinal not in range(128)
© Stack Overflow or respective owner