Python file input string: how to handle escaped unicode characters?

Posted by Michi on Stack Overflow See other posts from Stack Overflow or by Michi
Published on 2010-05-11T13:44:18Z Indexed on 2010/05/11 14:14 UTC
Read the original article Hit count: 301

Filed under:
|
|
|

In a text file (test.txt), my string looks like this:

Gro\u00DFbritannien

Reading it, python escapes the backslash:

>>> file = open('test.txt', 'r')
>>> input = file.readline()
>>> input
'Gro\\u00DFbritannien'

How can I have this interpreted as unicode? decode() and unicode() won't do the job.

The following code writes Gro\u00DFbritannien back to the file, but I want it to be Großbritannien

>>> input.decode('latin-1')
u'Gro\\u00DFbritannien'
>>> out = codecs.open('out.txt', 'w', 'utf-8')
>>> out.write(input)

© Stack Overflow or respective owner

Related posts about python

Related posts about unicode