What if I put two kinds of encoded strings, say utf-8 and utf-16, in one file?

Posted by jonny on Stack Overflow See other posts from Stack Overflow or by jonny
Published on 2012-06-20T07:24:16Z Indexed on 2012/06/20 9:16 UTC
Read the original article Hit count: 270

Filed under:
|
|
|

In Python, for example:

f = open('test','w')
f.write('this is a test\n'.encode('utf-16'))
f.write('another test\n'.encode('utf-8'))
f.close()

That file gets messy when I re-open it:

f = open("test")
print f.readline().decode('utf-16')  # it leads to UnicodeDecodeError
print f.readline().decode('utf-8')   # it works fine

However if I keep the texts encoded in one style (say utf-16 only), it could read back ok. So I'm guessing mixing two types of encoding in the same file is wrong and couldn't be decoded back, even if I do know the encoding rules of each specific string? Any suggestion is welcome, thank you!

© Stack Overflow or respective owner

Related posts about python

Related posts about unicode