On windows, I have the following problem:
>>> string = "Don´t Forget To Breathe"
>>> import json,os,codecs
>>> f = codecs.open("C:\\temp.txt","w","UTF-8")
>>> json.dump(string,f)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python26\lib\json\__init__.py", line 180, in dump
for chunk in iterable:
File "C:\Python26\lib\json\encoder.py", line 294, in _iterencode
yield encoder(o)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-5: invalid data
(Notice the non-ascii apostrophe in the string.)
However, my friend, on his mac (also using python2.6), can run through this like a breeze:
> string = "Don´t Forget To Breathe"
> import json,os,codecs
> f = codecs.open("/tmp/temp.txt","w","UTF-8")
> json.dump(string,f)
> f.close(); open('/tmp/temp.txt').read()
'"Don\\u00b4t Forget To Breathe"'
Why is this? I've also tried using UTF-16 and UTF-32 with json and codecs, but to no avail.