Windows cmd encoding change causes Python crash.
- by Alex
First I chage Windows CMD encoding to utf-8 and run Python interpreter:
chcp 65001
python
Then I try to print a unicode sting inside it and when i do this Python crashes in a peculiar way (I just get a cmd prompt in the same window).
>>> import sys
>>> print u'ëèæîð'.encode(sys.stdin.encoding)
Any ideas why it happens and how to make it work?
UPD: sys.stdin.encoding returns 'cp65001'
UPD2: It just came to me that the issue might be connected with the fact that utf-8 uses multi-byte character set (kcwu made a good point on that). I tried running the whole example with 'windows-1250' and got 'ëeaî?'. Windows-1250 uses single-character set so it worked for those characters it understands. However I still have no idea how to make 'utf-8' work here.
UPD3: Oh, I found out it is a known Python bug. I guess what happens is that Python copies the cmd encoding as 'cp65001 to sys.stdin.encoding and tries to apply it to all the input. Since it fails to understand 'cp65001' it crushes on any input that contains non-ascii characters.