Why does Python sometimes upgrade a string to unicode and sometimes not?
- by samtregar
I'm confused. Consider this code working the way I expect:
>>> foo = u'Émilie and Juañ are turncoats.'
>>> bar = "foo is %s" % foo
>>> bar
u'foo is \xc3\x89milie and Jua\xc3\xb1 are turncoats.'
And this code not at all working the way I expect:
>>> try:
... raise Exception(foo)
... except Exception as e:
... foo2 = e
...
>>> bar = "foo2 is %s" % foo2
------------------------------------------------------------
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
Can someone explain what's going on here? Why does it matter whether the unicode data is in a plain unicode string or stored in an Exception object? And why does this fix it:
>>> bar = u"foo2 is %s" % foo2
>>> bar
u'foo2 is \xc3\x89milie and Jua\xc3\xb1 are turncoats.'
I am quite confused! Thanks for the help!
UPDATE: My coding buddy Randall has added to my confusion in an attempt to help me! Send in the reinforcements to explain how this is supposed to make sense:
>>> class A:
... def __str__(self): return "string"
... def __unicode__(self): return "unicode"
...
>>> "%s %s" % (u'niño', A())
u'ni\xc3\xb1o unicode'
>>> "%s %s" % (A(), u'niño')
u'string ni\xc3\xb1o'
Note that the order of the arguments here determines which method is called!