Dealing with wacky encodings in Python
        Posted  
        
            by Tyson
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Tyson
        
        
        
        Published on 2010-06-07T05:42:59Z
        Indexed on 
            2010/06/07
            6:22 UTC
        
        
        Read the original article
        Hit count: 346
        
I have a Python script that pulls in data from many sources (databases, files, etc.). Supposedly, all the strings are unicode, but what I end up getting is any variation on the following theme (as returned by repr()):
u'D\\xc3\\xa9cor'
u'D\xc3\xa9cor'
'D\\xc3\\xa9cor'
'D\xc3\xa9cor'
Is there a reliable way to take any four of the above strings and return the proper unicode string?
u'D\xe9cor' # --> Décor
The only way I can think of right now uses eval(), replace(), and a deep, burning shame that will never wash away.
© Stack Overflow or respective owner