What Character Encoding Is This?
        Posted  
        
            by Canoehead
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Canoehead
        
        
        
        Published on 2010-04-23T17:53:03Z
        Indexed on 
            2010/04/23
            18:03 UTC
        
        
        Read the original article
        Hit count: 530
        
character-encoding
|utf-7
I need to clean up some file containing French text. Problem is that the files erroneously contain multiple encodings within the same file.
I think some sections are ISO8859-1 (Latin 1) but other parts have text encoded in single byte characters that look like 'extended' ASCII. In other words, it is UTF-7 encoding plus the following:
- 0x82 for é (e acute)
 - 0x8a for è (e grave)
 - 0x88 for ê (e circumflex)
 - 0x85 for à (a grave)
 - 0x87 for ç (c cedilla)
 
What encoding is this?
© Stack Overflow or respective owner