Convert ISO/Windows charsets to UTF-8 in Javascript

Posted by Amir on Stack Overflow See other posts from Stack Overflow or by Amir
Published on 2010-04-20T10:52:33Z Indexed on 2010/04/20 11:53 UTC
Read the original article Hit count: 234

I'm developing a firefox plugin and i fetch web pages to do some analysis for the user. The problem is when i try to get (XMLHttpRequest) pages that are not utf-8 encoded the string i see is messed up. For example hebrew pages with windows-1125 or Chinese pages with gb2312.

I already tried the following:

var uDecoder=Components.classes["@mozilla.org/intl/scriptableunicodeconverter"].getService(Components.interfaces.nsIScriptableUnicodeConverter);
uDecoder.charset="windows-1255";
alert( xhr.responseText );

var decoder=Components.classes["@mozilla.org/intl/utf8converterservice;1"].getService(Components.interfaces.nsIUTF8ConverterService);

alert(decoder.convertStringToUTF8(xhr.responseText,"WINDOWS-1255",true)); 

I also tried escape/unescape/encodeURIComponent

any ideas???

© Stack Overflow or respective owner

Related posts about JavaScript

Related posts about utf-8