Convert UCS-2 characters to UTF-8 Using C#

Posted by quanticle on Stack Overflow See other posts from Stack Overflow or by quanticle
Published on 2010-12-29T15:04:03Z Indexed on 2010/12/29 16:54 UTC
Read the original article Hit count: 302

Filed under:
|
|

I'm pulling some internationalized text from a MS SQL Server 2005 database. As per the defaults for that DB, the characters are stored as UCS-2. However, I need to output the data in UTF-8 format, as I'm sending it out over the web. Currently, I have the following code to convert:

SqlString dbString = resultReader.GetSqlString(0);
byte[] dbBytes = dbString.GetUnicodeBytes();
byte[] utf8Bytes = System.Text.Encoding.Convert(System.Text.Encoding.Unicode, 
    System.Text.Encoding.UTF8, dbBytes);
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
string outputString = encoder.GetString(utf8Bytes);

However, when I examine the output in the browser, it appears to be garbage, no matter what I set the encoding to.

What am I missing?

EDIT: In response to the answers below, the reason I thought I had to perform a conversion is because I can output literal multibyte strings just fine. For example:

OutputControl.Text = "????????????????????????????????????????????????????????????????";

works. Here, OutputControl is an ASP.Net Literal. However,

OutputControl.Text = outputString; //Output from above snippet

results in mangled output as described above. My hypothesis was that the database's output was somehow getting mangled by ASP.Net. If that's not the case, then what are some other possibilities?

© Stack Overflow or respective owner

Related posts about c#

Related posts about unicode