Convert UCS-2 characters to UTF-8 Using C#
Posted
by
quanticle
on Stack Overflow
See other posts from Stack Overflow
or by quanticle
Published on 2010-12-29T15:04:03Z
Indexed on
2010/12/29
16:54 UTC
Read the original article
Hit count: 302
I'm pulling some internationalized text from a MS SQL Server 2005 database. As per the defaults for that DB, the characters are stored as UCS-2. However, I need to output the data in UTF-8 format, as I'm sending it out over the web. Currently, I have the following code to convert:
SqlString dbString = resultReader.GetSqlString(0);
byte[] dbBytes = dbString.GetUnicodeBytes();
byte[] utf8Bytes = System.Text.Encoding.Convert(System.Text.Encoding.Unicode,
System.Text.Encoding.UTF8, dbBytes);
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
string outputString = encoder.GetString(utf8Bytes);
However, when I examine the output in the browser, it appears to be garbage, no matter what I set the encoding to.
What am I missing?
EDIT: In response to the answers below, the reason I thought I had to perform a conversion is because I can output literal multibyte strings just fine. For example:
OutputControl.Text = "????????????????????????????????????????????????????????????????";
works. Here, OutputControl
is an ASP.Net Literal. However,
OutputControl.Text = outputString; //Output from above snippet
results in mangled output as described above. My hypothesis was that the database's output was somehow getting mangled by ASP.Net. If that's not the case, then what are some other possibilities?
© Stack Overflow or respective owner