I need help converting a C# string from one character encoding to another?

Posted by Handleman on Stack Overflow See other posts from Stack Overflow or by Handleman
Published on 2011-02-25T06:26:49Z Indexed on 2011/02/25 7:25 UTC
Read the original article Hit count: 232

Filed under:
|
|

According to Spolsky I can't call myself a developer, so there is a lot of shame behind this question...

Scenario: From a C# application, I would like to take a string value from a SQL db and use it as the name of a directory. I have a secure (SSL) FTP server on which I want to set the current directory using the string value from the DB.
Problem: Everything is working fine until I hit a string value with a "special" character - I seem unable to encode the directory name correctly to satisfy the FTP server.

The code example below

  • uses "special" character é as an example
  • uses WinSCP as an external application for the ftps comms
  • does not show all the code required to setup the Process "_winscp".
  • sends commands to the WinSCP exe by writing to the process standardinput
  • for simplicity, does not get the info from the DB, but instead simply declares a string (but I did do a .Equals to confirm that the value from the DB is the same as the declared string)
  • makes three attempts to set the current directory on the FTP server using different string encodings - all of which fail
  • makes an attempt to set the directory using a string that was created from a hand-crafted byte array - which works

Process _winscp = new Process();
byte[] buffer;

string nameFromString = "Sinéad O'Connor";
_winscp.StandardInput.WriteLine("cd \"" + nameFromString + "\"");

buffer = Encoding.UTF8.GetBytes(nameFromString);
_winscp.StandardInput.WriteLine("cd \"" + Encoding.UTF8.GetString(buffer) + "\"");

buffer = Encoding.ASCII.GetBytes(nameFromString);
_winscp.StandardInput.WriteLine("cd \"" + Encoding.ASCII.GetString(buffer) + "\"");

byte[] nameFromBytes = new byte[] { 83, 105, 110, 130, 97, 100, 32, 79, 39, 67, 111, 110, 110, 111, 114 };
_winscp.StandardInput.WriteLine("cd \"" + Encoding.Default.GetString(nameFromBytes) + "\"");

The UTF8 encoding changes é to 101 (decimal) but the FTP server doesn't like it.

The ASCII encoding changes é to 63 (decimal) but the FTP server doesn't like it.

When I represent é as value 130 (decimal) the FTP server is happy, except I can't find a method that will do this for me (I had to manually contruct the string from explicit bytes).

Anyone know what I should do to my string to encode the é as 130 and make the FTP server happy and finally elevate me to level 1 developer by explaining the only single thing a developer should understand?

© Stack Overflow or respective owner

Related posts about c#

Related posts about character-encoding