How do I ignore the UTF-8 Byte Order Marker in String comparisons?
- by Skrud
I'm having a problem comparing strings in a Unit Test in C# 4.0 using Visual Studio 2010. This same test case works properly in Visual Studio 2008 (with C# 3.5).
Here's the relevant code snippet:
byte[] rawData = GetData();
string data = Encoding.UTF8.GetString(rawData);
Assert.AreEqual("Constant", data, false, CultureInfo.InvariantCulture);
While debugging this test, the data string appears to the naked eye to contain exactly the same string as the literal. When I called data.ToCharArray(), I noticed that the first byte of the string data is the value 65279 which is the UTF-8 Byte Order Marker. What I don't understand is why Encoding.UTF8.GetString() keeps this byte around.
How do I get Encoding.UTF8.GetString() to not put the Byte Order Marker in the resulting string?