Characters in string changed after downloading HTML from the internet.
- by Callum Rogers
Using the following code, I can download the HTML of a file from the internet:
WebClient wc = new WebClient();
// ....
string downloadedFile = wc.DownloadString("http://www.myurl.com/");
However, sometimes the file contains "interesting" characters like é to é, ? to ↠and ????? to フシギダãƒ.
I think it may be something to do with different unicode types or something, as each character gets changed into 2 new ones, perhaps each character being split in half but I have very little knowledge in this area. What do you think is wrong?