Characters in string changed after downloading HTML from the internet.

Posted by Callum Rogers on Stack Overflow See other posts from Stack Overflow or by Callum Rogers
Published on 2010-04-23T17:30:33Z Indexed on 2010/04/23 17:33 UTC
Read the original article Hit count: 366

Filed under:

c#

|

.NET

|

webclient

|

unicode

|

string

Using the following code, I can download the HTML of a file from the internet:

WebClient wc = new WebClient();

// ....

string downloadedFile = wc.DownloadString("http://www.myurl.com/");

However, sometimes the file contains "interesting" characters like é to Ã©, ? to â† and ????? to ãƒ•ã‚·ã‚®ãƒ€ãƒ.

I think it may be something to do with different unicode types or something, as each character gets changed into 2 new ones, perhaps each character being split in half but I have very little knowledge in this area. What do you think is wrong?

© Stack Overflow or respective owner

Related posts about c#

.NET WebRequest.PreAuthenticate not quite what it sounds like

as seen on West-Wind - Search for 'West-Wind'
I’ve run into the problem a few times now: How to pre-authenticate .NET WebRequest calls doing an HTTP call to the server – essentially send authentication credentials on the very first request instead of waiting for a server challenge first? At first glance this sound like it should be easy:… >>> More
HttpWebRequest and Ignoring SSL Certificate Errors

as seen on West-Wind - Search for 'West-Wind'
Man I can't believe this. I'm still mucking around with OFX servers and it drives me absolutely crazy how some these servers are just so unbelievably misconfigured. I've recently hit three different 3 major brokerages which fail HTTP validation with bad or corrupt certificates at least according to… >>> More
The dynamic Type in C# Simplifies COM Member Access from Visual FoxPro

as seen on West-Wind - Search for 'West-Wind'
I’ve written quite a bit about Visual FoxPro interoperating with .NET in the past both for ASP.NET interacting with Visual FoxPro COM objects as well as Visual FoxPro calling into .NET code via COM Interop. COM Interop with Visual FoxPro has a number of problems but one of them at least got a lot… >>> More
Dynamic Type to do away with Reflection

as seen on West-Wind - Search for 'West-Wind'
The dynamic type in C# 4.0 is a welcome addition to the language. One thing I’ve been doing a lot with it is to remove explicit Reflection code that’s often necessary when you ‘dynamically’ need to walk and object hierarchy. In the past I’ve had a number of ReflectionUtils that used string based expressions… >>> More
Finding a Relative Path in .NET

as seen on West-Wind - Search for 'West-Wind'
Here’s a nice and simple path utility that I’ve needed in a number of applications: I need to find a relative path based on a base path. So if I’m working in a folder called c:\temp\templates\ and I want to find a relative path for c:\temp\templates\subdir\test.txt I want to receive back subdir\test… >>> More

Related posts about .NET

Apt-Get Update: failure to fetch; can't connect to any sources

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I realize there are dozens of "apt-get update: failure to fetch" questions (I read through all I could find), but my present circumstance is unique to 12.04 and it affects all sources; not just launchpad. Additionally, I've tried several different servers in Europe and the U.S. as well as the "main… >>> More
12.04: Apt-Get Update: failure to fetch; can't connect to any sources

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I realize there are dozens of "apt-get update: failure to fetch" questions (I read through all I could find), but my present circumstance is unique to 12.04 and it affects all sources; not just launchpad. Additionally, I've tried several different servers in Europe and the U.S. as well as the "main… >>> More
What's New in ASP.NET 4

as seen on Geeks with Blogs - Search for 'Geeks with Blogs'
The .NET Framework version 4 includes enhancements for ASP.NET 4 in targeted areas. Visual Studio 2010 and Microsoft Visual Web Developer Express also include enhancements and new features for improved Web development. This document provides an overview of many of the new features that are included… >>> More
.NET Reflector 6, .NET Reflector Pro, TestDriven.NET, .NET 4.0 and Mono

as seen on Simple Talk - Search for 'Simple Talk'
By now you may well have noticed that .NET Reflector 6 and .NET Reflector Pro are out in the wild. The official launch happened today, although we actually put the software out last Thursday as part of a phased release plan to ensure that everything went smoothly today which, so far, it seems to have… >>> More
Redmine on Apache2 with Passenger issue

as seen on Server Fault - Search for 'Server Fault'
I installed Redmine and run it in Apache2 with the Passenger module. Apache2 boots, Passenger module gets loaded and the Redmine welcome page is shown, however when trying to login or navigate to other parts of the Redmine site, the browser keeps loading and loading and loading forever, although the… >>> More