I'm developing an application (winforms C# .NET 4.0) where I access a lookup functionality from a 3rd party through a simple HTTP request. I call an url with a parameter, and in return I get a small string with the result of the lookup. Simple enough.
The challenge is however, that I have to do lots of these lookups (a couple of thousands), and I would like to limit the time needed. Therefore I would like to run requests in parallel (say 10-20). I use a ThreadPool to do this, and the short version of my code looks like this:
public void startAsyncLookup(Action<LookupResult> returnLookupResult)
{
this.returnLookupResult = returnLookupResult;
foreach (string number in numbersToLookup)
{
ThreadPool.QueueUserWorkItem(lookupNumber, number);
}
}
public void lookupNumber(Object threadContext)
{
string numberToLookup = (string)threadContext;
string url = @"http://some.url.com/?number=" + numberToLookup;
WebClient webClient = new WebClient();
Stream responseData = webClient.OpenRead(url);
LookupResult lookupResult = parseLookupResult(responseData);
returnLookupResult(lookupResult);
}
I fill up numbersToLookup (a List<String>) from another place, call startAsyncLookup and provide it with a call-back function returnLookupResult to return each result. This works, but I found that I'm not getting the throughput I want.
Initially I thought it might be the 3rd party having a poor system on their end, but I excluded this by trying to run the same code from two different machines at the same time. Each of the two took as long as one did alone, so I could rule out that one.
A colleague then tipped me that this might be a limitation in Windows. I googled a bit, and found amongst others this post saying that by default Windows limits the number of simultaneous request to the same web server to 4 for HTTP 1.0 and to 2 for HTTP 1.1 (for HTTP 1.1 this is actually according to the specification (RFC2068)).
The same post referred to above also provided a way to increase these limits. By adding two registry values to [HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings] (MaxConnectionsPerServer and MaxConnectionsPer1_0Server), I could control this myself.
So, I tried this (sat both to 20), restarted my computer, and tried to run my program again. Sadly though, it didn't seem to help any. I also kept an eye on the Resource Monitor (see screen shot) while running my batch lookup, and I noticed that my application (the one with the title blacked out) still only was using two TCP connections.
So, the question is, why isn't this working? Is the post I linked to using the wrong registry values? Is this perhaps not possible to "hack" in Windows any longer (I'm on Windows 7)?
Any ideas would be highly appreciated :)
And just in case anyone should wonder, I have also tried with different settings for MaxThreads on ThreadPool (everyting from 10 to 100), and this didn't seem to affect my throughput at all, so the problem shouldn't be there either.