c# Network Programming - HTTPWebRequest Scraping
Posted
by masterguru
on Stack Overflow
See other posts from Stack Overflow
or by masterguru
Published on 2010-05-01T00:10:58Z
Indexed on
2010/05/01
0:17 UTC
Read the original article
Hit count: 772
Hi,
I am building a web scraping application. It should scrape a complex web site with concurrent HttpWebRequests from a single host to a single target web server.
The application should run on Windows server 2008.
One single HttpWebRequest for data could take from 1 minute to 4 minutes to complete (because of long running db operations)
I should have at least 100 parallel requests to the target web server, but i have noticed that when i use more then 2-3 long-running requests i have big performance issues (request timeouts/hanging).
How many concurrent requests can i have in this scenario from a single host to a single target web server? can i use Thread Pools in the application to run parallel HttpWebRequests to the server? will i have any issues with the default outbound HTTP connection/requests limits? what about Request timeouts when i reach outbound connection limits? what would be the best setup for my scenario?
Any help would be appreciated.
Thanks
© Stack Overflow or respective owner