c# Network Programming - HTTPWebRequest Scraping
- by masterguru
Hi,
I am building a web scraping application. It should scrape a complex web site with concurrent HttpWebRequests from a single host to a single target web server.
The application should run on Windows server 2008.
One single HttpWebRequest for data could take from 1 minute to 4 minutes to complete (because of long running db operations)
I should have at least 100 parallel requests to the target web server, but i have noticed that when i use more then 2-3 long-running requests i have big performance issues (request timeouts/hanging).
How many concurrent requests can i have in this scenario from a single host to a single target web server? can i use Thread Pools in the application to run parallel HttpWebRequests to the server? will i have any issues with the default outbound HTTP connection/requests limits? what about Request timeouts when i reach outbound connection limits? what would be the best setup for my scenario?
Any help would be appreciated.
Thanks