Design for fastest page download
Posted
by mexxican
on Stack Overflow
See other posts from Stack Overflow
or by mexxican
Published on 2010-05-28T13:56:19Z
Indexed on
2010/05/28
14:42 UTC
Read the original article
Hit count: 203
I have a file with millions of URLs/IPs and have to write a program to download the pages really fast. The connection rate should be at least 6000/s and file download speed at least 2000 with avg. 15kb file size. The network bandwidth is 1 Gbps.
My approach so far has been: Creating 600 socket threads with each having 60 sockets and using WSAEventSelect to wait for data to read. As soon as a file download is complete, add that memory address(of the downloaded file) to a pipeline( a simple vector ) and fire another request. When the total download is more than 50Mb among all socket threads, write all the files downloaded to the disk and free the memory. So far, this approach has been not very successful with the rate at which I could hit not shooting beyond 2900 connections/s and downloaded data rate even less.
Can somebody suggest an alternative approach which could give me better stats. Also I am working windows server 2008 machine with 8 Gig of memory. Also, do we need to hack the kernel so as we could use more threads and memory. Currently I can create a max. of 1500 threads and memory usage not going beyond 2 gigs [ which technically should be much more as this is a 64-bit machine ]. And IOCP is out of question as I have no experience in that so far and have to fix this application today.
Thanks Guys!
© Stack Overflow or respective owner