What's the requests/second standard for scraping websites?
Posted
by feydr
on Stack Overflow
See other posts from Stack Overflow
or by feydr
Published on 2010-05-29T22:24:43Z
Indexed on
2010/05/29
22:32 UTC
Read the original article
Hit count: 253
screen-scraping
|etiquette
This was the closest question to my question and it wasn't really answered very well imo:
http://stackoverflow.com/questions/2022030/web-scraping-etiquette
I'm looking for the answer to #1:
How many requests/second should you be doing to scrape?
Right now I pull from a queue of links. Every site that gets scraped has it's own thread and sleeps for 1 second in between requests. I ask for gzip compression to save bandwidth.
Are there standards for this? Surely all the big search engines have some set of guidelines they follow in regards to this.
© Stack Overflow or respective owner