MS Bing web crawler out of control causing our site to go down
- by akaDanPaul
Here is a weird one that I am not sure what to do. Today our companies e-commerce site went down. I tailed the production log and saw that we were receiving a ton of request from this range of IP's 157.55.98.0/157.55.100.0. I googled around and come to find out that it is a MSN Web Crawler.
So essentially MS web crawler overloaded our site causing it not to respond. Even though in our robots.txt file we have the following;
Crawl-delay: 10
So what I did was just banned the IP range in iptables.
But what I am not sure to do from here is how to follow up. I can't find anywhere to contact Bing about this issue, I don't want to keep those IPs blocked because I am sure eventually we will get de-indexed from Bing. And it doesn't really seem like this has happened to anyone else before.
Any Suggestions?
Update, My Server / Web Stats
Our web server is using Nginx, Rails 3, and 5 Unicorn workers. We have 4gb of memory and 2 virtual cores. We have been running this setup for over 9 months now and never had an issue, 95% of the time our system is under very little load. On average we receive 800,000 page views a month and this never comes close to bringing / slowing down our web server.
Taking a look at the logs we were receiving anywhere from 5 up to 40 request / second from this IP range.
In all my years of web development I have never seen a crawler hit a website so many times.
Is this new with Bing?