Massive 404 attack with non existent URLs. How to prevent this?
- by tattvamasi
The problem is a whole load of 404 errors, as reported by Google Webmaster Tools, with pages and queries that have never been there. One of them is viewtopic.php, and I've also noticed a scary number of attempts to check if the site is a WordPress site (wp_admin) and for the cPanel login. I block TRACE already, and the server is equipped with some defense against scanning/hacking. However, this doesn't seem to stop. The referrer is, according to Google Webmaster, totally.me.
I have looked for a solution to stop this, because it isn't certainly good for the poor real actual users, let alone the SEO concerns.
I am using the Perishable Press mini black list (found here), a standard referrer blocker (for porn, herbal, casino sites), and even some software to protect the site (XSS blocking, SQL injection, etc). The server is using other measures as well, so one would assume that the site is safe (hopefully), but it isn't ending.
Does anybody else have the same problem, or am I the only one seeing this? Is it what I think, i.e., some sort of attack? Is there a way to fix it, or better, prevent this useless resource waste?
EDIT
I've never used the question to thank for the answers, and hope this can be done. Thank you all for your insightful replies, which helped me to find my way out of this. I have followed everyone's suggestions and implemented the following:
a honeypot
a script that listens to suspect urls in the 404 page and sends me
an email with user agent/ip, while returning a standard 404 header
a script that rewards legitimate users, in the same 404 custom page,
in case they end up clicking on one of those urls.
In less than 24 hours I have been able to isolate some suspect IPs, all listed in Spamhaus. All the IPs logged so far belong to spam VPS hosting companies.
Thank you all again, I would have accepted all answers if I could.