Why do people crawl sites without downloading pictures?
- by Michael
Let me show you what I mean:
IP Pages Hits Bandwidth
85.xx.xx.xxx 236 236 735.00 KB
195.xx.xxx.xx 164 164 533.74 KB
95.xxx.xxx.xxx 90 90 293.47 KB
It's very clear that these person are crawling my site with bots. There's no way that you could visit my site and use <1MB bandwidth. You might say that there's the possibility that they could be browsing the site using some browser or plug-in that does not download images, js/css files, etc., but the simple fact of the matter is that there are not 90-236 pages that are linked from the home page (outside of WP files), even if you visited every page twice.
I could understand if these people were crawling the site for pictures, but once again, the bandwidth indicates that this isn't what is happening. Why, then, would they crawl the site to simply view the HTML/txt/js/etc. files?
The only thing that I can come up with is that they are scanning for outdated versions of WordPress, SQL injection vulnerabilities, etc., which makes me inclined to outright ban the IPs, but I'm curious, is it possible that this person is a legitimate user, or at the very least, not intending to be harmful?