Detecting well behaved / well known bots
- by Simon_Weaver
I found this question very interesting : Programmatic Bot Detection
I have a very similar question, but I'm not bothered about 'badly behaved bots'.
I am tracking (in addition to google analytics) the following per visit :
Entry URL
Referer
UserAgent
Adwords (by means of query string)
Whether or not the user made a purchase
etc.
The problem is that to calculate any kind of conversion rate I'm ending up with lots of 'bot' visits that are greatly skewing my results.
I'd like to ignore as many as possible bot visits, but I want a solution that I don't need to monitor too closely, and that won't in itself be a performance hog and preferably still work if someone has javascript disabled.
Are there good published lists of the top 100 bots or so? I did find a list at http://www.user-agents.org/ but that appears to contain hundreds if not thousands of bots. I don't want to check every referer against thousands of links.
Here is the current googlebot UserAgent. How often does it change?
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)