How should I deal with user agent parsing in logs?
Posted
by
Mr. Jefferson
on Pro Webmasters
See other posts from Pro Webmasters
or by Mr. Jefferson
Published on 2012-02-17T19:41:05Z
Indexed on
2012/03/23
17:41 UTC
Read the original article
Hit count: 331
logging
|user-agent
My web app project includes logging functionality so we can see where visitors are coming from (referrer URL), what the popular user agents are, what pages are most popular, etc. The log is stored in SQL Server, and when I query the user agents I use a large (almost 100 lines) and growing CASE statement to separate the user agents using string matching (i.e. if the user agent contains the string "Firefox/9" then it's Firefox 9). Is there a better way to do this so I don't have to continually add to that CASE statement to deal with new browser releases?
Also, how should I deal with less common, weird/unknown user agents? I've seen the following in the logs and been unable to find good information online about what they are:
WordPress/3.3.1; http://www.facecolony.org
Mozilla/4.0 ( http://www.hairirons.org redips; <a href=http://hairirons.org/>chi hair iron</a>)
I'd guess they're bots/crawlers, but the sites they point to don't appear to reference web crawlers (or even be available sometimes). I've seen other user agents aren't familiar to me, but I know they're bots because they include "bot" or "spider" or something similar in them.
© Pro Webmasters or respective owner