Why do some user agents have spam urls in them (and why are they always Opera/Presto User-Agents)?
- by Erx_VB.NExT.Coder
If you go to (say) the last 100 entries (visits) to the botsvsbrowsers.com website (exact link, feel free to take a look: http://www.botsvsbrowsers.com/recent/listings/index.html ), you'd notice that almost every User Agent that has the keywords "Opera" and "Presto" inside them, will almost certainly have a web link (URL/Web Address) inside it, and it won't just be a normal web address, but a HTML anchor tag/link to that address. Why is this so, I could not even find a single discussion about it on the internet, nowhere, I tried varying my search terms many times.
If the user agent contains the words "Opera" and "Presto" it doesnt mean it will have this weblink, but it means there is about an 80% change that it will. A typical anchor tag/link inside a user agent will look like this:
Mozilla/4.0 <a
href="http://osis-uk.co.uk/disabled-equipment">disability
equipment</a> (Windows NT 5.1; U; en) Presto/2.10.229
Version/11.60
If you check it out at the website, http://www.botsvsbrowsers.com/recent/listings/index.html you will notice that the back and forward arrows are in there unescaped format.
This isn't just true for botsvsbrowsers, but several other user agent listing sites. I'm really confused and feel line I'm in a room full of 10,000 people and am the only one seeing this ghost :).
If I'm doing statistical analysis, should I include or exclude this type of user agent from my listing (ie: are these just normal users who've set their user agents to attempt to drive some traffic to their sites as they browser the web), or is there something else going on? The fact that it is so consistent in terms of its format leads me to believe that it is an automated process (the setting or alteration of the user agent) so I cannot decide or understand the process by which this change is made (I know how to change a user agent), but unsure which program or facility is doing this, especially since it is exclusive to Opera (Presto) user agents that are beyond I think an 8 or 9 point something browser version.
I've run some statistical tests, parsing entries from all over the place, writing custom programs, to get a better understanding of this. Keep in mind that I see normal URL's in user agents infrequently, they are just text such as +http://www.someSite.com appended to a user agent normally, especially if its a crawler or bot it provided its service URL, this is normal and isnt done with an embedded link (A HREF=) etc, so I'm not talking about "those".