Which token from a long User-Agent should I use in robots.txt?

Posted by Gaia on Pro Webmasters See other posts from Pro Webmasters or by Gaia
Published on 2013-10-28T11:18:45Z Indexed on 2013/10/28 22:13 UTC
Read the original article Hit count: 342

Filed under:

The definition of User-Agent states that several tokens can be included, as deemed necessary by the client.

I want to block certain bots via robots.txt and I am confused as to which part of the User-Agent string to use, especially for more obscure bots. For example:

Mozilla/5.0 (compatible; uMBot-LN/1.0; mailto: [email protected])"
JS-Kit URL Resolver, http://js-kit.com/
Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html

Do I use the second token? Can tokens contain spaces, or did the SEOkicks folks forget a semicolon after SEOkicks-Robot? I don't actually intend on making my question specific to a couple bots - I want to know the guideline: which part of UA do I place in robots.txt for these exotic bots with UA as long as a haiku?

User-agent: uMBot-LN/1.0
Disallow: /

PS: Thank you but I do not need to hear that undesirable bots are better blocked with mod_security. I already have commercial mod_sec rules in place.

© Pro Webmasters or respective owner

Related posts about robots.txt