The definition of User-Agent states that several tokens can be included, as deemed necessary by the client.
I want to block certain bots via robots.txt and I am confused as to which part of the User-Agent string to use, especially for more obscure bots. For example:
Mozilla/5.0 (compatible; uMBot-LN/1.0; mailto:
[email protected])"
JS-Kit URL Resolver, http://js-kit.com/
Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html
Do I use the second token? Can tokens contain spaces, or did the SEOkicks folks forget a semicolon after SEOkicks-Robot? I don't actually intend on making my question specific to a couple bots - I want to know the guideline: which part of UA do I place in robots.txt for these exotic bots with UA as long as a haiku?
User-agent: uMBot-LN/1.0
Disallow: /
PS: Thank you but I do not need to hear that undesirable bots are better blocked with mod_security. I already have commercial mod_sec rules in place.