robots.txt file with more restrictive rules for certain user agents
Posted
by Carson63000
on Server Fault
See other posts from Server Fault
or by Carson63000
Published on 2010-06-08T06:04:57Z
Indexed on
2010/06/08
6:12 UTC
Read the original article
Hit count: 192
robots.txt
Hi,
I'm a bit vague on the precise syntax of robots.txt, but what I'm trying to achieve is:
- Tell all user agents not to crawl certain pages
- Tell certain user agents not to crawl anything
(basically, some pages with enormous amounts of data should never be crawled; and some voracious but useless search engines, e.g. Cuil, should never crawl anything)
If I do something like this:
User-agent: *
Disallow: /path/page1.aspx
Disallow: /path/page2.aspx
Disallow: /path/page3.aspx
User-agent: twiceler
Disallow: /
..will it flow through as expected, with all user agents matching the first rule and skipping page1, page2 and page3; and twiceler matching the second rule and skipping everything?
© Server Fault or respective owner