Restricting crawler activity to certain directories with robots.txt
Posted
by
neimad
on Pro Webmasters
See other posts from Pro Webmasters
or by neimad
Published on 2011-11-05T13:29:00Z
Indexed on
2011/11/19
10:16 UTC
Read the original article
Hit count: 309
ruby-on-rails
|robots.txt
I would like to use robots.txt to prevent indexing of some parts of my website. I want search engines to index only the /
directory and not search inside my controllers.
In my robots.txt, I have this:
User-Agent: *
Disallow: /compagnies/
Disallow: /floors/
Disallow: /spaces/
Disallow: /buildings/
Disallow: /users/
Disallow: /
I put this file in /mysite/public
. I tested the file with a robots.txt validator and got no errors.
However, Google always returns the result of my site. For testing, I added Disallow: /
, but again, Google indexed all pages.
floors
, spaces
, buildings
, etc. are not physical directories. Is this a bug? How can I work around it?
© Pro Webmasters or respective owner