Restricting crawler activity to certain directories with robots.txt
- by neimad
I would like to use robots.txt to prevent indexing of some parts of my website. I want search engines to index only the / directory and not search inside my controllers.
In my robots.txt, I have this:
User-Agent: *
Disallow: /compagnies/
Disallow: /floors/
Disallow: /spaces/
Disallow: /buildings/
Disallow: /users/
Disallow: /
I put this file in /mysite/public. I tested the file with a robots.txt validator and got no errors.
However, Google always returns the result of my site. For testing, I added Disallow: /, but again, Google indexed all pages.
floors, spaces, buildings, etc. are not physical directories. Is this a bug? How can I work around it?