Disallow robots.txt from being accessed in a browser but still accessible by spiders?
Posted
by
Michael Irigoyen
on Pro Webmasters
See other posts from Pro Webmasters
or by Michael Irigoyen
Published on 2011-02-14T22:06:08Z
Indexed on
2011/02/15
23:35 UTC
Read the original article
Hit count: 262
We make use of the robots.txt
file to prevent Google (and other search spiders) from crawling certain pages/directories in our domain. Some of these directories/files are secret, meaning they aren't linked (except perhaps on other pages encompassed by the robots.txt
file). Some of these directories/files aren't secret, we just don't want them indexed.
If somebody browses directly to www.mydomain.com/robots.txt
, they can see the contents of the robots.txt
file. From a security standpoint, this is not something we want publicly available to anybody. Any directories that contain secure information are set behind authentication, but we still don't want them to be discoverable unless the user specifically knows about them.
Is there a way to provide a robots.txt
file but to have it's presence masked by John Doe accessing it from his browser? Perhaps by using PHP to generate the document based on certain criteria? Perhaps something I'm not thinking of? We'd prefer a way to centrally do it (meaning a <meta>
tag solution is less than ideal).
© Pro Webmasters or respective owner