How to secure robots.txt file?
Posted
by
CompilingCyborg
on Pro Webmasters
See other posts from Pro Webmasters
or by CompilingCyborg
Published on 2012-08-10T10:18:16Z
Indexed on
2012/11/02
11:19 UTC
Read the original article
Hit count: 302
I would like for User-agents to index my relative pages only without accessing any directory on my server.
As initial thought, i had this version in mind:
User-agent: *
Disallow: */*
Sitemap: http://www.mydomain.com/sitemap.xml
My Questions:
- Is it correct to block all directories like that -
Disallow: */*
? - Would still search engines be able to see and index my sitemap if i disallowed all directories?
- What are the best practices for securing the robots.txt file?
For Reference:
Here is a good tutorial for robots.txt
#Add this if you want to stop Alexa from indexing your site.
User-agent: ia_archiver
Disallow: /
#Add this to stop duggmirror
User-agent: duggmirror
Disallow: /
#Add this to allow specific agents
User-agent: Googlebot
Disallow:
#Add this to allow all agents while blocking specific directories
User-agent: *
Disallow: /cgi-bin/
Disallow: /*?*
© Pro Webmasters or respective owner