Strange robots.txt - how and why did it get there?
- by Mick
I recently created a very simple, pure HTML website which I have hosted with "hostmonster". Hostmonster had very good reviews on some comparison website and in general so far they appear to be perfectly good in every way... At least I thought so until just now...
I have been making lots of edits to my site on an almost daily basis. My site now appears on the first page (7th on the list) for my most important keyphrase when doing a google search. But I did notice some problem with the snippet chosen by google. I asked a question on this site about snippets and got some great answers. I then made some modifications to my meta data and within 48hrs the google snippet for my search was perfect. The odd thing though was that looking at the "cached" version google had, it appeared that the cache was still very odl- like three weeks previous. This seemed very odd - how could it be that the google robots had read my new metadata without updating the cache? This puzzled me greatly. Just now it occurred to me that maybe I had some goofey setting in my robots.txt file. I didn't actually remember even making one - but I thought I'd have a look just in case. Much to my horror, I saw that there was a robots.txt and it contained the disturbing text below:
sitemap: http://cdn.attracta.com/sitemap/728687.xml.gz
Intuitively this looks like some kind of junk, spam trick, and I had indeed been getting some spam from "attracta".
So my questions are:
1. Should I simply delete this robots.txt?
2. Was the file there all along - placed there because of some commercial tie-in between attracta and hostmonster.
3. Does the attracta robots file explain the lack of re-caching?