wget not respecting my robots.txt. Is there an interceptor?
- by Jane Wilkie
I have a website where I post csv files as a free service. Recently I have noticed that wget and libwww have been scraping pretty hard and I was wondering how to circumvent that even if only a little.
I have implemented a robots.txt policy. I posted it below..
User-agent: wget
Disallow: /
User-agent: libwww
Disallow: /
User-agent: *
Disallow: /…