Screen scraping: getting around "HTTP Error 403: request disallowed by robots.txt"
Posted
by Diego
on Stack Overflow
See other posts from Stack Overflow
or by Diego
Published on 2010-05-17T00:35:43Z
Indexed on
2010/05/17
0:40 UTC
Read the original article
Hit count: 427
Is there a way to get around the following?
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
Is the only way around this to contact the site-owner (barnesandnoble.com).. i'm building a site that would bring them more sales, not sure why they would deny access at a certain depth.
I'm using mechanize and BeautifulSoup on Python2.6.
hoping for a work-around
© Stack Overflow or respective owner