Screen scraping: getting around "HTTP Error 403: request disallowed by robots.txt"

Posted by Diego on Stack Overflow See other posts from Stack Overflow or by Diego
Published on 2010-05-17T00:35:43Z Indexed on 2010/05/17 0:40 UTC
Read the original article Hit count: 427

Is there a way to get around the following?

httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt

Is the only way around this to contact the site-owner (barnesandnoble.com).. i'm building a site that would bring them more sales, not sure why they would deny access at a certain depth.

I'm using mechanize and BeautifulSoup on Python2.6.

hoping for a work-around

© Stack Overflow or respective owner

Related posts about http-status-code-403

Related posts about screen-scraping