Downloading a web page and all of its resource files in Python

Posted by Mark on Stack Overflow See other posts from Stack Overflow or by Mark
Published on 2009-05-09T21:28:26Z Indexed on 2010/05/14 21:24 UTC
Read the original article Hit count: 203

Filed under:
|
|

I want to be able to download a page and all of its associated resources (images, style sheets, script files, etc) using Python. I am (somewhat) familiar with urllib2 and know how to download individual urls, but before I go and start hacking at BeautifulSoup + urllib2 I wanted to be sure that there wasn't already a Python equivalent to "wget --page-requisites http://www.google.com".

Specifically I am interested in gathering statistical information about how long it takes to download an entire web page, including all resources.

Thanks Mark

© Stack Overflow or respective owner

Related posts about python

Related posts about urllib2