is it possible to extract all PDFs from a site
Posted
by deming
on Stack Overflow
See other posts from Stack Overflow
or by deming
Published on 2010-04-08T14:55:16Z
Indexed on
2010/04/08
15:13 UTC
Read the original article
Hit count: 249
web-crawler
|python
given a URL like www.mysampleurl.com is it possible to crawl through the site and extract links for all PDFs that might exist?
I've gotten the impression that Python is good for this kind of stuff. but is this feasible to do? how would one go about implementing something like this?
also, assume that the site does not let you visit something like www.mysampleurl.com/files/
© Stack Overflow or respective owner