is it possible to extract all PDFs from a site
- by deming
given a URL like www.mysampleurl.com is it possible to crawl through the site and extract links for all PDFs that might exist?
I've gotten the impression that Python is good for this kind of stuff. but is this feasible to do? how would one go about implementing something like this?
also, assume that the site does not let you visit something like www.mysampleurl.com/files/