how to scrawl file hosting website with scrapy in python?
Posted
by
Veryel Hua
on Stack Overflow
See other posts from Stack Overflow
or by Veryel Hua
Published on 2012-08-28T03:17:28Z
Indexed on
2012/08/29
3:38 UTC
Read the original article
Hit count: 235
Can anyone help me to figure out how to scrawl file hosting website like filefactory.com? I don't want to download all the file hosted but just to index all available files with scrapy.
I have read the tutorial and docs with respect to spider class for scrapy. If I only give the website main page as the begining url I wouldn't not scrawl the whole site, because the scrawling depends on links but the begining page seems not point to any file pages. That's the problem I am thinking and any help would be appreciated!
© Stack Overflow or respective owner