wget recursively download from pages with lots of links
Posted
by Shadow
on Super User
See other posts from Super User
or by Shadow
Published on 2010-04-13T22:18:28Z
Indexed on
2010/04/13
22:23 UTC
Read the original article
Hit count: 303
wget
When using wget with the recursive option turned on I am getting an error message when it is trying to download a file. It thinks the link is a downloadable file when in reality it should just be following it to get to the page that actually contains the files(or more links to follow) that I want.
wget -r -l 16 --accept=jpg website.com
The error message is: .... since it should be rejected. This usually occurs when the website link it is trying to fetch ends with a sql statement. The problem however doesn't occur when using the very same wget command on that link. I want to know how exactly it is trying to fetch the pages. I guess I could always take a poke around the source although I don't know how messy the project is. I might also be missing exactly what "recursive" means in the context of wget. I thought it would run through and travel in each link getting the files with the extension I have requested.
I posted this up over at stackOverFlow but they turned me over here:) Hoping you guys can help.
© Super User or respective owner