web spidering/crawling, can i do it or just search engines?

Posted by bboyreason on Super User See other posts from Super User or by bboyreason
Published on 2011-03-07T07:35:26Z Indexed on 2011/03/07 8:12 UTC
Read the original article Hit count: 250

Filed under:
|
|

i already had a question answered about web-scraping with wget. but as i read a little more, i realize i may be looking for a web-crawling program. particularly the part about web-crawlers being able to get specific data like links or, in my case, products.
all of the products on my site have the following naming convention, website.com/uniqueAlphaNumericID.html
as far as i know, no dynamic content generation is being used and only one page per one item in the above format.
should i just be thinking about:
wget website.com | grep *.html
or should i be looking into spiders/crawlers?

© Super User or respective owner

Related posts about website

Related posts about wget