web spidering/crawling, can i do it or just search engines?
Posted
by
bboyreason
on Super User
See other posts from Super User
or by bboyreason
Published on 2011-03-07T07:35:26Z
Indexed on
2011/03/07
8:12 UTC
Read the original article
Hit count: 254
i already had a question answered about web-scraping with wget. but as i read a little more, i realize i may be looking for a web-crawling program. particularly the part about web-crawlers being able to get specific data like links or, in my case, products.
all of the products on my site have the following naming convention, website.com/uniqueAlphaNumericID.html
as far as i know, no dynamic content generation is being used and only one page per one item in the above format.
should i just be thinking about:
wget website.com | grep *.html
or should i be looking into spiders/crawlers?
© Super User or respective owner