Building an automatic web crawler
Posted
by Sakin
on Stack Overflow
See other posts from Stack Overflow
or by Sakin
Published on 2009-08-11T11:25:20Z
Indexed on
2010/03/17
6:41 UTC
Read the original article
Hit count: 612
web-crawler
|webcrawling
I am building a web application crawler that's meant not only to find all the links or pages in a web application, but also perform all the allowed actions in the app (such as pushing buttons, filling forms, notice changes in the DOM even if they did not trigger a request etc.)
Basically, this is a kind of "browser simulator".
I find WebKit a good option to implement my crawler, since it has all the needed technology (Javascript engine, parsers, DOM manipulation, etc.) but it seems kind of an overkill being a fully featured browser.
Is there any toolkit you know that can provide the above functionality?
© Stack Overflow or respective owner