Best way to store data for Greasemonkey based crawler?

Posted by Björn on Stack Overflow See other posts from Stack Overflow or by Björn
Published on 2009-01-28T14:23:05Z Indexed on 2010/05/16 13:00 UTC
Read the original article Hit count: 263

I want to crawl a site with Greasemonkey and wonder if there is a better way to temporarily store values than with GM_setValue.

What I want to do is crawl my contacts in a social network and extract the Twitter URLs from their profile pages.

My current plan is to open each profile in it's own tab, so that it looks more like a normal browsing person (ie css, scrits and images will be loaded by the browser). Then store the Twitter URL with GM_setValue. Once all profile pages have been crawled, create a page using the stored values.

I am not so happy with the storage option, though. Maybe there is a better way?

I have considered inserting the user profiles into the current page so that I could all process them with the same script instance, but I am not sure if XMLHttpRequest looks indistignuishable from normal user initiated requests.

© Stack Overflow or respective owner

Related posts about greasemonkey

Related posts about crawler