How to know if the website being scraped has changed?

Posted by Lost_in_code on Stack Overflow See other posts from Stack Overflow or by Lost_in_code
Published on 2010-03-27T17:52:13Z Indexed on 2010/03/27 17:53 UTC
Read the original article Hit count: 161

Filed under:
|
|

I'm using PHP to scrape a website and collect some data. It's all done without using regex. I'm using php's explode() method to find particular HTML tags instead.

It is possible that if the structure of the website changes (CSS, HTML), then wrong data may be collected by the scraper. So the question is - how do I know if the HTML structure has changed? How to identify this before storing any data to my database to avoid wrong data being stored.

© Stack Overflow or respective owner

Related posts about scraping

Related posts about webscraping