Hi lads,
After obtaining the urls for various blogspots, tumblr and wordpress pages, I faced some problems processing the html pages. The thing is, i wish to distinguish between the content,title and date for each blog post. I might be able to get the date through regex, but there are so many custom scripts people are using now that the html classes and structure is so different.
Does anyone has a solution that may help?