Scrapy - Follow RSS links
Posted
by Tupak Goliam
on Stack Overflow
See other posts from Stack Overflow
or by Tupak Goliam
Published on 2010-05-30T14:40:51Z
Indexed on
2010/05/30
14:52 UTC
Read the original article
Hit count: 311
Hello,
I was wondering if anyone ever tried to extract/follow RSS links using SgmlLinkExtractor/CrawlSpider. I can't get it to work...
I am using the following rule:
rules = ( Rule(SgmlLinkExtractor(tags=('link',), attrs=False), follow=True, callback='parse_article'), )
(having in mind that rss links are located in the link tag).
I am not sure how to tell SgmlLinkExtractor to extract the text() of the link and not to search the attributes ...
Any help is welcome, Thanks in advance
© Stack Overflow or respective owner