Select only items in a specific DIV using HtmlAgilityPack
Posted
by Adam Haile
on Stack Overflow
See other posts from Stack Overflow
or by Adam Haile
Published on 2010-05-20T15:38:42Z
Indexed on
2010/05/20
15:40 UTC
Read the original article
Hit count: 378
c#
|htmlagilitypack
I'm trying to use the HtmlAgilityPack to pull all of the links from a page that are contained within a div declared as <div class='content'>
However, when I use the code below I simply get ALL links on the entire page. This doesn't really make sense to me since I am calling SelectNodes from the sub-node I selected earlier (which when viewed in the debugger only shows the HTML from that specific div). So, it's like it's going back to the very root node every time I call SelectNodes. The code I use is below:
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(@"http://example.com");
HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='content']");
foreach(HtmlNode link in node.SelectNodes("//a[@href]"))
{
Console.WriteLine(link.Value);
}
Is this the expected behavior? And if so, how do I get it to do what I'm expecting?
© Stack Overflow or respective owner