Repeating a object that only occurs couple of times and has different values with htmlagilitypack c#.
Posted
by
dtd
on Stack Overflow
See other posts from Stack Overflow
or by dtd
Published on 2012-10-07T02:33:14Z
Indexed on
2012/10/07
3:38 UTC
Read the original article
Hit count: 197
html-agility-pack
I have a problem I cant seem to solve here. Lets say I have some html like beneth here that I want to parse. All this html is within one list on the page. And the names repeat themself like in the example I wrote.
<li class = "seperator"> a date </li>
<li class = "lol"> some text </li>
<li class = "lol"> some text </li>
<li class = "lol"> some text </li>
<li class = "seperator"> a new date </li>
<li class = "lol"> some text </li>
<li class = "seperator"> a nother new date </li>
<li class = "lol"> some text </li>
<li class = "lol"> some text </li>
I did manage to use htmlagility pack to parse every li object seperate, and almost formating it how I want. My print atm looks something like this:
"a date" "some text"
"some text"
"some text"
"some text"
"a new date" "some text"
"a nother new date " "some text"
"some text"
"some text"
What I want to achive:
"a date" "some text"
"a date" "some text"
"a date" "some text"
"a date" "some text"
"a new date" "some text"
"a nother new date " "some text"
"a nother new date " "some text"
"a nother new date " "some text"
But the problem is that beneath every seperator, the count of every lol object may vary. So one day, the webpage may have one lol object beneth date 1, and the next day it may have 10 lol objects. So I am woundering if there is an smart/easy way to somehow count the number of lol objects in between the seperators. Or if there is another way to figure this out? Within for example htmlagilitypack. And yes, I need the correct date in front of every lol object, not just infront the first one. This would have been a pice of cake if the seperator class would have ended beneath the last lol object, but sadly that is not the case... I dont think that I need to paste my code here, but basicly what I do is to parse the page, extract the seperators and lol objects and add them to a list, where I split them up to seperator and lol objects. Then I print it out to a file and since the seperator only occure 3 times(in the example) I will only get out 3 seperate dates.
© Stack Overflow or respective owner