JSoup - Select only one listobject

Posted by Zyril on Stack Overflow See other posts from Stack Overflow or by Zyril
Published on 2012-06-10T10:25:50Z Indexed on 2012/06/10 10:40 UTC
Read the original article Hit count: 282

Filed under:
|
|
|

I'm trying to extract some certain data from a website using JSoup and Java. So far I've been successful in what I'm trying to achieve.

<ul class="beverageFacts">
<li><span>Årgång</span><strong>**2009**&nbsp;</strong></li>

I want to extract what is inside the ** in the above HTML. I can do this by using the code that follows in JSoup:

doc.select("ul.beverageFacts li:lt(1) strong");

I'm using the lt(1) because there are several more list items following that I want to omit.

Now to my problem; there's an optional information tab on the site I'm extracting data from, and it also has a class called "beverageFacts". My code will at the moment extract that data too, which I don't want it to do.

The code is further down in the source of the website, and I've tried to use the indexer :lt(1) here aswell, but it wont work.

<div id="beverageMoreFacts" style="display: block">
<ul class="beverageFacts"><li class="half">
<span> Färg</span><strong> Ljusgul färg.</strong>

My overall result is that I extract "2009 Ljusgul färg." instead of only "2009". How can I write my code so it will only extract the first part, which it succesfully does, and omits the rest?

EDIT: I get the same result using:

 doc.select("ul.beverageFacts li:eq(0) strong");

Thanks, Z

© Stack Overflow or respective owner

Related posts about java

Related posts about parsing