Webpage data scraping using Java

Posted by Gemma on Stack Overflow See other posts from Stack Overflow or by Gemma
Published on 2010-04-11T01:37:49Z Indexed on 2010/04/11 4:13 UTC
Read the original article Hit count: 331

Filed under:
|

I am now trying to implement a simple HTML webpage scraper using Java.Now I have a small problem. Suppose I have the following HTML fragment.

<div id="sr-h-left" class="sr-comp">
    <a class="link-gray-underline" id="compare_header"  rel="nofollow" href="javascript:i18nCompareProd('/serv/main/buyer/ProductCompare.jsp?nxtg=41980a1c051f-0942A6ADCF43B802');">
        <span style="cursor: pointer;" class="sr-h-o">Compare</span>
    </a>
</div>
<div id="sr-h-right" class="sr-summary">
    <div id="sr-num-results">
        <div class="sr-h-o-r">Showing 1 - 30 of 1,439 matches, 

The data I am interested is the integer 1.439 shown at the bottom.I am just wondering how can I get that integer out of the HTML. I am now considering using a regular expression,and then use the java.util.Pattern to help get the data out,but still not very clear about the process. I would be grateful if you guys could give me some hint or idea on this data scraping. Thanks a lot.

© Stack Overflow or respective owner

Related posts about java

Related posts about html