a question on webpage data scraping using Java
- by Gemma
Hi there.
I am now trying to implement a simple HTML webpage scraper using Java.Now I have a small problem.
Suppose I have the following HTML fragment.
<div id="sr-h-left" class="sr-comp">
<a class="link-gray-underline" id="compare_header" rel="nofollow" href="javascript:i18nCompareProd('/serv/main/buyer/ProductCompare.jsp?nxtg=41980a1c051f-0942A6ADCF43B802');
"
Compare
Showing 1 - 30 of 1,439 matches,
The data I am interested is the integer 1.439 shown at the bottom.I am just wondering how can I get that integer out of the HTML.
I am now considering using a regular expression,and then use the java.util.Pattern to help get the data out,but still not very clear about the process.
I would be grateful if you guys could give me some hint or idea on this data scraping.
Thanks a lot.