grabbing a substring while scraping with Python2.6
- by Diego
Hey can someone help with the following?
I'm trying to scrape a site that has the following information.. I need to pull just the number after the </strong> tag..
[<li><strong>ISBN-13:</strong> 9780375853401</li>, <li><strong>Pub. Date: </strong> 05/11/2010</li>]
[<li><strong>UPC:</strong> 490355000372</li>, <li><strong>Catalog No:</strong> 15024/25</li>, <li><strong>Label:</strong> CAMERATA</li>]
here's a piece of the code I've been using to grab the above data using mechanize and BeautifulSoup. I'm stuck here as it won't let me use the find() function for a list
br_results = mechanize.urlopen(br_results)
html = br_results.read()
soup = BeautifulSoup(html)
local_links = soup.findAll("a", {"class" : "down-arrow csa"})
upc_code = soup.findAll("ul", {"class" : "bc-meta3"})
for upc in upc_code:
upc_text = upc.contents.contents
print upc_text