grabbing a substring while scraping with Python2.6

Posted by Diego on Stack Overflow See other posts from Stack Overflow or by Diego
Published on 2010-05-16T22:11:53Z Indexed on 2010/05/16 22:20 UTC
Read the original article Hit count: 359

Filed under:

python

|

mechanize

|

beautifulsoup

|

substring

|

list

Hey can someone help with the following?

I'm trying to scrape a site that has the following information.. I need to pull just the number after the </strong> tag..

[<li><strong>ISBN-13:</strong> 9780375853401</li>, <li><strong>Pub. Date: </strong> 05/11/2010</li>]
[<li><strong>UPC:</strong> 490355000372</li>, <li><strong>Catalog No:</strong> 15024/25</li>, <li><strong>Label:</strong> CAMERATA</li>]

here's a piece of the code I've been using to grab the above data using mechanize and BeautifulSoup. I'm stuck here as it won't let me use the find() function for a list

br_results = mechanize.urlopen(br_results)
html = br_results.read()
soup = BeautifulSoup(html)
local_links = soup.findAll("a", {"class" : "down-arrow csa"})
upc_code = soup.findAll("ul", {"class" : "bc-meta3"})
for upc in upc_code:
    upc_text = upc.contents.contents
    print upc_text

© Stack Overflow or respective owner

Related posts about python

unmet dependencies in Ubuntu 12.04

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I tried today to install a dvb-card on my Ubuntu 12.04 (Linux blauhai-linux 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux ). The installation failed with an error. After that, i tried to install python (it was already installed but i got this error): linux:~$… >>> More
How can I get sikuli-ide to work?

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I installed sikuli-ide with sudo apt-get install sikuli-ide Everything was fine until I tried to start it from the terminal. I typed sikuli-ide But the only response I got was [info] locale: en_US The application was not started, furthermore there is no desktop file and sikuli-ide does not… >>> More
Getting PATH right for python after MacPorts install

as seen on Super User - Search for 'Super User'
I can't import some python libraries (PIL, psycopg2) that I just installed with MacPorts. I looked through these forums, and tried to adjust my PATH variable in $HOME/.bash_profile in order to fix this but it did not work. I added the location of PIL and psycopg2 to PATH. I know that Terminal is… >>> More
call python with system() in R to run a python script emulating the python console

as seen on Stack Overflow - Search for 'Stack Overflow'
I want to pass a chunk of Python code to Python in R with something like system('python ...'), and I'm wondering if there is an easy way to emulate the python console in this case. For example, suppose the code is "print 'hello world'", how can I get the output like this in R? >>> print… >>> More
Python - Calling a non python program from python?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am currently struggling to call a non python program from a python script. I have a ~1000 files that when passed through this C++ program will generate ~1000 outputs. Each output file must have a distinct name. The command I wish to run is of the form: program_name -input -output -o1 -o2… >>> More

Related posts about mechanize

python mechanize.browser submit() related problem

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello All im making some script with mechanize.browser module. one of problem is all other thing is ok, but when submit() form,it not working, so i was found some suspicion source part. in the html source i was found such like following. <form method="post" onsubmit="return loginCheck(this)"… >>> More
How to set the mechanize page encoding?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I'm trying to get a page with an ISO-8859-1 encoding clicking on a link, so the code is similar to this: page_result = page.link_with( :text => 'link_text' ).click So far I get the result with a wrong encoding, so I see characters like: 'T?tulo:' instead of 'Título:' I've tried several… >>> More
WWW::Mechanize-Question

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello! Are both of these versions OK or is one of them to prefer? #!/usr/bin/env perl use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); my $content; # 1 $mech->get( 'http://www.kernel.org' ); $content = $mech->content; say $content; # 2 my $res = $mech->get(… >>> More
WWW::Mechanize Perl login only works after relaunch

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello, I'm trying to login automatically in a website using Perl WWW::Mechanize. What I do is: $bot = WWW::Mechanize->new(); $bot->cookie_jar( HTTP::Cookies->new( file => "cookies.txt", autosave => 1, ignore_discard =>… >>> More
Python mechanize to follow image links?

as seen on Stack Overflow - Search for 'Stack Overflow'
mechanize's Browser class is great and it's follow_link() function is great too. But what to do with this kind of links: <a href="http://example.com"><img src="…"></a> Is there any way to follow such links? The text attribute of this type of links is simply '[IMG]', so AFAIK,… >>> More