Regular expressions - finding and comparing the first instance of a word

Posted by Dan on Stack Overflow See other posts from Stack Overflow or by Dan
Published on 2010-05-28T13:59:30Z Indexed on 2010/05/28 14:02 UTC
Read the original article Hit count: 372

Filed under:

html

|

regular-language

Hi there,

I am currently trying to write a regular expression to pull links out of a page I have. The problem is the links need to be pulled out only if the links have 'stock' for example. This is an outline of what I have code wise:

<td class="prd-details">
   <a href="somepage">
   ...
   <span class="collect unavailable">
</td>

<td class="prd-details">
   <a href="somepage">
   ...
   <span class="collect available">
</td>

What I would like to do is pull out the links only if 'collect available' is in the tag. I have tried to do this with the regular expression:

(?s)prd-details[^=]+="([^"]+)" .+?collect{1}[^\s]+ available

However on running it, it will find the first 'prd-details' class and keep going until it finds 'collect available', thereby taking the incorrect results. I thought by specifying the {1} after the word collect it would only use the first instance of the word it finds, but apparently I'm wrong. I've been trying to use different things such as positive and negative lookaheads but I cant seem to get anything to work.

Might anyone be able to help me with this issue?

Thanks,

Dan

© Stack Overflow or respective owner

Related posts about html

Install usblib package - Ubuntu

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I need the package libusb for another package I am installing. I tried the following which seemed to install the package, sudo apt-get install libusb-dev but when I try to install the other package I get, configure: error: Package requirements (libusb-1.0 >= 0.9.1) were not met: No package… >>> More
Prevent malicious vulnerability scan increasing load on a server

as seen on Server Fault - Search for 'Server Fault'
Hi all, this week we have been suffering some malicious vulnerability scans to our servers, increasing the load on them, making them nearly unusable. The attack is easy to defend, just blocking the offending ip, but only after discovering it. Is there any form of prevent it? Is it normal that… >>> More
can't install psycopg2 in my env on mac os x lion

as seen on Server Fault - Search for 'Server Fault'
I tried install psycopg2 via pip in my virtual env, but got this error: ld: library not found for -lpq (full log here: http://pastebin.com/XdmGyJ4u ) I tried install postgres 9.1 from .dmg and via port, (gksks)iMac-Alexander:~ lorddaedra$ locate libpq /Developer/SDKs/MacOSX10.7.sdk/usr/include/libpq /Developer/SDKs/MacOSX10… >>> More
Bitnami redmine error SVN

as seen on Server Fault - Search for 'Server Fault'
I'm installing the Bitnami Redmine stack (redmine + subversion). Firstly I install configure and test it locally (Ubuntu 14.04 LTS). And everything is OK. I install Bitnami stack on server (Red Hat 4.4.7-4) and configure SVN. I commit files into SVN and connect project into Redmine with SVN repository… >>> More
Can the .htaccess file slow down a website to a crawl? If so, are there better ways to solve these problems with different rewrite rules and such?

as seen on Pro Webmasters - Search for 'Pro Webmasters'
here is my htaccess file...... RewriteCond %{REQUEST_URI} ^/patients/billing/FAQ_billing\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/billing/getintouch\.html$ RewriteRule ^patients/billing/(.*)\.html$ $1.php [L,NC] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/a\.html$ [OR] RewriteCond… >>> More

Related posts about regular-language

What's a regular language?

as seen on Stack Overflow - Search for 'Stack Overflow'
I've read that you can't parse HTML with regular expressions because HTML is not a regular language. I tried searching Wikipedia, but I didn't understand a word of what the various related articles said. Can someone explain, in simpler terms, what's a regular (or non-regular) language, and why non-regular… >>> More
deriving regular expressions from a regular language

as seen on Stack Overflow - Search for 'Stack Overflow'
Given the language below, how do i find a regular expression for the language L = {a ^n b ^m | n = 1, m =1, nm =3} >>> More
Theory of Computation - Showing that a language is regular..

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm reviewing some notes for my course on Theory of Computation and I'm a little bit stuck on showing the following statement and I was hoping somebody could help me out with an explanation :) Let A be a regular language. The language B = {ab | a exists in A and b does not exist in A*} Why is B… >>> More
generalizing the pumping lemma for UNIX-style regular expressions

as seen on Stack Overflow - Search for 'Stack Overflow'
Most UNIX regular expressions have, besides the usual *,+,? operators a backslash operator where \1,\2,... match whatever's in the last parentheses, so for example L=(a)b\1* matches the (non regular) language a^n b a^n On one hand, this seems to be pretty powerful since you can create (a*)b\1b\1… >>> More
Regular Expression for any number divisible by 60 using C# .Net ?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi there, I need to apply validation on input time intervals that are taken in as seconds. Now i am not really good at Regular expressions. So can any body help making a regular expression that can test whether a number is divisible by 60. I was wondering if i could use to test one that check that… >>> More