PHP Regex: How to match anything except a pattern between two tags

Posted by Ryan on Stack Overflow See other posts from Stack Overflow or by Ryan
Published on 2010-04-30T07:58:05Z Indexed on 2010/04/30 8:07 UTC
Read the original article Hit count: 257

Filed under:
|

Hello, I am attempting to match a string which is composed of HTML. Basically it is an image gallery so there is a lot of similarity in the string. There are a lot of <dl> tags in the string, but I am looking to match the last <dl>(.?)+</dl> combo that comes before a </div>.

The way I've devised to do this is to make sure that there aren't any <dl's inside the <dl></dl> combo I'm matching. I don't care what else is there, including other tags and line breaks.

I decided I had to do it with regular expressions because I can't predict how long this substring will be or anything that's inside it.

Here is my current regex that only returns me an array with two NULL indicies:

preg_match_all('/<dl((?!<dl).)+<\/dl>(?=<\/div>)/', $foo, $bar)

As you can see I use negative lookahead to try and see if there is another <dl> within this one. I've also tried negative lookbehind here with the same results. I've also tried using +? instead of just + to no avail. Keep in mind that there's no pattern <dl><dl></dl> or anything, but that my regex is either matching the first <dl> and the last </dl> or nothing at all.

Now I realize . won't match line breaks but I've tried anything I could imagine there and it still either provides me with the NULL indicies or nearly the whole string (from the very first occurance of <dl to </dl></div>, which includes several other occurances of <dl>, exactly what I didn't want). I honestly don't know what I'm doing incorrectly.

Thanks for your help! I've spent over an hour just trying to straighten out this one problem and it's about driven me to pulling my hair out.

© Stack Overflow or respective owner

Related posts about regex

Related posts about matching