PHP Regex: How to match anything except a pattern between two tags
- by Ryan
Hello,
I am attempting to match a string which is composed of HTML. Basically it is an image gallery so there is a lot of similarity in the string. There are a lot of <dl> tags in the string, but I am looking to match the last <dl>(.?)+</dl> combo that comes before a </div>.
The way I've devised to do this is to make sure that there aren't any <dl's inside the <dl></dl> combo I'm matching. I don't care what else is there, including other tags and line breaks.
I decided I had to do it with regular expressions because I can't predict how long this substring will be or anything that's inside it.
Here is my current regex that only returns me an array with two NULL indicies:
preg_match_all('/<dl((?!<dl).)+<\/dl>(?=<\/div>)/', $foo, $bar)
As you can see I use negative lookahead to try and see if there is another <dl> within this one. I've also tried negative lookbehind here with the same results. I've also tried using +? instead of just + to no avail. Keep in mind that there's no pattern <dl><dl></dl> or anything, but that my regex is either matching the first <dl> and the last </dl> or nothing at all.
Now I realize . won't match line breaks but I've tried anything I could imagine there and it still either provides me with the NULL indicies or nearly the whole string (from the very first occurance of <dl to </dl></div>, which includes several other occurances of <dl>, exactly what I didn't want). I honestly don't know what I'm doing incorrectly.
Thanks for your help! I've spent over an hour just trying to straighten out this one problem and it's about driven me to pulling my hair out.