Match Anything Except a Sub-pattern
- by Tim Lytle
I'd like to accomplish what this (invalid I believe) regular expression tries to do:
<p><a>([^(<\/a>)]+?)<\/a></p>uniquestring
Essentially match anything except a closing anchor tag. Simple non-greedy doesn't help here because `uniquestring' may very well be after another distant closing anchor tag:
<p><a>text I don't <tag>want</tag> to match</a></p>random
data<p><a>text I do <tag>want to</tag> match</a></p>uniquestring more
matches <p><a>of <tag>text I do</tag> want to match</a></p>uniquestring
So I have more tag in between the anchor tags. And I'm using the presence of uniquestring to determine if I want to match the data. So a simple non-greedy ends up matching everything from the start of the data I don't want to the end of the data I do want.
I know I'm edging close to the problems regular expressions (or at least my knowledge of them) aren't good at solving. I could just through the data at an HTML/XML parser, but it is just one simple(ish) search.
Is there some easy way to do this that I'm just missing?