Regular expression: who's greedier?

Posted by polygenelubricants on Stack Overflow See other posts from Stack Overflow or by polygenelubricants
Published on 2010-04-02T09:28:02Z Indexed on 2010/04/02 9:33 UTC
Read the original article Hit count: 242

Filed under:
|

My primary concern is with the Java flavor, but I'd also appreciate information regarding others.

Let's say you have a subpattern like this:

(.*)(.*)

Not very useful as is, but let's say these two capture groups (say, \1 and \2) are part of a bigger pattern that matches with backreferences to these groups, etc.

So both are greedy, in that they try to capture as much as possible, only taking less when they have to.

My question is: who's greedier? Does \1 get first priority, giving \2 its share only if it has to?

What about:

(.*)(.*)(.*)

Let's assume that \1 does get first priority. Let's say it got too greedy, and then spit out a character. Who gets it first? Is it always \2 or can it be \3?

Let's assume it's \2 that gets \1's rejection. If this still doesn't work, who spits out now? Does \2 spit to \3, or does \1 spit out another to \2 first?

© Stack Overflow or respective owner

Related posts about regex

Related posts about java