Backreferences in lookbehind
Posted
by polygenelubricants
on Stack Overflow
See other posts from Stack Overflow
or by polygenelubricants
Published on 2010-04-29T05:34:38Z
Indexed on
2010/04/29
5:37 UTC
Read the original article
Hit count: 426
Can you use backreferences in a lookbehind?
Let's say I want to split
wherever behind me a character is repeated twice.
String REGEX1 = "(?<=(.)\\1)"; // DOESN'T WORK!
String REGEX2 = "(?<=(?=(.)\\1)..)"; // WORKS!
System.out.println(java.util.Arrays.toString(
"Bazooka killed the poor aardvark (yummy!)"
.split(REGEX2)
)); // prints "[Bazoo, ka kill, ed the poo, r aa, rdvark (yumm, y!)]"
Using REGEX2
(where the backreference is in a lookahead nested inside a lookbehind) works, but REGEX1
gives this error at run-time:
Look-behind group does not have an obvious maximum length near index 8
(?<=(.)\1)
^
This sort of make sense, I suppose, because in general the backreference can capture a string of any length (if the regex compiler is a bit smarter, though, it could determine that \1
is (.)
in this case, and therefore has a finite length).
So is there a way to use a backreference in a lookbehind?
And if there isn't, can you always work around it using this nested lookahead? Are there other commonly-used techniques?
© Stack Overflow or respective owner