Hello,
I'm on to some SQL where clause parsing and designed a working RegEx to find a column outside string literals using "Rad Software Regular Expression Desginer" which is using the .NET API. To make sure the designed RegEx works with Java too, I tested it by using the API of course (1.5 and 1.6). But guess what, it won't work. I got the message
"Look-behind group does not have an obvious maximum length near index 28".
The string that I'm trying to get parsed is
Column_1='test''the''stuff''all''day''long' AND Column_2='000' AND TheVeryColumnIWantToFind = 'Column_1=''test''''the''''stuff''''all''''day''''long'' AND Column_2=''000'' AND TheVeryColumnIWantToFind = '' TheVeryColumnIWantToFind = '' AND (Column_3 is null or Column_3 = ''Not interesting'') AND ''1'' = ''1''' AND (Column_3 is null or Column_3 = 'Still not interesting') AND '1' = '1'
As you may have guessed, I tried to create some kind of worst case to ensure the RegEx won't fail on more complicated SQL where clauses.
The RegEx itself looks like this
(?i:(?<!=\s*'(?:[^']|(?:''))*)((?<=\s*)TheVeryColumnIWantToFind(?=(?:\s+|=))))
I'm not sure if there is a more elegant RegEx (there'll most likely be one), but that's not important right now as it does the trick.
To explain the RegEx in a few words:
If it finds the column I'm after, it does a negative look-behind to figure out if the column name is used in a string literal. If so, it won't match. If not, it'll match.
Back to the question. As I mentioned before, it won't work with Java. What will work and result in what I want?
I found out, that Java does not seem to support unlimited look-behinds but still I couldn't get it to work.
Isn't it right that a look-behind is always putting a limit up on itself from the search offset to the current search position? So it would result in something like "position - offset"?