Lookahead regex produces unexpected group

Posted by Ivan Yatskevich on Stack Overflow See other posts from Stack Overflow or by Ivan Yatskevich
Published on 2010-06-03T14:01:34Z Indexed on 2010/06/03 14:04 UTC
Read the original article Hit count: 305

Filed under:
|

I'm trying to extract a page name and query string from a URL which should not contain .html

Here is an example code in Java:

public class TestRegex { 
    public static void main(String[] args) {
        Pattern pattern = Pattern.compile("/test/(((?!\\.html).)+)\\?(.+)");
        Matcher matcher = pattern.matcher("/test/page?param=value");
        System.out.println(matcher.matches());
        System.out.println(matcher.group(1));
        System.out.println(matcher.group(2));
    }
}

By running this code one can get the following output:

true
page
e

What's wrong with my regex so the second group contains the letter e instead of param=value?

© Stack Overflow or respective owner

Related posts about java

Related posts about regex