generalizing the pumping lemma for UNIX-style regular expressions

Posted by Avi on Stack Overflow See other posts from Stack Overflow or by Avi
Published on 2010-04-13T02:29:43Z Indexed on 2010/04/13 2:33 UTC
Read the original article Hit count: 510

Most UNIX regular expressions have, besides the usual *,+,? operators a backslash operator where \1,\2,... match whatever's in the last parentheses, so for example L=(a)b\1* matches the (non regular) language a^n b a^n

On one hand, this seems to be pretty powerful since you can create (a*)b\1b\1 to match the language a^n b a^n b a^n which can't even be recognized by a stack automaton. On the other hand, I'm pretty sure a^n b^n cannot be expressed this way.

Two questions: 1. Is there any literature on this family of languages (UNIX-y regular). In particular, is there a version of the pumping lemma for these? 2. Can someone prove (or perhaps disprove) that a^n b^n cannot be expressed this way?

Thanks

© Stack Overflow or respective owner

Related posts about regular-language

Related posts about pumping-lemma