lexer skips a token

Posted by Eugene Strizhok on Stack Overflow See other posts from Stack Overflow or by Eugene Strizhok
Published on 2013-07-02T05:02:11Z Indexed on 2013/07/02 5:05 UTC
Read the original article Hit count: 264

Filed under:
|

I am trying to do basic ANTLR-based scanning. I have a problem with a lexer not matching wanted tokens.

lexer grammar DefaultLexer;

ALPHANUM    :   (LETTER | DIGIT)+;
ACRONYM     :   LETTER '.' (LETTER '.')+;
HOST        :   ALPHANUM (('.' | '-') ALPHANUM)+;

fragment
LETTER  :   UNICODE_CLASS_LL | UNICODE_CLASS_LM | UNICODE_CLASS_LO | UNICODE_CLASS_LT | UNICODE_CLASS_LU;

fragment
DIGIT   :   UNICODE_CLASS_ND | UNICODE_CLASS_NL;

For the grammar above, hello. world string given as an input results in world only. Whereas I would expect to get both hello and world. What am I missing? Thanks.

© Stack Overflow or respective owner

Related posts about antlr3

Related posts about lexer