lexer skips a token

Posted by Eugene Strizhok on Stack Overflow See other posts from Stack Overflow or by Eugene Strizhok
Published on 2013-07-02T05:02:11Z Indexed on 2013/07/02 5:05 UTC
Read the original article Hit count: 305

Filed under:

antlr3

|

lexer

I am trying to do basic ANTLR-based scanning. I have a problem with a lexer not matching wanted tokens.

lexer grammar DefaultLexer;

ALPHANUM    :   (LETTER | DIGIT)+;
ACRONYM     :   LETTER '.' (LETTER '.')+;
HOST        :   ALPHANUM (('.' | '-') ALPHANUM)+;

fragment
LETTER  :   UNICODE_CLASS_LL | UNICODE_CLASS_LM | UNICODE_CLASS_LO | UNICODE_CLASS_LT | UNICODE_CLASS_LU;

fragment
DIGIT   :   UNICODE_CLASS_ND | UNICODE_CLASS_NL;

For the grammar above, hello. world string given as an input results in world only. Whereas I would expect to get both hello and world. What am I missing? Thanks.

© Stack Overflow or respective owner

Related posts about antlr3

Parsing some particular statements with antlr3 in C target

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello all! I have some questions about antlr3 with tree grammar in C target. I have almost done my interpretor (functions, variables, boolean and math expressions ok) and i have kept the most difficult statements for the end (like if, switch, etc.) 1) I would like interpreting a simple loop statement: repeat:… >>> More
Island grammar antlr3...

as seen on Stack Overflow - Search for 'Stack Overflow'
What are and how to use the "island grammar" in antlr3? >>> More
How to get rid of the following multiple alternatives warnings in my ANTLR3 grammar?

as seen on Stack Overflow - Search for 'Stack Overflow'
[11:45:19] warning(200): mygrammar.g:14:57: Decision can match input such as "','" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input [11:45:19] warning(200): C:\Users\Jarrod Roberson\mygrammar.g:14:57: Decision can match input such as "','" using multiple… >>> More
Lexer antlr3 token problem

as seen on Stack Overflow - Search for 'Stack Overflow'
Can I construct a token ENDPLUS: '+' (options (greedy = false;):.) * '+' ; being considered by the lexer only if it is preceded by a token PREwithout including in ENDPLUS? PRE: '<<' ; Thanks. >>> More
Why does Antlr not generate a lexer java file?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, Antlr3 does not generate Mylexer.java. I use AntlrWorks... when I have grammar starting like grammar mylexer; It does generate myParser.java It looks like a simple thing.. I wonder what may be the reason.. and the solution... I get no error message. >>> More

Related posts about lexer

Error in running script [closed]

as seen on Programmers - Search for 'Programmers'
I'm trying to run heathusf_v1.1.0.tar.gz found here I installed tcsh to make build_heathusf work. But, when I run ./build_heathusf, I get the following (I'm running that on a Fedora Linux system from Terminal): $ ./build_heathusf Compiling programs to build a library of image processing functions… >>> More
problem string recursion antlr lexer token

as seen on Stack Overflow - Search for 'Stack Overflow'
How do I build a token in lexer that can handle recursion inside as this string: ${*anythink*${*anything*}*anythink*} ? thanks >>> More
Lexer antlr3 token problem

as seen on Stack Overflow - Search for 'Stack Overflow'
Can I construct a token ENDPLUS: '+' (options (greedy = false;):.) * '+' ; being considered by the lexer only if it is preceded by a token PREwithout including in ENDPLUS? PRE: '<<' ; Thanks. >>> More
grammar parser lexer antlr letteral

as seen on Stack Overflow - Search for 'Stack Overflow'
What's the difference between this grammar: ... if_statement : 'if' condition 'then' statement 'else' statement 'end_if'; ... and this: ... if_statement : IF condition THEN statement ELSE statement END_IF; ... IF : 'if'; THEN: 'then'; ELSE: 'else'; END_IF: 'end_if'; .... ? If there is any… >>> More
ANTLR lexer mismatches tokens

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a simple ANTLR grammar, which I have stripped down to its bare essentials to demonstrate this problem I'm having. I am using ANTLRworks 1.3.1. grammar sample; assignment : IDENT ':=' NUM ';' ; IDENT : ('a'..'z')+ ; NUM : ('0'..'9')+ ; WS : (' '|'\n'|'\t'|'\r')+… >>> More