Python/YACC Lexer: Token priority?

Posted by Rosarch on Stack Overflow See other posts from Stack Overflow or by Rosarch
Published on 2010-05-26T05:45:59Z Indexed on 2010/05/26 5:51 UTC
Read the original article Hit count: 253

Filed under:
|
|
|

I'm trying to use reserved words in my grammar:

reserved = {
   'if' : 'IF',
   'then' : 'THEN',
   'else' : 'ELSE',
   'while' : 'WHILE',
}

tokens = [
 'DEPT_CODE',
 'COURSE_NUMBER',
 'OR_CONJ',
 'ID',
] + list(reserved.values())

t_DEPT_CODE = r'[A-Z]{2,}'
t_COURSE_NUMBER  = r'[0-9]{4}'
t_OR_CONJ = r'or'

t_ignore = ' \t'

def t_ID(t):
 r'[a-zA-Z_][a-zA-Z_0-9]*'
 if t.value in reserved.values():
  t.type = reserved[t.value]
  return t
 return None

However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.

© Stack Overflow or respective owner

Related posts about python

Related posts about parsing