How to create an AST with ANTLR from a hierarchical key-value syntax

Posted by Brabster on Stack Overflow See other posts from Stack Overflow or by Brabster
Published on 2012-11-22T10:58:47Z Indexed on 2012/11/22 10:59 UTC
Read the original article Hit count: 265

Filed under:
|
|
|

I've been looking at parsing a key-value data format with ANTLR. Pretty straightforward, but the keys represent a hierarchy.

A simplified example of my input syntax:

/a/b/c=2
/a/b/d/e=3
/a/b/d/f=4

In my mind, this represents a tree structured as follows:

(a (b (= c 2) (d (= e 3) (= f 4))))

The nearest I can get is to use the following grammar:

/* Parser Rules */
start: (component NEWLINE?)* EOF -> (component)*;

component: FORWARD_SLASH ALPHA_STRING component -> ^(ALPHA_STRING component)
  | FORWARD_SLASH ALPHA_STRING EQUALS value -> ^(EQUALS ALPHA_STRING value);

value: ALPHA_STRING;

/* Lexer Rules */
NEWLINE : '\r'? '\n';
ALPHA_STRING : ('a'..'z'|'A'..'Z'|'0'..'9')+;
EQUALS : '=';
FORWARD_SLASH : '/';

Which produces:

(a (b (= c 2))) (a (b (d (= e 3)))) (a (b (d (= f 4))))

I'm not sure whether I'm asking too much from a generic tool such as ANTLR here, and this is as close I can get with this approach. That is, from here I consume the parts of the tree and create the data structure I want by hand.

So - can I produce the tree structure I want directly from a grammar? If so, how? If not, why not - is it a technical limitation in ANTLR or is it something more CS-y to do with the type of language involved?

© Stack Overflow or respective owner

Related posts about antlr

Related posts about grammar