Treetop basic parsing and regular expression usage
- by ucint
I'm developing a script using the ruby Treetop library and having issues working with its syntax for regex's. First off, many regular expressions that work in other settings dont work the same in treetop.
This is my grammar: (myline.treetop)
grammar MyLine
rule line
string whitespace condition
end
rule string
[\S]*
end
rule whitespace
[\s]*
end
rule condition
"new" / "old" / "used"
end
end
This is my usage: (usage.rb)
require 'rubygems'
require 'treetop'
require 'polyglot'
require 'myline'
parser = MyLineParser.new
p parser.parse("randomstring new")
This should find the word new for sure and it does! Now I wont to extend it so that it can find new if the input string becomes "randomstring anotherstring new yetanother andanother"
and possibly have any number of strings followed by whitespace (tab included) before and after the regex for rule condition. In other words, if I pass it any sentence with the word "new" etc in it, it should be able to match it.
So let's say I change my grammar to:
rule line
string whitespace condition whitespace string
end
Then, it should be able to find a match for:
p parser.parse("randomstring new anotherstring")
So, what do I have to do to allow the string whitespace to be repeated before and after condition? If I try to write this:
rule line
(string whitespace)* condition (whitespace string)*
end
, it goes in an infinite loop. If i replace the above () with [], it returns nil
In general, regex's return a match when i use the above, but treetop regex's dont.
Does anyone have any tips/points on how to go about this? Plus, since there isn't much documentation for treetop and the examples are either too trivial or too complex, is there anyone who knows a more thorough documentation/guide for treetop?