Can Haskell's Parsec library be used to implement a recursive descent parser with backup?

Posted by Thor Thurn on Stack Overflow See other posts from Stack Overflow or by Thor Thurn
Published on 2010-03-20T14:37:19Z Indexed on 2010/03/20 14:41 UTC
Read the original article Hit count: 364

I've been considering using Haskell's Parsec parsing library to parse a subset of Java as a recursive descent parser as an alternative to more traditional parser-generator solutions like Happy. Parsec seems very easy to use, and parse speed is definitely not a factor for me. I'm wondering, though, if it's possible to implement "backup" with Parsec, a technique which finds the correct production to use by trying each one in turn. For a simple example, consider the very start of the JLS Java grammar:

Literal:
    IntegerLiteral  
    FloatingPointLiteral

I'd like a way to not have to figure out how I should order these two rules to get the parse to succeed. As it stands, a naive implementation like this:

literal = do {
    x <- try (do { v <- integer; return (IntLiteral v)}) <|>
         (do { v <- float; return (FPLiteral v)});
    return(Literal x)
}

Will not work... inputs like "15.2" will cause the integer parser to succeed first, and then the whole thing will choke on the "." symbol. In this case, of course, it's obvious that you can solve the problem by re-ordering the two productions. In the general case, though, finding things like this is going to be a nightmare, and it's very likely that I'll miss some cases. Ideally, I'd like a way to have Parsec figure out stuff like this for me. Is this possible, or am I simply trying to do too much with the library? The Parsec documentation claims that it can "parse context-sensitive, infinite look-ahead grammars", so it seems like something like I should be able to do something here.

© Stack Overflow or respective owner

Related posts about haskell

Related posts about parsec