Parsing: How to make error recovery in grammars like " a* b*"?

Posted by Lavir the Whiolet on Stack Overflow See other posts from Stack Overflow or by Lavir the Whiolet
Published on 2010-12-24T07:50:39Z Indexed on 2010/12/24 7:54 UTC
Read the original article Hit count: 215

Let we have a grammar like this:

Program ::= a* b*

where "*" is considered to be greedy.

I usually implement "*" operator naively:

  1. Try to apply the expression under "*" to input one more time.
  2. If it has been applied successfully then we are still under current "*"-expression; try to apply the expression under "*" one more time.
  3. Otherwise we have reached next grammar expression; put characters parsed by expression under "*" back into input and proceed with next expression.

But if there are errors in input in any of "a*" or "b*" part such a parser will "think" that in position of error both "a*" and "b*" have finished ("let's try "a"... Fail! OK, it looks like we have to proceed to "b*". Let's try "b"... Fail! OK, it looks like the string should have been finished...).

For example, for string "daaaabbbbbbc" it will "say": "The string must end at position 1, delete superflous characters: daaaabbbbbbc".

In short, greedy "*" operator becomes lazy if there are errors in input.

How to make "*" operator to recover from errors nicely?

© Stack Overflow or respective owner

Related posts about parsing

Related posts about language-agnostic