Parsing: How to make error recovery in grammars like " a* b*"?
- by Lavir the Whiolet
Let we have a grammar like this:
Program ::= a* b*
where "*" is considered to be greedy.
I usually implement "*" operator naively:
Try to apply the expression under "*" to input one more time.
If it has been applied successfully then we are still under current "*"-expression; try to apply the expression under "*" one more time.
Otherwise we have reached next grammar expression; put characters parsed by expression under "*" back into input and proceed with next expression.
But if there are errors in input in any of "a*" or "b*" part such a parser will "think" that in position of error both "a*" and "b*" have finished ("let's try "a"... Fail! OK, it looks like we have to proceed to "b*". Let's try "b"... Fail! OK, it looks like the string should have been finished...).
For example, for string "daaaabbbbbbc" it will "say": "The string must end at position 1, delete superflous characters: daaaabbbbbbc".
In short, greedy "*" operator becomes lazy if there are errors in input.
How to make "*" operator to recover from errors nicely?