Parsing: How to make error recovery in grammars like " a* b*"?

Posted by Lavir the Whiolet on Stack Overflow See other posts from Stack Overflow or by Lavir the Whiolet
Published on 2010-12-24T07:50:39Z Indexed on 2010/12/24 7:54 UTC
Read the original article Hit count: 253

Filed under:

parsing

|

language-agnostic

|

error-correction

|

error-recovery

Let we have a grammar like this:

Program ::= a* b*

where "*" is considered to be greedy.

I usually implement "*" operator naively:

Try to apply the expression under "*" to input one more time.
If it has been applied successfully then we are still under current "*"-expression; try to apply the expression under "*" one more time.
Otherwise we have reached next grammar expression; put characters parsed by expression under "*" back into input and proceed with next expression.

But if there are errors in input in any of "a*" or "b*" part such a parser will "think" that in position of error both "a*" and "b*" have finished ("let's try "a"... Fail! OK, it looks like we have to proceed to "b*". Let's try "b"... Fail! OK, it looks like the string should have been finished...).

For example, for string "daaaabbbbbbc" it will "say": "The string must end at position 1, delete superflous characters: daaaabbbbbbc".

In short, greedy "*" operator becomes lazy if there are errors in input.

How to make "*" operator to recover from errors nicely?

© Stack Overflow or respective owner

Related posts about parsing

Hot to fix nautilus desktop on linux mint

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
so I'm using Linux Mint 13 with Cinnamon and suddenly there are no icons on the desktop and the right click doesn't work, it's like the desktop doesn't start up at all, but the Cinnamon interface and everything else are working just fine. This happens only when I open the session with Cinnamon, if… >>> More
Is parsing JSON faster than parsing XML

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm creating a sophisticated JavaScript library for working with my company's server side framework. The server side framework encodes its data to a simple XML format. There's no fancy namespacing or anything like that. Ideally I'd like to parse all of the data in the browser as JSON. However, if… >>> More
Looking for a tutorial on Recursive Descent Parsing.

as seen on Stack Overflow - Search for 'Stack Overflow'
I am trying to parse some data to no success. Can anyone recommend a good introduction with a lot of examples to Recursive Descent Parsing? I haven't been able to find any. >>> More
Parsing XML with Hpricot, a Gem of a Ruby Gem

as seen on Internet.com - Search for 'Internet.com'
Need to parse complex XML documents but don't know where to begin? Leave the task to Ruby's powerful Hpricot library. >>> More
Parsing scripts that use curly braces

as seen on Programmers - Search for 'Programmers'
To get an idea of what I'm doing, I am writing a python parser that will parse directx .x text files. The problem I have deals with how the files are formatted. Although I'm writing it in python, I'm looking for general algorithms for dealing with this sort of parsing. .x files define data using… >>> More

Related posts about language-agnostic

Are there any language agnostic unit testing frameworks?

as seen on Programmers - Search for 'Programmers'
I have always been skeptical of rewriting working code - porting code is no exception to this. However, with the advent of TDD and automated testing it is much more reasonable to rewrite and refactor code. Does anyone know if there is a TDD tool that can be used for porting old code? Ideally you… >>> More
Language Agnostic Basic Programming Question

as seen on Stack Overflow - Search for 'Stack Overflow'
This is very basic question from programming point of view but as I am in learning phase, I thought I would better ask this question rather than having a misunderstanding or narrow knowledge about the topic. So do excuse me if somehow I mess it up. Question: Let's say I have class A,B,C and D… >>> More
Language-agnostic term for typed things that need memory

as seen on Stack Overflow - Search for 'Stack Overflow'
Is there an accepted general term that subsumes the concepts of variables, class instances and arrays? Basically "any typed thing that needs memory". In C++, such a thing is called an object, but I'm looking for a more language-agnostic term. § 1.8 The C++ object model 1 The constructs in… >>> More
Deprecated Methods in Code Base

as seen on Programmers - Search for 'Programmers'
A lot of the code I've been working on recently, both professionally (read: at work) and in other spheres (read: at home, for friends/family/etc, or NOT FOR WORK), has been worked on, redesigned and re-implemented several times - where possible/required. This has been in an effort to make things smaller… >>> More
Coding Competition, language agnostic guidelines?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi there: I might be doing a coding competition soon, I was wondering if anyone made one and what where the guidelines/ process. I'd like to make the competition appealing to all devs, and I m trying to come up with ideas as to how. the scenario is: There is an event running and we(of the coding… >>> More