How lookaheads are propagated in "channel" method of building LALR parser?

Posted by greenoldman on Programmers See other posts from Programmers or by greenoldman
Published on 2012-11-30T15:33:42Z Indexed on 2012/11/30 17:19 UTC
Read the original article Hit count: 335

Filed under:

The method is described in Dragon Book, however I read about it in ""Parsing Techniques" by D.Grune and C.J.H.Jacobs".

I start from my understanding of building channels for NFA:

  1. channels are built once, they are like water channels with current
  2. you "drop" lookahead symbols in right places (sources) of the channel, and they propagate with "current"
  3. when symbol propagates, there are no barriers (the only sufficient things for propagation are presence of channel and direction/current); i.e. lookahead cannot just die out of the blue

Is that right?

If I am correct, then eof lookahead should be present in all states, because the source of it is the start production, and all other production states are reachable from start state.

How the DFA is made out of this NFA is not perfectly clear for me -- the authors of the mentioned book write about preserving channels, but I see no purpose, if you propagated lookaheads. If the channels have to be preserved, are they cut off from the source if the DFA state does not include source NFA state? I assume no -- the channels still runs between DFA states, not only within given DFA state.

In the effect eof should still be present in all items in all states. But when you take a look at DFA presented in book (pdf is from errata):

DFA for LALR (fig. 9.34 in the book, p.301)

you will see there are items without eof in lookahead.

The grammar for this DFA is:

S -> E
E -> E - T
E -> T
T -> ( E )
T -> n

So how it was computed, when eof was dropped, and on what condition?

Update

It is textual pdf, so two interesting states (in DFA; # is eof):

State 1:

S--- >•E[#]
E--- >•E-T[#-]
E--- >•T[#-]
T--- >•n[#-]
T--- >•(E)[#-]

State 6:

T--- >(•E)[#-)]
E--- >•E-T[-)]
E--- >•T[-)]
T--- >•n[-)]
T--- >•(E)[-)]

Arc from 1 to 6 is labeled (.

© Programmers or respective owner

Related posts about parser