LALR: ParseToken Function

 

function ParseToken (Token) returns Parse-Result-Token

This function implements the LALR Parser.

Parameters:

Token created by the RetrieveToken() function

Returns

Parse-Result-Constant which tells the Parse() function the action that was performed. This constant is used
internally.

This function analyzes a token and either:
1. Makes a SINGLE reduction and pushes a Reduction on the stack.
2. Accepts the token and shifts.
3. Reports (and handles) a syntax error

if an Action for Token exists in the Current-LALR-State

case type of Action is

Accept:

Set Result = Accept.

Shift:

Set the State property of Token to the Current-LALR-State.
Push Token onto the LALR-Token-Stack.
Set Current-LALR-State = target state of Action.

Set Result = Shift.

Reduce:

This section of the algorithm will reduce the rule specified by the action A.

At this point during the parse, the Token Stack contains a series of tokens that represent the body of the rule being reduced. This is called the 'handle'. For instance, given the rule:

<S> ::= if <E> then <S>

the handle is

if <E> then <S>

These tokens are then popped from the stack and are replaced by the nonterminal token representing the handle. In the example above, this will be <S>.

The token structure / object normally contains a 'data' attribute that is used to store the text read from the source stream. If the token contains a nonterminal, this attribute can be used to store a Reduction object.

Set Reduce-Rule = rule to be reduced which is specified in Action.

PART 1: REDUCE THE RULE

If Trim-Reductions is active and Reduce-Rule contains one nonterminal

The "Trim Reductions" feature is designed to simplify the parse tree by removing unneeded reductions. In particular, reductions which contain a single nonterminal can be eliminated from the tree without changing its meaning.

This feature is particularly useful if the parse tree is going to be analyzed based on its "content" rather than the structure of the language.

Essentially, the Data property of the Head will be assigned the Data property of the reduced token (i.e. the only one on the stack). The token on the top of the stack can either be replaced for modified. The code below discards it.

Set Item = Pop the top token off the LALR-Token-Stack.

Create a new token Head.
Set Parent-Symbol property of Head to the rule being reduced by Action.
Set the Data propety of Head to the Data property of Item.

Set Result = Reduce-Trimmed.

else

Part 1.a: Pop the handle off the Token Stack and create a reduction.

Create a new Reduction.
Set the Parent-Rule property of Reduction to the rule reduced by Action.

loop for each symbol in the handle of the Reduce-Rule

Pop a token Item from the stack.
Add Item to Reduction.

next loop

Part 2.b: Create a new token that will contain the nonterminal representing the reduced rule.

Create a new token Head.
Set Parent-Symbol property of Head to the rule being reduced by Action.
Set the Data property of Head to the Reduction.

Set Result = Reduce.

end if

PART 2: GO TO THE NEXT STATE
The top of the token stack contains the LALR state - in the token's 'data' field - that the algorithm was in before it shifted.

The algorithm looks at this state and takes the GOTO action for the rule that was just reduced. Essentially, the rule's nonterminal Head is found in state referred to by the top of the stack.

This simulates the shift of a nonterminal.

Set Lookup-State = State property of the token on the top of the LALR-Token-Stack.

Set Current-LALR-State = State specified by the goto for Head in Lookup-State.

PART 3: PUSH THE NEW NONTERMINAL TOKEN

Set the State property of Head to the Current-LALR-State.
Push Head onto the LALR-Token-Stack.

end case

else

There is a syntax error!

Each implementation of the Engine can take whatever actions that the developer thinks is best.

One possible approach is to read through the Shift actions for the Current-LALR-State. The terminal for each Shift is an "expected" reserved word or symbol. This information can be passed to the application.

Set Result = Syntax-Error.

end if

Return the Result.

end function