|Backus-Naur Form||Backus-Naur Form is a notation used to describe the syntax of programming languages. In particular is it used to define productions.|
|Configuration||In parsing terms, a configuration is a production in the process of being completed. Configurations play a major role in the construction of LR parse tables.|
|A Deterministic Finite Automaton is often used to analyze a series of characters. Often it is implemented using state driven transition graph. Please see theDeterministic Finite Automata page for more information.|
|Grammar||Please see theGrammar and Backus-Naur Form page|
|LALR Parsing||LALR Parsing, or "Lookahead LR
parsing", is a variant of LR Parsing that
combines different "configurations" to limit the size of the parse tables. As a
result, the algorithm is slightly less powerful than the LR Parsing. Grammars that can be
parsed by a LR parser, might be found to be "ambiguous" by LALR. However, this
is very rarely the case and real-world examples are few.
The number of states eliminated by LALR are sometimes huge. The C programming language, for instance, has over 10,000 LR states. LALR drops this number to slightly more than 200.
LR Parsing, or Left-to-right Right-derivative parsing, uses tables to determine when a rule is complete and when additional tokens are needed to be read from the source.The LR Parser does very little "thinking" at runtime. All decisions are based on the content of the parse tables. The construction of these tables where all the "thinking" takes place. LR parser generators, such as YACC and GOLD, construct these tables by analyzing the grammar and determining all the possible "states" the system can be in when parsing.
Each state represents a point in the parse process where a number of tokens have been read from the source and rules are in different states of completion. Each production in a state of completion is called a "configuration" and each state is really a configuration set.
LR parse tables can be huge, and, as a result, often a variant of LR Parsing is used. For more information, please see theLALR Algorithm page.
|Nonterminal||A nonterminal is a symbol used in Backus-Naur
form represent a syntactic structures defined in the grammar.
Please see theDefine Rules page
A nullable rule is a type of rule which is optional, or in other words, can contain no symbols. In the GOLD Meta-Language, a nullable rule can be specified by adding a blank entry:
|Parser||A parser is software, such as a procedure or library, that organizes text into a set of logical units used by a programming language.|
|Parser Generator||A parser generator is a generic term that refers to any software program that helps develop a working parser. Examples include the GOLD, YACC and ANTLR.|
|Production||Please see theDefine Rules page.|
|A Reduce-Reduce Conflict is a caused when a grammar allows two
or more different rules to be reduced at the same time,
for the same token. When this happens, the grammar becomes ambiguous since a program can
be interpreted more than one way.
For instance, assume you have the following grammar:
When the system reads an identifier, it cannot determine if it has completed <A> or<B>.
When a LALR parser generator is analyzing a grammar and constructing the parse tables, these conflicts are located immediately.
|Regular Expression||A Regular Expression is a notation used describe patterns of characters. In programming language theory, they are often used to describe a languages' terminals. For more information, please see theDefine Terminals page|
|Rule||Please see theDefine Rules page.|
|Semantics||The semantics of a programming language refers to how the
actual statements, constructs, etc... are interpreted. This is quite different from the
syntax of a programming language which refers to how
different symbols and reserved words are arranged.
For instance, in both Visual Basic and C++, the following is a valid expression:
Even though the syntax is the same between the two languages, the semantics are quite different. In C++, this expression is interpreted as a binary-and of "a" and "b". In Visual Basic, however, this expression returns the concatenation of two strings.
|Symbol||In parsing terms, a symbol is the building block of a grammar and can be either a terminal or nonterminal. Essentially, the term "symbol" is used to refer to either of its two forms.|
|Syntax||The term "syntax" refers to the structure of a programming language, in particular, the different series of symbols and words that make up the basic parts of the language. This is quite different from the programming language's semantics - the actual meaning of those different parts.|
|Shift||A "shift" is an action performed by a parsing
engine when it reads a token that is valid in the current state.
Lookahead parsers maintain a list of terminals which are expected to be read from each state,
given that the syntax of the program is correct. If the token is not the expected, a
syntax error occurs.
In the bottom-up parser engines, such as GOLD and YACC, a shift pushes the token onto an internal stack that is used to hold tokens that are being used to construct completed productions.
|The Shift-Reduce Conflict is the most common type of conflict
found in grammars. It is caused when the grammar allows a
to be reduced for particular token, but, at the
same time, allowing another rule to be
shifted for that
As a result, the grammar is ambiguous since a program can be interpreted more than one way.
This error is often caused by recursive grammar definitions where the system cannot determine when one rule is complete and another is just started. The Builder documentation contains an example of the common if-then-else grammar problem and how to fit it.
|Terminal||In Backus-Naur Form, a terminal is used to denote a programming language's reserved words and symbols. Please see thePlease see theDefine Rules page.|