Getting Started
	What is a Parser?

Main
	Latest News
	Getting Started
	Screen Shots
	Download
	Documentation
	Contributors
	Contact

About GOLD
	How It Works
	FAQ
	Why Use GOLD?
	Comparison
	Revision History
	Freeware License
	More ...

Articles
	What is a Parser?
	Backus-Naur Form
	DFA Lexer
	LALR Parsing
	Glossary
	Links
	More ...

What is a "Parser"?

While the text of a program is easy to understand by humans, the computer must convert it into a form which it can understand before any emulation or compilation can begin.

This process is know generally as "parsing" and consists of two distinct parts.

Components

Lexical Analysis

The first component is called the "lexer" - sometimes also called the "scanner". The lexer takes the source text and breaks it into the reserved words, constants, identifiers, and symbols that are defined in the language.

Lexical analysis is concerned with a grammar's terminals.

The result of the lexical analysis is a series of "tokens" which contains the text of the source broken into individual pieces of data. While terminals are used to represent the classification of information, tokens contain the actual information.

Essentially, a token is an instance of a terminal. For instance, the common identifier is a specific type of terminal, but can exist in various forms such as "Value1", "cat", "Sacramento", etc...

Syntactic Analysis

Syntactic analysis is concerned with a grammar's productions.

After the text is broken into a stream of tokens, the system needs to determine which groups of symbols form the meaningful constructs and groups used in the language.

The second component is called the "parser". This is where the terminology gets a tad confusing. Since a parser requires a lexer to function properly, the term "parser" is often used to refer to both.

The "tokens" created by the lexer are subsequently passed to the actual 'parser' which analyzes the series of tokens and then determines when one of the language's syntax rules is complete.

Finally...

The result of the lexical and syntactic analysis components is a tree that follows the structure of the grammar and contains all the tokens created by the lexer. Essentially, nonterminals function as the tree's nodes while tokens represent the tree's leaves.

In this form, the program is ready to be interpreted or compiled by the application. This can be in the form of compiling it to a new program, running it through interpretation or translating the text to another programming language.

Next: How Parsers Work