Difference between revisions of "Recursive Descent Parsers"

From Progzoo
Jump to: navigation, search
(Production Rules)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
A Recursive Descent Parser takes a stream of tokens and builds a syntax tree.
 
A Recursive Descent Parser takes a stream of tokens and builds a syntax tree.
 +
[[Recursive Descent Parser Tutorial]]
 
==JSON Data Format==
 
==JSON Data Format==
 
As an example we will take the JSON data format. This is simple, but it does have the recursive structure that shows off the technique.
 
As an example we will take the JSON data format. This is simple, but it does have the recursive structure that shows off the technique.
Line 11: Line 12:
 
There are hash arrays
 
There are hash arrays
 
For example:
 
For example:
*{num:123, str: "one" }
+
*{a:123, b: "one" }
 
You can have a list inside a list of course:
 
You can have a list inside a list of course:
 
*[1, 2, [3, 4]]
 
*[1, 2, [3, 4]]
 
Similarly you can have a hash array of lists, and in JavaScript you can mix types:
 
Similarly you can have a hash array of lists, and in JavaScript you can mix types:
 
*{a:1, b:[1,2,3], c:{x:2}}
 
*{a:1, b:[1,2,3], c:{x:2}}
 +
 
==Production Rules==
 
==Production Rules==
 +
An eta ε production allows the empty string as an option.
 +
 
  N -> ''number''
 
  N -> ''number''
 
  N -> [ N Z ]
 
  N -> [ N Z ]
Line 24: Line 28:
 
  Y -> , ''word'' : N Y
 
  Y -> , ''word'' : N Y
 
  Y -> ε
 
  Y -> ε
 +
Productions such as
 +
N -> ''number''
 +
Are permitted because the lexer can recognise numbers, similarly strings.
 +
==A Recursive Descent Parser in Java==
 +
*Each of the non-terminals N Z and Y have a method ParseN, ParseZ and ParseY.
 +
*In each case the method takes the token stream as input and does the following:
 +
**Consumes all tokens associated with the thing parsed
 +
**Returns a data structure representing the thing parsed
 +
<syntaxhighlight>
 +
static Object ParseN(StreamTokenizer st) throws Exception
 +
{
 +
st.nextToken();
 +
if (st.ttype == StreamTokenizer.TT_NUMBER)
 +
return null;
 +
if (st.ttype=='[')
 +
{
 +
ParseN(st);
 +
ParseZ(st);
 +
if (st.nextToken()!=']')
 +
throw new Exception("Was expecting ]");
 +
return null;
 +
}
 +
… section missing …
 +
throw new Exception("Unexpected input.");
 +
}
 +
</syntaxhighlight>
 +
[[Recursive Descent Parser Tutorial]]

Latest revision as of 08:42, 8 March 2011

A Recursive Descent Parser takes a stream of tokens and builds a syntax tree. Recursive Descent Parser Tutorial

JSON Data Format

As an example we will take the JSON data format. This is simple, but it does have the recursive structure that shows off the technique. JSON is used to represent data. There are two primitive type: string and number.

  • "one"
  • 123

There are lists For example:

  • [123, "one" ]

There are hash arrays For example:

  • {a:123, b: "one" }

You can have a list inside a list of course:

  • [1, 2, [3, 4]]

Similarly you can have a hash array of lists, and in JavaScript you can mix types:

  • {a:1, b:[1,2,3], c:{x:2}}

Production Rules

An eta ε production allows the empty string as an option.

N -> number
N -> [ N Z ]
N -> { word : N Y }
Z -> , N Z
Z -> ε
Y -> , word : N Y
Y -> ε

Productions such as

N -> number

Are permitted because the lexer can recognise numbers, similarly strings.

A Recursive Descent Parser in Java

  • Each of the non-terminals N Z and Y have a method ParseN, ParseZ and ParseY.
  • In each case the method takes the token stream as input and does the following:
    • Consumes all tokens associated with the thing parsed
    • Returns a data structure representing the thing parsed
static Object ParseN(StreamTokenizer st) throws Exception
{
	st.nextToken();
	if (st.ttype == StreamTokenizer.TT_NUMBER)
		return null;
	if (st.ttype=='[')
	{
		ParseN(st);
		ParseZ(st);
		if (st.nextToken()!=']')
			throw new Exception("Was expecting ]");
		return null;
	}
	… section missing …
	throw new Exception("Unexpected input.");
}

Recursive Descent Parser Tutorial