homechevron_rightProfessionalchevron_rightComputers

# LL1 grammar analysis. Yet another top-down parser generator.

The calculator checks LL1 grammar correctness, parses a text using the grammar, shows FIRST, FOLLOW and FIRST PLUS sets, parsing tree and gives PLANETCALC parsing code.

The following calculator generates the parser code by the grammar given in EBNF form. As example we use classic math expression grammar.

The result can be used to create LL(1) parser using PCP library, available on this site .
Code example:

var parser = new PCP.parser( [/* put generated code here */] );
var tree = parser.parse( text ); //creates parsing tree
//result traversal
tree.visit( {
/* define custom traversal functions here with names according your grammar
(nt1 and nt2 names are choosen here for illustration purposes only);*/
"nt1": function ( v ) {
// use to get expression value: v.getValue()
}
,"nt2": function ( v ) {
// use for children traversal: v.visit( childvisitor );
}
});

Some details regarding EBNF syntax and this calculator usage are below the calculator.

### LL1 parser generator

Extended BNF grammar in ISO/IEC 14977 : 1996(E) form.

Rule names divided by comma, which must be removed.

Grammar analysis
PLANETCALC parser code

Expression parsing result

Save the calculation to reuse next time, to extension embed in your website or share share with friends.

## Extended Backus-Naur Form (EBNF)

EBNF has may advantages comparing to simple BNF:

• EBNF rule can contain a few text lines. The rule end is marked by the semicolon (;).

terminal character
= letter
| decimal digit
| concatenate symbol
| defining symbol
| definition separator symbol
| end comment symbol
| end group symbol
| end option symbol
| end repeat symbol
| except symbol
| first quote symbol
| repetition symbol
| second quote symbol
| special sequence symbol
| start comment symbol
| start group symbol
| start option symbol
| start repeat symbol
| terminator symbol
| other character;

• Nonterminal symbols must not be marked by angle brackets (but all terminal symbols bust be in single or double quotes).

decimal digit = ’0’ | ’1’ | ’2’ | ’3’ | ’4’ | ’5’ | ’6’ | ’7’| ’8’ | ’9’;

• EBNF has a special syntax for Kleene closure - any expression in braces can be repeated zero or more times:

syntax= syntax rule, {syntax rule};

• Exceptions can be defined using minus sign. For example the following excerption form EBNF standard defines any symbol except single quotation mark:

first terminal character= terminal character - first quote symbol;

Full RBNF description can be found in ISO/IEC 14977 standard.

## EBNF standard extensions

A few extensions to EBNF standard are implemented in the calculator above. As a special sequence you can use a regular expression or hexadecimal symbol code. E.g., for line feed you may use the following rule, defined by two hexadecimal symbols:

lineend= ?0xd?,?0xa?;

E.g. the following regular expression-based rule can be used for non printable symbols matching:

sp=?/\s+/?;

## Grammar analysis

The calculator performs simple grammar analysis. It detects the following errors and unacceptable conditions:

• Grammar syntax errors
• Root rule missing
• Single root rule check
• Left recursion
• Back-track free parsing condition

## Scanning and parsing combined together

In classic compiler theory there is scanning step which produces tokens from the input stream. During scanning, all unnecessary symbols e.g. unprintable ones and comments get separated from tokens, which is used further on parsing step. Commonly scanners are based on regular expressions. This calculator does scanning and parsing in the same way using single grammar. Commonly, if you define a single grammar for scanning and parsing, the grammar become very complex. To simplify the grammar we allow to split whole grammar in multiple parts each of which has its own root rule. Remove rules input box defines the rules to be removed during preliminary parsing (scanning) steps. To see an example grammar with two parsing steps and tokens to be removed on preliminary parsing step click the following link:

Two step grammar example

Literature: