Recursive descent parser in JavaScript
The parser enables you to write BNF-like rules directly in JavaScript without need to compile the file (like with Ragel, YACC, Bison, ANTLR etc.)
The grammar is a JS function with 11 arguments (9 rules and 3 hooks). Each rule gets two arguments: text (string) and state (arbitrary value) and returns a tuple of new text and new state (or null if rule was not matched). Parser walks character by character from left to right. text is always a tail of the initial text. Generally, text is empty string when parser finishes.
All(rule1, rule2, …) — a sequence of rules. Example: All(Char(“[”), list, Char(“]”)) defines a JS array.
Any(rule1, rule2, …) — “OR” rule for any number of rules. Example: JSONValue = Any(StringRule, ObjectRule, ArrayRule, …)
Char(alphabet) — character matching rule. Example: digit = Char(“0123456789”)
NotChar(alphabet) — inverse of Char(). Any character — NotChar(“”).
Optional(rule) — tries to match rule and skips it if not matched. Example: optSpace = Optional(Char(“ \t\r\n”))
EOF — matches the end of the text. Fails if text != “”.
Terminator — terminates parser. That is, always returns empty text.
Y — Y-combinator for defining recursive rules. Example: X = Y(function(x) { return abc(x) } ), where x is equal to X. Google for more info on Y-combinator.
Hooks enable you to actually build some structures using your grammars. Every hook returns a new state value to use in further rules. You should avoid mutable state values because some rules may be thrown away if not matched later (remember: this is a recursive parser!). For example, use array.concat([value]) instead of array.push(value).
Capture(rule, function(buffer, state1){ return state2 }) — captures raw string buffer to store in the state2.
Before(rule, function(state1){ return state2 }) — creates a new state for the rule (e.g. creates empty array for array syntax).
After(rule, function(beforeState, afterState){ return newState }) — creates a new state after successful match. You can put nested values to the container if beforeState is a container before rule parsing, afterState is a nested value (after rule match) and newState is a new container with this value.
See usage examples in JSONGrammar.js
See the parser source code in Parser.js
