Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Extended Parser<br />
Up until now we’ve been using parser combinators to build our parsers. Parser combinators are a topdown<br />
parser formally in the LL(k) family of parsers. e parser proceeds top-down, with a sequence of<br />
k characters used to dispatch on the leftmost production rule. Combined with backtracking (i.e. try<br />
combinator) this is simultaneously both an extremely powerful and simple model to implement as we<br />
saw before with our simple 100 line parser library.<br />
However there are a family of grammars that include left-recursion that LL(k) can be inefficient and often<br />
incapable of parsing. Left-recursive rules are the case where the left-most symbol of the rule recurses on<br />
itself. For example:<br />
e ::= e op atom<br />
Now we demonstrated a way before that we could handle these cases using the parser combinator<br />
chainl1 function, and while this is possible sometimes it can in many cases be inefficient use of parser<br />
stack and lead to ambiguous cases.<br />
e other major family of parsers LR are not plagued with the same concerns over left recursion. On the<br />
other hand LR parser are exceedingly more complicated to implement, relying on a rather sophisticated<br />
method known as Tomita’s algorithm to do the heavy lifting. e tooling can around the construction of<br />
the production rules in a form that can be handled by the algorithm is often handled a DSL that generates<br />
the code for the parser. While the tooling is fairly robust, there is a level of indirection between us and<br />
the code that can often be a bit of brittle to extend with custom logic.<br />
e most common form of this toolchain is the Lex/Yacc lexer and parser generator which compile into<br />
efficient C parsers for LR grammars. <strong>Haskell</strong>’s Happy and Alex are roughly the <strong>Haskell</strong> equivalent of<br />
these tools.<br />
Toolchain<br />
Our parser logic will be spread across two different modules.<br />
• Lexer.x<br />
• Parser.y<br />
e code in each of these modules is a hybrid of the specific Alex/Happy grammar syntax and arbitrary<br />
<strong>Haskell</strong> logic that is spliced in. Code delineated by braces ({}) is regular <strong>Haskell</strong>, while code outside is<br />
parser/lexer logic.<br />
128