25.10.2015 Views

Write You a Haskell Stephen Diehl

1kEcQTb

1kEcQTb

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Extended Parser<br />

Up until now we’ve been using parser combinators to build our parsers. Parser combinators are a topdown<br />

parser formally in the LL(k) family of parsers. e parser proceeds top-down, with a sequence of<br />

k characters used to dispatch on the leftmost production rule. Combined with backtracking (i.e. try<br />

combinator) this is simultaneously both an extremely powerful and simple model to implement as we<br />

saw before with our simple 100 line parser library.<br />

However there are a family of grammars that include left-recursion that LL(k) can be inefficient and often<br />

incapable of parsing. Left-recursive rules are the case where the left-most symbol of the rule recurses on<br />

itself. For example:<br />

e ::= e op atom<br />

Now we demonstrated a way before that we could handle these cases using the parser combinator<br />

chainl1 function, and while this is possible sometimes it can in many cases be inefficient use of parser<br />

stack and lead to ambiguous cases.<br />

e other major family of parsers LR are not plagued with the same concerns over left recursion. On the<br />

other hand LR parser are exceedingly more complicated to implement, relying on a rather sophisticated<br />

method known as Tomita’s algorithm to do the heavy lifting. e tooling can around the construction of<br />

the production rules in a form that can be handled by the algorithm is often handled a DSL that generates<br />

the code for the parser. While the tooling is fairly robust, there is a level of indirection between us and<br />

the code that can often be a bit of brittle to extend with custom logic.<br />

e most common form of this toolchain is the Lex/Yacc lexer and parser generator which compile into<br />

efficient C parsers for LR grammars. <strong>Haskell</strong>’s Happy and Alex are roughly the <strong>Haskell</strong> equivalent of<br />

these tools.<br />

Toolchain<br />

Our parser logic will be spread across two different modules.<br />

• Lexer.x<br />

• Parser.y<br />

e code in each of these modules is a hybrid of the specific Alex/Happy grammar syntax and arbitrary<br />

<strong>Haskell</strong> logic that is spliced in. Code delineated by braces ({}) is regular <strong>Haskell</strong>, while code outside is<br />

parser/lexer logic.<br />

128

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!