13.07.2015 Views

Parsing Expression Grammar as a Primitive Recursive-Descent ...

Parsing Expression Grammar as a Primitive Recursive-Descent ...

Parsing Expression Grammar as a Primitive Recursive-Descent ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Parsing</strong> <strong>Expression</strong> <strong>Grammar</strong> <strong>as</strong> a<strong>Primitive</strong> <strong>Recursive</strong>-<strong>Descent</strong> Parserwith BacktrackingAn article by Roman RedziejowskiPresented by:Jørgen Ulrik B. Krag 1


PEG sequence●●Syntax●A B C DSemantics●●●Apply each rule in order, consuming input for eachReturn success if all rules succeedElse reset input position and return failure8


PEG &Expr●●Syntax●&ExprSemantics●●Match against Expr, but do not consume any inputIf match succeeds return success, otherwise returnfailure9


PEG !Expr●●Syntax●!ExprSemantics●●Match against Expr, but do not consume any inputIf match succeeds return failure, otherwise returnsuccess10


PEG repetition●Syntax● Expr+ , Expr *●Semantics●●●Consume input <strong>as</strong> long <strong>as</strong> Expr matches+ returns failure if less than one match w<strong>as</strong> made* always returns success11


PEG zero-or-one●●Syntax●Expr?Semantics●●If Expr matches, consume input and return trueElse return true12


PEG literal matching●Syntax● [s], [c1-c2], 'literal string', .●Semantics●●●●[abcd], match a, b, c or d and consume or returnfailure[0-3], match 0, 1, 2, 3 and consume or return failure'literal string', match the string and consume orreturn failure., match any single character or return failure (at theend of input)13


PEG example●Value = [0-9]+ / '(' Expr ')'● Product = Value ( ( '*' / '/' ) Value )*● Sum = Product ( ( '+' / '-' ) Product )*●Expr = Sum14


PEG pitfalls●●●No left recursingHidden prefix capture●( '+' / '++' ) [a-z] does not match “++n”Spacing●●●No lexer to remove ignored inputSpacing rule must be applied in the grammar wherethere can be whitespacesE<strong>as</strong>y way to do it: Spacing before the first rule andspacing after every “token”15


PEG example with spacing●Value = [0-9]+ S / '(' S Expr ')' S● Product = Value ( ( '*' S / '/' S ) Value )*● Sum = Product ( ( '+' S / '-' S ) Product )*●●Expr = S SumS = [ \n\t]*16


Testing PEG/Packrat parsing●●Is packrat parsing neccessary?●●Most programming languages are mainly LL(1)Exponential in length of statement, times number ofstatementsExperiment●●●Write Java 1.5 PEG parserApply to 10522 source filesExperiment with saving the l<strong>as</strong>t result of eachprocedure17


Test results●●●●Uses about 20 calls per byte input, regardlessof input size16.1% of calls were repeated calls: Calls to thesame rule at the same inputSaving and reusing the l<strong>as</strong>t call of eachprocedure reduces repeated calls to 3.3%Storing the two l<strong>as</strong>t calls yields 1.1% repeatedcalls18


Optimizations●Identifiers used the following rule:●●Identifier = !Keyword Letter LetterOrDigit*Had to test 53 keywords before checking foridentifier● Using a h<strong>as</strong>htable instead gave 10.3%, 1.6%and 0.6% repeated calls while rememberingresult of the l<strong>as</strong>t 0, 1 and 2 calls respectively19


My opinion●●●●●Good overview and introduction to PEGIdentifies some problems with using PEGs forspecifying languagesGives some idea of the effectiveness of a PEGparser for javaDoes not go into details about some of the truebenefits of PEGsAsks a lot of questions in the conclusion20


Other <strong>as</strong>pects of PEGs●●●●UnambiguousTwo PEGs can be combined to form a newPEGMany packrat parsing tools allow you to specifysemantics along with the syntax●Makes it possible to make language extensions likein fortressError recovery can be very hard21


PEG implementations●●●Libraries and parser generators for a lot oflanguages: C, C#, Java, Python, Ruby,Jav<strong>as</strong>cript, Lisp...Rats!●The parser generator used in FortressPerl6: native PEG functionality <strong>as</strong> an extensionto RegExps22


Features of other PEGimplementations●Warning about hidden prefix capture●Optimized literal choice matching●Full packrat behavior23


Examples24

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!