12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

+CHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 1606.6.3.2 Efficient predictionAs d<strong>is</strong>cussed in Section 6.4.9, the worst-case run-time on fully parameterized CNF grammars <strong>is</strong>dominated by the completion step. However, th<strong>is</strong> <strong>is</strong> not necessarily true <strong>of</strong> sparse grammars. Our experimentsshowed that the computation <strong>is</strong> dominated by the generation <strong>of</strong> Earley states during the prediction steps.It <strong>is</strong> therefore worthwhile to minimize the total number <strong>of</strong> predicted states generated by the parser.Since predicted states only affect the derivation if they lead to subsequent scanning we can use the next inputsymbol to constrain the relevant predictions. To th<strong>is</strong> end, we compute the extended left-corner © relation w,indicating which terminals can appear as left corners <strong>of</strong> which ©w nonterminals. <strong>is</strong> a Boolean matrix withrows indexed by nonterminals and columns indexed by terminals. It can be computed as the productwhere + w6|© w©wa non-zero entry H_ at iff there <strong>is</strong> a production for H nonterminal that starts with terminals .has<strong>is</strong> the old left-corner relation.©During the prediction step we can ignore incoming states whose RHS nonterminal following thedot cannot have the current input as a left-corner, and then eliminate from the remaining predictions all thosewhose LHS cannot produce the current input as a left-corner. <strong>The</strong>se filtering steps are very fast as they involveonly table lookup.On a test corpus th<strong>is</strong> technique cut the number <strong>of</strong> generated predictions to almost 1/4 and speeded upparsing by a factor <strong>of</strong> 3.3. <strong>The</strong> corpus cons<strong>is</strong>ted <strong>of</strong> 1143 sentence with an average length <strong>of</strong> 4.65 words. <strong>The</strong>top-down prediction alone generated 991781 states and parsed at a rate <strong>of</strong> 590 mill<strong>is</strong>econds per sentence. Withbottom-up filtered prediction only 262287 states were generated, resulting in 180 mill<strong>is</strong>econds per sentence.A trivial optimization <strong>of</strong>ten found in Earley parsers <strong>is</strong> to precompute the entire first prediction step,as it doesn’t depend on the input and may eliminate a substantial portion <strong>of</strong> the total predictions per sentence. 18We found that with bottom-up filtering th<strong>is</strong> technique lost its edge: scanning the precomputed predicted statesturned out to be slower than computing the zeroth state set filtered by the first input.6.7 D<strong>is</strong>cussion6.7.1 Relation to finite-state modelsThroughout the exposition <strong>of</strong> the Earley algorithm and its probabil<strong>is</strong>tic extension we have beenalluding, in concepts and terminology, to the algorithms used with probabil<strong>is</strong>tic finite-state models,in particularHidden Markov Models (Rabiner & Juang 1986). Many concepts carry over, if suitably generalized, mostnotably that <strong>of</strong> forward probabilities. Prefix probabilities can be computed from forward probabilities by theEarley parser just as in HMMs because Earley states summarize past h<strong>is</strong>tory in much the same way as thestates in a finite-state model. <strong>The</strong>re are important differences, however. <strong>The</strong> number <strong>of</strong> states in an HMMCommonL<strong>is</strong>p/CLOS implementation <strong>of</strong> generic sparse matrices that was not particularly optimized for th<strong>is</strong> task.18 <strong>The</strong> first prediction step accounted for roughly 30% <strong>of</strong> all predictions on our test corpus.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!