12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

H : d=:6) ¸ÆÆÆ:6) ¸ .V=i£Ù )ŠÙÙ ŠÙCHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 134<strong>of</strong> choosing production=¸@. <strong>The</strong> valueŠEÙ <strong>is</strong> just a special case <strong>of</strong> the definition.Rationale. Æ Ù <strong>is</strong> the sum <strong>of</strong> all path probabilities leading up to6)¸.M£=§˜ , times the probabilityScanning (probabil<strong>is</strong>tic)for all states with terminal matching input at position H . <strong>The</strong>n.—£ f˜)Šó 6QC.BE£ ˜óH :6) ¸1 :6) ¸ò Æò…ÆHBNÙ 66 ŠÙRationale. Scanning does not involve any new choices since the terminal was already selected asŠpart <strong>of</strong> the production during prediction. 10Completion (probabil<strong>is</strong>tic)ò…Æ@$£Ù…ÙY)Š>Ù…Ùñóò…Ƹò…Ƙó.—£=¤˜)ŠóŽ 6QC H<strong>The</strong>nÙ += ÆŠÙ…Ù (6.1)+= Š}ŠÙ…Ù (6.2) ŠÙNote that Ù <strong>is</strong> not used.Rationale. To update the old forward/inner probabilities Æ andŠ to Æ Ù andŠ>Ù , respectively, theÆprobabilities <strong>of</strong> all expanding=¸paths @have to be factored in. <strong>The</strong>se are exactly the paths summarizedby the probabilityŠBÙ…Ù inner .6.4.5 Coping with recursion<strong>The</strong> standard Earley algorithm, together with the probability computations described in the previoussection would be sufficient if it weren’t for the problem <strong>of</strong> recursion in the prediction and completion steps.<strong>The</strong> non-probabil<strong>is</strong>tic Earley algorithm can stop recursing as soon as all predictions/completionsyield states already contained in the current state set. For the computation <strong>of</strong> probabilities, however, th<strong>is</strong>would mean truncating the probabilities resulting from the repeated summing <strong>of</strong> contributions.10 In different parsing scenarios the scanning step may well modify probabilities. For example, if the input symbols themselves haveattached likelihoods these can be integrated by multiplying them onto and when a symbol <strong>is</strong> scanned. That way it <strong>is</strong> possible toperform efficient Earley parsing with integrated joint probability computation directly on weighted lattices <strong>of</strong> input symbols.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!