12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

are then summed over all nonterminals>, and the result <strong>is</strong> once multiplied by the rule probability +-,= ¸CHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 1586.6 Implementation IssuesTh<strong>is</strong> section briefly d<strong>is</strong>cusses some <strong>of</strong> the experience gained from implementing the probabil<strong>is</strong>ticEarley parser. Implementation <strong>is</strong> mainly straightforward and many <strong>of</strong> the standard techniques for context-freegrammars can be used (Graham et al. 1980). However, some aspects are unique due to the addition <strong>of</strong>probabilities.6.6.1 PredictionDue to the collapsing <strong>of</strong> transitive predictions, th<strong>is</strong> step can be implemented in a very efficient andstraightforward manner. As explained in Section 6.4.5, one has to perform a single pass over the currentstate set, identifying all nonterminals> occurring to the right <strong>of</strong> dots, and add states corresponding to all@that are reachable through the relation> left-corner =. As indicated in equation (6.3), C/¸contributions to the forward probabilities <strong>of</strong> new states have to be summed when several paths lead to theproductions=same state. However, the summation in equation (6.3) can be mostly eliminated if the Æ values for all oldstates with the same nonterminal> are summed first, and then multiplied by © . <strong>The</strong>se quantities CTœ=ë0,>to give the forward probability for the predicted state.@E06.6.2 CompletionUnlike prediction, the completion step still involves iteration. Each complete state derived bycompletion can potentially feed other completions. An important detail here <strong>is</strong> that to ensure that allcontributions to a state’s andŠ are summed before proceeding with using that state as input to furthercompletion steps.ÆOne approach to th<strong>is</strong> problem <strong>is</strong> to insert complete states into a prioritized queue. <strong>The</strong> queue ordersstates by their start indices, highest first. Th<strong>is</strong> <strong>is</strong> because states corresponding to later expansion always haveto be completed first before they can lead to the completion <strong>of</strong> earlier expansions. For each start index, theentries are managed as a first-in-first-out queue, ensuring that the directed dependency graph formed by thestates <strong>is</strong> traversed in breadth-first order.A completion pass can now be implemented as follows. Initially, all complete states from theprevious scanning step are inserted in the queue. States are then removed from the front <strong>of</strong> the queue, andused to complete other states. Among the new states thus produced, complete ones are again added to thequeue. <strong>The</strong> process iterates until no more states remain in the queue. Because the computation <strong>of</strong> probabilitiesalready includes chains <strong>of</strong> unit productions, states derived from such productions need not be queued, whichalso ensures that the iteration terminates.A similar queuing scheme, with the start index order reversed, can be used for the reverse completionstep needed in the computation <strong>of</strong> outer probabilities (Section 6.5.2).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!