12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 164exact SCFG prefix and next-word probabilities to a tightly-coupled speech decoder (Jurafsky et al. 1994b).An essential idea in the probabil<strong>is</strong>tic formulation <strong>of</strong> Earley’s algorithm <strong>is</strong> the collapsing <strong>of</strong> recursivepredictions and unit completion chains, replacing them by lookups in precomputed matrices. Th<strong>is</strong> idea ar<strong>is</strong>esin our formulation out <strong>of</strong> the need to compute probability sums given as infinite series. Graham et al. (1980)use a non-probabil<strong>is</strong>tic version <strong>of</strong> the same technique to create a highly optimized Earley-like parser forgeneral CFGs that implements prediction and completion by operations on Boolean matrices. 25<strong>The</strong> matrix inversion method for dealing with left-recursive prediction <strong>is</strong> borrowed from the LRIalgorithm <strong>of</strong> Jelinek & Lafferty (1991) for computing prefix probabilities for SCFGs in CNF. 26 We then usethat idea a second time to deal with the similar recursion ar<strong>is</strong>ing from unit productions in the completion step.We suspect, but have not proved, that the Earley computation <strong>of</strong> forward probabilities when applied to a CNFgrammar performs a computation that <strong>is</strong> in some sense <strong>is</strong>omorphic to that <strong>of</strong> the LRI algorithm. In any case,we believe that the parser-oriented view afforded by the Earley framework makes for a more intuitive solutionto the prefix probability problem, with the added advantage that it <strong>is</strong> not restricted to CNF grammars.Kupiec (1992a) has proposed a version <strong>of</strong> the Inside/Outside algorithm that allows it to operateon non-CNF grammars. Interestingly, Kupiec’s algorithm <strong>is</strong> also based on a generalization <strong>of</strong> finite-statemodels, namely, Recursive Transition Networks (RTNs). Probabil<strong>is</strong>tic RTNs are essentially HMMs thatallow nonterminals as output symbols. Also, the dotted productions appearing in Earley states are exactlyequivalent to the states in an RTN derived from a CFG.6.7.5 A simple typology <strong>of</strong> SCFG algorithms<strong>The</strong> various known algorithms for probabil<strong>is</strong>tic CFGs share many similarities, and vary alongsimilar dimensions. One such dimension <strong>is</strong> whether the quantities entered into the parser chart are defined ina bottom-up (CYK) fashion, or whether left-to-right constraints are an inherent part <strong>of</strong> their definition. 27Another point <strong>of</strong> variation <strong>is</strong> the ‘sparseness’ trade-<strong>of</strong>f. If we are given a set <strong>of</strong> nonterminals andwanted to l<strong>is</strong>t all possible CFG rules involvingthose nonterminals, the l<strong>is</strong>t would be infinite due to the arbitrarylength <strong>of</strong> the right-hand sides <strong>of</strong> productions. Th<strong>is</strong> <strong>is</strong> a problem, for example, when training a CFG startingwith complete ignorance about the structure <strong>of</strong> the rules.A workaround <strong>is</strong> to restrict the rule format somehow, usually to CNF, and then l<strong>is</strong>t all possibleproductions. Algorithms that assume CNF are usually formulated in terms <strong>of</strong> such a fully parameterizedgrammar where all triples )g)=i?> form a possible rule ) ¸ =>withcases they may be specialized to handle sparse grammars efficiently.non-zero probability, although in manyAt the other extreme we have algorithms with accept unrestricted CFG productionsand are thereforemeant for sparse grammars, where almost all (in the set theoretic sense) possible productions have probability25 Th<strong>is</strong> connection to the GHR algorithm was pointed out by Fernando Pereira. Exploration <strong>of</strong> th<strong>is</strong> link then lead to the extension <strong>of</strong>our algorithm to handle¹-productions, as described in Section 6.4.7.26 <strong>The</strong>ir method uses the transitive (but not reflexive) closure over the left-corner relation ¦Vº, for which they chose the symbol»º.We chose the symbol”ºin th<strong>is</strong> chapter to point to th<strong>is</strong> difference.27 Of course a CYK-style parser can operate left-to-right, right-to-left, or otherw<strong>is</strong>e by reordering the computation <strong>of</strong> chart entries.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!