12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

) In particular, the string probability +9,9 c) <strong>The</strong> prefix probability C +9,91 C+-,91+-,9 1y @ 10&6 ¸:Œ”'/åŠ@_,6)600 ¸ ¸y 0%6 ¸ .1BD ?Œ”'/z§‡1åKÆ%DY,6)CHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 132set H . Having said that, we will refer to Æ simply as a probability, both for the sake <strong>of</strong> brevity, and to keep theanalogy to the HMM terminology <strong>of</strong> which th<strong>is</strong> <strong>is</strong> a generalization. 7 Note that for scanned states, <strong>is</strong> alwaysa probability, since by definition a scanned state can occur only once along a path.Æ<strong>The</strong> inner probabilities, on the other hand, represent the probability <strong>of</strong> generating a substring <strong>of</strong> theinput from a given nonterminal, using a particular production. Inner probabilities are thus conditional on thepresence <strong>of</strong> a given )nonterminal with expansion starting at position", unlike the forward probabilities,which include the generation h<strong>is</strong>tory starting with the initial state. <strong>The</strong> inner probabilities as defined herecorrespond closely to the quantities <strong>of</strong> the same name in Baker (1979). <strong>The</strong> sum <strong>of</strong>Š<strong>of</strong> all states with a givenLHS ) <strong>is</strong> exactly Baker’s inner probability for ) .<strong>The</strong> following lemma <strong>is</strong> essentially a restatement <strong>of</strong> Lemma 6.2 in terms <strong>of</strong> forward and innerprobabilities. It shows how to obtain the sentence and string probabilities we are interested in, provided thatforward and inner probabilities can be computed effectively.a) that9 Provided CÀprobability that a nonterminal )0å…å…å6?Lemma 6.3 <strong>The</strong> followingassumes an Earley chart constructed by the parser on an input string 1 with .1a possible left-most derivation <strong>of</strong> the grammar (for some@), thegenerates the substring 16£££1 can be computed as the sum.6 G .1>@A?<strong>is</strong>+9,1)z@ @C? 16å…å…å ) C(sum <strong>of</strong> inner probabilities over all complete states with LHS )and start index")..—£ 00 can be computed as 89%£…0C10°6 ŠD ,9%£…0C1Æ D ,0 , with .1.6|G , can be computed as1£ ˜M0(sum <strong>of</strong> forward probabilities over all scanned states).<strong>The</strong> restriction in (a) that )be preceded by a possible prefix <strong>is</strong> necessary since the Earley parserat position H will only pursue derivations that are cons<strong>is</strong>tent with the input up to position H . Th<strong>is</strong> constitutesthe main d<strong>is</strong>tingu<strong>is</strong>hing feature <strong>of</strong> Earley parsing compared to the strict bottom-up computation used inthe standard inside probability computation (Baker 1979). <strong>The</strong>re, inside probabilities for all positions andnonterminals are computed, regardless <strong>of</strong> possible prefixes.7 <strong>The</strong> same technical complication was noticed by Wright (1990) in the computation <strong>of</strong> probabil<strong>is</strong>tic LR parser tables. <strong>The</strong> relationto LR parsing will be d<strong>is</strong>cussed in Section 6.7.3. Incidentally, a similar interpretation <strong>of</strong> forward ‘probabilities’ <strong>is</strong> required for HMMswith non-emitting states.8 <strong>The</strong> definitions <strong>of</strong> forward and inner probabilities coincide for the final state.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!