12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 161remains fixed, whereas the number <strong>of</strong> possible Earley states grows linearly with the length <strong>of</strong> the input (dueto the start index).Incidentally, the HMM concept <strong>of</strong> backward probabilities has no useful analog in Earley parsing.It <strong>is</strong> tempting to define´@, í 0 as the conditional probability that the generator produces the remaining stringgiven that it <strong>is</strong> currently in state . Alas, th<strong>is</strong> would be an ill-defined quantity since the generation <strong>of</strong> a suffixdepends (via completion) on more than just the current state.<strong>The</strong> solution found by Baker (1979), adopted here in modified form, <strong>is</strong> to use outer probabilitiesíinstead <strong>of</strong> ’backward’ probabilities. Outer probabilities follow the hierarchical structure <strong>of</strong> a derivation, ratherthan the sequential structure imposed by left-to-right processing. Fortunately, outer probability computation<strong>is</strong> just as well supported by the Earley chart as forward and inner probabilities. 196.7.2 Online pruningIn finite-state parsing (especially speech decoding) one <strong>of</strong>ten makes use <strong>of</strong> the forward probabilitiesfor pruning partial parses before having seen the entire input. Pruning <strong>is</strong> formally straightforward in Earleyparsers: in each state set, rank states according to their values, then remove those states with smallprobabilities compared to the current best candidate, or simply those whose rank exceed a given limit. NoticeÆth<strong>is</strong> will not only omit certain parses, but will also underestimate the forward and inner probabilities <strong>of</strong> thederivations that remain. Pruning procedures have to be evaluated empirically since they invariably sacrificecompleteness and, in the case <strong>of</strong> the Viterbi algorithm, optimality <strong>of</strong> the result.While Earley-based on-line pruning awaits further study, there <strong>is</strong> reason to believe the Earleyframework has inherent advantages over strategies based only on bottom-up information (including so-called‘over-the-top’ parsers). Context-free forward probabilities include all available probabil<strong>is</strong>tic information(subject to assumptions implicit in the SCFG formal<strong>is</strong>m) available from an input prefix, whereas the usualinside probabilities do not take in account the nonterminal prior probabilities that result from the top-downrelation to the start state. Using top-down constraints does not necessarily mean sacrificing robustness, asd<strong>is</strong>cussed in Section 6.5.4. On the contrary, by using Earley-style parsing with a set <strong>of</strong> carefully designed andestimated ‘fault tolerant’ top-level productions, it should be possible to use probabilities to better advantagein robust parsing. Th<strong>is</strong> approach <strong>is</strong> a subject <strong>of</strong> ongoing work in the tight-coupling framework <strong>of</strong> the BeRPsystem (Jurafsky et al. 1994b:see below).6.7.3 Relation to probabil<strong>is</strong>tic LR parsingOne <strong>of</strong> the major alternative context-free parsing paradigms besides Earley’s algorithm<strong>is</strong> LR parsing(Aho & Ullman 1972). A compar<strong>is</strong>on <strong>of</strong> the two approaches, both in their probabil<strong>is</strong>tic and non-probabil<strong>is</strong>ticaspects, <strong>is</strong> interesting and provides useful insights. <strong>The</strong> following remarks assume familiarity with bothapproaches. We sketch the fundamental relations, as well as the the important trade<strong>of</strong>fs between the two19 <strong>The</strong> closest thing to a HMM backward probability <strong>is</strong> probably the suffix probability ¦Œ[ 2 ·…¸ §s‡.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!