12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ÆÆb) <strong>The</strong> probabil<strong>is</strong>tic left-corner relation 11 +ÆƸ¸ 99 0¸ £9£ òu óò£…06 t§N u t§N u 2t§N|£££6¬t) ¸ =.+ø,??iff?CHAPTER 6. EFFICIENT PARSING WITH STOCHASTIC CONTEXT-FREE GRAMMARS 1356.4.5.1 Prediction loopsAs an example, consider the following simple left-recursive SCFG.tówhere u 6 1 L§t . Non-probabil<strong>is</strong>tically, the prediction loop at position 0 would stop after producing the states9%09 ¸09 ¸Th<strong>is</strong> would leave the forward probabilities at£9% £,0£…06 t,0£j9%06 u corresponding to just two out <strong>of</strong> an infinity <strong>of</strong> possible paths. <strong>The</strong> correct forward probabilities are obtainedas a sum <strong>of</strong> infinitely many terms, accounting for all possible paths <strong>of</strong> length 1.¸¸ 09 09,0¸091 L u 016 1,0u N u 2N u 3N|£££6 u ,1 L u 0£j9%0616¬t¸ 1uIn these eacht sums corresponds to a choice <strong>of</strong> the first production, each to a choice <strong>of</strong> the second production.u09If we didn’t care about finite computation the resulting geometric series could be computed by letting theprediction loop (and hence the summation) continue indefinitely.Fortunately, all repeated prediction steps, including those due to left-recursion in the productions,can be collapsed into a single, modified prediction step, and the corresponding sums computed in closed form.For th<strong>is</strong> purpose we need a probabil<strong>is</strong>tic version <strong>of</strong> the well-known parsing concept <strong>of</strong> a left corner, which <strong>is</strong>also at the heart <strong>of</strong> the prefix probability algorithm <strong>of</strong> Jelinek & Lafferty (1991).Definition 6.5 <strong>The</strong> following definitions are relative to a given SCFG{.a) Two nonterminals ) and= are said to be in a left-corner relation ) ¸ =for )that has a RHS starting with=,there ex<strong>is</strong>ts a productiondefined as the total probability <strong>of</strong> choosing a production for )that has= as a left corner:’6£<strong>is</strong> the matrix <strong>of</strong> probabilities +-, ) ¸ L=ë0 ,,{§0+-,+9,) ¸ y ”'’‘/“ =ë0&6) ¸ =“.B0£11 If a probabil<strong>is</strong>tic relation”<strong>is</strong> replaced by its version”œ set-theoretic , i.e.,used here reduce to their traditional d<strong>is</strong>crete counterparts; hence the choice <strong>of</strong> terminology.§¨+•‡Q–—”œthen the closure operationsiff”io§¨˜•‡š0,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!