12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

,?Ó,tÌ?L A0,I 1N I N A A 2 A 3 N £££ N 6CHAPTER 7. -GRAMS FROM STOCHASTIC CONTEXT-FREE GRAMMARS 176Th<strong>is</strong> leads toÍ6 6tL 2t-L £ t/Ï t¬6 0£ 0£1 2 1 (7.10)Now, for 5 th<strong>is</strong> becomes infinity, and for probabilities 5, the solution <strong>is</strong> negative! Th<strong>is</strong> <strong>is</strong> arather striking manifestation <strong>of</strong> the failure <strong>of</strong> th<strong>is</strong> grammar, for tKV 0£ 5, to be cons<strong>is</strong>tent in the sense <strong>of</strong> Booth& Thompson (1973) (see Section 6.4.8). An incons<strong>is</strong>tent grammar <strong>is</strong> one in which the stochastic derivationprocess has non-zero probability <strong>of</strong> not terminating. <strong>The</strong> expected length <strong>of</strong> the generated strings shouldtherefore be infinite in th<strong>is</strong> case.Booth and Thompson derive a criterion for checking the cons<strong>is</strong>tency <strong>of</strong> a SCFG: Find the firstmomentmatrix 6 E <strong>is</strong> the expected number <strong>of</strong> occurrences <strong>of</strong> nonterminal= in a one-step.”‘, where Ó”‘0expansion <strong>of</strong> ) nonterminal , and make sure its powers E6converge to as" 0¸Í°cons<strong>is</strong>tent, otherw<strong>is</strong>e it <strong>is</strong> not. 5If so, the grammar <strong>is</strong>For the grammar in (7.9), E <strong>is</strong> the 1 ¾ 1 matrix , 2 u 0 . Thus we can confirm our earlier observationby noting that , 2 u 0|6converges to 0 iff u Ï 0£ 5, or t0£ 5.Notice that E <strong>is</strong> identical to the matrix A that occurs in the linear equations (7.6) for the ¢ -gramcomputation. <strong>The</strong> actual coefficient matrix <strong>is</strong> I L A, and its inverse, if it ex<strong>is</strong>ts, can be written as the geometricsumTh<strong>is</strong> series converges prec<strong>is</strong>ely if A6converges to 0. We have thus shown that the ex<strong>is</strong>tence <strong>of</strong> a solutionfor the¢ -gram problem <strong>is</strong> equivalent to the cons<strong>is</strong>tency <strong>of</strong> the grammar in question. Furthermore, the solution vector6 c L A0 I 1 b will always cons<strong>is</strong>t <strong>of</strong> non-negative numbers: it <strong>is</strong> the sum and product <strong>of</strong> the non-negativevalues given by equations (7.7) and (7.8).<strong>The</strong> matrix L I A and its inverse turn out to have a special role for SCFG: it <strong>is</strong>, in a sense, a‘universal problem solver’ for a whole series <strong>of</strong> global quantities associated with probabil<strong>is</strong>tic grammars. Abrief overview <strong>of</strong> these <strong>is</strong> given in the appendix to th<strong>is</strong> chapter.7.6 Experiments<strong>The</strong> algorithm described here has been implemented, and <strong>is</strong> being used to generate bigrams fora speech recognizer that <strong>is</strong> part <strong>of</strong> the BeRP spoken-language system (Jurafsky et al. 1994a). <strong>The</strong> speechdecoder and language model components <strong>of</strong> the BeRP system were used in an experiment to assess the benefit<strong>of</strong> using bigram probabilities obtained through SCFGs versus estimating them directly from the availabletraining corpus. <strong>The</strong> system’s domain are inquiries about restaurants in the city <strong>of</strong> Berkeley. Table 7.1 givesstat<strong>is</strong>tics for the training and test corpora used, as well as the language models involved in the experiment.Our experiments made use <strong>of</strong> a context-free grammar hand-written for the BeRP domain. Computing thebigram probabilities from th<strong>is</strong> SCFG <strong>of</strong> 133 nonterminals involves solving 657 linear systems for unigram5 An alternative version <strong>of</strong> th<strong>is</strong> criterion <strong>is</strong> to check the magnitude <strong>of</strong> the largest <strong>of</strong> E’s eigenvalues (its spectral radius). If that value<strong>is</strong>Î 1, the grammar <strong>is</strong> incons<strong>is</strong>tent; ifÏ 1, it <strong>is</strong> cons<strong>is</strong>tent.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!