12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

¸ > ) ¸are all productions with LHS ) , the entry indexed by ) in the right-hand side vector would be +9, ) ¸¸ )¸ ˜.CHAPTER 7. -GRAMS FROM STOCHASTIC CONTEXT-FREE GRAMMARS 179<strong>The</strong> algorithm has been found practical in the context <strong>of</strong> a medium-scale speech understandingsystem, were it gave improved estimates for a bigram language model, based on a hand-written SCFG andvery small amounts <strong>of</strong> available training data.¢ Deriving -gram probabilities from more soph<strong>is</strong>ticated language models appears to be a generallyuseful technique which can both improve upon direct estimation ¢ <strong>of</strong> -grams, and allows available higherlevellingu<strong>is</strong>tic knowledge to be effectively integrated into speech decoding or other tasks that place strongconstraints on usable language models.7.8 Appendix: Related ProblemsWe have seen ¢ how -gram expectations for SCFGs can be obtained by solving linear systemsbased on the matrix L I A, where I <strong>is</strong> the identity matrix and A <strong>is</strong> the first-moment (expectation) matrix <strong>of</strong>nonterminal occurrences for a single nonterminal expansion. As it turns out, a number <strong>of</strong> apparently unrelatedproblems ar<strong>is</strong>ing in connection with SCFGs and other probabil<strong>is</strong>tic grammars have solutions based on th<strong>is</strong>same matrix. <strong>The</strong>se are briefly surveyed below, without detailed pro<strong>of</strong>s.7.8.1 Expected string lengthTo compute the expected number <strong>of</strong> terminals in a string, the IL system A <strong>is</strong> solved for the right-handside vector containing the average number <strong>of</strong> terminals generated in a single production, for each nonterminal.For example, if<strong>The</strong> solution vector contains the expected lengths for the sublanguages generated by each <strong>of</strong> the2 N+9,) ¸ 0; 1.>!0Œ¾nonterminals. Thus, the expected sentence string length <strong>is</strong> the9-entry in the solution vector.<strong>The</strong> problems and its solution are easily generalized to obtain the expected number <strong>of</strong> terminals <strong>of</strong>a particular type occuring in a string (Booth & Thompson 1973).7.8.2 Derivation entropy<strong>The</strong> derivation entropy <strong>is</strong> the average number <strong>of</strong> bits required to specify a derivation from a SCFG.Is <strong>is</strong> computed from a right-hand side vector that contains the average negative log probabilities for theproductions <strong>of</strong> each LHS nonterminal. For example, based on

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!