12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 4. STOCHASTIC CONTEXT-FREE GRAMMARS 85S --> A B (10)--> A A B B (8)--> A S B (2)--> A A S B B (1)A --> a (30)B --> b (30)(log likelihood = -8.71).Another chunking operation, chunk(A S B) = Y, producesS --> A B (10)--> A A B B (8)--> Y (2)--> A Y B (1)Y --> A S B (3)A --> a (30)B --> b (30)and again the new nonterminal Y <strong>is</strong> immediately merged with S:S --> A B (10)--> A A B B (8)--> A S B (4)A --> a (30)B --> b (30)(log likelihood = -8.69).After one more chunking, chunk(A B) = Z, and subsequent merging step, merge(Z, S) = S, thegrammar has its final formS --> A B (18)--> A S B (12)A --> a (30)B --> b (30)(log likelihood = -8.77).As in the HMM example, it <strong>is</strong> instructive to consider additional merging steps, which would lead tovarious overly general grammars, such asmerge(A, B) log likelihood = -26.83merge(S, A) log likelihood = -23.77merge(S, B) log likelihood = -23.77Th<strong>is</strong> confirms our intuitionthat the right time to stop merging <strong>is</strong> characterized by a large drop in the likelihood.We should therefore see a posterior probability maximum at th<strong>is</strong> point under any reasonable prior.4.3.3 Bracketed samples<strong>The</strong> major source <strong>of</strong> added difficulty in learning SCFGs, as opposed to finite-state models comesfrom the added uncertainty about the phrase structure (bracketing) <strong>of</strong> the observed samples. <strong>The</strong> chunkingoperation was created prec<strong>is</strong>ely to account for th<strong>is</strong> part <strong>of</strong> the learning process.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!