Perplexity of n-gram and dependency language models
Perplexity of n-gram and dependency language models
Perplexity of n-gram and dependency language models
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Language Models – design decisions<br />
1) How to factorize P(w1, w2, ... wm) into Πi=1..m P(wi | hi), this<br />
i.e. what word-positions will be used as the context hi ?<br />
work<br />
2) What additional context information will be used<br />
(apart from word forms),<br />
e.g. stems, lemmata, POS tags, word classes,... ?<br />
3) How to estimate P(w i | h i ) from the training data?<br />
Which Linear smoothing interpolation technique will be used?<br />
Weights (Good-Turing, trained Jelinek-Mercer, by EM<br />
Katz, Kneser-Ney,...)<br />
Generalized Parallel Back<strong>of</strong>f etc.<br />
14