12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

, u1 ¸¼u Ù 0 and , u3 ¸¹u Ù 0&6|model variant due to merging <strong>is</strong> now possible in ðð, u1 ¸¼u Ù 0MN¬,,, u2 ¸¹u Ù 0F£,CHAPTER 3. HIDDEN MARKOV MODELS 44, u2 ¸¹u ÙY0 in the current model, the merged state u 3 would be assigned a countlikely paths in the merged model, simply replacing the merged states with u 3, and no other samples changetheir paths to include u 3 ¸¼u Ù .Th<strong>is</strong> path preservation assumption <strong>is</strong> not strictly true but holds most <strong>of</strong> the time, since the mergesactually chosen are those that collapse states with similar d<strong>is</strong>tributions<strong>of</strong> transition and em<strong>is</strong>sion probabilities.<strong>The</strong> assumption can be easily tested, and the counts corrected, by reparsing the training data from time totime.In an incremental model building scenario, where new samples are available in large number andincorporated one by one, interleaved with merging, one might not want to store all data seen in the past. Inth<strong>is</strong> case an exponentially decaying average <strong>of</strong> Viterbi counts can be kept instead. Th<strong>is</strong> has the effect thatincorrect Viterbi counts will eventually fade away, being replaced by up-to-date counts obtained form parsingmore recent data with the current model.Th<strong>is</strong> <strong>is</strong> correct if all samples with Viterbi paths through the transitions u 1 ¸¹u and u 2 ¸u Ù retain their mostIncremental model evaluationUsing the techniques described in the previous sections, the evaluation <strong>of</strong> a. 3§. 0 amortized time, instead <strong>of</strong> the ð .ñÞ9.N . 3§. 0

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!