12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 3. HIDDEN MARKOV MODELS 46local posterior probability maxima in the space <strong>of</strong> HMM structures constructed by successive mergingoperations.By far the most common problem found in practice <strong>is</strong> that the stopping criterion <strong>is</strong> triggered tooearly, since a single merging step alone decreases the posterior model probability, although additional relatedsteps might eventually increase it. Th<strong>is</strong> happens although in the vast majority <strong>of</strong> cases the first step <strong>is</strong> inthe right direction. <strong>The</strong> straightforward solution to th<strong>is</strong> problem <strong>is</strong> to add a ‘lookahead’ to the best-firststrategy. <strong>The</strong> stopping criterion <strong>is</strong> modified to trigger only after a fixed number <strong>of</strong> steps 1 have producedno improvement; merging still proceeds along the best-first path. Due to th<strong>is</strong>, the lookahead depth does notÌentail an exponential increase in computation as a full tree search would. <strong>The</strong> only additional cost <strong>is</strong> the workperformed by looking ahead in vain at the end <strong>of</strong> a merging sequence. That cost <strong>is</strong> amortized over severalsamples if incremental merging with a batch size 1 <strong>is</strong> being used.Best-first merging with lookahead has been our method <strong>of</strong> choice for almost all applications, usingÌlookaheads between 2 and 5. However, we have also experimented with beam search strategies. In these, aset <strong>of</strong> working models <strong>is</strong> kept at each time, either limited in number (say, top! the scoring ones), or by thedifference in score to the current best model. On each inner loop <strong>of</strong> the search algorithm, all current modelsare modified according to the possible merges, and among the pool thus generated the best ones accordingto the beam criterion are retained. (By including the unmerged models in the pool we get the effect <strong>of</strong> alookahead.)Some duplication <strong>of</strong> work results from the fact that different sequences <strong>of</strong> merges can lead to thesame final HMM structure. To remove such gratuitous duplicates from the beam we attach a l<strong>is</strong>t <strong>of</strong> d<strong>is</strong>allowedmerges to each model, which <strong>is</strong> propagated from a model to its successors generated by merging. Multiplesuccessors <strong>of</strong> the same model have the l<strong>is</strong>t extended so that later successors cannot produce identical resultsfrom simply permuting the merge sequence.<strong>The</strong> resulting beam search version <strong>of</strong> our algorithm does indeed produce superior results on datathat requires aligning long substrings <strong>of</strong> states, and where the quality <strong>of</strong> the alignment can only be evaluatedafter several coordinated merging steps. On the other hand, beam search <strong>is</strong> considerably more expensive thanbest-first search and may not be worth a marginal improvement.All results in Section 3.6 were obtained using best-first search with lookahead. Nevertheless,improved search strategies and heur<strong>is</strong>tics for merging remain an important problem for future research.3.5 Related WorkMany <strong>of</strong> the ideas used in our approach to Bayesian HMM induction are not new by themselves,and can be found in similar forms in the vast literatures on grammar induction and stat<strong>is</strong>tical inference.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!