08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

previous state j by putting an arrow with edge label t from i to j. At the end, can find<br />

the most likely sequence by tracing backwards as is standard for dynamic programming<br />

algorithms.<br />

Example: For the earlier example what is the most likely sequence <strong>of</strong> states to produce<br />

the output hhht?<br />

t = 3 max{ 1<br />

t = 2 max{ 1 8<br />

t = 1<br />

1<br />

1 1<br />

2 2<br />

1 1<br />

, 1 3 1<br />

48 2 2 24 4<br />

1 1<br />

, 1 3 1<br />

2 2 6 4<br />

} = 1 q or p max{ 3<br />

2 64<br />

} = 3 p max{ 1 2 48 8<br />

= 1 1<br />

q<br />

2 8 2<br />

1 2<br />

2<br />

t = 0<br />

1<br />

2<br />

q 0 p<br />

q<br />

1 1<br />

, 1 1 1<br />

48 2 3 24 4<br />

1 2<br />

, 1 1 2<br />

2 3 6 4<br />

3 = 1 6<br />

q<br />

3 } = 1<br />

3 } = 1<br />

p<br />

96<br />

q<br />

24<br />

q<br />

Note that the two sequences <strong>of</strong> states, qqpq and qpqq, are tied for the most likely sequences<br />

<strong>of</strong> states.<br />

Determining the underlying hidden Markov model<br />

Given an n-state HMM, how do we adjust the transition probabilities and output<br />

probabilities to maximize the probability <strong>of</strong> an output sequence O 1 O 2 · · · O T ? The assumption<br />

is that T is much larger than n. There is no known computationally efficient<br />

method for solving this problem. However, there are iterative techniques that converge<br />

to a local optimum.<br />

Let a ij be the transition probability from state i to state j and let b j (O k ) be the<br />

probability <strong>of</strong> output O k given that the HMM is in state j. Given estimates for the HMM<br />

parameters, a ij and b j , and the output sequence O, we can improve the estimates by<br />

calculating for each unit <strong>of</strong> time the probability that the HMM goes from state i to state<br />

j and outputs the symbol O k .<br />

Given estimates for the HMM parameters, a ij and b j , and the output sequence O, the<br />

probability δ t (i, j) <strong>of</strong> going from state i to state j at time t is given by the probability <strong>of</strong><br />

producing the output sequence O and going from state i to state j at time t divided by<br />

the probability <strong>of</strong> producing the output sequence O.<br />

δ t (i, j) = a t(i)a ij b j (O t+1 )β t+1 (j)<br />

p(O)<br />

The probability p(O) is the sum over all pairs <strong>of</strong> states i and j <strong>of</strong> the numerator in the<br />

above formula for δ t (i, j). That is,<br />

p(O) = ∑ ∑<br />

α t (j)a ij b j (O t+1 )β t+1 (j).<br />

i j<br />

306

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!