Notes on computational linguistics.pdf - UCLA Department of ...
Notes on computational linguistics.pdf - UCLA Department of ...
Notes on computational linguistics.pdf - UCLA Department of ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Stabler - Lx 185/209 2003<br />
c. Finally, given P(qi ⇒ a1 ...an) for all qi ∈ ΩX,<br />
P(a1 ...an) = <br />
P0(qi)P(qi ⇒ a1 ...an)<br />
qi∈ΩX<br />
(86) Exercise: Use the c<strong>of</strong>fee machine as elaborated in (84) and the backward method to compute the probability<br />
<strong>of</strong> the output sequence<br />
s1s3s3.<br />
8.1.11 Computing most probable parses: Viterbi’s algorithm<br />
(87) Given a string a1 ...an output by a Markov model, what is the most likely sequence <strong>of</strong> states that could<br />
have yielded this string? This is analogous to finding a most probable parse <strong>of</strong> a string.<br />
Notice that we could solve this problem by calculating the probabilities <strong>of</strong> the output sequence for each<br />
<strong>of</strong> the |ΩX| n state sequences, but this is not feasible!<br />
(88) The Viterbi algorithm allows efficient calculati<strong>on</strong> <strong>of</strong> the most probable sequence <strong>of</strong> states producing<br />
a given output (Viterbi 1967; Forney 1973), using an idea that is similar to the forward calculati<strong>on</strong> <strong>of</strong><br />
output sequence probabilities in §8.1.9 above.<br />
Intuitively, <strong>on</strong>ce we know the best way to get to any state in ΩX at a time t, the best path to the next<br />
state is an extensi<strong>on</strong> <strong>of</strong> <strong>on</strong>e <strong>of</strong> those.<br />
(89) a. Calculate, for each possible initial state qi ∈ ΩX,<br />
P(qi,a1) = P0(qi)P(a1|qi).<br />
and record: qi : P(qi,a1)@ɛ.<br />
That is, for each state qi, we record the probability <strong>of</strong> the state sequence ending in qi.<br />
b. Recursive step: Given qi : P(qqi,a1 ...at)@q for each qi ∈ ΩX,<br />
for each qj ∈ ΩX find a qi that maximizes<br />
P(qqiqj,a1 ...atat+1) = P(qqi,a1 ...at)P(qj|qi)P(at+1|qj)<br />
and record: qj : P(qqiqj,a1 ...atat+1)@qqi. 37<br />
c. After these values have been computed up to the final state tn, we choose a qi : P(qqi,a1 ...an)@q<br />
with a maximum probability P(qqi,a1 ...an).<br />
(90) Exercise: Use the c<strong>of</strong>fee machine as elaborated in (84) to compute the most likely state sequence underlying<br />
the output sequence<br />
s1s3s3.<br />
(91) The Viterbi algorithm is not incremental: at every time step |ΩX| different parses are being c<strong>on</strong>sidered.<br />
As stated, the algorithm stores arbitrarily l<strong>on</strong>g state paths at each step, but notice that each step <strong>on</strong>ly<br />
needs the results <strong>of</strong> the previous step: |ΩX| different probabilities (an unbounded memory requirement,<br />
unless precisi<strong>on</strong> can be bounded)<br />
37In case more than <strong>on</strong>e qi ties for the maximum P(qqiqj,a1 ...atat+1), we can either make a choice, or else carry all the winning<br />
opti<strong>on</strong>s forward.<br />
146