20.07.2013 Views

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Stabler - Lx 185/209 2003<br />

c. Finally, given P(qi ⇒ a1 ...an) for all qi ∈ ΩX,<br />

P(a1 ...an) = <br />

P0(qi)P(qi ⇒ a1 ...an)<br />

qi∈ΩX<br />

(86) Exercise: Use the c<strong>of</strong>fee machine as elaborated in (84) and the backward method to compute the probability<br />

<strong>of</strong> the output sequence<br />

s1s3s3.<br />

8.1.11 Computing most probable parses: Viterbi’s algorithm<br />

(87) Given a string a1 ...an output by a Markov model, what is the most likely sequence <strong>of</strong> states that could<br />

have yielded this string? This is analogous to finding a most probable parse <strong>of</strong> a string.<br />

Notice that we could solve this problem by calculating the probabilities <strong>of</strong> the output sequence for each<br />

<strong>of</strong> the |ΩX| n state sequences, but this is not feasible!<br />

(88) The Viterbi algorithm allows efficient calculati<strong>on</strong> <strong>of</strong> the most probable sequence <strong>of</strong> states producing<br />

a given output (Viterbi 1967; Forney 1973), using an idea that is similar to the forward calculati<strong>on</strong> <strong>of</strong><br />

output sequence probabilities in §8.1.9 above.<br />

Intuitively, <strong>on</strong>ce we know the best way to get to any state in ΩX at a time t, the best path to the next<br />

state is an extensi<strong>on</strong> <strong>of</strong> <strong>on</strong>e <strong>of</strong> those.<br />

(89) a. Calculate, for each possible initial state qi ∈ ΩX,<br />

P(qi,a1) = P0(qi)P(a1|qi).<br />

and record: qi : P(qi,a1)@ɛ.<br />

That is, for each state qi, we record the probability <strong>of</strong> the state sequence ending in qi.<br />

b. Recursive step: Given qi : P(qqi,a1 ...at)@q for each qi ∈ ΩX,<br />

for each qj ∈ ΩX find a qi that maximizes<br />

P(qqiqj,a1 ...atat+1) = P(qqi,a1 ...at)P(qj|qi)P(at+1|qj)<br />

and record: qj : P(qqiqj,a1 ...atat+1)@qqi. 37<br />

c. After these values have been computed up to the final state tn, we choose a qi : P(qqi,a1 ...an)@q<br />

with a maximum probability P(qqi,a1 ...an).<br />

(90) Exercise: Use the c<strong>of</strong>fee machine as elaborated in (84) to compute the most likely state sequence underlying<br />

the output sequence<br />

s1s3s3.<br />

(91) The Viterbi algorithm is not incremental: at every time step |ΩX| different parses are being c<strong>on</strong>sidered.<br />

As stated, the algorithm stores arbitrarily l<strong>on</strong>g state paths at each step, but notice that each step <strong>on</strong>ly<br />

needs the results <strong>of</strong> the previous step: |ΩX| different probabilities (an unbounded memory requirement,<br />

unless precisi<strong>on</strong> can be bounded)<br />

37In case more than <strong>on</strong>e qi ties for the maximum P(qqiqj,a1 ...atat+1), we can either make a choice, or else carry all the winning<br />

opti<strong>on</strong>s forward.<br />

146

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!