20.07.2013 Views

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Stabler - Lx 185/209 2003<br />

8.2.2 Stochastic CKY parser<br />

(133) We have extended the CKY parsing strategy can handle any CFG, and augmented the chart entries so<br />

that they indicate the rule used to generate each item and the positi<strong>on</strong>s <strong>of</strong> internal boundaries.<br />

We still have a problem getting the parses out <strong>of</strong> the chart, since there can be too many <strong>of</strong> them: we do<br />

not want to take out <strong>on</strong>e at a time!<br />

Onethingwecandoistoextractjustthemostprobableparse. Anequivalentideaistomakeallthe<br />

relevant comparis<strong>on</strong>s before adding an item to the chart.<br />

(134) For any input string, the CKY parser chart represents a grammar that generates <strong>on</strong>ly the input string. We<br />

can find the most probable parse using the Trellis-like c<strong>on</strong>structi<strong>on</strong> familiar from Viterbi’s algorithm.<br />

(135) For any positi<strong>on</strong>s 0 ≤ i, j ≤|input|, wecanfindtherule(i, j) : A : X with maximal probability.<br />

(i − 1,ai,i) [axiom]<br />

(i,a,j)<br />

(i,A,j,p)<br />

(i,B,j,p1) (j,C,k,p2)<br />

(i,A,k,p1 ∗ p2 ∗ p3)<br />

[r educe1] if A p → a<br />

[r educe2] if A p3<br />

→ BC<br />

and ¬∃A p′<br />

→ a such that p ′ >p<br />

and ¬∃(i, B ′ ,j,p ′ 1 ),<br />

(j, C ′ ,k,p ′ 2 ),<br />

A p′ 3<br />

→ B ′ C ′ such that<br />

p ′ 1 ∗ p′ 2 ∗ p′ 3 >p1 ∗ p2 ∗ p3<br />

(136) This algorithm does (approximately) as many comparis<strong>on</strong>s <strong>of</strong> items as the n<strong>on</strong>-probabilistic versi<strong>on</strong>,<br />

since the reduce rules require identifying the most probable items <strong>of</strong> each category over each span <strong>of</strong><br />

the input.<br />

To reduce the chart size, we need to restrict the rules so that we do not get all the items in there – and<br />

then there is a risk <strong>of</strong> missing some analyses.<br />

161

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!