20.07.2013 Views

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Stabler - Lx 185/209 2003<br />

(50) Nem<br />

not<br />

fogok haza-menni kezdeni akarni V1 M V4 V3 V2<br />

will-1s begin-inf want-inf home-go-inf<br />

One analysis <strong>of</strong> verbal clusters in Hungarian (Koopman and Szabolcsi, 2000a) suggests that they “roll up” from<br />

the end <strong>of</strong> the string as shown below:<br />

rolling up:<br />

Nem fogok akarni kezdeni haza−menni V1 V2 V3 M V4<br />

not will−1s want−inf begin−inf home−go−inf<br />

Nem fogok akarni haza−menni kezdeni V1 V2 M V4 V3<br />

Nem fogok haza−menni kezdeni akarni V1 M V4 V3V2<br />

[M] moves around V4, then [M V4] rolls up around V3, then [M V4 V3] rolls up around V2,…It turns out that<br />

this kind <strong>of</strong> derivati<strong>on</strong> can derive complex patterns <strong>of</strong> dependencies which can yield formal languages like<br />

{a n b n c n | n ≥ 0}, oreven{a n b n c n d n e n | n ≥ 0} – any number <strong>of</strong> counting dependencies. We can define these<br />

languages without any kind <strong>of</strong> “rolling up” c<strong>on</strong>stituents if we help ourselves to (unboundedly many) feature<br />

values and unificati<strong>on</strong>:<br />

% anbncn.pl<br />

’S’ :˜ [’A’(X),’B’(X),’C’(X)].<br />

’A’(s(X)) :˜ [a,’A’(X)]. ’B’(s(X)) :˜ [b,’B’(X)]. ’C’(s(X)) :˜ [c,’C’(X)].<br />

’A’(0) :˜ []. ’B’(0) :˜ []. ’C’(0) :˜ [].<br />

4.2 Semilinearity and some inhuman linguistic patterns<br />

In the previous secti<strong>on</strong> we saw grammars for {a n b n | n ≥ 0}, {xx| x ∈{a, b} ∗ },and{a n b n c n | n ≥ 0}. These<br />

languages all have a basic property in comm<strong>on</strong>, which can be seen by counting the number <strong>of</strong> symbols in each<br />

string <strong>of</strong> each language.<br />

For example,<br />

{a n b n | n ≥ 0} ={ɛ,ab,aabb,aaabbb,...}<br />

,wecanuse(x, y) to represent x a’s and y’bs, so we see that the strings in this language have the following<br />

counts:<br />

{(0, 0), (1, 1), (2, 2),...}={(x, y)| x = y}.<br />

For {xx| x ∈{a, b} ∗ },wehaveallpairsN × N. For{a n b n c n | n ≥ 0} we have the set <strong>of</strong> triples {(x,y,z)| x =<br />

y = z}. If we look at just the number <strong>of</strong> a’s in each language, c<strong>on</strong>sidering the set <strong>of</strong> values <strong>of</strong> first coordinates<br />

<strong>of</strong> the tuples, then we can list those sets by value, obtaining in all three cases the sequence:<br />

0, 1, 2, 3,...<br />

The patterns <strong>of</strong> dependencies we looked at above will not always give us the sequence 0, 1, 2, 3,..., though.<br />

For example, the language<br />

{(ab) n (ba) n | n ≥ 0} ={ɛ, abba, ababbaba, . . .}<br />

60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!