Notes on computational linguistics.pdf - UCLA Department of ...
Notes on computational linguistics.pdf - UCLA Department of ...
Notes on computational linguistics.pdf - UCLA Department of ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Stabler - Lx 185/209 2003<br />
(50) Nem<br />
not<br />
fogok haza-menni kezdeni akarni V1 M V4 V3 V2<br />
will-1s begin-inf want-inf home-go-inf<br />
One analysis <strong>of</strong> verbal clusters in Hungarian (Koopman and Szabolcsi, 2000a) suggests that they “roll up” from<br />
the end <strong>of</strong> the string as shown below:<br />
rolling up:<br />
Nem fogok akarni kezdeni haza−menni V1 V2 V3 M V4<br />
not will−1s want−inf begin−inf home−go−inf<br />
Nem fogok akarni haza−menni kezdeni V1 V2 M V4 V3<br />
Nem fogok haza−menni kezdeni akarni V1 M V4 V3V2<br />
[M] moves around V4, then [M V4] rolls up around V3, then [M V4 V3] rolls up around V2,…It turns out that<br />
this kind <strong>of</strong> derivati<strong>on</strong> can derive complex patterns <strong>of</strong> dependencies which can yield formal languages like<br />
{a n b n c n | n ≥ 0}, oreven{a n b n c n d n e n | n ≥ 0} – any number <strong>of</strong> counting dependencies. We can define these<br />
languages without any kind <strong>of</strong> “rolling up” c<strong>on</strong>stituents if we help ourselves to (unboundedly many) feature<br />
values and unificati<strong>on</strong>:<br />
% anbncn.pl<br />
’S’ :˜ [’A’(X),’B’(X),’C’(X)].<br />
’A’(s(X)) :˜ [a,’A’(X)]. ’B’(s(X)) :˜ [b,’B’(X)]. ’C’(s(X)) :˜ [c,’C’(X)].<br />
’A’(0) :˜ []. ’B’(0) :˜ []. ’C’(0) :˜ [].<br />
4.2 Semilinearity and some inhuman linguistic patterns<br />
In the previous secti<strong>on</strong> we saw grammars for {a n b n | n ≥ 0}, {xx| x ∈{a, b} ∗ },and{a n b n c n | n ≥ 0}. These<br />
languages all have a basic property in comm<strong>on</strong>, which can be seen by counting the number <strong>of</strong> symbols in each<br />
string <strong>of</strong> each language.<br />
For example,<br />
{a n b n | n ≥ 0} ={ɛ,ab,aabb,aaabbb,...}<br />
,wecanuse(x, y) to represent x a’s and y’bs, so we see that the strings in this language have the following<br />
counts:<br />
{(0, 0), (1, 1), (2, 2),...}={(x, y)| x = y}.<br />
For {xx| x ∈{a, b} ∗ },wehaveallpairsN × N. For{a n b n c n | n ≥ 0} we have the set <strong>of</strong> triples {(x,y,z)| x =<br />
y = z}. If we look at just the number <strong>of</strong> a’s in each language, c<strong>on</strong>sidering the set <strong>of</strong> values <strong>of</strong> first coordinates<br />
<strong>of</strong> the tuples, then we can list those sets by value, obtaining in all three cases the sequence:<br />
0, 1, 2, 3,...<br />
The patterns <strong>of</strong> dependencies we looked at above will not always give us the sequence 0, 1, 2, 3,..., though.<br />
For example, the language<br />
{(ab) n (ba) n | n ≥ 0} ={ɛ, abba, ababbaba, . . .}<br />
60