20.07.2013 Views

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Stabler - Lx 185/209 2003<br />

also has nested dependencies just like {a n b n | n ≥ 0}, but this time the number <strong>of</strong> a’s in words <strong>of</strong> the language<br />

is<br />

0, 2, 4, 6,...<br />

Plotting positi<strong>on</strong> in the sequence against value, these sets are both linear.<br />

Let’s write scalar product <strong>of</strong> an integer k and a pair (x, y) this way:<br />

k(x, y) = (kx, ky)<br />

, and we add pairs in the usual way (x, y) + (z, w) = (x + y,z + w). ThenasetS <strong>of</strong> pairs (or tuples <strong>of</strong> higher<br />

arity) is said to be linear iff there are finitely many pairs (tuples) v0,v1,...,vk such that<br />

S ={v0 +<br />

k<br />

nvi| n ∈ N, 1 ≤ i ≤ k}.<br />

Asetissemilinear iff it is the uni<strong>on</strong> <strong>of</strong> finitely many linear sets.<br />

Theorem: Finite state and c<strong>on</strong>text free languages are semilinear<br />

Semilinearity Hypothesis: Human languages are semilinear (Joshi, 1985)<br />

Theorem: Many unificati<strong>on</strong> grammar languages are not semilinear!<br />

Here is a unificati<strong>on</strong> grammar that accepts {a2n| n>0}.<br />

% apowtw<strong>on</strong>.pl<br />

’S’(0) :˜ [a,a].<br />

’S’(s(X)) :˜ [’S’(X),’S’(X)].<br />

i=1<br />

Michaelis and Kracht (1997) argue against Joshi’s semilinearity hypothesis <strong>on</strong> the basis <strong>of</strong> the case markings<br />

in Old Georgian, 18 which we see in examples like these (cf also Boeder 1995, Bhatt&Joshi 2003):<br />

(51) saidumlo-j igi sasupevel-isa m-is γmrt-isa-jsa-j<br />

mystery-nom the-nom kingdom-gen the-gen God-gen-gen-nom<br />

‘the mystery <strong>of</strong> the kingdom <strong>of</strong> God’<br />

(52) govel-i igi sisxl-i saxl-isa-j m-is Saul-is-isa-j<br />

all-nom the-nom blood-nom house-gen-nom the-nom Saul-gen-gen-nom<br />

‘all the blood <strong>of</strong> the house <strong>of</strong> Saul’<br />

Michaelis and Kracht infer from examples like these that in this kind <strong>of</strong> possessive, Old Georgian requires<br />

the embedded nouns to repeat the case markers <strong>on</strong> all the heads that dominate them, yielding the following<br />

pattern (writing K for each case marker):<br />

[N1 − K1[N2 − K2 − K1[N3 − K3 − K2 − K1 ...[Nn − Kn − ...− K1]]]]<br />

It is easy to calculate that in this pattern, when there are n nouns, there are n(n+1)<br />

2 case markers. Such a<br />

language is not semilinear.<br />

18A Kartevelian language with translati<strong>on</strong>s <strong>of</strong> the Gospel from the 5th century. Modern Georgian does not show the phenomen<strong>on</strong><br />

noted here.<br />

61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!