20.07.2013 Views

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Stabler - Lx 185/209 2003<br />

8.1.12 Markov models in human syntactic analysis?<br />

(92) Shann<strong>on</strong> (1948, pp42-43) says:<br />

We can also approximate to a natural language by means <strong>of</strong> a series <strong>of</strong> simple artificial language…To<br />

give a visual idea <strong>of</strong> how this series approaches a language, typical sequences in the<br />

approximati<strong>on</strong>s to English have been c<strong>on</strong>structed and are given below…<br />

5. First order word approximati<strong>on</strong>…Here words are chosen independently but with their appropriate<br />

frequencies.<br />

REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL<br />

HERE HE THE IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE<br />

MESSAGE HAD BE THESE<br />

6. Sec<strong>on</strong>d order word approximati<strong>on</strong>.<br />

further structure is included.<br />

The word transiti<strong>on</strong> probabilities are correct but no<br />

THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARAC-<br />

TER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE<br />

TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED<br />

The resemblance to ordinary English text increases quite noticeably at each <strong>of</strong> the above<br />

steps…It appears then that a sufficiently complex stochastic source process will give a satisfactory<br />

representati<strong>on</strong> <strong>of</strong> a discrete source.<br />

(93) Damerau (1971) c<strong>on</strong>firms this trend in an experiment that involved generating 5th order approximati<strong>on</strong>s.<br />

All these results are hard to interpret though, since (i) sparse data in generati<strong>on</strong> will tend to yield near<br />

copies <strong>of</strong> porti<strong>on</strong>s <strong>of</strong> the source texts (<strong>on</strong> the sparse data problem, remember the results from Jelinek<br />

menti<strong>on</strong>ed in 95, above), and (ii) human linguistic capabilities are not well reflected in typical texts.<br />

(94) Miller and Chomsky objecti<strong>on</strong> 1: The number <strong>of</strong> parameters to set is enormous.<br />

Notice that for a vocabulary <strong>of</strong> 100, 000 words, where each different word is emitted by a different event,<br />

we would need at least 100,000 states. The full transiti<strong>on</strong> matrix then has 100, 0002 = 10 10 entries.<br />

Notice that the last column <strong>of</strong> the transiti<strong>on</strong> matrix is redundant, and so a 109 matrix will do.<br />

Miller and Chomsky (1963, p430) say:<br />

We cannot seriously propose that a child learns the value <strong>of</strong> 10 9 parameters in a childhood<br />

lasting <strong>on</strong>ly 10 8 sec<strong>on</strong>ds.<br />

Why not? This is very far from obvious, unless the parameters are independent, and there is no reas<strong>on</strong><br />

to assume they are.<br />

(95) Miller and Chomsky (1963, p430) objecti<strong>on</strong> 2: The amount <strong>of</strong> input required to set the parameters <strong>of</strong><br />

a reas<strong>on</strong>able model is enormous.<br />

Jelinek (1985) reports that after collecting the trigrams from a 1,500,000 word corpus, he found that,<br />

in the next 300,000 words, 25% <strong>of</strong> the trigrams were new.<br />

No surprise! Some generalizati<strong>on</strong> across lexical combinati<strong>on</strong>s is required. In this c<strong>on</strong>text, the “generalizati<strong>on</strong>”<br />

is sometimes achieved with various “smoothing” functi<strong>on</strong>s, which will be discussed later. With<br />

generalizati<strong>on</strong>, setting large numbers <strong>of</strong> parameters becomes quite c<strong>on</strong>ceivable.<br />

Without a better understanding <strong>of</strong> the issues, I find objecti<strong>on</strong> 2 completely unpersuasive.<br />

(96) Miller and Chomsky (1963, p425) objecti<strong>on</strong> 3:<br />

Since human messages have dependencies extending over l<strong>on</strong>g strings <strong>of</strong> symbols, we know that<br />

any pure Markov source must be too simple…<br />

147

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!