Notes on computational linguistics.pdf - UCLA Department of ...
Notes on computational linguistics.pdf - UCLA Department of ...
Notes on computational linguistics.pdf - UCLA Department of ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Stabler - Lx 185/209 2003<br />
The best models <strong>of</strong> human language processing are based <strong>on</strong> the programmatic hypothesis that<br />
human language processes are (at least, in large part) computati<strong>on</strong>al. That is, the hypothesis is that<br />
understanding or producing a coherent utterance typically involves changes <strong>of</strong> neural state that<br />
can be regarded as a calculati<strong>on</strong>, as the steps in some kind <strong>of</strong> derivati<strong>on</strong>.<br />
We could try to understand what is going <strong>on</strong> by attempting to map out the neural resp<strong>on</strong>ses to<br />
linguistic stimulati<strong>on</strong>, as has been d<strong>on</strong>e for example in the visual system <strong>of</strong> the frog (Lettvin et al.,<br />
1959, e.g.). Unfortunately, the careful in vitro single cell recording that is required for this kind <strong>of</strong><br />
investigati<strong>on</strong> <strong>of</strong> human neural activity is impractical and unethical (except perhaps in some unusual<br />
cases where surgery is happening anyway, as in the studies <strong>of</strong> Ojemann?).<br />
Another way to study language use is to c<strong>on</strong>sider how human language processing problems could<br />
possibly be solved by any sort <strong>of</strong> system. Designing and even building computati<strong>on</strong>al systems with<br />
properties similar to the human language user not <strong>on</strong>ly avoids the ethical issues (the devices we<br />
build appear to be much too simple for any kind <strong>of</strong> “robot rights” to kick in), but also, it allows<br />
us to begin with systems that are simplified in various respects. That this is an appropriate initial<br />
focus will be seen from the fact that many problems are quite clear and difficult well before we get<br />
to any <strong>of</strong> the subtle nuances <strong>of</strong> human language use.<br />
So these lecture notes briefly review some <strong>of</strong> the basic work <strong>on</strong> how human language processing<br />
problems could possibly be solved by any sort <strong>of</strong> system, rather than trying to model in detail the<br />
resources that humans have available for language processing. Roughly, the problems we would<br />
like to understand include these:<br />
percepti<strong>on</strong>: given an utterance, compute its meaning(s), in c<strong>on</strong>text. This involves recogniti<strong>on</strong> <strong>of</strong><br />
syntactic properties (subject, verb, object), semantic properties (e.g. entailment relati<strong>on</strong>s, in<br />
c<strong>on</strong>text), and pragmatic properties (asserti<strong>on</strong>, questi<strong>on</strong>,…).<br />
producti<strong>on</strong>: given some (perhaps <strong>on</strong>ly vaguely) intended syntactic, semantic, and pragmatic properties,<br />
create an utterance that has them.<br />
acquisiti<strong>on</strong>: given some experience in a community <strong>of</strong> language users, compute a representati<strong>on</strong><br />
<strong>of</strong> the language that is similar enough to others that percepti<strong>on</strong>/producti<strong>on</strong> is reliably c<strong>on</strong>sistent<br />
across speakers<br />
Note that the main focus <strong>of</strong> this text is “computati<strong>on</strong>al <strong>linguistics</strong>” in this rather scientific sense,<br />
as opposed to “natural language processing” in the sense <strong>of</strong> building commercially viable tools<br />
for language analysis or informati<strong>on</strong> retrieval, or “corpus <strong>linguistics</strong>” in the sense <strong>of</strong> studying the<br />
properties <strong>of</strong> collecti<strong>on</strong>s <strong>of</strong> texts with available tools. Computati<strong>on</strong>al <strong>linguistics</strong> overlaps to some<br />
extent with these other interests, but the goals here are really quite different.<br />
The notes are very significantly changed from earlier versi<strong>on</strong>s, and so the c<strong>on</strong>tributi<strong>on</strong>s <strong>of</strong> the<br />
class participants were enormously valuable. Thanks especially to Dan Albro, Lest<strong>on</strong> Buell, Heidi<br />
Fleischhacker, Alexander Kaiser, Greg Kobele, Alex MacBride, and Jas<strong>on</strong> Riggle. Ed Keenan provided<br />
many helpful suggesti<strong>on</strong>s and inspirati<strong>on</strong> during this work.<br />
No doubt, many typographical errors and infelicities <strong>of</strong> other sorts remain. I hope to c<strong>on</strong>tinue<br />
revising and improving these notes, so comments are welcome!<br />
stabler@ucla.edu<br />
3