20.07.2013 Views

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

Notes on computational linguistics.pdf - UCLA Department of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Stabler - Lx 185/209 2003<br />

The best models <strong>of</strong> human language processing are based <strong>on</strong> the programmatic hypothesis that<br />

human language processes are (at least, in large part) computati<strong>on</strong>al. That is, the hypothesis is that<br />

understanding or producing a coherent utterance typically involves changes <strong>of</strong> neural state that<br />

can be regarded as a calculati<strong>on</strong>, as the steps in some kind <strong>of</strong> derivati<strong>on</strong>.<br />

We could try to understand what is going <strong>on</strong> by attempting to map out the neural resp<strong>on</strong>ses to<br />

linguistic stimulati<strong>on</strong>, as has been d<strong>on</strong>e for example in the visual system <strong>of</strong> the frog (Lettvin et al.,<br />

1959, e.g.). Unfortunately, the careful in vitro single cell recording that is required for this kind <strong>of</strong><br />

investigati<strong>on</strong> <strong>of</strong> human neural activity is impractical and unethical (except perhaps in some unusual<br />

cases where surgery is happening anyway, as in the studies <strong>of</strong> Ojemann?).<br />

Another way to study language use is to c<strong>on</strong>sider how human language processing problems could<br />

possibly be solved by any sort <strong>of</strong> system. Designing and even building computati<strong>on</strong>al systems with<br />

properties similar to the human language user not <strong>on</strong>ly avoids the ethical issues (the devices we<br />

build appear to be much too simple for any kind <strong>of</strong> “robot rights” to kick in), but also, it allows<br />

us to begin with systems that are simplified in various respects. That this is an appropriate initial<br />

focus will be seen from the fact that many problems are quite clear and difficult well before we get<br />

to any <strong>of</strong> the subtle nuances <strong>of</strong> human language use.<br />

So these lecture notes briefly review some <strong>of</strong> the basic work <strong>on</strong> how human language processing<br />

problems could possibly be solved by any sort <strong>of</strong> system, rather than trying to model in detail the<br />

resources that humans have available for language processing. Roughly, the problems we would<br />

like to understand include these:<br />

percepti<strong>on</strong>: given an utterance, compute its meaning(s), in c<strong>on</strong>text. This involves recogniti<strong>on</strong> <strong>of</strong><br />

syntactic properties (subject, verb, object), semantic properties (e.g. entailment relati<strong>on</strong>s, in<br />

c<strong>on</strong>text), and pragmatic properties (asserti<strong>on</strong>, questi<strong>on</strong>,…).<br />

producti<strong>on</strong>: given some (perhaps <strong>on</strong>ly vaguely) intended syntactic, semantic, and pragmatic properties,<br />

create an utterance that has them.<br />

acquisiti<strong>on</strong>: given some experience in a community <strong>of</strong> language users, compute a representati<strong>on</strong><br />

<strong>of</strong> the language that is similar enough to others that percepti<strong>on</strong>/producti<strong>on</strong> is reliably c<strong>on</strong>sistent<br />

across speakers<br />

Note that the main focus <strong>of</strong> this text is “computati<strong>on</strong>al <strong>linguistics</strong>” in this rather scientific sense,<br />

as opposed to “natural language processing” in the sense <strong>of</strong> building commercially viable tools<br />

for language analysis or informati<strong>on</strong> retrieval, or “corpus <strong>linguistics</strong>” in the sense <strong>of</strong> studying the<br />

properties <strong>of</strong> collecti<strong>on</strong>s <strong>of</strong> texts with available tools. Computati<strong>on</strong>al <strong>linguistics</strong> overlaps to some<br />

extent with these other interests, but the goals here are really quite different.<br />

The notes are very significantly changed from earlier versi<strong>on</strong>s, and so the c<strong>on</strong>tributi<strong>on</strong>s <strong>of</strong> the<br />

class participants were enormously valuable. Thanks especially to Dan Albro, Lest<strong>on</strong> Buell, Heidi<br />

Fleischhacker, Alexander Kaiser, Greg Kobele, Alex MacBride, and Jas<strong>on</strong> Riggle. Ed Keenan provided<br />

many helpful suggesti<strong>on</strong>s and inspirati<strong>on</strong> during this work.<br />

No doubt, many typographical errors and infelicities <strong>of</strong> other sorts remain. I hope to c<strong>on</strong>tinue<br />

revising and improving these notes, so comments are welcome!<br />

stabler@ucla.edu<br />

3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!