26.03.2013 Views

MIT Encyclopedia of the Cognitive Sciences - Cryptome

MIT Encyclopedia of the Cognitive Sciences - Cryptome

MIT Encyclopedia of the Cognitive Sciences - Cryptome

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

finding relevant pages on <strong>the</strong> world wide web). The second,<br />

<strong>the</strong> one most relevant to cognitive science, seeks to better<br />

understand how language comprehension and generation<br />

occurs in humans. Ra<strong>the</strong>r than performing experiments on<br />

humans as done in psycholinguistics, or developing <strong>the</strong>ories<br />

that account for <strong>the</strong> data with a focus on handling possible<br />

counterexamples as in linguistics and philosophy, researchers<br />

in natural language processing test <strong>the</strong>ories by building<br />

explicit computational models to see how well <strong>the</strong>y behave.<br />

Most research in <strong>the</strong> field is still more in <strong>the</strong> exploratory<br />

stage <strong>of</strong> this endeavor and trying to construct “existence<br />

pro<strong>of</strong>s” (i.e., find any mechanism that can understand language<br />

within limited scenarios), ra<strong>the</strong>r than building computational<br />

models and comparing <strong>the</strong>m to human<br />

performance. But once such existence-pro<strong>of</strong> systems are<br />

completed, <strong>the</strong> stage will be set for more detailed comparative<br />

study between human and computational models.<br />

Whatever <strong>the</strong> motivation behind <strong>the</strong> work in this area, however,<br />

computational models have provided <strong>the</strong> inspiration<br />

and starting point for much work in psycholinguistics and<br />

linguistics in <strong>the</strong> last twenty years.<br />

Although <strong>the</strong>re is a diverse set <strong>of</strong> methods used in natural<br />

language processing, <strong>the</strong> techniques can generally be<br />

broadly classified in three general approaches: statistical<br />

methods, structural/pattern-based methods and reasoningbased<br />

methods. It is important to note that <strong>the</strong>se approaches<br />

are not mutually exclusive. In fact, <strong>the</strong> most comprehensive<br />

models combine all three techniques. The approaches differ<br />

in <strong>the</strong> kind <strong>of</strong> processing tasks <strong>the</strong>y can perform and in <strong>the</strong><br />

degree to which systems require handcrafted rules as<br />

opposed to automatic training/learning from language data.<br />

A good source that gives an overview <strong>of</strong> <strong>the</strong> field involving<br />

all three approaches is Allen 1995.<br />

Statistical methods involve using large corpora <strong>of</strong> language<br />

data to compute statistical properties such as word<br />

co-occurrence and sequence information (see also STATISTI-<br />

CAL TECHNIQUES IN NATURAL LANGUAGE PROCESSING). For<br />

instance, a bigram statistic captures <strong>the</strong> probability <strong>of</strong> a<br />

word with certain properties following a word with o<strong>the</strong>r<br />

properties. This information can be estimated from a corpus<br />

that is labeled with <strong>the</strong> properties needed, and used to predict<br />

what properties a word might have based on its preceding<br />

context. Although limited, bigram models can be<br />

surprisingly effective in many tasks. For instance, bigram<br />

models involving part <strong>of</strong> speech labels (e.g., noun, verb) can<br />

typically accurately predict <strong>the</strong> right part <strong>of</strong> speech for over<br />

95 percent <strong>of</strong> words in general text. Statistical models are<br />

not restricted to part <strong>of</strong> speech tagging, however, and <strong>the</strong>y<br />

have been used for semantic disambiguation, structural disambiguation<br />

(e.g., prepositional phrase attachment), and<br />

many o<strong>the</strong>r properties. Much <strong>of</strong> <strong>the</strong> initial work in statistical<br />

language modeling was performed for automatic speechrecognition<br />

systems, where good word prediction can double<br />

<strong>the</strong> word-recognition accuracy rate. The techniques have<br />

also proved effective in tasks such as information retrieval<br />

and producing rough “first-cut” drafts in machine translation.<br />

A big advantage to statistical techniques is that <strong>the</strong>y<br />

can be automatically trained from language corpora. The<br />

challenge for statistical models concerns how to capture<br />

higher level structure, such as semantic information, and<br />

Natural Language Processing 593<br />

structural properties, such as sentence structure. In general,<br />

<strong>the</strong> most successful approaches to <strong>the</strong>se problems involve<br />

combining statistical approaches with o<strong>the</strong>r approaches. A<br />

good introduction to statistical approaches is Charniak<br />

1993.<br />

Structural and pattern-based approaches have <strong>the</strong> closest<br />

connection to traditional linguistic models. These<br />

approaches involve defining structural properties <strong>of</strong> language,<br />

such as defining FORMAL GRAMMARS for natural languages.<br />

Active research issues include <strong>the</strong> design <strong>of</strong><br />

grammatical formalisms to capture natural language structure<br />

yet retain good computational properties, and <strong>the</strong><br />

design <strong>of</strong> efficient parsing algorithms to interpret sentences<br />

with respect to a grammar. Structural approaches are not<br />

limited solely to syntax, however. Many more practical systems<br />

use semantically based grammars, where <strong>the</strong> primitive<br />

units in <strong>the</strong> grammar are semantic classes ra<strong>the</strong>r than syntactic.<br />

And o<strong>the</strong>r approaches dispense with fully analyzing<br />

sentence structure altoge<strong>the</strong>r, using simpler patterns <strong>of</strong> lexical,<br />

syntactic and semantic information that match sentence<br />

fragments. Such techniques are especially useful in limiteddomain<br />

speech-driven applications where errors in <strong>the</strong> input<br />

can be expected. Because <strong>the</strong> domain is limited, certain<br />

phrases (e.g., a prepositional phrase) may have only one<br />

interpretation possible in <strong>the</strong> application. Structural models<br />

also appear at <strong>the</strong> DISCOURSE level, where models are developed<br />

that capture <strong>the</strong> interrelationships between sentences<br />

and build models <strong>of</strong> topic flow. Structural models provide a<br />

capability for detailed analysis <strong>of</strong> linguistic phenomena, but<br />

<strong>the</strong> more detailed <strong>the</strong> analysis, <strong>the</strong> more one must rely on<br />

hand-constructed rules ra<strong>the</strong>r than automatic training from<br />

data. An excellent collection <strong>of</strong> papers on structural<br />

approaches, though missing recent work, is Grosz, Sparck<br />

Jones, and Webber 1986.<br />

Reasoning-based approaches involve encoding knowledge<br />

and reasoning processes and use <strong>the</strong>se to interpret language.<br />

This work has much in common with work in<br />

KNOWLEDGE REPRESENTATION as well as work in <strong>the</strong> philosophy<br />

<strong>of</strong> language. The idea here is that <strong>the</strong> interpretation<br />

<strong>of</strong> language is highly dependent on <strong>the</strong> context in which <strong>the</strong><br />

language appears. By trying to capture <strong>the</strong> knowledge a<br />

human may have in a situation, and model common-sense<br />

reasoning, problems such as word sense and sentencestructure<br />

disambiguation, analysis <strong>of</strong> referring expressions,<br />

and <strong>the</strong> recognition <strong>of</strong> <strong>the</strong> intentions behind language can be<br />

addressed. These techniques become crucial in discourse,<br />

whe<strong>the</strong>r it be extended text that needs to be understood or a<br />

dialogue that needs to be engaged in. Most dialogue-based<br />

systems use a speech-act–based approach to language and<br />

computational models <strong>of</strong> PLANNING and plan recognition to<br />

define a conversational agent. Specifically, such systems<br />

first attempt to recognize <strong>the</strong> intentions underlying <strong>the</strong> utterances<br />

<strong>the</strong>y hear, and <strong>the</strong>n plan <strong>the</strong>ir own utterances based on<br />

<strong>the</strong>ir goals and knowledge (including what was just recognized<br />

about <strong>the</strong> o<strong>the</strong>r agent). The advantage <strong>of</strong> this approach<br />

is that is provides a mechanism for contextual interpretation<br />

<strong>of</strong> language. The disadvantage is <strong>the</strong> complexity <strong>of</strong> <strong>the</strong> models<br />

required to define <strong>the</strong> conversational agent. Two good<br />

sources for work in this area are Cohen, Morgan, and Pollack<br />

1990 and Carberry 1991.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!