17.01.2014 Views

Linguistic Modeling for Multilingual Machine Translation

Linguistic Modeling for Multilingual Machine Translation

Linguistic Modeling for Multilingual Machine Translation

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.2. MODULARITY 21<br />

8 Morphological generation is done with the help of the external morphological<br />

module mpro*. The surface string is generated from the basic lexeme<br />

and the in<strong>for</strong>mation about compounding, derivation and inection.<br />

monotonicity<br />

(23)<br />

MSEN<br />

The man is afraid of a sentence which cannot be pronounced.<br />

The advantages of this straticational approach are the reduced complexity of<br />

every submodule and the restriction of certain modules to specic functionalities.<br />

One risk with this approach, however, is that some in<strong>for</strong>mation might<br />

not be accessible at a given level though necessary <strong>for</strong> unambiguous processing.<br />

In the face of ambiguities at a given level, straticational systems produce a<br />

number of possible structures which have to be ltered out at higher levels of<br />

processing. This approach is not ecient ifthepoints where an ambiguity is<br />

introduced and the point where it is resolved are distant (cf. [Mehrjerdian92]).<br />

A second problem with this approach is the possible incompatibility at dierent<br />

levels due to the nonmonotonicity introduced by dierent strata. In order to<br />

maintain the advantages of the straticational approach, thereby reducing to a<br />

minimum the risks incurred, a static modularity isintroduced in CAT2 which<br />

operates orthogonally on the stratication modules.<br />

2.2.2 Static Modularity<br />

In order to assure unambiguous processing at every straticational level, every<br />

type of in<strong>for</strong>mation should in principle be accessible throughout all levels. As<br />

the main data come from the lexicon, the lexicon should be present atevery<br />

level, supplying the levels even with those types of in<strong>for</strong>mation which are not<br />

typically required at that level. Ambiguities at one level (e.g. dierent syntactic<br />

and semantic properties of the verb to be) should be represented so as not to<br />

create an overgeneration at a level where these ambiguities are not relevant (e.g.<br />

the morphological level). As CAT2 uses one lexicon to which there is access<br />

at every level <strong>for</strong> each type of in<strong>for</strong>mation, semantic in<strong>for</strong>mation can already<br />

be used at level CS in order to exclude spurious objects, or morpho-syntactic<br />

in<strong>for</strong>mation can be used in generation at an early stage in order to speed up<br />

generation. In order to ensure the consistency of the lexicon and the grammar<br />

modules, a language declaration system, including a feature declaration system<br />

and a macro system are employed.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!