Linguistic Modeling for Multilingual Machine Translation
Linguistic Modeling for Multilingual Machine Translation
Linguistic Modeling for Multilingual Machine Translation
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2.2. MODULARITY 21<br />
8 Morphological generation is done with the help of the external morphological<br />
module mpro*. The surface string is generated from the basic lexeme<br />
and the in<strong>for</strong>mation about compounding, derivation and inection.<br />
monotonicity<br />
(23)<br />
MSEN<br />
The man is afraid of a sentence which cannot be pronounced.<br />
The advantages of this straticational approach are the reduced complexity of<br />
every submodule and the restriction of certain modules to specic functionalities.<br />
One risk with this approach, however, is that some in<strong>for</strong>mation might<br />
not be accessible at a given level though necessary <strong>for</strong> unambiguous processing.<br />
In the face of ambiguities at a given level, straticational systems produce a<br />
number of possible structures which have to be ltered out at higher levels of<br />
processing. This approach is not ecient ifthepoints where an ambiguity is<br />
introduced and the point where it is resolved are distant (cf. [Mehrjerdian92]).<br />
A second problem with this approach is the possible incompatibility at dierent<br />
levels due to the nonmonotonicity introduced by dierent strata. In order to<br />
maintain the advantages of the straticational approach, thereby reducing to a<br />
minimum the risks incurred, a static modularity isintroduced in CAT2 which<br />
operates orthogonally on the stratication modules.<br />
2.2.2 Static Modularity<br />
In order to assure unambiguous processing at every straticational level, every<br />
type of in<strong>for</strong>mation should in principle be accessible throughout all levels. As<br />
the main data come from the lexicon, the lexicon should be present atevery<br />
level, supplying the levels even with those types of in<strong>for</strong>mation which are not<br />
typically required at that level. Ambiguities at one level (e.g. dierent syntactic<br />
and semantic properties of the verb to be) should be represented so as not to<br />
create an overgeneration at a level where these ambiguities are not relevant (e.g.<br />
the morphological level). As CAT2 uses one lexicon to which there is access<br />
at every level <strong>for</strong> each type of in<strong>for</strong>mation, semantic in<strong>for</strong>mation can already<br />
be used at level CS in order to exclude spurious objects, or morpho-syntactic<br />
in<strong>for</strong>mation can be used in generation at an early stage in order to speed up<br />
generation. In order to ensure the consistency of the lexicon and the grammar<br />
modules, a language declaration system, including a feature declaration system<br />
and a macro system are employed.