Word Expert Translation from German into Chinese in - Knowledge ...
Word Expert Translation from German into Chinese in - Knowledge ...
Word Expert Translation from German into Chinese in - Knowledge ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Word</strong> <strong>Expert</strong> <strong>Translation</strong> <strong>from</strong> <strong>German</strong> <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong><br />
<strong>in</strong> the Slow Intelligence Framework<br />
Abstract—This paper presents a novel approach to translat<strong>in</strong>g<br />
<strong>German</strong> sentences <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong> us<strong>in</strong>g word expert translators,<br />
thereby extend<strong>in</strong>g the application area of the slow <strong>in</strong>telligence<br />
architecture. The word expert perspective to natural language<br />
understand<strong>in</strong>g is reviewed. The motivation to word expert<br />
translation is presented – It is shown <strong>in</strong> detail that the <strong>Ch<strong>in</strong>ese</strong><br />
language depends crucially on the topic-comment relation and is<br />
more suitable to be understood <strong>from</strong> the word expert perspective.<br />
Five pr<strong>in</strong>ciples for communication among <strong>Ch<strong>in</strong>ese</strong> word experts<br />
are proposed. Ma<strong>in</strong> activities of word expert translators consist<br />
of enumerat<strong>in</strong>g possible <strong>Ch<strong>in</strong>ese</strong> lexemes of a <strong>German</strong> lexeme,<br />
determ<strong>in</strong><strong>in</strong>g l<strong>in</strong>ear order<strong>in</strong>g among <strong>Ch<strong>in</strong>ese</strong> lexemes with<strong>in</strong> one<br />
word expert translator, determ<strong>in</strong><strong>in</strong>g topic-comment relations<br />
among word experts, construct<strong>in</strong>g nested topic-comment relations<br />
among word expert translators, choos<strong>in</strong>g the possible <strong>Ch<strong>in</strong>ese</strong><br />
lexemes. All these fit well with<strong>in</strong> the slow <strong>in</strong>telligence framework.<br />
I. INTRODUCTION<br />
Mach<strong>in</strong>e translation (MT) is to transform texts <strong>from</strong> one<br />
natural language <strong><strong>in</strong>to</strong> another by computers. The worldwide<br />
market for mach<strong>in</strong>e translation is large and steadily <strong>in</strong>creas<strong>in</strong>g<br />
with a growth rate of around 20% per year, <strong>in</strong> part because<br />
the manual translation by human translators is expensive and<br />
slow. Ma<strong>in</strong> technologies for MT are the <strong>in</strong>terl<strong>in</strong>gua method, the<br />
statistical method, and the rule-based approach. The rule-based<br />
method assumes that languages can be governed by rules.<br />
However, every rule has exceptions; and the mapp<strong>in</strong>g between<br />
rules of different languages can be complicated, sometimes<br />
even impossible. The statistical method [1] focuses on the<br />
probability of translat<strong>in</strong>g a sentence <strong>in</strong> the source language<br />
<strong><strong>in</strong>to</strong> a sentence <strong>in</strong> the target language, therefore, it requires a<br />
highly qualified and extensive sample translation corpus. Even<br />
if such sample corpus is available, what the statistical method<br />
guarantees is only a probability. The <strong>in</strong>terl<strong>in</strong>gua approach<br />
ideally assumes a common semantic representation for all<br />
natural languages. <strong>Translation</strong> is a process of transform<strong>in</strong>g<br />
text <strong>in</strong> the source language <strong><strong>in</strong>to</strong> the <strong>in</strong>terl<strong>in</strong>gua, and a process<br />
of transform<strong>in</strong>g the <strong>in</strong>terl<strong>in</strong>gua <strong><strong>in</strong>to</strong> the target language. The<br />
difficulty is that such <strong>in</strong>terl<strong>in</strong>gua is hard to develop, at least at<br />
the time when the ALPAC report was made. Recent research<br />
on the semantic representation of natural language has made<br />
fruitful progress. It is time now to reconsider MT with the<br />
<strong>in</strong>terl<strong>in</strong>gua approach, e.g. [8]. The present paper adopts the<br />
MultiNet representation [5] as the <strong>in</strong>terl<strong>in</strong>gua for the semantic<br />
representation of natural languages, and applies the word<br />
expert perspective for mach<strong>in</strong>e translation.<br />
Tiansi Dong and Ingo Glöckner<br />
Department of Mathematics and Computer Science<br />
University of Hagen<br />
Email: {tiansi.dong|<strong>in</strong>go.gloeckner}@fernuni-hagen.de<br />
The rest of the paper is structured as follows: Section<br />
2 reviews the word expert perspective, and the successful<br />
application of the WOCADI parser developed by [6]; Section 3<br />
presents the topic-comment structure of the <strong>Ch<strong>in</strong>ese</strong> language,<br />
and motivates the method of word expert translator; Section<br />
4 presents the ma<strong>in</strong> activities of word expert translators<br />
and shows how these activities are carried out <strong>in</strong> the slow<br />
<strong>in</strong>telligence framework.<br />
II. WORD EXPERT PERSPECTIVE<br />
A. The Perspective<br />
The traditional perspective views words as passive data<br />
(with knowledge of part of speech and mean<strong>in</strong>g), and languages<br />
as an <strong>in</strong>f<strong>in</strong>ite set of sequences of words (satisfy<strong>in</strong>g<br />
grammatical rules). The word expert perspective pioneered by<br />
Rieger [9] views <strong>in</strong>dividual words as active procedures – Each<br />
word of language is seen as an active lexical agent called a<br />
word expert, which participates <strong>in</strong> the overall control of the<br />
pars<strong>in</strong>g process by its <strong>in</strong>ternal actions and its <strong>in</strong>teractions with<br />
other such agents [10, p.1]. The word expert view advocates<br />
the <strong>in</strong>tegrated syntax-semantics coupl<strong>in</strong>g approach to language<br />
understand<strong>in</strong>g. The traditional syntax is viewed as an artifact<br />
describ<strong>in</strong>g patterns of lexical <strong>in</strong>teractions, and cannot be used<br />
to model comprehension due to the rich semantic particularities<br />
of lexemes [10, p.3]. The word expert perspective<br />
views text understand<strong>in</strong>g as a process of <strong>in</strong>teractions among<br />
word exports that results <strong>in</strong> a disambiguation. The work of<br />
pars<strong>in</strong>g is designed as the decision-mak<strong>in</strong>g process of each<br />
word expert to choose one suitable <strong>in</strong>terpretation for the word<br />
that it represents. Comprehension is therefore simulated as<br />
an activity of look<strong>in</strong>g for the best possible fit among word<br />
experts. This view not only differs <strong>from</strong> the traditional rulebased<br />
approach (<strong>in</strong> which words are passive), but also <strong>from</strong><br />
the statistical-based approach (<strong>in</strong> which best fit is guessed by<br />
look<strong>in</strong>g back <strong><strong>in</strong>to</strong> the past exist<strong>in</strong>g sample corpus).<br />
B. <strong>Word</strong> <strong>Expert</strong> Parser for English<br />
Follow<strong>in</strong>g Wilks’ pars<strong>in</strong>g system [13], Small [10] developed<br />
one of the most <strong>in</strong>fluential word expert parsers for English. His<br />
theoretical position is that words have no mean<strong>in</strong>g per se, but<br />
rather that fragments of lexical items mean someth<strong>in</strong>g through<br />
their <strong>in</strong>terrelationships [11, pp.70]. That is, each lexical item<br />
is viewed as hav<strong>in</strong>g certa<strong>in</strong> <strong>in</strong>teractions with its neighborhood<br />
items, and produces mean<strong>in</strong>gs.
C. <strong>Word</strong> Class <strong>Expert</strong> Parser for <strong>German</strong><br />
Based on the ideas of <strong>Word</strong> Class Functions [7], Helbig<br />
and Hartrumpf [6] developed the first semantically oriented<br />
word class expert parser for the <strong>German</strong> language – WOCADI.<br />
In contrast to Small’s distributed-<strong>in</strong>teraction approach, WCFA<br />
describes the grammatical functions of whole classes of words<br />
[6, pp.313]. The parser transforms <strong>German</strong> sentences <strong><strong>in</strong>to</strong><br />
the MultiNet formalism [5]. It has been tested with all of<br />
the texts <strong>in</strong> the <strong>German</strong> Wikipedia. The MultiNet formalism<br />
and the WOCADI parser have been successfully applied<br />
<strong>in</strong> the LogAnswer question answer<strong>in</strong>g (QA) system which<br />
scored second among non-English QA systems <strong>in</strong> the CLEF<br />
competition [4].<br />
D. <strong>Word</strong> <strong>Expert</strong>s for <strong>Translation</strong><br />
As po<strong>in</strong>ted out by Small, the word expert perspective<br />
suggests a new way to look at translation [10, p.15]. The<br />
generation step, <strong>in</strong> particular, requires word experts to arrange<br />
themselves <strong><strong>in</strong>to</strong> a mean<strong>in</strong>gful sequence by communicat<strong>in</strong>g<br />
with each other. 1 For example, let the three word experts<br />
represent drive, Joanie and car. The drive expert would send<br />
out the message: <strong>in</strong> front of me there shall be someone, beh<strong>in</strong>d<br />
me there shall be a vehicle, who fits? The Joanie expert would<br />
reply: I can stay <strong>in</strong> front of you; the car expert would reply:<br />
I can stay beh<strong>in</strong>d you. A sequence of Joanie drives car is<br />
therefore formed.<br />
<strong>Translation</strong> <strong>from</strong> the word expert perspective consists of<br />
two processes: one process is word expert pars<strong>in</strong>g <strong>from</strong> the<br />
source language <strong><strong>in</strong>to</strong> a mean<strong>in</strong>g representation, and the second<br />
process is word expert generation <strong>from</strong> the mean<strong>in</strong>g representation<br />
<strong><strong>in</strong>to</strong> the target language. We focus on the translation<br />
<strong>from</strong> <strong>German</strong> <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong>. The first process is carried out<br />
by the WOCADI parser. The second process transforms the<br />
MultiNet semantics representation [5] <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong>.<br />
III. THE CHINESE LANGUAGE AND CHINESE WORD<br />
EXPERTS<br />
The grammar for <strong>Ch<strong>in</strong>ese</strong> is totally different <strong>from</strong> the<br />
grammars of <strong>German</strong> or English: Most parts of speech can<br />
serve as both the subject and the predicate <strong>in</strong> a <strong>Ch<strong>in</strong>ese</strong><br />
sentence. Therefore, an attempt to map grammar rules between<br />
<strong>German</strong> and <strong>Ch<strong>in</strong>ese</strong> only makes th<strong>in</strong>gs complicated. We will<br />
show that the word expert perspective is a very suitable way<br />
to expla<strong>in</strong> the <strong>Ch<strong>in</strong>ese</strong> language.<br />
A. Subject and Predicate as Topic and Comment<br />
Chao [3] studied the <strong>Ch<strong>in</strong>ese</strong> language and concluded that<br />
the relation between the subject and the predicate <strong>in</strong> a <strong>Ch<strong>in</strong>ese</strong><br />
sentence is a topic-comment relation. This relation holds <strong>in</strong> all<br />
<strong>Ch<strong>in</strong>ese</strong> dialects [12], e.g., Mandar<strong>in</strong> (used <strong>in</strong> Pek<strong>in</strong>g area),<br />
Wu (used around Shanghai area), Cantonese (used <strong>in</strong> and<br />
around Canton area), and WenYan (used <strong>in</strong> ancient Ch<strong>in</strong>a).<br />
For example, <strong>in</strong> ����(��/John, ��/dead), ��/John<br />
is the topic, ��/dead is the comment, which presents a<br />
1 For simplicity, we adopt Small’s <strong>in</strong>dividual word experts view here though<br />
an abstraction of word class experts would aga<strong>in</strong> make sense.<br />
comment on ��, which means he is dead. In�����<br />
�(��/John, ��/dead, ��/father), ��/John is the topic,<br />
����(��/dead, ��/father) is the comment, which<br />
presents a comment on ��, which means his father is dead.<br />
In ���(�/water,��/boil), �/water is the topic, ��/boil<br />
comments on the water; <strong>in</strong> ����(�/book, �/read, �<br />
�/f<strong>in</strong>ish), �(book) is the topic, ���(�/read, ��/f<strong>in</strong>ish)<br />
is a comment – (I) have f<strong>in</strong>ished read<strong>in</strong>g the book.<br />
The topic-comment relation <strong>in</strong>troduces a question-answer<br />
relation between the subject and the predicate [3, p.81].<br />
Imag<strong>in</strong>e a man returns home after work, and asks his wife,<br />
��? (where (is the) rice?) his wife answers, ����(all<br />
eaten). The man <strong>in</strong>troduces a topic �/rice <strong>in</strong> a question; his<br />
wife comments on the topic by answer<strong>in</strong>g all eaten.<br />
B. Full <strong>Ch<strong>in</strong>ese</strong> Sentences<br />
A full <strong>Ch<strong>in</strong>ese</strong> sentence has a topic and a comment.<br />
1) nom<strong>in</strong>al expressions as comments: In ����(he is<br />
an American), �/he is a pronoun, ���/American is a<br />
noun. There is no l<strong>in</strong>k verb �/is between �/he and �<br />
��/American. The <strong>Ch<strong>in</strong>ese</strong> <strong>Word</strong> <strong>Expert</strong> � (shortened as<br />
‘CWE�’) asks its surround<strong>in</strong>g experts, for example, who can<br />
be my property? The CWE��� answers: I can be your<br />
property of nationality. If we view <strong>Ch<strong>in</strong>ese</strong> words as such<br />
active agents, <strong>in</strong>stead of passive data as <strong>in</strong> the traditional rulebased<br />
grammar, the l<strong>in</strong>k verb �/be is not necessary. In ���<br />
���(��/<strong>in</strong>side of the room, ��/many, ��/mosquito),<br />
the nom<strong>in</strong>al expression ���� is the predicate. The CWE�<br />
� asks: what is <strong>in</strong>side of the room? The CWE����<br />
answers: many mosquitoes. The mean<strong>in</strong>g of the sentence is<br />
there are many mosquitoes <strong>in</strong>side of the room.<br />
2) active verbs as comments: In �������(this<br />
matter has long been published), ���(this matter) is the<br />
topic, ��/publish is a verb <strong>in</strong> the active form. In <strong>Ch<strong>in</strong>ese</strong> the<br />
passive form ���(be published) may not be used to mark<br />
the passive action – a construction taken for granted <strong>in</strong> English<br />
or <strong>German</strong>. However, <strong>from</strong> the word expert perspective, <strong>Ch<strong>in</strong>ese</strong><br />
has more pragmatic efficiency <strong>in</strong> express<strong>in</strong>g mean<strong>in</strong>gs:<br />
the CWE�� asks: what can be published? the CWE���<br />
answers: this matter. So, <strong>in</strong> <strong>Ch<strong>in</strong>ese</strong>, the passive form is <strong>in</strong>deed<br />
not necessary. In ���,��(w<strong>in</strong>e, (I) do not dr<strong>in</strong>k, tobacco,<br />
(I) smoke), �/w<strong>in</strong>e and �/tobacco are subjects, �/dr<strong>in</strong>k and<br />
�/smoke are verbs <strong>in</strong> the active form. The CWE�� asks:<br />
what not to dr<strong>in</strong>k? the CWE� answers: w<strong>in</strong>e. The CWE�<br />
asks: what to smoke? the CWE� answers: tobacco.<br />
3) adjective as comments: In � �(�/I, �/poor), the<br />
whole comment is one adjective word �. The mean<strong>in</strong>g of the<br />
sentence is I am poor. The CWE� asks: how about me? The<br />
CWE� answers: poor. In ��(�/dish, �/salty), the whole<br />
comment is one adjective word �. The CWE� asks: who can<br />
serve my property? The CWE� answers: salty. The mean<strong>in</strong>g<br />
of the sentence is the dish is salty.<br />
4) full sentences as comments: In ������(��/this,<br />
�/man, ��/ear, �/soft), the comment is a full sentence �<br />
��. The direct translation is (as for) this man, the ear is soft.<br />
The CWE��� and the CWE�� ask: what is my property?
The CWE� answers to both: soft. As the CWE� is nearer<br />
to the CWE�� than to the CWE���, its answer was first<br />
accepted by the CWE��. As a result a new word expert<br />
CWE��� is formed which means gullible. This new expert<br />
answers the question raised by CWE���. The mean<strong>in</strong>g of<br />
the sentence is this man is gullible. From this example we<br />
propose two pr<strong>in</strong>ciples for communications among <strong>Ch<strong>in</strong>ese</strong><br />
word experts as follows:<br />
<strong>Ch<strong>in</strong>ese</strong> WE Pr<strong>in</strong>ciple 1: Neighborhood word experts have<br />
the priority <strong>in</strong> communication.<br />
<strong>Ch<strong>in</strong>ese</strong> WE Pr<strong>in</strong>ciple 2: New word experts may appear<br />
after successful communications and play roles <strong>in</strong> communication<br />
with other experts.<br />
In � ��� ���(� �/this, �/man, � �/m<strong>in</strong>d, �<br />
�/simple), the comment is a full sentence ����. The<br />
direct translation is this man (is such that his) m<strong>in</strong>d is simple.<br />
The mean<strong>in</strong>g of the sentence is the m<strong>in</strong>d of this man is simple.<br />
With the two pr<strong>in</strong>ciples, the CWE�� first communicates<br />
with the CWE��, and forms a new word expert CWE���<br />
�, which answers the question of the CWE���. The whole<br />
process can be simulated by two question-answer rounds: -<br />
how about the m<strong>in</strong>d? -simple -how about this man? -m<strong>in</strong>d is<br />
simple.<br />
5) verbal expressions as topics: In ��,����(�/go,<br />
�/all right, �/not, �/also), verbal expressions � and ��<br />
are topics. The mean<strong>in</strong>g of the sentence is to go is all right, not<br />
to go is also all right. Primitive verbs as subjects are neither<br />
allowed <strong>in</strong> English nor <strong>in</strong> <strong>German</strong>, but very normal <strong>in</strong> <strong>Ch<strong>in</strong>ese</strong>,<br />
and can be easily expla<strong>in</strong>ed <strong>from</strong> the word expert perspective.<br />
The CWE� asks: shall I perform? The CWE� answers: all<br />
right. The CWE�� asks: can I not perform? The CWE�<br />
answers: all right.<br />
6) Spatial-temporal expressions as topics: In ���(�<br />
�/today,�/cold), the temporal expression ��is the topic.<br />
The mean<strong>in</strong>g of the sentence is today is cold. The CWE��<br />
asks: how is today? The CWE� answers: cold. In ���<br />
��? (��/here,�/is,��/where), the spatial expression �<br />
�is the whole topic. The mean<strong>in</strong>g of the sentence is where<br />
is here? The CWE�� asks: where is here? The CWE��<br />
will communicate with word experts <strong>in</strong> the next sentence for<br />
an answer.<br />
7) conditional expressions as topics: In �������<br />
�����(�/he,��/dead,��/if,��/simply,����<br />
�/unth<strong>in</strong>kable), the topic is the conditional expression ��<br />
���. The mean<strong>in</strong>g of the sentence is the supposition that<br />
he should die is simply unth<strong>in</strong>kable. The CWE� asks: how<br />
is he? The CWE�� answers: dead. The CWE�� asks:<br />
what will be the result under what condition? The CWE��<br />
� serves the condition, the CWE���� serves the result.<br />
8) prepositional expressions as topics: In � � � � �<br />
��(�/through,��/chairman,��/convene,��/meet<strong>in</strong>g),<br />
the topic is a prepositional expression ���. The mean<strong>in</strong>g of<br />
the sentence is the meet<strong>in</strong>g is convened through the chairman.<br />
The CWE� asks: through what? by whom? The CWE�<br />
� answers: chairman. A new word expert CWE��� is<br />
formed. The CWE�� asks: who convenes what? how to<br />
convene? The CWE��� answers the how question, the<br />
CWE� � answers the who question, and the CWE� �<br />
answers the what question.<br />
9) full sentences as topics and comments: In �����<br />
��(�/he, ��/dead, �/I, �/awfully, ��/feel bad), the<br />
topic is a full sentence ���(he is dead), the comment is<br />
also a full sentence ����(I feel awfully bad). The direct<br />
translation is he is dead, I feel awfully bad. The mean<strong>in</strong>g of<br />
the sentence is that he is dead is someth<strong>in</strong>g about which I<br />
feel awfully bad. The CWE� asks: how is he? the CWE�<br />
� answers: dead. The CWE��� asks: how? what is the<br />
result? The CWE� asks: how about myself? The CWE�<br />
�� answers: awfully bad. The CWE���� answer the<br />
question: what is the result? If the sentence is ���, �<br />
�(�/he, ��/dead, ��/traffic accident), the CWE�� will<br />
answer the question of the CWE���: how come?<br />
We conclude that structures of full <strong>Ch<strong>in</strong>ese</strong> sentences violate<br />
many important grammar rules of Western languages, and<br />
that understand<strong>in</strong>g full <strong>Ch<strong>in</strong>ese</strong> sentences can be achieved by<br />
communications among <strong>Ch<strong>in</strong>ese</strong> word experts follow<strong>in</strong>g some<br />
communication pr<strong>in</strong>ciples. The mean<strong>in</strong>g of a full <strong>Ch<strong>in</strong>ese</strong><br />
sentence can be represented by a dialog process among the<br />
word experts.<br />
C. M<strong>in</strong>or Sentences<br />
In conversations, it is normal that one speaker <strong>in</strong>troduces<br />
a topic and the other makes a comment. This <strong>in</strong>troduces the<br />
term of m<strong>in</strong>or sentence [3, p.60]. In contrast to full sentence,<br />
a m<strong>in</strong>or sentence does not have both topics and comments.<br />
The above conclusion is supported by the structure of <strong>Ch<strong>in</strong>ese</strong><br />
m<strong>in</strong>or sentences <strong>in</strong> that a speaker <strong>in</strong> a conversation may only<br />
say a few words to answer the questions raised by word experts<br />
of the other speaker.<br />
D. Compound and Complex Sentences<br />
By paralleliz<strong>in</strong>g two or more sentences, we can construct<br />
compound <strong>Ch<strong>in</strong>ese</strong> sentences. For example, �����, �<br />
����(you do not know me and I do not know you) is a<br />
compound sentence by paralleliz<strong>in</strong>g two ‘A ��� B’ (A<br />
does not know B) sentences. By nest<strong>in</strong>g one full sentence<br />
either <strong>in</strong> topic or <strong>in</strong> comment, we can construct complex<br />
sentences. For example, �������(�/I,��/dead,�<br />
�/funeral,��/simple)is a complex sentence, mean<strong>in</strong>g when<br />
I die, the funeral should be simple, constructed by us<strong>in</strong>g a full<br />
sentence ���(I die) as the topic. Both of the two structures<br />
can be easily expla<strong>in</strong>ed <strong>in</strong> the word expert framework with the<br />
second communication pr<strong>in</strong>ciple (<strong>Ch<strong>in</strong>ese</strong> WE Communication<br />
2) and the third communication pr<strong>in</strong>ciple as follows.<br />
<strong>Ch<strong>in</strong>ese</strong> WE Pr<strong>in</strong>ciple 3: A newly formed word expert has<br />
priority over old ones <strong>in</strong> communication.<br />
With this pr<strong>in</strong>ciple, although both CWE�� and CWE��<br />
� answer the CWE����’s conditional question, CWE�<br />
�� is newly formed and has the priority.<br />
E. Pivotal Constructions<br />
A <strong>Ch<strong>in</strong>ese</strong> sentence with a pivotal construction has normally<br />
two verbs and a nom<strong>in</strong>al expression. This nom<strong>in</strong>al expression
serves as the object of the first verb and as the subject of the<br />
second verb [3, p.124-125]. For example, <strong>in</strong> ������<br />
�(��/we, �/order, �/he, �/serve, ��/representative),<br />
there are two verbs �/send and �/serve, a pronoun�/he is<br />
the object of �/send and the subject of �/serve (<strong>in</strong> <strong>Ch<strong>in</strong>ese</strong><br />
�/he has the same form as nom<strong>in</strong>al case and as dative case).<br />
This direct translation of the sentence is that ‘we order he<br />
serve as representative’. The mean<strong>in</strong>g is that we delegate<br />
him to be representative. With<strong>in</strong> the word expert perspective,<br />
we can expla<strong>in</strong> this special construction without <strong>in</strong>troduc<strong>in</strong>g<br />
any new term<strong>in</strong>ologies (as done <strong>in</strong> the rule-based grammar<br />
theories). CWE� asks: whom is sent? CWE� asks: who<br />
serves? CWE� answers to both: �/he.<br />
IV. GERMAN-CHINESE WORD EXPERT TRANSLATORS<br />
Given a sentence <strong>in</strong> <strong>German</strong>, the WOCADI parser can be<br />
used to generate a correspond<strong>in</strong>g semantic representation <strong>in</strong><br />
the MultiNet formalism. We need to design word expert translators<br />
that communicate with each other, l<strong>in</strong>earize themselves<br />
to form a (nested) topic-comment structure, and transform the<br />
MultiNet representation <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong> sentences.<br />
We start with a simple example to <strong>in</strong>troduce the ma<strong>in</strong> idea.<br />
Suppose the <strong>German</strong> sentence is Er ist e<strong>in</strong> Deutscher (He is a<br />
<strong>German</strong>). The WOCADI parser delivers the MulitNet semantic<br />
representation as illustrated <strong>in</strong> Figure 1: er.1.1 is the word<br />
sense2 of the lexeme er/he; c275 represents the word expert<br />
for a concrete <strong>in</strong>dividual3 subord<strong>in</strong>ate (SUB) to er.1.1;<br />
similarly, c287 represents the word expert whose concept<br />
is subord<strong>in</strong>ate to deutsche.1.1/<strong>German</strong>. The word expert<br />
c278 has two arguments: one is the topic, c275, po<strong>in</strong>ted by<br />
ARG1; the second is the comment, c287, po<strong>in</strong>ted by ARG2;<br />
the temporal status (TEMP)ofc278 is present (present.0).<br />
The word expert c275 posts a message: if I am the topic,<br />
who can be my comment? The word expert c278 answers:<br />
as far as I know, the word expert c287 is your comment. If<br />
word expert c275 knows that his <strong>Ch<strong>in</strong>ese</strong> lexeme is �/he,<br />
and word expert c287 knows that his <strong>Ch<strong>in</strong>ese</strong> lexeme is �<br />
��/<strong>German</strong>, they will know that <strong>in</strong> the l<strong>in</strong>earization of the<br />
<strong>Ch<strong>in</strong>ese</strong> sentence c275 is before c287. The simplest case<br />
is: ����(He <strong>German</strong>), which is <strong>in</strong>deed a valid <strong>Ch<strong>in</strong>ese</strong><br />
sentence with the same mean<strong>in</strong>g as Er ist e<strong>in</strong> Deutscher. If<br />
word expert c278 knows that his <strong>Ch<strong>in</strong>ese</strong> lexeme is �/is, and<br />
knows that I shall stay between the topic and the comment<br />
<strong>in</strong> the <strong>Ch<strong>in</strong>ese</strong> sentence, the <strong>Ch<strong>in</strong>ese</strong> sentence will be ��<br />
���(he is <strong>German</strong>). If c278 knows that ��/now is<br />
the <strong>Ch<strong>in</strong>ese</strong> lexeme mean<strong>in</strong>g present and decides that his<br />
temporal knowledge shall also be encoded <strong>in</strong> the <strong>Ch<strong>in</strong>ese</strong><br />
sentence, the <strong>Ch<strong>in</strong>ese</strong> sentence could be �������,<br />
�������, �������, �������. Ifhe<br />
knows that his temporal label shall be <strong>in</strong> front of him, the<br />
two l<strong>in</strong>earizations will be ������� and �����<br />
��, both are valid <strong>Ch<strong>in</strong>ese</strong> sentences. These word experts<br />
are now also experts for translat<strong>in</strong>g. Activities of each word<br />
2 Our lexicon uses a double <strong>in</strong>dex<strong>in</strong>g scheme to dist<strong>in</strong>guish word senses<br />
3 Entities mentioned <strong>in</strong> the text are represented by constants c1, c2 etc.<br />
Fig. 1. MultiNet representation of the sentence Er ist e<strong>in</strong> Deutscher (he is<br />
a <strong>German</strong>)<br />
expert translator comprise: to transform its <strong>German</strong> lexeme<br />
<strong><strong>in</strong>to</strong> possible <strong>Ch<strong>in</strong>ese</strong> lexemes, to determ<strong>in</strong>e a l<strong>in</strong>ear order<strong>in</strong>g<br />
relation among <strong>Ch<strong>in</strong>ese</strong> lexemes of a s<strong>in</strong>gle word expert<br />
translator, to determ<strong>in</strong>e whether its <strong>Ch<strong>in</strong>ese</strong> lexemes shall<br />
appear <strong>in</strong> the <strong>Ch<strong>in</strong>ese</strong> translation, to establish a l<strong>in</strong>ear order<strong>in</strong>g<br />
relation with other word expert translators by communication,<br />
and to choose most appropriate <strong>Ch<strong>in</strong>ese</strong> lexemes.<br />
A. Transform<strong>in</strong>g <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong> Lexemes<br />
This task is to f<strong>in</strong>d <strong>Ch<strong>in</strong>ese</strong> lexemes for a given <strong>German</strong><br />
lexeme such that they represent the same concept. This may<br />
not be always feasible. Some <strong>German</strong> lexemes may not<br />
have a correspond<strong>in</strong>g native <strong>Ch<strong>in</strong>ese</strong> lexeme, e.g. names of<br />
Cheeses, beers, and chocolates – <strong>in</strong> <strong>German</strong>, Franziskaner<br />
can refer to a k<strong>in</strong>d of beer, and there is no correspond<strong>in</strong>g<br />
native <strong>Ch<strong>in</strong>ese</strong> lexeme. For those hav<strong>in</strong>g correspond<strong>in</strong>g native<br />
<strong>Ch<strong>in</strong>ese</strong> lexemes, these may be different <strong>in</strong> <strong>Ch<strong>in</strong>ese</strong> dialects.<br />
For example, <strong>German</strong> lexeme wir (we) can be mapped to ��<br />
<strong>in</strong> Mandar<strong>in</strong>, �� <strong>in</strong> Shanghai dialect, �� <strong>in</strong> Canton dialect.<br />
Chao [3] suggest that a complete lexicon shall be constructed<br />
to make selection applicable <strong>in</strong> grammar. For the translation<br />
<strong>from</strong> <strong>German</strong> <strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong>, we need to embed the <strong>Ch<strong>in</strong>ese</strong><br />
lexical ontologies system <strong><strong>in</strong>to</strong> the <strong>German</strong> lexical ontologies<br />
system, and mark each <strong>Ch<strong>in</strong>ese</strong> lexeme with its dialect group,<br />
while neglect<strong>in</strong>g f<strong>in</strong>e <strong>Ch<strong>in</strong>ese</strong> ontologies lexemes which do<br />
not have correspond<strong>in</strong>g <strong>German</strong> lexemes. <strong>German</strong> lexemes<br />
which do not have native <strong>Ch<strong>in</strong>ese</strong> lexemes will be translated<br />
separately.<br />
B. Lexeme Order<strong>in</strong>g with<strong>in</strong> a <strong>Word</strong> <strong>Expert</strong> Translator<br />
A word expert <strong>in</strong> the analysis of a <strong>German</strong> sentence may<br />
have a temporal property represent<strong>in</strong>g the tense. In <strong>Ch<strong>in</strong>ese</strong><br />
tense is achieved by particles: � � � � �, such as �<br />
���� (er hat gegessen/he has eaten). The word expert<br />
essen.1.1/eat has a temporal property past.0, which shall<br />
be translated <strong><strong>in</strong>to</strong> the <strong>Ch<strong>in</strong>ese</strong> particle � and �. The correspond<strong>in</strong>g<br />
<strong>Ch<strong>in</strong>ese</strong> lexemes of word expert essen.1.1 are ��<br />
���, whose l<strong>in</strong>ear order<strong>in</strong>g <strong>in</strong> <strong>Ch<strong>in</strong>ese</strong> sentence is stated<br />
as follows.<br />
<strong>Ch<strong>in</strong>ese</strong> WE Pr<strong>in</strong>ciple 4: Let L be a word expert which<br />
may have � or �as particle. (1) if � is the only particle,<br />
it must occur directly after L; (2) if � is the only particle, it
Fig. 2. MultiNet representation of the sentence Das K<strong>in</strong>d hat Angst und<br />
fängt an zu we<strong>in</strong>en (The Child is scared and beg<strong>in</strong>s to cry)<br />
shall be after L; (3) if both � and �are particles, � shall be<br />
before �besides obey<strong>in</strong>g rules (1) and (2); (4) if � is used<br />
twice, besides obey<strong>in</strong>g rule (2), they must be separated by<br />
lexemes of another word expert, and one must directly follow<br />
L.<br />
With the above pr<strong>in</strong>ciple, the follow<strong>in</strong>g order<strong>in</strong>gs are all<br />
understandable translations: ���– (1), ���– (2), ��<br />
�– (2), ����– (1)(2)(3), ����– (1)(2)(3), ���<br />
�– (2)(4).<br />
C. Determ<strong>in</strong><strong>in</strong>g to Appear or Not<br />
Not all the word experts of a <strong>German</strong> sentence shall<br />
appear <strong>in</strong> the translated <strong>Ch<strong>in</strong>ese</strong> sentence. For example, the<br />
<strong>German</strong> sentence Das K<strong>in</strong>d hat Angst und fängt an zu we<strong>in</strong>en<br />
(The Child is scared and beg<strong>in</strong>s to cry) shall be translated<br />
<strong><strong>in</strong>to</strong> <strong>Ch<strong>in</strong>ese</strong> �����(��/K<strong>in</strong>d/Child, �/Angst/scare,<br />
�/we<strong>in</strong>en/cry, �/particle), as illustrated <strong>in</strong> Figure 2. The follow<strong>in</strong>g<br />
word experts appear <strong>in</strong> the translated <strong>Ch<strong>in</strong>ese</strong> sentence:<br />
k<strong>in</strong>d.1.1/child (c97), we<strong>in</strong>en.1.1/cry (c138), angst.1.1/scare<br />
(c101), and haben.1.1/have (c98). The carrier (SCAR) of<br />
haben.1.1/have is k<strong>in</strong>d.1.1/child, which is the actor (AGT)<br />
of both we<strong>in</strong>en.1.1/cry and anfangen.1.2/beg<strong>in</strong>. The affected<br />
object (AFF) of anfangen.1.2/beg<strong>in</strong> is we<strong>in</strong>en.1.1/cry, which<br />
is the second argument (ARG2) of anfangen.1.1. 4<br />
Consider<strong>in</strong>g the topic-comment relations among the word<br />
experts, we f<strong>in</strong>d that the comment of angst.1.1/scare is<br />
we<strong>in</strong>en.1.1/cry, and that the two word experts anfangen.1.2/beg<strong>in</strong><br />
and anfangen.1.1/beg<strong>in</strong> simply duplicate this<br />
topic-comment relation. Therefore, they shall not appear <strong>in</strong><br />
the translated <strong>Ch<strong>in</strong>ese</strong> sentence – the redundancy is only an<br />
artifact of our deep semantic analysis. A pr<strong>in</strong>ciple is stated as<br />
follows.<br />
<strong>Ch<strong>in</strong>ese</strong> WE Pr<strong>in</strong>ciple 5: A word expert shall not appear<br />
<strong>in</strong> the translated <strong>Ch<strong>in</strong>ese</strong> sentence, if it duplicates an exist<strong>in</strong>g<br />
topic-comment relation.<br />
4 This analysis reflects that if someone starts someth<strong>in</strong>g (anfangen.1.1), then<br />
this start<strong>in</strong>g actions causes someth<strong>in</strong>g to start (anfangen.1.2).<br />
D. Order<strong>in</strong>g of <strong>Word</strong> <strong>Expert</strong> Translators<br />
For those word experts that will appear <strong>in</strong> the translated<br />
<strong>Ch<strong>in</strong>ese</strong> sentence, a nested topic-comment order<strong>in</strong>g shall<br />
be constructed. As we have semantic representations <strong>from</strong><br />
WOCADI and lexemes tagged with semantic roles, topiccomment<br />
relations among word experts are not difficult to<br />
obta<strong>in</strong> – we only need to exam<strong>in</strong>e the description of the lexeme<br />
and the possible MultiNet constructions, to see which elements<br />
can be the comment of which other elements <strong>in</strong> the mean<strong>in</strong>g<br />
representation. In Figure 2, angst.1.1/scare is the topic for<br />
we<strong>in</strong>en.1.1/cry, so we have a list (WE-angst.1.1/scare WEwe<strong>in</strong>en.1.1/cry),<br />
k<strong>in</strong>d.1.1/child is the topic for angst.1.1/scare,<br />
so we have (WE-k<strong>in</strong>d.1.1 (WE-angst.1.1 WE-we<strong>in</strong>en.1.)). By<br />
flatten<strong>in</strong>g this nested structure, we have a l<strong>in</strong>ear order<strong>in</strong>g of the<br />
word experts: (WE-k<strong>in</strong>d.1.1 WE-angst.1.1 WE-we<strong>in</strong>en.1.1).<br />
<strong>Ch<strong>in</strong>ese</strong> sentences can be obta<strong>in</strong>ed by replac<strong>in</strong>g each word expert<br />
with their <strong>Ch<strong>in</strong>ese</strong> lexeme(s). For example, �����,<br />
�����, and ������ are all understandable translations,<br />
where � is the particle of �, whose order<strong>in</strong>g follows<br />
<strong>Ch<strong>in</strong>ese</strong> WE Pr<strong>in</strong>ciple 4.<br />
E. Choos<strong>in</strong>g <strong>Ch<strong>in</strong>ese</strong> Lexemes<br />
A word expert may have more than one correspond<strong>in</strong>g<br />
<strong>Ch<strong>in</strong>ese</strong> lexeme even <strong>in</strong> one <strong>Ch<strong>in</strong>ese</strong> dialect. Communications<br />
among word experts are required to select the most suitable<br />
one, or to delete <strong>in</strong>compatible ones. For example, the word<br />
expert e<strong>in</strong>.1.1/a <strong>in</strong> the <strong>German</strong> phrase e<strong>in</strong> Baum (a tree) can be<br />
mapped to �����������.... By communicat<strong>in</strong>g<br />
with the word expert Baum.1.1/tree, word expert e<strong>in</strong>.1.1/a<br />
knows that it can only be mapped to ��. This requires word<br />
experts of countable objects to have measurement <strong>in</strong>formation.<br />
F. A Slow Intelligence (SIS) Workflow<br />
The word expert translation system can be organized <strong>in</strong><br />
the slow <strong>in</strong>telligence architecture [2]. Each word expert that<br />
occurs <strong>in</strong> the <strong>German</strong> parse is a unit slow <strong>in</strong>telligence system.<br />
It enumerates possible <strong>Ch<strong>in</strong>ese</strong> lexemes, determ<strong>in</strong>es possible<br />
l<strong>in</strong>ear order<strong>in</strong>gs among themselves, and the topic-comment<br />
relation of its arguments. <strong>Word</strong> experts communicate among<br />
themselves and form a larger slow <strong>in</strong>telligence system: nested<br />
topic-comment structures are firstly enumerated, then duplicated<br />
topic-comment relation will be removed, <strong>in</strong>correct<br />
lexemes will be pruned, order<strong>in</strong>g of particles will be l<strong>in</strong>earized<br />
with lexemes of other word experts. At last possible <strong>Ch<strong>in</strong>ese</strong><br />
sentences will be produced.<br />
For the <strong>German</strong> sentence Ich fällte e<strong>in</strong>en Baum mit e<strong>in</strong>er Axt<br />
(I cut down a tree with an ax), the MultiNet analysis results<br />
<strong>in</strong> four word experts: WE-ich.1.1/I (abbreviated WE-I <strong>in</strong> the<br />
follow<strong>in</strong>g), WE-baum.1.1/tree (WE-tree), WE-axt.1.1/ax (WEax),<br />
and WE-fällen.1.1/cut-down (WE-cut-down). The system<br />
first creates one CWE for each word expert WE. All the<br />
CWEs are unit SIS, search<strong>in</strong>g for possible <strong>Ch<strong>in</strong>ese</strong> lexemes<br />
and enumerat<strong>in</strong>g l<strong>in</strong>ear orders based on CWE pr<strong>in</strong>ciples. In our<br />
current <strong>German</strong>-<strong>Ch<strong>in</strong>ese</strong> dictionary CWE-I has one <strong>Ch<strong>in</strong>ese</strong><br />
lexeme: �; CWE-cut-down has five <strong>Ch<strong>in</strong>ese</strong> lexemes: ��<br />
� � � � � � � � � � and two particles � and �;
Fig. 3. <strong>Word</strong> expert translation results for the <strong>in</strong>put sentence Ich fällte e<strong>in</strong>en Baum mit e<strong>in</strong>er Axt<br />
CWE-tree has one <strong>Ch<strong>in</strong>ese</strong> lexeme: ���; CWE-ax has<br />
one <strong>Ch<strong>in</strong>ese</strong> lexeme: �. Only CWE-cut-down need to carry<br />
out an enumeration-elim<strong>in</strong>ation process to l<strong>in</strong>earize orders of<br />
each <strong>Ch<strong>in</strong>ese</strong> lexeme and particles. The orders among the<br />
five CWEs will be enumerated based on the topic-comment<br />
relations as follows: (1) CWE-I is the actor of CWE-cutdown,<br />
therefore, CWE-I is the topic, CWE-cut-down is the<br />
comment; (2) CWE-tree is the affected object of CWE-cutdown,<br />
therefore, CWE-cut-down is the topic and CWE-tree is<br />
the comment; (3) CWE-ax is analysed as the association of<br />
CWE-tree, whose features delivered <strong>from</strong> WOCADI parser are<br />
different: CWE-ax is an <strong>in</strong>strument, while CWE-tree is a non<strong>in</strong>strument.<br />
Therefore, CWE-ax is the <strong>in</strong>strument commented<br />
both on the action and on the actor. This <strong>in</strong>troduces the particle<br />
�/us<strong>in</strong>g before the lexeme of CWE-ax: ��(us<strong>in</strong>g an ax).<br />
Two nested topic-comment structures are (CWE-I ((CWEcut-down<br />
CWE-tree) CWE-ax) and (CWE-I (CWE-ax (CWEcut-down<br />
CWE-tree)). In the adaptation phase CWEs will<br />
communicate with each other to prune lexemes. �� and �<br />
� of CWE-cut-down are removed, as they <strong>in</strong>compatible with<br />
the <strong>in</strong>strument CWE-ax. A demo translation system has been<br />
developed. Its translation results are illustrated <strong>in</strong> Figure 3.<br />
V. CONCLUSIONS<br />
The work reported <strong>in</strong> this paper is ma<strong>in</strong>ly based on a classic<br />
research work on <strong>Ch<strong>in</strong>ese</strong> grammar [3]. We argue that the<br />
<strong>Ch<strong>in</strong>ese</strong> language structure is more suitable to be understood<br />
<strong>from</strong> the word expert perspective than <strong>from</strong> the traditional rulebased<br />
perspective. The mach<strong>in</strong>e translation <strong>from</strong> <strong>German</strong> <strong><strong>in</strong>to</strong><br />
<strong>Ch<strong>in</strong>ese</strong> is outl<strong>in</strong>ed <strong>from</strong> the word expert perspective. A slow<br />
<strong>in</strong>telligence workflow for mach<strong>in</strong>e translation is illustrated<br />
with an example.<br />
ACKNOWLEDGMENTS<br />
Hermann Helbig <strong>in</strong>troduced us the topic and critically<br />
commented on the draft paper; Peil<strong>in</strong>g Cui criticized the<br />
<strong>Ch<strong>in</strong>ese</strong> translations. F<strong>in</strong>ancial support <strong>from</strong> DFG is gratefully<br />
acknowledged.<br />
REFERENCES<br />
[1] P. F. Brown and S. A. Della Pietra J. Cocke. A Statistical Approach to<br />
Mach<strong>in</strong>e <strong>Translation</strong>. Computational L<strong>in</strong>guistics, 16(2):79–85, 1990.<br />
[2] S. K. Chang. A general framework for slow <strong>in</strong>telligence systems. International<br />
Journal of Software Eng<strong>in</strong>eer<strong>in</strong>g and <strong>Knowledge</strong> Eng<strong>in</strong>eer<strong>in</strong>g,<br />
20(1):1–18, 2010.<br />
[3] Y. R. Chao. A Grammar of Spoken <strong>Ch<strong>in</strong>ese</strong>. University of California<br />
Press, 1968.<br />
[4] I. Glöckner and B. Pelzer. The LogAnswer project at ResPubliQA 2010.<br />
In CLEF 2010 Work<strong>in</strong>g Notes, September 2010.<br />
[5] H. Helbig. <strong>Knowledge</strong> <strong>Knowledge</strong> Representation and the Semantics of<br />
Natural Language. Spr<strong>in</strong>ger-Verlag, 2006.<br />
[6] H. Helbig and S. Hartrumpf. <strong>Word</strong> class functions for syntactic-semantic<br />
analysis. In Proceed<strong>in</strong>gs of the 2nd International Conference on Recent<br />
Advances <strong>in</strong> Natural Language Process<strong>in</strong>g, pages 312–317. Tzigov<br />
Chark, Bulgaria, 1997.<br />
[7] H. Helig. Syntactic-semantic analysis of natural language by a new<br />
word-class controlled functional analysis. Computers and Artificial<br />
Intelligence, 5(1):53–59, 1986.<br />
[8] V. Mihailevschi. Mach<strong>in</strong>e <strong>Translation</strong> Interl<strong>in</strong>gua Based on MultiNet.<br />
VDM Verlag Dr. Müller, 2008.<br />
[9] C. Rieger. View<strong>in</strong>g Pars<strong>in</strong>g as <strong>Word</strong> Sense Discrim<strong>in</strong>ation. In D<strong>in</strong>gwall,<br />
editor, A Survey of L<strong>in</strong>guistic Science. Greylock Publishers, 1977.<br />
[10] S. Small. <strong>Word</strong> <strong>Expert</strong> Pars<strong>in</strong>g: A Theory of Distributed <strong>Word</strong>-Based<br />
Natural Language Understand<strong>in</strong>g. PhD thesis, Department of Computer<br />
Science, University of Maryland, 1980.<br />
[11] S. Small. View<strong>in</strong>g <strong>Word</strong> <strong>Expert</strong> Pars<strong>in</strong>g as L<strong>in</strong>guistic Theory. In<br />
Proceed<strong>in</strong>gs of the 7th International Jo<strong>in</strong>t Conference on Artificial<br />
Intelligence, pages 70–76. Erlbaum, Hillsdale, 1981.<br />
[12] L. Wang. History of <strong>Ch<strong>in</strong>ese</strong> Language. Zhonghua Book Co. Press,<br />
1980. ��, ����, ������.<br />
[13] Y. Wilks. Mak<strong>in</strong>g Preferences More Active. AI Memo 206, Artificial<br />
Intelligence Laboratory, Stanford University, 1973.