Views
3 years ago

An Application of Lexicalized Grammars in English ... - CiteSeerX

An Application of Lexicalized Grammars in English ... - CiteSeerX

An Application of Lexicalized Grammars in English ... -

An Application of Lexicalized Grammars inEnglish-Persian TranslationHeshaam Feili and Gholamreza Ghassem-Sani 1Abstract. Increasing the domain of locality by using TreeAdjoining Grammars (TAG) caused some applications, such asmachine translation, to employ it for the disambiguation process.Successful experiments of employing TAG in French-English andKorean-English machine translation encouraged us to use it foranother language pairs with very divergent properties, Persian andEnglish. Using Synchronous TAG (S-TAG) for this pair oflanguages can benefit from syntactic and semantic features fortransferring the source into the target language. Here, we report ourexperiments in translating English into Persian. Also, we present amodel for lexical selection disambiguation based on the decisiontrees notion. An automatic learning method of the requireddecision trees from a sample data set is introduced, too.1 INTRODUCTIONTree adjoining grammars (TAGs) have several unique propertiesthat make them suitable to be used by applications such assemantic interpretation and automatic translation of naturallanguage [2, 8, 14]. This type of grammars is related to a class ofgrammars named Mildly Context-Sensitive Grammars (MCSGs),which is placed between context-free and context-sensitivegrammars with respect to their generating power [5].Tree Adjoining Grammars are an extension of context freegrammars (CFGs), which use trees instead of productions as aprimary representing structure. Formally, a TAG is a 5-tuple(V N , V T , S, I, A), where V N is a finite set of non-terminal symbols,V T is a finite set of terminal symbols, S is the axiom of thegrammar, I is a finite set of initial trees, and A is a finite set ofauxiliary trees. The union of I and A is the set of elementary trees.Internal nodes are labeled by non-terminals, and leaf nodes byterminals or empty string, except for exactly one leaf node per eachauxiliary tree (called the foot node) that is labeled by the same nonterminalused as the label of its root node. New trees are derived bysubstitution or adjoining actions.Substitution of a node labeled by A, in an elementary tree T byanother tree T' rooted by label A, is performed by replacing thedesired node by the whole tree T'. Let T be a tree containing a nodelabeled by A, and let T' be an auxiliary tree with both root and footnode labeled by A. Then, adjoining of T' into T is obtained byexcising the sub-tree of T, which has a node labeled by A (i.e.,called adjunction node), and then attaching T' to that node, and theexcised sub-tree to the foot of T' [6]. Adjunction nodes are labeledby a symbol (*) and substitution nodes are labeled by a symbol (↓).1 Computer Engineering Department, Sharif University of Technology,Tehran, Iran, email: {hfaili@mehr.sharif.edu, sani@sharif.edu}.A Lexicalized tree adjoining grammar (LTAG) is a version ofTAG that is lexicalized by a lexical item. In other words, in thistype of grammars, every elementary tree is associated with alexical item. The leaf node associated with a lexical item is namedthe anchor node. Anchor nodes are usually labeled by a symbol(◊).The use of TAG for automatic translation of natural languageshas led to a new concept named synchronous tree adjoininggrammar (S-TAG). The use of S-TAG for machine translation wasfirst described by Abeille et al. [1], and since then severalexperiments have been reported, most of which adopted the XTAGsystem [17]. Abeille et al. noted that traditionally difficultproblems mentioned by Dorr [7], such as structural, lexical,conflation, and thematic divergences are not regarded as problemsfor an S-TAG based approach [10].S-TAG is defined as two related TAGs for the source and targetlanguages. Any sentence in the source language with its structure,which is interpreted in the TAG formalism, is related to itsassociated structure, again in the TAG formalism [14].2 PERSIAN LANGUAGEPersian, also known as Farsi, is the official language of Iran andTajikistan, and one of the two main languages used in Afghanistan.This language has been influenced by local environments such asArabic language (in Iran) and Russian language. Here, we use thePersian that is the official language of Iran.Arabic language has heavily influenced Persian, but has notchanged its structure. In other words, Persian has only borrowed alarge number of lexical words from Arabic. Therefore, in spite ofthis influence, it does not affect the syntactic and morphologicalforms of Persian [9].Persian is a language with SOV form with a large potential to befree-word-order, especially in proposition adjunction andcomplements. For example, adverbs could be placed at thebeginning, at the end, or in the middle of sentences, and this doesnot often change the meaning. This flexibility in word ordering isusually useful in language generation.Written style of Persian is right to left and it uses Arabicalphabet 2 . Vowels generally known as short vowels (a, e, o) areusually not written. This causes some ambiguities in pronunciationof words in Persian.2 By Persian, we mean the language that used in Iran. In Tajikistan, Persianis written by English letters.

basic english grammar basic english grammar - SADDLEBACK ...
Fall 2007, English 3318: Studies in English Grammar
English grammar : including the principles of grammatical ... - Index of
Azar. Fundamentals of English Grammar, 3-Ed Workbook ... - Fort/Da
A model of lexical variation and the grammar with application to ...
Learning a Radically Lexical Grammar - the Association for ...
Lexicalized Stochastic Modeling of Constraint-Based Grammars ...
Verification of Lexicalized Tree Adjoining Grammars - cs@union
Lexical Functional Grammar - Personal Pages Index
Slot Grammar Lexical Formalism - IBM Research
Estimation of Stochastic Attribute-Value Grammars ... - CiteSeerX
Dissociations of lexical function: Semantics, syntax, and ... - CiteSeerX
A Bibliography of Lexical Functional Grammar (LFG)
Lexically conditioned variation in Harmonic Grammar
Unsupervised Acquisition of Lexical Knowledge From N ... - CiteSeerX
Lexical Semantics of Adjectives - CiteSeerX
LEXICAL-FUNCTIONAL GRAMMAR OF THE CROATIAN LANGUAGE
Non-Lexical Conversational Sounds in American English - CiteSeerX
Grammar and Style Checking for German - CiteSeerX
The Formal Architecture of Lexical-Functional Grammar
Compiling Grammar-based Speech Application ... - CiteSeerX
Generative Grammars, Lexical & Syntactic Analysis - Ace
Towerds Model Checking Graph Grammars - CiteSeerX
Lexicalized Stochastic Modeling of Constraint-Based Grammars ...