06.07.2014 Views

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Non-Projective Structures <strong>in</strong> Indian Language<br />

<strong>Treebank</strong>s<br />

Riyaz Ahmad Bhat and Dipti Misra Sharma<br />

Language Technology Research Center, IIIT-Hyderabad, India<br />

E-mail: riyaz.bhat@research.iiit.ac.<strong>in</strong>,dipti@iiit.ac.<strong>in</strong><br />

Abstract<br />

In recent years non-projective structures have been widely studied across different<br />

languages. These dependency structures have been reported to restrict<br />

the pars<strong>in</strong>g efficiency and pose problems for grammatical formalisms. Nonprojective<br />

structures are particularly frequent <strong>in</strong> morphologically rich languages<br />

like Czech and H<strong>in</strong>di [8], [10]. In H<strong>in</strong>di a major chunk <strong>of</strong> parse<br />

errors are due to non-projective structures [6], which motivates a thorough<br />

analysis <strong>of</strong> these structures, both at l<strong>in</strong>guistic and formal levels, <strong>in</strong> H<strong>in</strong>di<br />

and other related languages. In this work we study non-projectivity <strong>in</strong> Indian<br />

languages (ILs) which are morphologically richer with relatively free<br />

word order. We present a formal characterization and l<strong>in</strong>guistic categorization<br />

<strong>of</strong> non-projective dependency structures across four Indian Language<br />

<strong>Treebank</strong>s.<br />

1 Introduction<br />

Non-projective structures <strong>in</strong> contrast to projective dependency structures conta<strong>in</strong><br />

a node with a discont<strong>in</strong>uous yield. These structures are common <strong>in</strong> natural languages,<br />

particularly frequent <strong>in</strong> morphologically rich languages with flexible word<br />

order like Czech, German etc. In the recent past the formal characterization <strong>of</strong> nonprojective<br />

structures have been thoroughly studied, motivated by the challenges<br />

these structures pose to the dependency pars<strong>in</strong>g [7], [11], [5]. Other studies have<br />

tried to provide an adequate l<strong>in</strong>guistic description <strong>of</strong> non-projectivity <strong>in</strong> <strong>in</strong>dividual<br />

languages [4], [10]. Mannem et.al [10] have done a prelim<strong>in</strong>ary study on Hyderabad<br />

Dependency <strong>Treebank</strong> (HyDT) a pilot dependency treebank <strong>of</strong> H<strong>in</strong>di conta<strong>in</strong><strong>in</strong>g<br />

1865 sentences annotated with dependency structures. They have identified different<br />

construction types present <strong>in</strong> the treebank with non-projectivity. In this work<br />

we present our analysis <strong>of</strong> non-projectivity across four IL treebanks. ILs are morphologically<br />

richer, grammatical relations are expressed via morphology <strong>of</strong> words<br />

rather than the syntax. This allows words <strong>in</strong> these language to move around <strong>in</strong> the<br />

sentence structure. Such movements quite <strong>of</strong>ten, as we will see <strong>in</strong> subsequent sections,<br />

lead to non-projectivity <strong>in</strong> the dependency structure. We studied treebanks <strong>of</strong><br />

four Indian Languages viz H<strong>in</strong>di (Indo-Aryan), Urdu (Indo-Aryan), Bangla (Indo-<br />

Aryan) and Telugu (Dravidian). They all have an unmarked Subject-Object-Verb<br />

(SOV) word order, however the order can be altered under appropriate pragmatic<br />

conditions. Movement <strong>of</strong> arguments and modifiers away from the head is the major<br />

phenomenon observed that <strong>in</strong>duces non-projectivity <strong>in</strong> these languages.<br />

25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!