06.07.2014 Views

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

constituency and dependency, possibly simultaneously, as stated <strong>in</strong> the<br />

SynAF metamodel. The metamodel is <strong>based</strong> on the follow<strong>in</strong>g components:<br />

• The T_node component represents the term<strong>in</strong>al nodes <strong>of</strong> a syntactic<br />

tree, consist<strong>in</strong>g <strong>of</strong> morpho-syntactically annotated word forms, as<br />

well as empty elements when appropriate. T_nodes are annotated<br />

with syntactic categories valid for the word level.<br />

• The NT_node component represents the non-term<strong>in</strong>al nodes <strong>of</strong> a<br />

syntactic tree. The NT_nodes are annotated with syntactic categories<br />

that are valid at the phrasal, clausal and/or sentential levels.<br />

• The Edge component represents a relation between syntactic nodes<br />

(both term<strong>in</strong>al and non-term<strong>in</strong>al nodes). For example, the<br />

dependency relation is b<strong>in</strong>ary, consist<strong>in</strong>g <strong>of</strong> a pair <strong>of</strong> source and<br />

target nodes, with one or more annotations.<br />

From this metamodel, a specific syntactic annotation model can be obta<strong>in</strong>ed<br />

by comb<strong>in</strong><strong>in</strong>g the above-mentioned components with data categories<br />

characteris<strong>in</strong>g or ref<strong>in</strong><strong>in</strong>g their semantics.<br />

It should be noticed that the term<strong>in</strong>al node level <strong>in</strong> SynAF is strongly<br />

equivalent to the word form level <strong>in</strong> MAF (ISO 24611 [5]), for which we<br />

<strong>of</strong>fer concrete <strong>in</strong>terfaces below. It is thus left to the implementer to either<br />

separate or merge these two components depend<strong>in</strong>g on whether it is relevant<br />

or not, for <strong>in</strong>stance, to clearly differentiate the data categories attached to<br />

word forms and term<strong>in</strong>als with<strong>in</strong> a multi-layered annotated corpus.<br />

3 as a SynAF compliant Tiger<br />

The ma<strong>in</strong> characteristics <strong>of</strong> the datamodel can be summarized as<br />

follows:<br />

• Term<strong>in</strong>al nodes are implemented as elements, either referr<strong>in</strong>g to<br />

a textual segment or po<strong>in</strong>t<strong>in</strong>g to a word form <strong>in</strong> a<br />

morphosyntactically annotated corpus. Non-term<strong>in</strong>al nodes are<br />

implemented as elements and are used to represent hierarchical<br />

structures like syntactic trees<br />

• The Edge component is implemented by means <strong>of</strong> an <br />

element, which may relate either or elements to each other.<br />

• A generic @type attribute <strong>in</strong>side term<strong>in</strong>al, non-term<strong>in</strong>al and edge<br />

elements can further qualify the mean<strong>in</strong>g <strong>of</strong> these elements. For<br />

<strong>in</strong>stance a non-term<strong>in</strong>al node can be qualified as a verbal complex, a<br />

term<strong>in</strong>al can be specified as a compound node etc.<br />

• The elements term<strong>in</strong>al, non-term<strong>in</strong>al and edge as well are used as<br />

placeholders for further l<strong>in</strong>guistic annotations. Such annotations can<br />

freely be added as generic attribute-value-pairs with no restrictions to<br />

39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!