06.07.2014 Views

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Nom<strong>in</strong>al P R F1<br />

MUC 75.33% 81.33% 78.21%<br />

B 3 72.80% 83.95% 77.97%<br />

BLANC 90.5% 87.57% 88.98%<br />

Pronom<strong>in</strong>al 76.9% 60.0% 67.4%<br />

Table 2: Results <strong>of</strong> the anaphora resolution system.<br />

5 Experimental Results<br />

We use two different strategies to evaluate the two coreference resolution processes,<br />

s<strong>in</strong>ce we use two different methods to l<strong>in</strong>k coreferences. In the pronom<strong>in</strong>al<br />

anaphora resolution process, we return the five most probable antecedents for each<br />

anaphor, while <strong>in</strong> the nom<strong>in</strong>al coreference resolution process we return a cluster <strong>of</strong><br />

mentions that l<strong>in</strong>ks coreferential mentions for each nom<strong>in</strong>al coreference.<br />

The evaluation metrics are chosen with a view toward appropriateness. We use<br />

the classic measures (precision, recall and F1) to evaluate the pronom<strong>in</strong>al anaphora<br />

resolution process, count<strong>in</strong>g as success the <strong>in</strong>stances when the real antecedent <strong>of</strong><br />

the pronom<strong>in</strong>al anaphor is among the five most probable antecedents. To evaluate<br />

nom<strong>in</strong>al coreferences, we use BLANC, MUC and B 3 metrics, as they are the three<br />

most significant metrics used for this task.<br />

We use 1004 NPs to develop our nom<strong>in</strong>al coreference resolution system and<br />

281 NPs to evaluate it. For the evaluation <strong>of</strong> our pronom<strong>in</strong>al anaphora resolution<br />

system, we use 130 pronom<strong>in</strong>al anaphora <strong>of</strong> those 1285 NPs.<br />

We present the results <strong>of</strong> our coreference resolution system <strong>in</strong> Table 2. In<br />

the nom<strong>in</strong>al coreference resolution system we obta<strong>in</strong> an F-score <strong>of</strong> at least 78%<br />

us<strong>in</strong>g the three above-mentioned metrics. On the other hand, us<strong>in</strong>g the pronom<strong>in</strong>al<br />

coreference resolution system, the F-score is 67.4%. Although these results are<br />

not the best obta<strong>in</strong>ed <strong>in</strong> coreference resolution systems, they build a solid base<br />

for improv<strong>in</strong>g our system and <strong>in</strong>dicate that our system is <strong>of</strong> considerable use <strong>in</strong><br />

speed<strong>in</strong>g up the manual nom<strong>in</strong>al/pronom<strong>in</strong>al anaphora annotation. This, <strong>in</strong> turn,<br />

will allow us to create a broader corpus and use it to improve our hybrid approach<br />

to automatic corpus annotation.<br />

6 Conclusions and Future Work<br />

In this work we present a system for automatically annotat<strong>in</strong>g nom<strong>in</strong>al and pronom<strong>in</strong>al<br />

coreferences us<strong>in</strong>g a comb<strong>in</strong>ation <strong>of</strong> rules and ML methods. Our work beg<strong>in</strong>s<br />

by detect<strong>in</strong>g <strong>in</strong>correctly tagged NPs and, <strong>in</strong> most cases, correct<strong>in</strong>g them, recover<strong>in</strong>g<br />

63% <strong>of</strong> the <strong>in</strong>correctly tagged NPs. Next, <strong>in</strong> the case <strong>of</strong> the nom<strong>in</strong>al coreferences,<br />

we divide the NPs <strong>in</strong>to different groups accord<strong>in</strong>g to their morphological features<br />

to f<strong>in</strong>d coreferences among the compatible groups. Then we use a ML approach to<br />

solve pronom<strong>in</strong>al anaphora; this returns, for each anaphor, a cluster that conta<strong>in</strong>s<br />

123

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!