06.07.2014 Views

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

A Treebank-based Investigation of IPP-triggering Verbs in Dutch

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SUJ<br />

NP<br />

SENT<br />

MOD<br />

PP<br />

NP<br />

PP<br />

PP<br />

NP<br />

NP<br />

PP<br />

CMP_N<br />

VN<br />

NP<br />

Pourtant<br />

,<br />

même<br />

cette<br />

dernière<br />

partie<br />

de<br />

notre<br />

aventure<br />

au<br />

Mont<br />

Blanc<br />

est<br />

vécue<br />

avec<br />

un<br />

sentiment<br />

de<br />

bonheur<br />

.<br />

ADV<br />

PCT_W<br />

ADV<br />

D_dem<br />

A_qual<br />

N_C<br />

P<br />

D_poss<br />

N_C<br />

P<br />

N_C<br />

A_qual<br />

V<br />

V<br />

P<br />

D_<strong>in</strong>d<br />

N_C<br />

P<br />

N_C<br />

PCT_S<br />

KON<br />

Doch<br />

ADV<br />

auch<br />

PDAT<br />

dieser<br />

ADJA<br />

letzte<br />

NN<br />

Teil<br />

PPOSAT<br />

unserer<br />

NN<br />

Montblanc-Abenteuer<br />

VAFIN<br />

wird<br />

ADJD<br />

glücklich<br />

VVPP<br />

überstanden<br />

$.<br />

.<br />

HD<br />

AP<br />

NK<br />

NP<br />

HD<br />

HD<br />

AP<br />

MO NK NK HD<br />

NP<br />

AG<br />

MO<br />

VP<br />

HD<br />

JU<br />

SB<br />

S<br />

HD<br />

OC<br />

Figure 4: Aligned French-German tree pair from the Alp<strong>in</strong>e treebank<br />

The scores normally <strong>in</strong>dicate the percentage <strong>of</strong> overlapp<strong>in</strong>g n-grams between<br />

the reference phrase (checkpo<strong>in</strong>t <strong>in</strong>stance) and the output produced by the MT system.<br />

However, <strong>in</strong> this context, the scores reported for the automatic alignments do<br />

not reflect the quality <strong>of</strong> the MT system. The evaluation module takes the same<br />

<strong>in</strong>put <strong>in</strong> all three cases, except for the alignments, which are computed <strong>in</strong> different<br />

ways and generate different outcomes accord<strong>in</strong>gly. Therefore the scores should be<br />

seen as estimates <strong>of</strong> the accuracy <strong>of</strong> the evaluation. The more precise the alignments,<br />

the more reliable the evaluation results.<br />

We notice that the doma<strong>in</strong> <strong>of</strong> the texts used for tra<strong>in</strong><strong>in</strong>g GIZA++ does not<br />

<strong>in</strong>fluence significantly the accuracy, s<strong>in</strong>ce the produced scores are similar (e.g. less<br />

than 2% difference between Europarl and the Alp<strong>in</strong>e texts). However, when we<br />

compare the evaluation results with automatic alignments to the ones obta<strong>in</strong>ed with<br />

manual alignments, the latter ones are significantly better (up to 12% <strong>in</strong>crease).<br />

This f<strong>in</strong>d<strong>in</strong>g demonstrates the validity <strong>of</strong> our claim, namely that feed<strong>in</strong>g manually<br />

pro<strong>of</strong>ed alignments from a parallel treebank to the evaluation pipel<strong>in</strong>e generates<br />

more reliable results.<br />

Checkpo<strong>in</strong>t Alignment type F<strong>in</strong>al score<br />

GIZA++: Europarl 0.190 29<br />

Verb GIZA++: Alp<strong>in</strong>e 0.191 78<br />

Parallel <strong>Treebank</strong> 0.283 65<br />

GIZA++: Europarl 0.228 82<br />

Det+Noun+Adj GIZA++: Alp<strong>in</strong>e 0.240 99<br />

Parallel <strong>Treebank</strong> 0.480 17<br />

Table 1: Evaluation results for different alignments<br />

151

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!