24.02.2013 Views

Proceedings of the LFG 02 Conference National Technical - CSLI ...

Proceedings of the LFG 02 Conference National Technical - CSLI ...

Proceedings of the LFG 02 Conference National Technical - CSLI ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

compaction techniques [Krotov et al, 1998]. Our results are summarised in <strong>the</strong> following<br />

tables.<br />

Pipeline Integrated<br />

PCFG PCFG+P A-PCFG A-PCFG+P<br />

# Rules 19439 30<strong>02</strong>6 29216 35815<br />

F-Score Labelled 77.37 80.49 81.26 81.18<br />

Full F-Score Unlabelled 79.89 82.56 83.16 83.08<br />

Grammar F-Str Fragm. 95.80 95.64 94.69 94.93<br />

F-Score Gold Stnd. 51.98 56.28 60.66 60.21<br />

# Parses 2240 2227 2223 2207<br />

All fractions are percentages. F-scores are calculated as<br />

2 precision recall/(precision recall). The first row shows <strong>the</strong> number <strong>of</strong><br />

rules extracted for each grammar. PCFG+P is <strong>the</strong> standard PCFG with <strong>the</strong> parent<br />

transformation and A-PCFG+P is <strong>the</strong> annotated PCFG with <strong>the</strong> parent transformation<br />

(here <strong>the</strong> functional annotations <strong>of</strong> <strong>the</strong> mo<strong>the</strong>r nodes do not carry over to<br />

daughter nodes). Interestingly, PCFG+P and <strong>the</strong> automatically annotated A-PCFG<br />

are <strong>of</strong> roughly equal size. The next two rows show labelled and unlabelled f-score<br />

(composite precision and recall) on trees against <strong>the</strong> section 23 reference trees as<br />

computed by evalb. The parent transfom shows a 3% advantage over <strong>the</strong> PCFG<br />

while our A-PCFG performs slightly better with an almost 4% advantage for<br />

labelled f-score. Interestingly, combining <strong>the</strong> parent transform with our automatic<br />

f-structure annotation does not seem to improve results. The next row measures<br />

f-structure fragmentation. Our current results seem to indicate that (surprisingly<br />

perhaps) <strong>the</strong> pipeline parsing architecture outperforms <strong>the</strong> integrated model with<br />

respect to f-structure fragmentation. The next row reports f-score results against<br />

<strong>the</strong> hand-coded, gold-standard reference f-structures for <strong>the</strong> 105 randomly selected<br />

trees in section 23. The results show clearly that currently <strong>the</strong> f-structures generated<br />

by <strong>the</strong> integrated model are <strong>of</strong> higher quality than <strong>the</strong> ones generated by <strong>the</strong><br />

pipeline model.<br />

Pipeline Integrated<br />

PCFG PCFG+P A-PCFG A-PCFG+P<br />

# Rules 7906 12117 11545 13970<br />

F-Score Labelled 76.81 79.79 80.08 80.39<br />

Threshold 1 F-Score Unlabelled 79.41 81.92 82.09 82.30<br />

F-Str Fragm. 96.14 93.92 94.79 95.41<br />

F-Score Gold Stnd. 51.99 56.07 59.15 60.42<br />

# Parses 2227 2173 2189 2115<br />

This table reports on a simple thresholding grammar compaction experiment<br />

with a threshold set to 1. This means that every rule that occurs only once in <strong>the</strong><br />

88

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!