W10-09
W10-09
W10-09
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
parsingusingastack-based, shift-reducealgorithm<br />
withruntimethatislinearintheinputlength. This<br />
lightweight approach is very efficient; however, it<br />
maynotbequiteasaccurateasmorecomplex,chartbased<br />
approaches (e.g., the approach of Charniak<br />
andJohnson(2005)forsyntacticparsing).<br />
We trained the discourse parser over the causal<br />
andtemporalrelationscontainedintheRSTcorpus.<br />
Examplesoftheserelationsareshownbelow:<br />
(1) [causePackagesoftengetburiedintheload]<br />
[resultandaredeliveredlate.]<br />
(2) [beforeThreemonthsaftershearrivedinL.A.]<br />
[aftershespent$120shedidn’thave.]<br />
The RST corpus defines many fine-grained relations<br />
that capture causal and temporal properties.<br />
Forexample, thecorpusdifferentiates betweenresultandreasonforcausationandtemporal-afterand<br />
temporal-beforefortemporalorder. Inordertoincreasetheamountofavailabletrainingdata,wecollapsed<br />
all causal and temporal relations into two<br />
generalrelationscausesandprecedes. Thissteprequired<br />
normalization ofasymmetric relations such<br />
astemporal-beforeandtemporal-after.<br />
Toevaluatethediscourseparserdescribedabove,<br />
wemanuallyannotated100randomlyselectedweblogstoriesfromthestorycorpusproducedbyGordonandSwanson(20<strong>09</strong>).<br />
Forincreasedefficiency,<br />
welimitedourannotationtothegeneralizedcauses<br />
and precedes relations described above. We attempted<br />
to keep our definitions of these relations<br />
inlinewiththoseusedbyRST.Followingprevious<br />
discourseannotationefforts,weannotatedrelations<br />
over clause-level discourse units, permitting relationsbetweenadjacentsentences.<br />
Intotal, weannotated770instancesofcausesand1,0<strong>09</strong>instances<br />
ofprecedes.<br />
Weexperimented withtwoversions oftheRST<br />
parser, one trained on the fine-grained RST relationsandtheothertrainedonthecollapsedrelations.Attestingtime,weautomaticallymappedthefinegrained<br />
relations to their corresponding causes or<br />
precedesrelation. Wecomputedthefollowingaccuracystatistics:<br />
Discoursesegmentationaccuracy For each predicteddiscourseunit,welocatedthereference<br />
45<br />
discourseunitwiththehighestoverlap. Accuracyforthepredicteddiscourseunitisequaltothepercentagewordoverlapbetweenthereferenceandpredicteddiscourseunits.<br />
Argumentidentificationaccuracy For each discourse<br />
unit of a predicted discourse relation,<br />
welocatedthereferencediscourseunitwiththe<br />
highestoverlap. Accuracyisequaltothepercentageoftimesthatareferencediscourserelation(ofanytype)holdsbetweenthereferencediscourseunitsthatoverlapmostwiththepredicteddiscourseunits.<br />
Argumentclassificationaccuracy For the subset<br />
ofinstancesinwhichareferencediscourserelationholdsbetweentheunitsthatoverlapmost<br />
withthepredicteddiscourseunits,accuracyis<br />
equaltothepercentage oftimesthatthepredicteddiscourserelationmatchesthereference<br />
discourserelation.<br />
Completeaccuracy For each predicted discourse<br />
relation, accuracy is equal to the percentage<br />
wordoverlap withareference discourse relationofthesametype.<br />
Table1showsthe accuracy results for thefinegrainedandcollapsedversionsoftheRSTdiscourse<br />
parser. AsshowninTable1,thecollapsedversion<br />
of the discourse parser exhibits higher overall accuracy.<br />
Bothparsers predicted thecauses relation<br />
muchmoreoftenthantheprecedesrelation,sothe<br />
overallscoresarebiased towardthescores forthe<br />
causesrelation.Forcomparison,Sagae(20<strong>09</strong>)evaluatedasimilarRSTparseroverthetestsectionof<br />
theRSTcorpus, obtaining precision of42.9%and<br />
recallof46.2%(F1 = 44.5%).<br />
Inadditiontotheautomaticevaluationdescribed<br />
above,wealsomanuallyassessedtheoutputofthe<br />
discourse parsers. One of the authors judged the<br />
correctnessofeachextracteddiscourserelation,and<br />
we found that the fine-grained and collapsed versions<br />
of the parser performed equally well with a<br />
precisionnear33%;however,throughoutourexperiments,weobservedmoredesirablediscoursesegmentationwhenworkingwiththecollapsedversion<br />
ofthediscourseparser.Thisfact,combinedwiththe<br />
resultsoftheautomaticevaluationpresentedabove,