10.07.2015 Views

Componential Analysis for Recognizing Textual Entailment

Componential Analysis for Recognizing Textual Entailment

Componential Analysis for Recognizing Textual Entailment

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

in per<strong>for</strong>ming the various tasks.The KMS text processing component consistsof three elements: (1) a sentence splitter thatseparates the source documents into individualsentences; (2) a full sentence parser which producesa parse tree containing the constituents of thesentence; and (3) a parse tree analyzer thatidentifies important discourse constituents(sentences and clauses, discourse entities, verbs andprepositions) and creates an XML-tagged version ofthe document.The XML representations of the documents areused in per<strong>for</strong>ming the various KMS tasks. Toper<strong>for</strong>m the RTE task, we made use ofsummarization and question answering modules,each of which employ lower level modules <strong>for</strong>dictionary lookup, WordNet analysis, linguistictesting, and XML functions. Litkowski (2006),Litkowski (2005a), and Litkowski (2005b) providemore details on the methods used in TREC questionanswering and DUC summarization.3 System <strong>for</strong> Assessing <strong>Textual</strong> <strong>Entailment</strong>To per<strong>for</strong>m the RTE task, we developed a graphicaluser interface on top of various modules fromKMS, as appropriate. The development of thisinterface is in itself illuminating about factors thatappear relevant to the task.KMS is document-centric, so it was firstnecessary to create an appropriate framework <strong>for</strong>analyzing each instance of the RTE data sets(working initially with only the development set).Since these data were available in XML, we wereable to exploit KMS’ underlying XML functionalityto read the files. We first created a list box <strong>for</strong>displaying in<strong>for</strong>mation about each instance as thefile was read. Initially, this list box contained acheckbox <strong>for</strong> each item (so that subsets of the datacould be analyzed), its ID, its task, its entailment,an indication of whether the text and the hypothesiswere properly parsed, the results of our evaluation,and a confidence score (used initially, but thendiscarded since we did not develop this aspectfurther). Subsequently, we added columns to recordand characterize any problem with our evaluationand to identify the main verb in the hypothesis.The interface was designed with text boxes sothat an item could be selected from the instancesand both the text and the hypothesis could bedisplayed. We associated a menu of options withthe list box so that we could per<strong>for</strong>m various tasks.Initially, the options consisted of (1) selecting allitems, (2) clearing all selections, and (3) parsing allitems.The first step in per<strong>for</strong>ming the RTE task wasto parse the texts and hypotheses and to createXML representations <strong>for</strong> further analysis. We wereable to incorporate KMS’ routines <strong>for</strong> processingeach text and each hypothesis as a distinct“document” (applying KMS’ sentence splitting,parsing, discourse analysis, and XMLrepresentation routines). 1 After per<strong>for</strong>ming this step(taking about 15 minutes <strong>for</strong> the full set), it wasfound that several texts had not been parsed, due toa bug in a sentence splitting routine. As a result,another option was added to reparse selected items,useful when corrections were made to underlyingroutines. The result of this parsing step was thecreation of an XML rendition of the entire RTE set,approximately 10 times the size of the originaldata. 2The next extension of the interface was theaddition of an option to make our evaluation ofwhether the texts entailed the hypotheses. Ourinitial implementation of this evaluation was drawnfrom the KMS summarization functionality. Asused in multi-document DUC summarization (seeLitkowski, 2005b, <strong>for</strong> details), KMS extracts topsentences that have a high match with either theterms in the documents or the terms in a topicdescription. KMS has generally per<strong>for</strong>med quitewell in DUC, primarily through its use of anoverlap assessment that excludes relevant sentencesthat are highly repetitive of what has already beenincluded in a growing summary. A key feature ofthat success is the use of anaphoric references inplace of the anaphors. While this feature issignificant in multi-document summarization, it isless so <strong>for</strong> the RTE task. Notwithstanding, thisoverlap assessment is the basis <strong>for</strong> the RTEjudgment.The overlap analysis is not strict, but ratherbased on an assessment of “preponderance.” InRTE, the analysis looks at each discourse entity inthe hypothesis and compares them to the discourse1 Since only a relatively small number of the RTEtexts consisted of more than one sentence, the use ofKMS discourse analysis functionality was minimal.2 The developers of the RTE data sets are to becommended <strong>for</strong> the integrity of the data. Processingof the data proceeded quite smoothly, enabling us tofocus on the task, rather than dealing with problemsin the underlying data.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!