29.01.2014 Views

Extractive Summarization of Development Emails

Extractive Summarization of Development Emails

Extractive Summarization of Development Emails

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

singular vote <strong>of</strong> each participant, that is already included in the sum <strong>of</strong> them stored at column "hmn_score") that<br />

could foul up the results. We started the environment with the following settings:<br />

• We chose as attribute evaluator ClassifierSubsetEval 15 that uses a classifier to estimate the "merit" <strong>of</strong> a set <strong>of</strong><br />

attributes.<br />

• The search method was BestFirst “which explores a graph by expanding the most promising node chosen according<br />

to a specified rule" 16 .<br />

• We set "class" as feature to evaluate.<br />

Figure 11. Weka processing platform<br />

Weka Explorer rewarded us with a relevance tree [Figure 12] stating that the 6 attributes which determine the<br />

relevance <strong>of</strong> a sentence are chars, num_nouns_norm, num_stopw_norm, num_verbs_norm, rel_pos_norm, and<br />

subj_words_norm. Thanks to the tree we could determine whether a sentence should be included in the summary<br />

or not, simply by considering its values <strong>of</strong> these 6 attributes and going through the conditions written in the tree<br />

(starting from the root). If we arrived to a leaf labeled as "relevant", we include the sentence in the summary,<br />

otherwise not.<br />

15 http://bio.informatics.indiana.edu/ml _docs/weka/weka.attributeSelection.ClassifierSubsetEval.html<br />

16 http://en.wikipedia.org/wiki/Best-first _search<br />

20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!