29.01.2014 Views

Extractive Summarization of Development Emails

Extractive Summarization of Development Emails

Extractive Summarization of Development Emails

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

in charge <strong>of</strong> some "creative" production is always error prone, because the possibility <strong>of</strong> receiving a text with wrong<br />

concepts is high.<br />

The solution we propose is: Extracting the most relevant sentences from an email, aggregating them in a<br />

logical order, and returning this text to the user. This choice stems from the following reasons:<br />

• Since we use parts <strong>of</strong> the sentences included in the original email, composing new phrases with gibberish<br />

synonyms is improbable. Even if the logic among sentences might miss something, we are sure that the<br />

internal meaning <strong>of</strong> every single sentence remains exactly what it is supposed to be.<br />

• An extractive process is more lightweight than an intelligent procedure <strong>of</strong> summary composition. This is very<br />

frequently translated in less time needed to get the result.<br />

• When a user reads some sentence in the summary, he can quickly find it in the original email if needed. On<br />

the contrary, if he has a summary composed <strong>of</strong> different words (but same concepts) and wants to recover the<br />

exact point in the email where that topic was treated, he has to go through the entire conversation.<br />

1.4 Structure <strong>of</strong> the document<br />

In Section 2 we analyze some documents treating summarization also in other fields like bug reports, speeches, etc.<br />

In Section 3 we explain the several steps we scale to create a benchmark to be used in machine learning. In Section 4<br />

we described the systems we implemented to summarize emails with sentence extraction technique. In Section 5 we<br />

briefly analyze the quality <strong>of</strong> the summaries we managed to automatically create with our system, and we propose<br />

some optimizations that could be developed in the future.<br />

4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!