Extractive Summarization of Development Emails
Extractive Summarization of Development Emails
Extractive Summarization of Development Emails
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
in charge <strong>of</strong> some "creative" production is always error prone, because the possibility <strong>of</strong> receiving a text with wrong<br />
concepts is high.<br />
The solution we propose is: Extracting the most relevant sentences from an email, aggregating them in a<br />
logical order, and returning this text to the user. This choice stems from the following reasons:<br />
• Since we use parts <strong>of</strong> the sentences included in the original email, composing new phrases with gibberish<br />
synonyms is improbable. Even if the logic among sentences might miss something, we are sure that the<br />
internal meaning <strong>of</strong> every single sentence remains exactly what it is supposed to be.<br />
• An extractive process is more lightweight than an intelligent procedure <strong>of</strong> summary composition. This is very<br />
frequently translated in less time needed to get the result.<br />
• When a user reads some sentence in the summary, he can quickly find it in the original email if needed. On<br />
the contrary, if he has a summary composed <strong>of</strong> different words (but same concepts) and wants to recover the<br />
exact point in the email where that topic was treated, he has to go through the entire conversation.<br />
1.4 Structure <strong>of</strong> the document<br />
In Section 2 we analyze some documents treating summarization also in other fields like bug reports, speeches, etc.<br />
In Section 3 we explain the several steps we scale to create a benchmark to be used in machine learning. In Section 4<br />
we described the systems we implemented to summarize emails with sentence extraction technique. In Section 5 we<br />
briefly analyze the quality <strong>of</strong> the summaries we managed to automatically create with our system, and we propose<br />
some optimizations that could be developed in the future.<br />
4