Development of a Stemmer for the Greek Language - SAIS
Development of a Stemmer for the Greek Language - SAIS
Development of a Stemmer for the Greek Language - SAIS
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
as well and that’s why <strong>the</strong>re are two stems <strong>for</strong> every verb. So we uphold this<br />
distinction and we accept that a verb has a different stem on <strong>the</strong> past tenses and on<br />
<strong>the</strong> o<strong>the</strong>r tenses.<br />
Present/Future Tenses Past Tenses Present stem Past stem<br />
∆ΕΝΩ<br />
Ε∆ΕΣΑ ∆ΕΝ ∆ΕΣ<br />
(I tide)<br />
(I tided)<br />
ΦΕΥΓΩ<br />
ΕΦΥΓΑ ΦΕΥΓ ΕΦΥΓ<br />
(I leave)<br />
(I left)<br />
ΠΑΙΖΩ<br />
ΕΠΑΙΞΑ ΠΑΙΖ ΕΠΑΙΞ<br />
(I play)<br />
(I played)<br />
Table 1: Present and Past stem <strong>for</strong> <strong>the</strong> same word<br />
An accurate <strong>Greek</strong> stemmer should deal with both <strong>the</strong> inflectional and<br />
derivational endings (Kalamboukis 1995). But <strong>the</strong> <strong>Greek</strong> language is rich in<br />
derivative words. This means that we can have many words coming from <strong>the</strong><br />
same stem. And if we think that <strong>the</strong>re could be more than 10 derivational endings<br />
with around 50 inflectional endings each we have a list <strong>of</strong> around 500 words<br />
belongs in <strong>the</strong> same family and having <strong>the</strong> same stem. As we want to apply <strong>the</strong><br />
<strong>Greek</strong> stemmer on a search engine, this will not be useful, because we will have<br />
too generic results. So we will deal only with inflectional endings.<br />
Extended constrains about <strong>the</strong> <strong>Greek</strong> stemming algorithm are following in <strong>the</strong><br />
second part <strong>of</strong> <strong>the</strong> <strong>the</strong>sis.<br />
10