09.11.2014 Views

SEEU Review vol. 5 Nr. 2 (pdf) - South East European University

SEEU Review vol. 5 Nr. 2 (pdf) - South East European University

SEEU Review vol. 5 Nr. 2 (pdf) - South East European University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Mentor Hamiti, Visar Shehu and Agni Dika<br />

The compound letters, also known as bigrams, are formed by linking two<br />

Latin characters in one single letter for representing a single phoneme in the<br />

Albanian language. There are 9 in total [3]: dh, gj, ll, nj, rr, sh, th, xh, zh.<br />

These letters represent another challenge for computer analysis, because the<br />

combination of two characters should be treated as a single letter. Thus, one<br />

string should be treated as a single character. Usually, the problem is solved<br />

by implementing different algorithms in programming.<br />

Based on the two above-mentioned characteristics, we can conclude that<br />

the existing software for textual analysis of other languages, for example the<br />

English language, cannot be used for a thorough analysis of texts written in<br />

the Albanian language because the Albanian language in this matter is more<br />

complex. For this reason, we do not have any other choice than to use<br />

programming languages and to write an original application specifically for<br />

the textual analysis of texts written in the Albanian language.<br />

III. 2. Rules of the division of words into syllables and the respective<br />

algorithm<br />

To use programming languages in searching for the identification of<br />

syllables, there must be previously defined strict rules which must be<br />

respected, based on theoretical and professional knowledge of linguistics and<br />

related to the possible configuration of syllables. So the design of an<br />

algorithm is needed, which will take into account all possible cases of letters<br />

combination in correlation vowel-consonant and due to the defined rules,<br />

will bring correct decisions for the division of words into syllables.<br />

Respecting the rules of phonetic separation of words in the Albanian<br />

language, syllables of all possible configurations and in order to design an<br />

effective algorithm, there are five categories with the following conditions:<br />

1. A word has got as many syllables as is the number of vowels within<br />

the V word. Also a single voice can represent a syllable.<br />

2. Case VCV, always treated as a V-CV. So, a consonant between<br />

vowels, regularly goes with the second forming syllable. Ex: a-ra,<br />

di-ta, na-ta, je-ta, etc.<br />

3. As a more complicated case is treated, when between two vowels<br />

there are two consonant VC1C2V , then two cases occur:<br />

182

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!