Letter-to-Sound Conversion for Urdu Text-to-Speech System

More documents

Recommendations

Info

َِچج ث ٹ ت پ ب ا زڑ ر ذ ڈ د خ ح عظ ط ض ص ش س ژ نم ل گ ك ق ف غ ےى ئ ہ و ْ ّ ً ٰ ُھة ں آ Table 1: Urdu basic (top) and secondary(middle) letters and aerab (bottom)Combination of these characters realizes a richinventory of 44 consonants, 8 long oral vowels, 7long nasal vowels, 3 short vowels and numerousdiphthongs (e.g. Saleem et al. 2002, Hussain 1997;set of Urdu diphthongs is still under analysis).This phonemic inventory is given in Table 2.The italicized phonemes, whose existence is stillnot determined, are not considered any further (seeSaleem et al. 2002 for further discussion).Mapping of this phonetic inventory to thecharacters given in Table 1 is discussed later.(a)p b p b m mt d t d n n k k t d t d q f v s z x hr r j l l(b)i e æu o i e æu o Table 2: Urdu (a) Consonantal and (b) Vocalicphonemic inventory3 NLP for Urdu TTSAs discussed earlier, to enable text-to-speechsystem for any language, a Natural LanguageProcessing component is required. The NLPsystem may have differing requirement fordifferent languages. However, it always takes rawtext input and always outputs precise phonetictranscription for a language. The system can bedivided into two parts, Text-NormalizationComponent and Phonological ProcessingComponent. These components may be furtherdivided. A simplified schematic is shown inFigure 1 1 .Urdu RawText InputNormalizedUrdu TextTokenizerSemanticTaggerStringGeneratorLetter to SoundConverterSyllabifierSound ChangeManagerStress MarkerIntonationMarkerAnnotated PhoneticOutputFigure 1: NLP architecture for Urdu TTS system1This diagram is based on the architecture of UrduText to Speech system under development at Center forResearch in Urdu Language Processing(www.crulp.org).Workshop on Arabic Script Based Languages, COLING2004, Geneva 2
The Text Normalization component takes acharacter string as input and converts it into astring of letters. Within it, the Tokenizer uses thepunctuation marks and space between words tomark token boundaries which are then stamped aswords, punctuation, date, time and other relevantcategories by the Semantic Tagger. The StringGenerator takes any non-letter based input (e.g. anumber or a date containing digits) and converts itinto a letter string.After the input is converted into a stringcomprising only of letters, the PhonologicalProcessing Component generates thecorresponding phonetic transcription. This is donethrough a series of processes. The first process isto use Letter-to-Sound Converter (detailed below)to convert the normalized text input to a phonemicstring. This process may also be referred to asgrapheme-to-phoneme conversion. This isfollowed by Syllabifier, which marks syllableboundaries. The intermediate output is thenforwarded to a module which applies Urdu soundchange rules to generate the correspondingphonetic string. Following these modules, StressMarker and Intonation Marker modules add stressand intonation to the string being processed. Resyllabificationis also performed after soundchange rules are applied, in case phones areepenthesized or deleted and syllable boundariesrequire re-adjustment. Urdu shows a reasonablyregular behavior and most of these tasks can beachieved through rule-based systems (e.g. seeHussain 1997 for stress assignment algorithm).This paper focuses on Letter-to-Sound rules forUrdu, the first in the series of modules inPhonological Processing Component.4 Urdu Letter to Sound RulesUrdu shows a very regular mapping fromgraphemes to phonemes. However, to explain thebehavior, the letters need to be further classifiedinto the following categories:a. Consonantal charactersb. Dual (consonantal and vocalic) behaviorcharactersc. Vowel modifier characterd. Consonant modifier charactere. Composite (consonantal and vocalic) characterSimilarly, the aerab set can also be divided intothe following categories:f. Basic vowel specifierg. Extended vowel specifierh. Consonantal gemination specifieri. Dual (vocalic and consonantal) insertorFinally, there is a third category which may takeshape of an letter and aerab:j. Vowel-aerab placeholderThe Consonantal characters in (a) above alwaysrepresent a consonant of Urdu. In Urdu, there isalways a single consonant corresponding to asingle character of this category, unlike some otherlanguages e.g. English maps “ph” string tophoneme /f/. Most of the Urdu consonantalcharacters fall into this category. These charactersand corresponding consonantal phonemes aregiven in Table 3 below. A simple mapping rulewould generate the phoneme corresponding tothese characters.ب پ ت ٹ ث ج چt d s t p bح خ د ڈ ذ ر ڑ r z d x hز ژ س ش ص ض طt z s s zظ ع غ ف ق ك گ k q f zل م ن ہ ةt h n m lTable 3: Consonantal characters and theircorresponding phonemesThree characters of Urdu show dual behavior,i.e. in certain contexts they transform intoconsonants, but in certain other contexts, theytransform into vowels. These characters are AlefAlef acts ‏.(ے or ى)‏ and Yay ‏,(و)‏ vao ‏,(ا)‏exceptionally in this category and therefore it isdiscussed separately in (j) below. Vao changes to/v/ and Yay changes to the approximant /j/ whenthey occur in consonantal positions (in onset orcoda of a syllable). However, when they occur asnucleus of a syllable, they form long vowels. Asan example, Yay occurs as a consonant when itoccurs in the onset of single syllable word ر Workshop on Arabic Script Based Languages, COLING2004, Geneva 3
Page 1: Letter-to-Sound Conversion for Urdu
Page 5 and 6: ًَّٰٔBay + (Null | Zer) + Yay

Letter-to-Sound Conversion for Urdu Text-to-Speech System

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?