pdf - Universität zu Köln
pdf - Universität zu Köln
pdf - Universität zu Köln
Sie wollen auch ein ePaper? Erhöhen Sie die Reichweite Ihrer Titel.
YUMPU macht aus Druck-PDFs automatisch weboptimierte ePaper, die Google liebt.
#Hash with encodings for combinations of two characters<br />
#the § is probably to avoid later matching of UE in names like bauer ?<br />
our %doublechars=(SC=>’C’, SZ=>’C’, CZ=>’C’, TZ=>’C’, SZ=>’C’, TS=>’C’,<br />
KS=>’X’, PF=>’V’, QU=>’KW’, PH=>’V’, UE=>’Y’, AE=>’E’,<br />
OE=>’Ö’, EI=>’AY’, EY=>’AY’, EU=>’OY’, AU=>’A§’, OU=>’§ ’);<br />
sub PHONEM<br />
{<br />
my $string = uc shift;<br />
#iterate over two character substitutions<br />
foreach my $index (0..((length $string)-2))<br />
{<br />
if ($doublechars{substr $string,$index,2})<br />
{<br />
substr ($string,$index,2) = $doublechars{substr $string,$index,2};<br />
}<br />
}<br />
#single character substitutions via tr<br />
#umlauts are still lower case, since they are not converted by uc<br />
$string =~tr/ZKGQäüIJflFWPT§àáéèúu^ooi^ı/CCCCEYYYSVVBDUAAEEUUOOYY/;<br />
#delete forbidden characters by using the complementary operator<br />
$string =~tr/ABCDLMNORSUVWXYö//cd;<br />
#remove double chars<br />
$string =~tr/ABCDLMNORSUVWXYö//s;<br />
return $string;<br />
}<br />
1;<br />
11.4 Implementation der Silbentrennung<br />
# Syllabification algorithm tailored for german words as suggested by<br />
# multiple authors as for Example<br />
# Spencer, 1996, Phonology: theory and description<br />
# ISBN 0-631-19233-6<br />
#<br />
# Martin Wilz 2004-12-10<br />
70