Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
for i in a b c d e f g h i j k l m n o p q r s t u v w x y z<br />
do<br />
# Stop value for wagon<br />
STOP=2<br />
echo letter $i STOP $STOP<br />
# Find training set for letter $i<br />
cat oald.train.feats |<br />
awk '{if ($6 == "'$i'") print $0}' >ltsdataTRAIN.$i.feats<br />
# split training set to get heldout data for stepwise testing<br />
traintest ltsdataTRAIN.$i.feats<br />
# Extract test data for letter $i<br />
cat oald.test.feats |<br />
awk '{if ($6 == "'$i'") print $0}' >ltsdataTEST.$i.feats<br />
# run wagon to predict model<br />
wagon -data ltsdataTRAIN.$i.feats.train -test ltsdataTRAIN.$i.feats.test \<br />
-stepwise -desc ltsOALD.desc -stop $STOP -output lts.$i.tree<br />
# Test the resulting tree against<br />
wagon_test -heap 2000000 -data ltsdataTEST.$i.feats -desc ltsOALD.desc \<br />
-tree lts.$i.tree<br />
done<br />
The script `traintest' splits the given file `X' into `X.train' and `X.test' with every tenth line in<br />
`X.test' and the rest in `X.train'.<br />
This script can take a significnat amount of time to run, about 6 hours on a Sun Ultra 140.<br />
Once the models are created the must be collected together into a single list structure. The trees generated by<br />
`wagon' contain fully probability distributions at each leaf, at this time this information can be removed as only the<br />
most probable will actually be predicted. This substantially reduces the size of the tress.<br />
(merge_models 'oald_lts_rules "oald_lts_rules.scm")<br />
(merge_models is defined within `lts_build.scm') The given file will contain a set! for the given variable<br />
name to an assoc list of letter to trained tree. Note the above function naively assumes that the letters in the alphabet<br />
are the 26 lower case letters of the English alphabet, you will need to edit this adding accented letters if required.<br />
Note that adding "'" (single quote) as a letter is a little tricky in scheme but can be done--the command (intern<br />
"'") will give you the symbol for single quote.<br />
To test a set of lts models load the saved model and call the following function with the test align file<br />
festival oald-table.scm oald_lts_rules.scm<br />
festival> (lts_testset "oald.test.align" oald_lts_rules)<br />
The result (after showing all the failed ones), will be a table showing the results for each letter, for all letters and for<br />
complete words. The failed entries may give some notion of how good or bad the result is, sometimes it will be<br />
simple vowel diferences, long versus short, schwa versus full vowel, other times it may be who consonants missing.<br />
Remember the ultimate quality of the letter sound rules is how adequate they are at providing acceptable<br />
pronunciations rather than how good the numeric score is.<br />
For some languages (e.g. English) it is necessary to also find a stree pattern for unknown words. Ultimately for this to<br />
work well you need to know the morphological decomposition of the word. At present we provide a CART trained<br />
system to predict stress patterns for English. If does get 94.6% correct for an unseen test set but that isn't really very<br />
good. Later tests suggest that predicting stressed and unstressed phones directly is actually better for getting whole<br />
words correct even though the models do slightly worse on a per phone basis black98.<br />
As the lexicon may be a large part of the system we have also experimented with removing entries from the lexicon if<br />
the letter to sound rules system (and stree assignment system) can correct predict them. For OALD this allows us to<br />
half the size of the lexicon, it could possibly allow more if a certain amount of fuzzy acceptance was allowed (e.g.<br />
with schwa). For other languages the gain here can be very signifcant, for German and French we can reduce the