12.12.2012 Views

Festival Speech Synthesis System: - Speech Resource Pages

Festival Speech Synthesis System: - Speech Resource Pages

Festival Speech Synthesis System: - Speech Resource Pages

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

for i in a b c d e f g h i j k l m n o p q r s t u v w x y z<br />

do<br />

# Stop value for wagon<br />

STOP=2<br />

echo letter $i STOP $STOP<br />

# Find training set for letter $i<br />

cat oald.train.feats |<br />

awk '{if ($6 == "'$i'") print $0}' >ltsdataTRAIN.$i.feats<br />

# split training set to get heldout data for stepwise testing<br />

traintest ltsdataTRAIN.$i.feats<br />

# Extract test data for letter $i<br />

cat oald.test.feats |<br />

awk '{if ($6 == "'$i'") print $0}' >ltsdataTEST.$i.feats<br />

# run wagon to predict model<br />

wagon -data ltsdataTRAIN.$i.feats.train -test ltsdataTRAIN.$i.feats.test \<br />

-stepwise -desc ltsOALD.desc -stop $STOP -output lts.$i.tree<br />

# Test the resulting tree against<br />

wagon_test -heap 2000000 -data ltsdataTEST.$i.feats -desc ltsOALD.desc \<br />

-tree lts.$i.tree<br />

done<br />

The script `traintest' splits the given file `X' into `X.train' and `X.test' with every tenth line in<br />

`X.test' and the rest in `X.train'.<br />

This script can take a significnat amount of time to run, about 6 hours on a Sun Ultra 140.<br />

Once the models are created the must be collected together into a single list structure. The trees generated by<br />

`wagon' contain fully probability distributions at each leaf, at this time this information can be removed as only the<br />

most probable will actually be predicted. This substantially reduces the size of the tress.<br />

(merge_models 'oald_lts_rules "oald_lts_rules.scm")<br />

(merge_models is defined within `lts_build.scm') The given file will contain a set! for the given variable<br />

name to an assoc list of letter to trained tree. Note the above function naively assumes that the letters in the alphabet<br />

are the 26 lower case letters of the English alphabet, you will need to edit this adding accented letters if required.<br />

Note that adding "'" (single quote) as a letter is a little tricky in scheme but can be done--the command (intern<br />

"'") will give you the symbol for single quote.<br />

To test a set of lts models load the saved model and call the following function with the test align file<br />

festival oald-table.scm oald_lts_rules.scm<br />

festival> (lts_testset "oald.test.align" oald_lts_rules)<br />

The result (after showing all the failed ones), will be a table showing the results for each letter, for all letters and for<br />

complete words. The failed entries may give some notion of how good or bad the result is, sometimes it will be<br />

simple vowel diferences, long versus short, schwa versus full vowel, other times it may be who consonants missing.<br />

Remember the ultimate quality of the letter sound rules is how adequate they are at providing acceptable<br />

pronunciations rather than how good the numeric score is.<br />

For some languages (e.g. English) it is necessary to also find a stree pattern for unknown words. Ultimately for this to<br />

work well you need to know the morphological decomposition of the word. At present we provide a CART trained<br />

system to predict stress patterns for English. If does get 94.6% correct for an unseen test set but that isn't really very<br />

good. Later tests suggest that predicting stressed and unstressed phones directly is actually better for getting whole<br />

words correct even though the models do slightly worse on a per phone basis black98.<br />

As the lexicon may be a large part of the system we have also experimented with removing entries from the lexicon if<br />

the letter to sound rules system (and stree assignment system) can correct predict them. For OALD this allows us to<br />

half the size of the lexicon, it could possibly allow more if a certain amount of fuzzy acceptance was allowed (e.g.<br />

with schwa). For other languages the gain here can be very signifcant, for German and French we can reduce the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!