Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
lexicon by over 90%. The function reduce_lexicon in `festival/lib/lts_build.scm' was used to do<br />
this. A diccussion of using the above technique as a dictionary compression method is discussed in pagel98. A<br />
morphological decomposition algorithm, like that described in black91, may even help more.<br />
The technique described in this section and its relative merits with respect to a number of languages/lexicons and<br />
tasks is dicussed more fully in black98.<br />
[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />
13.6 Lexicon requirements<br />
For English there are a number of assumptions made about the lexicon which are worthy of explicit mention. If you<br />
are basically going to use the existing token rules you should try to include at least the following in any lexicon that<br />
is to work with them.<br />
● The letters of the alphabet, when a token is identified as an acronym it is spelled out. The tokenization<br />
assumes that the individual letters of the alphabet are in the lexicon with their pronunciations. They should be<br />
identified as nouns. (This is to distinguish a as a determiner which can be schwa'd from a as a letter which<br />
cannot.) The part of speech should be nn by default, but the value of the variable token.letter_pos is<br />
used and may be changed if this is not what is required.<br />
● One character symbols such as dollar, at-sign, percent etc. Its difficult to get a complete list and to know what<br />
the pronunciation of some of these are (e.g hash or pound sign). But the letter to sound rules cannot deal with<br />
them so they need to be explicitly listed. See the list in the function mrpa_addend in<br />
`festival/lib/dicts/oald/oaldlex.scm'. This list should also contain the control characters<br />
and eight bit characters.<br />
● The possessive 's should be in your lexicon as schwa and voiced fricative (z). It should be in twice, once as<br />
part speech type pos and once as n (used in plurals of numbers acronyms etc. e.g 1950's). 's is treated as a<br />
word and is separated from the tokens it appears with. The post-lexical rule (the function<br />
postlex_apos_s_check) will delete the schwa and devoice the z in appropriate contexts. Note this postlexical<br />
rule brazenly assumes that the unvoiced fricative in the phoneset is s. If it is not in your phoneset copy<br />
the function (it is in `festival/lib/postlex.scm') and change it for your phoneset and use your<br />
version as a post-lexical rule.<br />
● Numbers as digits (e.g. "1", "2", "34", etc.) should normally not be in the lexicon. The number conversion<br />
routines convert numbers to words (i.e. "one", "two", "thirty four", etc.).<br />
● The word "unknown" or whatever is in the variable token.unknown_word_name. This is used in a few<br />
obscure cases when there just isn't anything that can be said (e.g. single characters which aren't in the lexicon).<br />
Some people have suggested it should be possible to make this a sound rather than a word. I agree, but<br />
<strong>Festival</strong> doesn't support that yet.<br />
[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />
13.7 Available lexicons<br />
Currently <strong>Festival</strong> supports a number of different lexicons. They are all defined in the file `lib/lexicons.scm'<br />
each with a number of common extra words added to their addendas. They are<br />
`CUVOALD'<br />
The Computer Users Version of Oxford Advanced Learner's Dictionary is available from the Oxford Text<br />
Archive ftp://ota.ox.ac.uk/pub/ota/public/dicts/710. It contains about 70,000 entries and is a part of the BEEP<br />
lexicon. It is more consistent in its marking of stress though its syllable marking is not what works best for our<br />
synthesis methods. Many syllabic `l''s, `n''s, and `m''s, mess up the syllabification algorithm, making<br />
results sometimes appear over reduced. It is however our current default lexicon. It is also the only lexicon<br />
with part of speech tags that can be distributed (for non-commercial use).<br />
`CMU'<br />
This is automatically constructed from `cmu_dict-0.4' available from many places on the net (see