12.12.2012 Views

Festival Speech Synthesis System: - Speech Resource Pages

Festival Speech Synthesis System: - Speech Resource Pages

Festival Speech Synthesis System: - Speech Resource Pages

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

lexicon by over 90%. The function reduce_lexicon in `festival/lib/lts_build.scm' was used to do<br />

this. A diccussion of using the above technique as a dictionary compression method is discussed in pagel98. A<br />

morphological decomposition algorithm, like that described in black91, may even help more.<br />

The technique described in this section and its relative merits with respect to a number of languages/lexicons and<br />

tasks is dicussed more fully in black98.<br />

[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />

13.6 Lexicon requirements<br />

For English there are a number of assumptions made about the lexicon which are worthy of explicit mention. If you<br />

are basically going to use the existing token rules you should try to include at least the following in any lexicon that<br />

is to work with them.<br />

● The letters of the alphabet, when a token is identified as an acronym it is spelled out. The tokenization<br />

assumes that the individual letters of the alphabet are in the lexicon with their pronunciations. They should be<br />

identified as nouns. (This is to distinguish a as a determiner which can be schwa'd from a as a letter which<br />

cannot.) The part of speech should be nn by default, but the value of the variable token.letter_pos is<br />

used and may be changed if this is not what is required.<br />

● One character symbols such as dollar, at-sign, percent etc. Its difficult to get a complete list and to know what<br />

the pronunciation of some of these are (e.g hash or pound sign). But the letter to sound rules cannot deal with<br />

them so they need to be explicitly listed. See the list in the function mrpa_addend in<br />

`festival/lib/dicts/oald/oaldlex.scm'. This list should also contain the control characters<br />

and eight bit characters.<br />

● The possessive 's should be in your lexicon as schwa and voiced fricative (z). It should be in twice, once as<br />

part speech type pos and once as n (used in plurals of numbers acronyms etc. e.g 1950's). 's is treated as a<br />

word and is separated from the tokens it appears with. The post-lexical rule (the function<br />

postlex_apos_s_check) will delete the schwa and devoice the z in appropriate contexts. Note this postlexical<br />

rule brazenly assumes that the unvoiced fricative in the phoneset is s. If it is not in your phoneset copy<br />

the function (it is in `festival/lib/postlex.scm') and change it for your phoneset and use your<br />

version as a post-lexical rule.<br />

● Numbers as digits (e.g. "1", "2", "34", etc.) should normally not be in the lexicon. The number conversion<br />

routines convert numbers to words (i.e. "one", "two", "thirty four", etc.).<br />

● The word "unknown" or whatever is in the variable token.unknown_word_name. This is used in a few<br />

obscure cases when there just isn't anything that can be said (e.g. single characters which aren't in the lexicon).<br />

Some people have suggested it should be possible to make this a sound rather than a word. I agree, but<br />

<strong>Festival</strong> doesn't support that yet.<br />

[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />

13.7 Available lexicons<br />

Currently <strong>Festival</strong> supports a number of different lexicons. They are all defined in the file `lib/lexicons.scm'<br />

each with a number of common extra words added to their addendas. They are<br />

`CUVOALD'<br />

The Computer Users Version of Oxford Advanced Learner's Dictionary is available from the Oxford Text<br />

Archive ftp://ota.ox.ac.uk/pub/ota/public/dicts/710. It contains about 70,000 entries and is a part of the BEEP<br />

lexicon. It is more consistent in its marking of stress though its syllable marking is not what works best for our<br />

synthesis methods. Many syllabic `l''s, `n''s, and `m''s, mess up the syllabification algorithm, making<br />

results sometimes appear over reduced. It is however our current default lexicon. It is also the only lexicon<br />

with part of speech tags that can be distributed (for non-commercial use).<br />

`CMU'<br />

This is automatically constructed from `cmu_dict-0.4' available from many places on the net (see

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!