Festival Speech Synthesis System: - Speech Resource Pages

More documents

Recommendations

Info

The segments of an utterance may be saved in a file using the function utt.save.segs which saves the segments of the named utterance in xlabel format. Any other stream may also be saved using the more general utt.save.relation which takes the additional argument of a relation name. The names of each item and the end feature of each item are saved in the named file, again in Xlabel format, other features are saved in extra fields. For more elaborated saving methods you can easily write a Scheme function to save data in an utterance in whatever format is required. See the file `lib/mbrola.scm' for an example. A simple function to allow the displaying of an utterance in Entropic's Xwaves tool is provided by the function display. It simply saves the waveform and the segments and sends appropriate commands to (the already running) Xwaves and xlabel programs. A function to synthesize an externally specified utterance is provided for by utt.resynth which takes two filename arguments, an xlabel segment file and an F0 file. This function loads, synthesizes and plays an utterance synthesized from these files. The loading is provided by the underlying function utt.load.segf0. [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 15. Text analysis 15.1 Tokenizing Splitting text into tokens 15.2 Token to word rules 15.3 Homograph disambiguation "Wed 5 may wind US Sen up" [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 15.1 Tokenizing A crucial stage in text processing is the initial tokenization of text. A token in <strong>Festival</strong> is an atom separated with whitespace from a text file (or string). If punctuation for the current language is defined, characters matching that punctuation are removed from the beginning and end of a token and held as features of the token. The default list of characters to be treated as white space is defined as (defvar token.whitespace " \t\n\r") While the default set of punctuation characters is (defvar token.punctuation "\"'`.,:;!?(){}[]") (defvar token.prepunctuation "\"'`({[") These are declared in `lib/token.scm' but may be changed for different languages, text modes etc. [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 15.2 Token to word rules Tokens are further analysed into lists of words. A word is an atom that can be given a pronunciation by the lexicon (or letter to sound rules). A token may give rise to a number of words or none at all. For example the basic tokens
This pocket-watch was made in 1983. would give a word relation of this pocket watch was made in nineteen eighty three Becuase the relationship between tokens and word in some cases is complex, a user function may be specified for translating tokens into words. This is designed to deal with things like numbers, email addresses, and other nonobvious pronunciations of tokens as zero or more words. Currently a builtin function builtin_english_token_to_words offers much of the necessary functionality for English but a user may further customize this. If the user defines a function token_to_words which takes two arguments: a token item and a token name, it will be called by the Token_English and Token_Any modules. A substantial example is given as english_token_to_words in `festival/lib/token.scm'. An example of this function is in `lib/token.scm'. It is quite elaborate and covers most of the common multiword tokens in English including, numbers, money symbols, Roman numerals, dates, times, plurals of symbols, number ranges, telephone number and various other symbols. Let us look at the treatment of one particular phenomena which shows the use of these rules. Consider the expression "$12 million" which should be rendered as the words "twelve million dollars". Note the word "dollars" which is introduced by the "$" sign, ends up after the end of the expression. There are two cases we need to deal with as there are two tokens. The first condition in the cond checks if the current token name is a money symbol, while the second condition check that following word is a magnitude (million, billion, trillion, zillion etc.) If that is the case the "$" is removed and the remaining numbers are pronounced, by calling the builtin token to word function. The second condition deals with the second token. It confirms the previous is a money value (the same regular expression as before) and then returns the word followed by the word "dollars". If it is neither of these forms then the builtin function is called. (define (token_to_words token name) "(token_to_words TOKEN NAME) Returns a list of words for NAME from TOKEN." (cond ((and (string-matches name "\\$[0-9,]+\$\\.[0-9]+\$?") (string-matches (item.feat token "n.name") ".*illion.?")) (builtin_english_token_to_words token (string-after name "$"))) ((and (string-matches (item.feat token "p.name") "\\$[0-9,]+\$\\.[0-9]+\$?") (string-matches name ".*illion.?")) (list name "dollars")) (t (builtin_english_token_to_words token name)))) It is valid to make some conditions return no words, though some care should be taken with that, as punctuation information may no longer be available to later processing if there are no words related to a token. [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 15.3 Homograph disambiguation Not all tokens can be rendered as words easily. Their context may affect the way they are to be pronounced. For example in the utterance
Page 1 and 2:
[Top] [Contents] [Index] [ ? ] Fest
Page 3 and 4:
The Festival Speech Synthesis Syste
Page 5 and 6: 3.3 Edinburgh Speech Tools Library
Page 7 and 8: multiple methods, though we will of
Page 9 and 10: for non-commercial use (we are work
Page 11 and 12: festlex_CMU.tar.gz festlex_OALD.tar
Page 13 and 14: held), and voices_dir (pointing to
Page 15 and 16: Ensure your audio device actually w
Page 17 and 18: $ festival Festival Speech Synthesi
Page 19 and 20: eference to a manual section and re
Page 21 and 22: [ < ] [ > ] [ > ] [Top] [Contents]
Page 23 and 24: To convert a symbol whose print nam
Page 25 and 26: filter A Unix shell program filter
Page 27 and 28: into name and IP address. Note that
Page 29 and 30: The boy saw the girl in the park
Page 31 and 32: VOLUME Allows the specification of
Page 33 and 34: festival/lib/tts.scm). [ < ] [ > ]
Page 35 and 36: 13.2 Defining lexicons Building new
Page 37 and 38: (debug_output t) before compilation
Page 39 and 40: ) The above isn't the most efficien
Page 41 and 42: The process involves the following
Page 43 and 44: (y _epsilon_ i ii i@ ai uh y @ ai-@
Page 45 and 46: lexicon by over 90%. The function r
Page 47 and 48: (define (postlex_apos_s_check utt)
Page 49 and 50: a list of syllables. Each member wi
Page 51 and 52: Phrase This allows explicit phrasin
Page 53 and 54: `(item.daughter2 ITEM)' Return the
Page 55: `stress' This item's lexical stress
Page 59 and 60: ((string-matches name "\\([dD][Rr]\
Page 61 and 62: (set! simple_phrase_cart_tree ' ((R
Page 63 and 64: accented (i.e. has an IntEvent rela
Page 65 and 66: (Utterance Words (boy (saw ((accent
Page 67 and 68: After prediction the segmental dura
Page 69 and 70: aa-ll &aa-l This states that the di
Page 71 and 72: The UniSyn_module_hooks are run bef
Page 73 and 74: for i in wave/*.wav do fname=`basen
Page 75 and 76: used on the signal, and/or up to th
Page 77 and 78: lib/voices/english/don_diphone/fest
Page 79 and 80: (Parameter.set 'Audio_Method 'irixa
Page 81 and 82: voice_el_diphone A male Castilian S
Page 83 and 84: ) (PhoneSet.silences '(#)) Note som
Page 85 and 86: (set! spanish_phrase_cart_tree ' ((
Page 87 and 88: (us_diphone_init (list '(name "el_l
Page 89 and 90: (define (voice_giant) "comment comm
Page 91 and 92: 25. Tools A number of basic data ma
Page 93 and 94: CART ::= QUESTION-NODE || ANSWER-NO
Page 95 and 96: (define (pos_cand_function w) ;; se
Page 97 and 98: some label files identify point typ
Page 99 and 100: Building the models and getting goo
Page 101 and 102: `./src/modules/diphone' An optional
Page 103 and 104: to this function should be added to
Page 105 and 106: #include "festival.h" static LISP u
Page 107 and 108:
In yout `Makefile' for this directo
Page 109 and 110:
Every effort has been made to minim
Page 111 and 112:
A typical example use of `festival_
Page 113 and 114:
A simpler C only interface example
Page 115 and 116:
29.2 Singing Synthesis As an intere
Page 117 and 118:
Magisterarbeit, Institute of Natura
Page 119 and 120:
B C adding new LISP objects 27.2.4
Page 121 and 122:
F G H Edinburgh Speech Tools Librar
Page 123 and 124:
M N O P load-path 6.3 Site initiali
Page 125 and 126:
S resynthesis 14.7 Utterance I/O ru
Page 127 and 128:
U V W ungrouped diphones 20.1 UniSy
Page 129 and 130:
12. Phonesets 13. Lexicons 13.1 Lex
Page 131 and 132:
[Top] [Contents] [Index] [ ? ] Shor
show all

Festival Speech Synthesis System: - Speech Resource Pages

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?