Festival Speech Synthesis System: - Speech Resource Pages

More documents

Recommendations

Info

We save the above definitions in a file `spanish_el.scm'. Now we can declare the new voice to Festival. See section 24.3 Defining a new voice, for a description of methods for adding new voices. For testing purposes we can explciitly load the file `spanish_el.scm' The voice is now available for use in festival. festival> (voice_spanish_el) spanish_el festival> (SayText "hola amigos") As you can see adding a new voice is not very difficult. Of course there is quite a lot more than the above to add a high quality robust voice to Festival. But as we can see many of the basic tools that we wish to use already exist. The main difference between the above voice and the English voices already in Festival are that their models are better trained from databases. This produces, in general, better results, but the concepts behind them are basically the same. All of those trainable methods may be parameterized with data for new voices. As Festival develops, more modules will be added with better support for training new voices so in the end we hope that adding in high quality new voices is actually as simple as (or indeed simpler than) the above description. [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 24.2.9 Resetting globals Because the version of Scheme used in Festival only has a single flat name space it is unfortunately too easy for voices to set some global which accidentally affects all other voices selected after it. Because of this problem we have introduced a convention to try to minimise the possibility of this becoming a problem. Each voice function defined should always call voice_reset at the start. This will reset any globals and also call a tidy up function provided by the previous voice function. Likewise in your new voice function you should provide a tidy up function to reset any non-standard global variables you set. The function current_voice_reset will be called by voice_reset. If the value of current_voice_reset is nil then it is not called. voice_reset sets current_voice_reset to nil, after calling it. For example suppose some new voice requires the audio device to be directed to a different machine. In this example we make the giant's voice go through the netaudio machine big_speakers while the standard voice go through small_speakers. Although we can easily select the machine big_speakers as out when our voice_giant is called, we also need to set it back when the next voice is selected, and don't want to have to modify every other voice defined in the system. Let us first define two functions to selection the audio output. (define (select_big) (set! giant_previous_audio (getenv "AUDIOSERVER")) (setenv "AUDIOSERVER" "big_speakers")) (define (select_normal) (setenv "AUDIOSERVER" giant_previous_audio)) Note we save the previous value of AUDIOSERVER rather than simply assuming it was small_speakers. Our definition of voice_giant definition of voice_giant will look something like
(define (voice_giant) "comment comment ..." (voice_reset) ;; get into a known state (select_big) ;;; other giant voice parameters ... (set! current_voice_rest select_normal) (set! current-voice 'giant)) The obvious question is which variables should a voice reset. Unfortunately there is not a definitive answer to that. To a certain extent I don't want to define that list as there will be many variables that will by various people in Festival which are not in the original distribution and we don't want to restrict them. The longer term answer is some for of partitioning of the Scheme name space perhaps having voice local variables (cf. Emacs buffer local variables). But ultimately a voice may set global variables which could redefine the operation of later selected voices and there seems no real way to stop that, and keep the generality of the system. Note the convention of setting the global current-voice as the end of any voice definition file. We do not enforce this but probabaly should. The variable current-voice at any time should identify the current voice, the voice description information (described below) will relate this name to properties identifying it. [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 24.3 Defining a new voice As there are a number of voices available for Festival and they may or may not exists in different installations we have tried to make it as simple as possible to add new voices to the system without having to change any of the basic distribution. In fact if the voices use the following standard method for describing themselves it is merely a matter of unpacking them in order for them to be used by the system. The variable voice-path conatins a list of directories where voices will be automatically searched for. If this is not set it is set automatically by appending `/voices/' to all paths in festival load-path. You may add new directories explicitly to this variable in your `sitevars.scm' file or your own `.festivalrc' as you wish. Each voice directory is assumed to be of the form LANGUAGE/VOICENAME/ Within the VOICENAME/ directory itself it is assumed there is a file `festvox/VOICENAME.scm' which when loaded will define the voice itself. The actual voice function should be called voice_VOICENAME. For example the voices distributed with the standard Festival distribution all unpack in `festival/lib/voices'. The Amercan voice `ked_diphone' unpacks into festival/lib/voices/english/ked_diphone/ Its actual definition file is in festival/lib/voices/english/ked_diphone/festvox/ked_diphone.scm Note the name of the directory and the name of the Scheme definition file must be the same. Alternative voices using perhaps a different encoding of the database but the same front end may be defined in the same way by using symbolic links in the langauge directoriy to the main directory. For example a PSOLA version of the ked voice may be defined in
Page 1 and 2:
[Top] [Contents] [Index] [ ? ] Fest
Page 3 and 4:
The Festival Speech Synthesis Syste
Page 5 and 6:
3.3 Edinburgh Speech Tools Library
Page 7 and 8:
multiple methods, though we will of
Page 9 and 10:
for non-commercial use (we are work
Page 11 and 12:
festlex_CMU.tar.gz festlex_OALD.tar
Page 13 and 14:
held), and voices_dir (pointing to
Page 15 and 16:
Ensure your audio device actually w
Page 17 and 18:
$ festival Festival Speech Synthesi
Page 19 and 20:
eference to a manual section and re
Page 21 and 22:
[ < ] [ > ] [ > ] [Top] [Contents]
Page 23 and 24:
To convert a symbol whose print nam
Page 25 and 26:
filter A Unix shell program filter
Page 27 and 28:
into name and IP address. Note that
Page 29 and 30:
The boy saw the girl in the park
Page 31 and 32:
VOLUME Allows the specification of
Page 33 and 34:
festival/lib/tts.scm). [ < ] [ > ]
Page 35 and 36:
13.2 Defining lexicons Building new
Page 37 and 38: (debug_output t) before compilation
Page 39 and 40: ) The above isn't the most efficien
Page 41 and 42: The process involves the following
Page 43 and 44: (y _epsilon_ i ii i@ ai uh y @ ai-@
Page 45 and 46: lexicon by over 90%. The function r
Page 47 and 48: (define (postlex_apos_s_check utt)
Page 49 and 50: a list of syllables. Each member wi
Page 51 and 52: Phrase This allows explicit phrasin
Page 53 and 54: `(item.daughter2 ITEM)' Return the
Page 55 and 56: `stress' This item's lexical stress
Page 57 and 58: This pocket-watch was made in 1983.
Page 59 and 60: ((string-matches name "\\([dD][Rr]\
Page 61 and 62: (set! simple_phrase_cart_tree ' ((R
Page 63 and 64: accented (i.e. has an IntEvent rela
Page 65 and 66: (Utterance Words (boy (saw ((accent
Page 67 and 68: After prediction the segmental dura
Page 69 and 70: aa-ll &aa-l This states that the di
Page 71 and 72: The UniSyn_module_hooks are run bef
Page 73 and 74: for i in wave/*.wav do fname=`basen
Page 75 and 76: used on the signal, and/or up to th
Page 77 and 78: lib/voices/english/don_diphone/fest
Page 79 and 80: (Parameter.set 'Audio_Method 'irixa
Page 81 and 82: voice_el_diphone A male Castilian S
Page 83 and 84: ) (PhoneSet.silences '(#)) Note som
Page 85 and 86: (set! spanish_phrase_cart_tree ' ((
Page 87: (us_diphone_init (list '(name "el_l
Page 91 and 92: 25. Tools A number of basic data ma
Page 93 and 94: CART ::= QUESTION-NODE || ANSWER-NO
Page 95 and 96: (define (pos_cand_function w) ;; se
Page 97 and 98: some label files identify point typ
Page 99 and 100: Building the models and getting goo
Page 101 and 102: `./src/modules/diphone' An optional
Page 103 and 104: to this function should be added to
Page 105 and 106: #include "festival.h" static LISP u
Page 107 and 108: In yout `Makefile' for this directo
Page 109 and 110: Every effort has been made to minim
Page 111 and 112: A typical example use of `festival_
Page 113 and 114: A simpler C only interface example
Page 115 and 116: 29.2 Singing Synthesis As an intere
Page 117 and 118: Magisterarbeit, Institute of Natura
Page 119 and 120: B C adding new LISP objects 27.2.4
Page 121 and 122: F G H Edinburgh Speech Tools Librar
Page 123 and 124: M N O P load-path 6.3 Site initiali
Page 125 and 126: S resynthesis 14.7 Utterance I/O ru
Page 127 and 128: U V W ungrouped diphones 20.1 UniSy
Page 129 and 130: 12. Phonesets 13. Lexicons 13.1 Lex
Page 131 and 132: [Top] [Contents] [Index] [ ? ] Shor
show all

Festival Speech Synthesis System: - Speech Resource Pages

Create successful ePaper yourself

Delete template?

Save as template?