Festival Speech Synthesis System: - Speech Resource Pages

More documents

Recommendations

Info

festival/lib/voices/english/ked_diphone/festvox/ked_psola.scm Adding a symbole link in `festival/lib/voices/english/' ro `ked_diphone' called `ked_psola' will allow that voice to be automatically registered when Festival starts up. Note that this method doesn't actually load the voices it finds, that could be prohibitively time consuming to the start up process. It blindly assumes that there is a file `VOICENAME/festvox/VOICENAME.scm' to load. An autoload definition is given for voice_VOICENAME which when called will load that file and call the real definition if it exists in the file. This is only a recommended method to make adding new voices easier, it may be ignored if you wish. However we still recommend that even if you use your own convetions for adding new voices you consider the autoload function to define them in, for example, the `siteinit.scm' file or `.festivalrc'. The autoload function takes three arguments: a function name, a file containing the actual definiton and a comment. For example a definition of voice can be done explicitly by (autooad voice_f2b "/home/awb/data/f2b/ducs/f2b_ducs" "American English female f2b"))) Of course you can also load the definition file explicitly if you wish. In order to allow the system to start making intellegent use of voices we recommend that all voice definitions include a call to the function voice_proclaim this allows the system to know some properties about the voice such as language, gender and dialect. The proclaim_voice function taks two arguments a name (e.g. rab_diphone and an assoc list of features and names. Currently we require language, gender, dialect and description. The last being a textual description of the voice itself. An example proclaimation is (proclaim_voice 'rab_diphone '((language english) (gender male) (dialect british) (description "This voice provides a British RP English male voice using a residual excited LPC diphone synthesis method. It uses a modified Oxford Advanced Learners' Dictionary for pronunciations. Prosodic phrasing is provided by a statistically trained model using part of speech and local distribution of breaks. Intonation is provided by a CART tree predicting ToBI accents and an F0 contour generated from a model trained from natural speech. The duration model is also trained from data using a CART tree."))) There are functions to access a description. voice.description will return the description for a given voice and will load that voice if it is not already loaded. voice.describe will describe the given given voice by synthesizing the textual description using the current voice. It would be nice to use the voice itself to give a self introduction but unfortunately that introduces of problem of decide which language the description should be in, we are not all as fluent in welsh as we'd like to be. The function voice.list will list the potential voices in the system. These are the names of voices which have been found in the voice-path. As they have not actaully been loaded they can't actually be confirmed as usable voices. One solution to this would be to load all voices at start up time which would allow confirmation they exist and to get their full description through proclaim_voice. But start up is already too slow in festival so we have to accept this stat for the time being. Splitting the description of the voice from the actual definition is a possible solution to this problem but we have not yet looked in to this. [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]
25. Tools A number of basic data manipulation tools are supported by Festival. These often make building new modules very easy and are already used in many of the existing modules. They typically offer a Scheme method for entering data, and Scheme and C++ functions for evaluating it. 25.1 Regular expressions 25.2 CART trees Building and using CART 25.3 Ngrams Building and using Ngrams 25.4 Viterbi decoder Using the Viterbi decoder 25.5 Linear regression Building and using linear regression models [ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ] 25.1 Regular expressions Regular expressions are a formal method for describing a certain class of mathematical languages. They may be viewed as patterns which match some set of strings. They are very common in many software tools such as scripting languages like the UNIX shell, PERL, awk, Emacs etc. Unfortunately the exact form of regualr expressions often differs slightly between different applications making their use often a little tricky. Festival support regular expressions based mainly of the form used in the GNU libg++ Regex class, though we have our own implementation of it. Our implementation (EST_Regex) is actually based on Henry Spencer's `regex.c' as distributed with BSD 4.4. Regular expressions are represented as character strings which are interpreted as regular expressions by certain Scheme and C++ functions. Most characters in a regular expression are treated as literals and match only that character but a number of others have special meaning. Some characters may be escaped with preceeding backslashes to change them from operators to literals (or sometime literals to operators). . Matches any character. $ matches end of string ^ matches beginning of string X* matches zero or more occurrences of X, X may be a character, range of parenthesized expression. X+ matches one or more occurrences of X, X may be a character, range of parenthesized expression. X? matches zero or one occurrence of X, X may be a character, range of parenthesized expression. [...] a ranges matches an of the values in the brackets. The range operator "-" allows specification of ranges e.g. az for all lower case characters. If the first character of the range is ^ then it matches anything character except those specificed in the range. If you wish - to be in the range you must put that first. \$...\$ Treat contents of parentheses as single object allowing operators *, +, ? etc to operate on more than single characters. X\\|Y matches either X or Y. X or Y may be single characters, ranges or parenthesized expressions. Note that actuall only one backslash is needed before a character to escape it but becuase these expressions are most often contained with Scheme or C++ strings, the escpae mechanaism for those strings requires that backslash itself be escaped, hence you will most often be required to type two backslashes.
Page 1 and 2:
[Top] [Contents] [Index] [ ? ] Fest
Page 3 and 4:
The Festival Speech Synthesis Syste
Page 5 and 6:
3.3 Edinburgh Speech Tools Library
Page 7 and 8:
multiple methods, though we will of
Page 9 and 10:
for non-commercial use (we are work
Page 11 and 12:
festlex_CMU.tar.gz festlex_OALD.tar
Page 13 and 14:
held), and voices_dir (pointing to
Page 15 and 16:
Ensure your audio device actually w
Page 17 and 18:
$ festival Festival Speech Synthesi
Page 19 and 20:
eference to a manual section and re
Page 21 and 22:
[ < ] [ > ] [ > ] [Top] [Contents]
Page 23 and 24:
To convert a symbol whose print nam
Page 25 and 26:
filter A Unix shell program filter
Page 27 and 28:
into name and IP address. Note that
Page 29 and 30:
The boy saw the girl in the park
Page 31 and 32:
VOLUME Allows the specification of
Page 33 and 34:
festival/lib/tts.scm). [ < ] [ > ]
Page 35 and 36:
13.2 Defining lexicons Building new
Page 37 and 38:
(debug_output t) before compilation
Page 39 and 40: ) The above isn't the most efficien
Page 41 and 42: The process involves the following
Page 43 and 44: (y _epsilon_ i ii i@ ai uh y @ ai-@
Page 45 and 46: lexicon by over 90%. The function r
Page 47 and 48: (define (postlex_apos_s_check utt)
Page 49 and 50: a list of syllables. Each member wi
Page 51 and 52: Phrase This allows explicit phrasin
Page 53 and 54: `(item.daughter2 ITEM)' Return the
Page 55 and 56: `stress' This item's lexical stress
Page 57 and 58: This pocket-watch was made in 1983.
Page 59 and 60: ((string-matches name "\\([dD][Rr]\
Page 61 and 62: (set! simple_phrase_cart_tree ' ((R
Page 63 and 64: accented (i.e. has an IntEvent rela
Page 65 and 66: (Utterance Words (boy (saw ((accent
Page 67 and 68: After prediction the segmental dura
Page 69 and 70: aa-ll &aa-l This states that the di
Page 71 and 72: The UniSyn_module_hooks are run bef
Page 73 and 74: for i in wave/*.wav do fname=`basen
Page 75 and 76: used on the signal, and/or up to th
Page 77 and 78: lib/voices/english/don_diphone/fest
Page 79 and 80: (Parameter.set 'Audio_Method 'irixa
Page 81 and 82: voice_el_diphone A male Castilian S
Page 83 and 84: ) (PhoneSet.silences '(#)) Note som
Page 85 and 86: (set! spanish_phrase_cart_tree ' ((
Page 87 and 88: (us_diphone_init (list '(name "el_l
Page 89: (define (voice_giant) "comment comm
Page 93 and 94: CART ::= QUESTION-NODE || ANSWER-NO
Page 95 and 96: (define (pos_cand_function w) ;; se
Page 97 and 98: some label files identify point typ
Page 99 and 100: Building the models and getting goo
Page 101 and 102: `./src/modules/diphone' An optional
Page 103 and 104: to this function should be added to
Page 105 and 106: #include "festival.h" static LISP u
Page 107 and 108: In yout `Makefile' for this directo
Page 109 and 110: Every effort has been made to minim
Page 111 and 112: A typical example use of `festival_
Page 113 and 114: A simpler C only interface example
Page 115 and 116: 29.2 Singing Synthesis As an intere
Page 117 and 118: Magisterarbeit, Institute of Natura
Page 119 and 120: B C adding new LISP objects 27.2.4
Page 121 and 122: F G H Edinburgh Speech Tools Librar
Page 123 and 124: M N O P load-path 6.3 Site initiali
Page 125 and 126: S resynthesis 14.7 Utterance I/O ru
Page 127 and 128: U V W ungrouped diphones 20.1 UniSy
Page 129 and 130: 12. Phonesets 13. Lexicons 13.1 Lex
Page 131 and 132: [Top] [Contents] [Index] [ ? ] Shor
show all

Festival Speech Synthesis System: - Speech Resource Pages

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?