Festival Speech Synthesis System: - Speech Resource Pages

More documents

Recommendations

Info

If you have never heard any audio ever on your machine then you must first work out if you have the appropriate hardware. If you do, you also need the appropriate software to drive it. Festival can directly interface with a number of audio systems or use external methods for playing audio. The currently supported audio methods are `NAS' NCD's NAS, is a network transparent audio system (formerly called netaudio). If you already run servers on your machines you simply need to ensure your AUDIOSERVER environment variable is set (or your DISPLAY variable if your audio output device is the same as your X Windows display). You may set NAS as your audio output method by the command (Parameter.set 'Audio_Method 'netaudio) `/dev/audio' On many systems `/dev/audio' offers a simple low level method for audio output. It is limited to mu-law encoding at 8KHz. Some implementations of `/dev/audio' allow other sample rates and sample types but as that is non-standard this method only uses the common format. Typical systems that offer these are Suns, Linux and FreeBSD machines. You may set direct `/dev/audio' access as your audio method by the command (Parameter.set 'Audio_Method 'sunaudio) `/dev/audio (16bit)' Later Sun Microsystems workstations support 16 bit linear audio at various sample rates. Support for this form of audio output is supported. It is a compile time option (as it requires include files that only exist on Sun machines. If your installation supports it (check the members of the list *modules*) you can select 16 bit audio output on Suns by the command (Parameter.set 'Audio_Method 'sun16audio) Note this will send it to the local machine where the festival binary is running, this might not be the one you are sitting next to--that's why we recommend netaudio. A hacky solution to playing audio on a local machine from a remote machine without using netaudio is described in 6. Installation `/dev/dsp (voxware)' Both FreeBSD and Linux have a very similar audio interface through `/dev/dsp'. There is compile time support for these in the speech tools and when compiled with that option Festival may utilise it. Check the value of the variable *modules* to see which audio devices are directly supported. On FreeBSD, if supported, you may select local 16 bit linear audio by the command (Parameter.set 'Audio_Method 'freebsd16audio) While under Linux, if supported, you may use the command (Parameter.set 'Audio_Method 'linux16audio) Some earlier (and smaller machines) only have 8bit audio even though they include a `/dev/dsp' (Soundblaster PRO for example). This was not dealt with properly in earlier versions of the system but now the support automatically checks to see the sample width supported and uses it accordingly. 8 bit at higher frequencies that 8K sounds better than straight 8k ulaw so this feature is useful. `mplayer' Under Windows NT or 95 you can use the `mplayer' command which we have found requires special treatement to get its parameters right. Rather than using Audio_Command you can select this on Windows machine with the following command (Parameter.set 'Audio_Method 'mplayeraudio) Alternatively built-in audio output is available with (Parameter.set 'Audio_Method 'win32audio) `SGI IRIX' Builtin audio output is now available for SGI's IRIX 6.2 using the command
(Parameter.set 'Audio_Method 'irixaudio) `Audio Command' Alternatively the user can provide a command that can play an audio file. Festival will execute that command in an environment where the shell variables SR is set to the sample rate (in Hz) and FILE which, by default, is the name of an unheadered raw, 16bit file containing the synthesized waveform in the byte order of the machine Festival is running on. You can specify your audio play command and that you wish Festival to execute that command through the following command (Parameter.set 'Audio_Command "sun16play -f $SR $FILE") (Parameter.set 'Audio_Method 'Audio_Command) On SGI machines under IRIX the equivalent would be (Parameter.set 'Audio_Command "sfplay -i integer 16 2scomp rate $SR end $FILE") (Parameter.set 'Audio_Method 'Audio_Command) The Audio_Command method of playing waveforms Festival supports two additional audio parameters. Audio_Required_Rate allows you to use Festival's internal sample rate conversion function to any desired rate. Note this may not be as good as playing the waveform at the sample rate it is originally created in, but as some hardware devices are restrictive in what sample rates they support, or have naive resample functions this could be optimal. The second additional audio parameter is Audio_Required_Format which can be used to specify the desired output forms of the file. The default is unheadered raw, but this may be any of the values supported by the speech tools (including nist, esps, snd, riff, aiff, audlab, raw and, if you really want it, ascii). For example suppose you have a program that only plays sun headered files at 16000 KHz you can set up audio output as (Parameter.set 'Audio_Method 'Audio_Command) (Parameter.set 'Audio_Required_Rate 16000) (Parameter.set 'Audio_Required_Format 'snd) (Parameter.set 'Audio_Command "sunplay $FILE") Where the audio method supports it, you can specify alternative audio device for machine that have more than one audio device. (Parameter.set 'Audio_Device "/dev/dsp2") If Netaudio is not available and you need to play audio on a machine different from teh one Festival is running on we have had reports that `snack' (http://www.speech.kth.se/snack/) is a possible solution. It allows remote play but importnatly also supports Windows 95/NT based clients. Because you do not want to wait for a whole file to be synthesized before you can play it, Festival also offers an audio spooler that allows the playing of audio files while continuing to synthesize the following utterances. On reasonable workstations this allows the breaks between utterances to be as short as your hardware allows them to be. The audio spooler may be started by selecting asynchronous mode (audio_mode 'async) This is switched on by default be the function tts. You may put Festival back into synchronous mode (i.e. the utt.play command will wait until the audio has finished playing before returning). by the command (audio_mode 'sync) Additional related commands are (audio_mode 'close)
Page 1 and 2:
[Top] [Contents] [Index] [ ? ] Fest
Page 3 and 4:
The Festival Speech Synthesis Syste
Page 5 and 6:
3.3 Edinburgh Speech Tools Library
Page 7 and 8:
multiple methods, though we will of
Page 9 and 10:
for non-commercial use (we are work
Page 11 and 12:
festlex_CMU.tar.gz festlex_OALD.tar
Page 13 and 14:
held), and voices_dir (pointing to
Page 15 and 16:
Ensure your audio device actually w
Page 17 and 18:
$ festival Festival Speech Synthesi
Page 19 and 20:
eference to a manual section and re
Page 21 and 22:
[ < ] [ > ] [ > ] [Top] [Contents]
Page 23 and 24:
To convert a symbol whose print nam
Page 25 and 26:
filter A Unix shell program filter
Page 27 and 28: into name and IP address. Note that
Page 29 and 30: The boy saw the girl in the park
Page 31 and 32: VOLUME Allows the specification of
Page 33 and 34: festival/lib/tts.scm). [ < ] [ > ]
Page 35 and 36: 13.2 Defining lexicons Building new
Page 37 and 38: (debug_output t) before compilation
Page 39 and 40: ) The above isn't the most efficien
Page 41 and 42: The process involves the following
Page 43 and 44: (y _epsilon_ i ii i@ ai uh y @ ai-@
Page 45 and 46: lexicon by over 90%. The function r
Page 47 and 48: (define (postlex_apos_s_check utt)
Page 49 and 50: a list of syllables. Each member wi
Page 51 and 52: Phrase This allows explicit phrasin
Page 53 and 54: `(item.daughter2 ITEM)' Return the
Page 55 and 56: `stress' This item's lexical stress
Page 57 and 58: This pocket-watch was made in 1983.
Page 59 and 60: ((string-matches name "\\([dD][Rr]\
Page 61 and 62: (set! simple_phrase_cart_tree ' ((R
Page 63 and 64: accented (i.e. has an IntEvent rela
Page 65 and 66: (Utterance Words (boy (saw ((accent
Page 67 and 68: After prediction the segmental dura
Page 69 and 70: aa-ll &aa-l This states that the di
Page 71 and 72: The UniSyn_module_hooks are run bef
Page 73 and 74: for i in wave/*.wav do fname=`basen
Page 75 and 76: used on the signal, and/or up to th
Page 77: lib/voices/english/don_diphone/fest
Page 81 and 82: voice_el_diphone A male Castilian S
Page 83 and 84: ) (PhoneSet.silences '(#)) Note som
Page 85 and 86: (set! spanish_phrase_cart_tree ' ((
Page 87 and 88: (us_diphone_init (list '(name "el_l
Page 89 and 90: (define (voice_giant) "comment comm
Page 91 and 92: 25. Tools A number of basic data ma
Page 93 and 94: CART ::= QUESTION-NODE || ANSWER-NO
Page 95 and 96: (define (pos_cand_function w) ;; se
Page 97 and 98: some label files identify point typ
Page 99 and 100: Building the models and getting goo
Page 101 and 102: `./src/modules/diphone' An optional
Page 103 and 104: to this function should be added to
Page 105 and 106: #include "festival.h" static LISP u
Page 107 and 108: In yout `Makefile' for this directo
Page 109 and 110: Every effort has been made to minim
Page 111 and 112: A typical example use of `festival_
Page 113 and 114: A simpler C only interface example
Page 115 and 116: 29.2 Singing Synthesis As an intere
Page 117 and 118: Magisterarbeit, Institute of Natura
Page 119 and 120: B C adding new LISP objects 27.2.4
Page 121 and 122: F G H Edinburgh Speech Tools Librar
Page 123 and 124: M N O P load-path 6.3 Site initiali
Page 125 and 126: S resynthesis 14.7 Utterance I/O ru
Page 127 and 128: U V W ungrouped diphones 20.1 UniSy
Page 129 and 130:
12. Phonesets 13. Lexicons 13.1 Lex
Page 131 and 132:
[Top] [Contents] [Index] [ ? ] Shor
show all

Festival Speech Synthesis System: - Speech Resource Pages

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?