12.12.2012 Views

Festival Speech Synthesis System: - Speech Resource Pages

Festival Speech Synthesis System: - Speech Resource Pages

Festival Speech Synthesis System: - Speech Resource Pages

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

To convert a symbol whose print name is a number to a number use parse-number. This is the equivalent to<br />

atof in C.<br />

Note that, all i/o from Scheme input files is assumed to be basically some form of Scheme data (though can be just<br />

numbers, tokens). For more elaborate analysis of incoming data it is possible to use the text tokenization functions<br />

which offer a fully programmable method of reading data.<br />

[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />

9. TTS<br />

<strong>Festival</strong> supports text to speech for raw text files. If you are not interested in using <strong>Festival</strong> in any other way except<br />

as black box for rendering text as speech, the following method is probably what you want.<br />

festival --tts myfile<br />

This will say the contents of `myfile'. Alternatively text may be submitted on standard input<br />

echo hello world | festival --tts<br />

cat myfile | festival --tts<br />

<strong>Festival</strong> supports the notion of text modes where the text file type may be identified, allowing <strong>Festival</strong> to process the<br />

file in an appropriate way. Currently only two types are considered stable: STML and raw, but other types such as<br />

email, HTML, Latex, etc. are being developed and discussed below. This follows the idea of buffer modes in<br />

Emacs where a file's type can be utilized to best display the text. Text mode may also be selected based on a<br />

filename's extension.<br />

Within the command interpreter the function tts is used to render files as text; it takes a filename and the text mode<br />

as arguments.<br />

9.1 Utterance chunking From text to utterances<br />

9.2 Text modes Mode specific text analysis<br />

9.3 Example text mode An example mode for reading email<br />

[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />

9.1 Utterance chunking<br />

Text to speech works by first tokenizing the file and chunking the tokens into utterances. The definition of utterance<br />

breaks is determined by the utterance tree in variable eou_tree. A default version is given in `lib/tts.scm'.<br />

This uses a decision tree to determine what signifies an utterance break. Obviously blank lines are probably the most<br />

reliable, followed by certain punctuation. The confusion of the use of periods for both sentence breaks and<br />

abbreviations requires some more heuristics to best guess their different use. The following tree is currently used<br />

which works better than simply using punctuation.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!