Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
25. Tools<br />
A number of basic data manipulation tools are supported by <strong>Festival</strong>. These often make building new modules very<br />
easy and are already used in many of the existing modules. They typically offer a Scheme method for entering data,<br />
and Scheme and C++ functions for evaluating it.<br />
25.1 Regular expressions<br />
25.2 CART trees Building and using CART<br />
25.3 Ngrams Building and using Ngrams<br />
25.4 Viterbi decoder Using the Viterbi decoder<br />
25.5 Linear regression Building and using linear regression models<br />
[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />
25.1 Regular expressions<br />
Regular expressions are a formal method for describing a certain class of mathematical languages. They may be<br />
viewed as patterns which match some set of strings. They are very common in many software tools such as scripting<br />
languages like the UNIX shell, PERL, awk, Emacs etc. Unfortunately the exact form of regualr expressions often<br />
differs slightly between different applications making their use often a little tricky.<br />
<strong>Festival</strong> support regular expressions based mainly of the form used in the GNU libg++ Regex class, though we have<br />
our own implementation of it. Our implementation (EST_Regex) is actually based on Henry Spencer's<br />
`regex.c' as distributed with BSD 4.4.<br />
Regular expressions are represented as character strings which are interpreted as regular expressions by certain<br />
Scheme and C++ functions. Most characters in a regular expression are treated as literals and match only that<br />
character but a number of others have special meaning. Some characters may be escaped with preceeding backslashes<br />
to change them from operators to literals (or sometime literals to operators).<br />
.<br />
Matches any character.<br />
$<br />
matches end of string<br />
^<br />
matches beginning of string<br />
X*<br />
matches zero or more occurrences of X, X may be a character, range of parenthesized expression.<br />
X+<br />
matches one or more occurrences of X, X may be a character, range of parenthesized expression.<br />
X?<br />
matches zero or one occurrence of X, X may be a character, range of parenthesized expression.<br />
[...]<br />
a ranges matches an of the values in the brackets. The range operator "-" allows specification of ranges e.g. az<br />
for all lower case characters. If the first character of the range is ^ then it matches anything character except<br />
those specificed in the range. If you wish - to be in the range you must put that first.<br />
\\(...\\)<br />
Treat contents of parentheses as single object allowing operators *, +, ? etc to operate on more than single<br />
characters.<br />
X\\|Y<br />
matches either X or Y. X or Y may be single characters, ranges or parenthesized expressions.<br />
Note that actuall only one backslash is needed before a character to escape it but becuase these expressions are most<br />
often contained with Scheme or C++ strings, the escpae mechanaism for those strings requires that backslash itself be<br />
escaped, hence you will most often be required to type two backslashes.