Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
Festival Speech Synthesis System: - Speech Resource Pages
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />
20.1.2 Generating LPC coefficients<br />
LPC coefficients are generated using the `sig2fv' command. Two stages are required, generating the LPC<br />
coefficients and generating the residual. The prototypical commands for these are<br />
sig2fv wav/file001.wav -o lpc/file001.lpc -otype est -lpc_order 16 \<br />
-coefs "lpc" -pm pm/file001.pm -preemph 0.95 -factor 3 \<br />
-window_type hamming<br />
sigfilter wav/file001.wav -o lpc/file001.res -otype nist \<br />
-lpcfilter lpc/file001.lpc -inv_filter<br />
For some databases you may need to normalize the power. Properly normalizing power is difficult but we provide a<br />
simple function which may do the jobs acceptably. You should do this on the waveform before lpc analysis (and<br />
ensure you also do the residual extraction on the normalized waveform rather than the original.<br />
ch_wave -scaleN 0.5 wav/file001.wav -o file001.Nwav<br />
This normalizes the power by maximizing the signal first then multiplying it by the given factor. If the database<br />
waveforms are clean (i.e. no clicks) this can give reasonable results.<br />
[ < ] [ > ] [ > ] [Top] [Contents] [Index] [ ? ]<br />
20.2 Generating a diphone index<br />
The diphone index consists of a short header following by an ascii list of each diphone, the file it comes from<br />
followed by its start middle and end times in seconds. For most databases this files needs to be generated by some<br />
database specific script.<br />
An example header is<br />
EST_File index<br />
DataType ascii<br />
NumEntries 2005<br />
IndexName rab_diphone<br />
EST_Header_End<br />
The most notable part is the number of entries, which you should note can get out of sync with the actual number of<br />
entries if you hand edit entries. I.e. if you add an entry and the system still can't find it check that the number of<br />
entries is right.<br />
The entries themselves may take on one of two forms, full entries or index entries. Full entries consist of a diphone<br />
name, where the phones are separated by "-"; a file name which is used to index into the pitchmark, LPC and<br />
waveform file; and the start, middle (change over point between phones) and end of the phone in the file in seconds<br />
of the diphone. For example<br />
r-uh edx_1001 0.225 0.261 0.320<br />
r-e edx_1002 0.224 0.273 0.326<br />
r-i edx_1003 0.240 0.280 0.321<br />
r-o edx_1004 0.212 0.253 0.320<br />
The second form of entry is an index entry which simply states that reference to that diphone should actually be made<br />
to another. For example