LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />
8.3. HOMOMORPHIC ANALYSIS OF THE SPEECH MODEL 447<br />
log e | G(e j2π FT ) |<br />
log e | R(e j2π FT ) |<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
(a) Glottal Pulse Spectrum<br />
−4<br />
0 1000 2000 3000 4000 5000<br />
frequency in Hz<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
(c) Radiation Load Frequency Response<br />
−4<br />
0 1000 2000 3000 4000 5000<br />
frequency in Hz<br />
log e | V(e j2π FT ) |<br />
| P(e j2π FT ) |<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
−0.5<br />
−1<br />
(b) Vocal Tract Frequency Response<br />
−1.5<br />
0 1000 2000 3000 4000 5000<br />
frequency in Hz<br />
100<br />
80<br />
60<br />
40<br />
20<br />
(d) Voiced Excitation Spectrum<br />
0<br />
0 1000 2000 3000 4000 5000<br />
frequency in Hz<br />
Figure 8.15: Log magnitude (base e of DTFTs: (a) Glottal pulse DTFT<br />
log |G(e jω )|. (b) Vocal tract frequency response, log |V (e jω )|. (c) Radiation<br />
load frequency response log |R(e jω )|. (d) Magnitude of DTFT of periodic excitation<br />
|P (e jω )|.<br />
in Figure 8.15 in corresponding locations. Note that the discrete-time Fourier<br />
transforms are plotted as log e | · | rather than in dB (i.e., 20 log 10 | · |) as is<br />
common elsewhere throughout this text. To convert the plots in Figure 8.15(a),<br />
(b) <strong>and</strong> (c) to dB, simply multiply by 20 log 10 e = 8.6859. We see that the<br />
spectral contribution due to the glottal pulse is a lowpass component that has a<br />
dynamic range of about 6 between F = 0 <strong>and</strong> F = 5000 Hz. This is equivalent<br />
to about 50 dB spectral falloff. Figure 8.15(b) shows the spectral contribution<br />
of the vocal tract system. The peaks of the spectrum are approximately at<br />
the locations given in Table 8.2 with b<strong>and</strong>widths that increase with increasing<br />
frequency. As depicted in Figure 8.15(c), the effect of radiation is to give a high<br />
frequency boost that partially compensates for the falloff due to the glottal<br />
pulse. Finally, Figure 8.15(d) shows |P (e j2πF T )| (not the log) as a function of<br />
F . Note the periodic structure due to the periodicity of p[n]. The fundamental<br />
frequency for Np = 80 is F0 = 10000/80 = 125 Hz. 6<br />
Now if the components of the speech model are combined by convolution,<br />
as defined in the upper branch of Figure 8.12, the result is the synthetic speech<br />
signal s[n] which is plotted in Figure 8.16(a). The frequency-domain repre-<br />
6 In order to be able to make the plot in Figure 8.15(d) in Matlab it was necessary to use<br />
β = 0.999; i.e., the excitation was not perfectly periodic.