LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />
8.5. HOMOMORPHIC FILTERING OF NATURAL SPEECH 473<br />
Log Magnitude<br />
Phase (Radians)<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
2<br />
0<br />
−2<br />
(a) Log Magnitude of Excitation Component<br />
0 500 1000 1500 2000 2500 3000 3500 4000<br />
(b) Phase of Excitation Component<br />
0 500 1000 1500 2000 2500 3000 3500 4000<br />
Frequency (Hz)<br />
Figure 8.33: Homomorphic filtering of voiced speech; (a) <strong>and</strong> (b) estimate of<br />
log magnitude <strong>and</strong> phase of Ew(e jω ).<br />
ˆy[n] = lhp[n]ˆx[n], which is shown in Figure 8.32b approximates an impulse train<br />
with spacing equal to the pitch period <strong>and</strong> amplitudes retaining the shape of<br />
the Hamming window used to weight the input signal. Thus, with the highpass<br />
lifter, y[n] serves as an estimate of ew[n].<br />
If the same value of nco is used for both the lowpass <strong>and</strong> highpass lifters,<br />
then llp[n] + lhp[n] = 1 for all n. Thus, the choice of the lowpass <strong>and</strong> highpass<br />
lifters defines ew[n] <strong>and</strong> hV [n] so that hV [n] ∗ ew[n] = x[n]; i.e., convolution of<br />
the waveforms in Figures 8.32a <strong>and</strong> 8.32b will result in the original windowed<br />
speech signal shown in Figure 8.32c. In terms of the corresponding discrete-time<br />
Fourier transform, adding the curves in Figures 8.33a <strong>and</strong> 8.33b to the smooth<br />
curves plotted with thick lines in Figures 8.29a <strong>and</strong> 8.29b respectively results in<br />
the rapidly varying curves in Figures 8.29a <strong>and</strong> 8.29b.<br />
8.5.4 Minimum-Phase Analysis<br />
Since the cepstrum is the inverse Fourier transform of the logarithm of the<br />
magnitude of the Fourier transform of the windowed speech segment, it is also<br />
the even part of the complex cepstrum. If the input signal is known to have the<br />
minimum-phase property, we also know that the complex cepstrum is zero for<br />
n < 0 <strong>and</strong> therefore, it can be obtained from the cepstrum by the operation in<br />
Eq. (8.35), which can be seen to be equivalent to multiplying the cepstrum by<br />
a lifter; i.e., ˆxmnp[n] = lmnp[n]c[n] where<br />
⎧<br />
⎪⎨ 0 n < 0<br />
lmnp[n] = 1<br />
⎪⎩<br />
2<br />
n = 0<br />
0 < n.<br />
(8.99a)