LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />
8.5. HOMOMORPHIC FILTERING OF NATURAL SPEECH 471<br />
cepstrum by l[n] corresponds to convolving its DTFT L(ejω ) with the complex<br />
logarithm ˆ X(ejω ) as in<br />
ˆY (e jω π<br />
) = ˆX(e jθ )L(e j(ω−θ) )dθ. (8.95)<br />
−π<br />
This operation, which is simply linear filtering of the complex logarithm of the<br />
DTFT, was also called “liftering” by Bogert et al. [1], <strong>and</strong> therefore l[n] is often<br />
called a “lifter”. The resulting windowed complex cepstrum is processed by the<br />
inverse characteristic system to recover the desired component.<br />
This is illustrated by the thick lines in Figures. 8.29a <strong>and</strong> 8.29b which show<br />
the log magnitude <strong>and</strong> phase obtained in the process of implementing the inverse<br />
characteristic system (i.e., ˆ Y (ejω )) when l[n] is of the form<br />
⎧<br />
⎪⎨ 1, |n| < nco<br />
llp[n] = 0.5 |n| = nco<br />
(8.96)<br />
⎪⎩<br />
0, |n| > nco,<br />
where, in general, nco is chosen to be less than the pitch period, Np, <strong>and</strong> in this<br />
example, nco = 50 as shown in Figure 8.30a. 18<br />
When using the DFT implementation, the lifter in Eq. (8.96) must conform<br />
to the sample ordering of the DFT, i.e., the negative quefrencies fall in the<br />
interval N/2 < n ≤ N −1 for an N-point DFT. Thus, for DFT implementations,<br />
the lowpass lifter has the form,<br />
⎧<br />
1, 0 ≤ n < nco<br />
⎪⎨ 0.5 n = nco<br />
˜llp[n] = 0, nco < n < N − nco<br />
(8.97)<br />
0.5 n = N − nco<br />
⎪⎩<br />
1 N − nco < n ≤ N − 1.<br />
For simplicity, we shall henceforth define lifters in DTFT form as in Eq. (8.96),<br />
recognizing that the DFT form is always obtained by the process that yielded<br />
Eq. (8.97).<br />
The thick lines that are superimposed on the plots of log |X(e jω )| <strong>and</strong><br />
arg{X(e jω )} in Figures 8.29a <strong>and</strong> 8.29b show the real <strong>and</strong> imaginary parts<br />
of ˆ Y (e jω ) corresponding to the liftered complex cepstrum ˆy[n] = llp[n]ˆx[n].<br />
By comparing these plots to the corresponding plots with thin lines in Figures<br />
8.29b <strong>and</strong> 8.29c respectively, it can be seen that ˆ Y (e jω ) is a lowpass filtered<br />
(smoothed) version of ˆ X(e jω ). The result of the lowpass liftering is to remove<br />
the effect of the excitation in the short-time Fourier transform. That is, retaining<br />
only the low quefrency components of the complex cepstrum is a way of<br />
estimating ˆ HV (e jω ) = log |HV (e jω )| + j arg{HV (e jω )}, the complex logarithm<br />
of the frequency response of the vocal tract system. We see that the smoothed<br />
log magnitude function in Figure 8.29a clearly displays formant resonances at<br />
about 500, 1500, 2250, <strong>and</strong> 3100 Hz. Also note that if the lifter llp[n] is applied<br />
18 A one-sample transition is included in Eq. (8.96). Exp<strong>and</strong>ing or omitting this transition<br />
completely usually has little effect.