LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />
484CHAPTER 8. THE CEPSTRUM AND HOMOMORPHIC SPEECH PROCESSING<br />
where ⇐⇒ denotes the unique relationship between a sequence <strong>and</strong> its DTFT.<br />
An interesting result can be obtained if we represent the complex cepstrum as<br />
ˆh[n] = c[n] + d[n], (8.112)<br />
where c[n] = Ev{ ˆ h[n]} is the even part <strong>and</strong> d[n] = Odd{ ˆ h[n]} is the odd part of<br />
the complex cepstrum. Recalling that the DTFT of the complex cepstrum is,<br />
by definition, ˆ H(e jω ) = log |H(e jω )| + j arg{H(e jω )}, it can be shown that the<br />
following DTFT relations hold:<br />
<strong>and</strong><br />
nc[n] ⇐⇒ j d log |H(ejω )|<br />
, (8.113a)<br />
dω<br />
nd[n] ⇐⇒ − d arg{H(ejω )}<br />
. (8.113b)<br />
dω<br />
The DTFT expression on the right in (8.113b) is the group delay function [15]<br />
for H(ejω ); i.e.,<br />
grd{H(e jω )} = − d arg{H(ejω )}<br />
. (8.114)<br />
dω<br />
Now if h[n] is assumed to be obtained by all-pole modeling as discussed in<br />
Section 8.6, the complex cepstrum satisfies ˆ h[n] = 0 for n < 0. This means that<br />
ˆh[n] = 2c[n] = 2d[n] for n > 0. If we define l[n] = n, then the liftered cepstrum<br />
distance<br />
D =<br />
∞<br />
m=−∞<br />
|l[m]c[m] − l[m]¯c[m]| =<br />
is equivalent to either<br />
D = 1<br />
π <br />
<br />
<br />
d log |H(e<br />
2π <br />
jω )|<br />
dω<br />
or<br />
−π<br />
∞<br />
m=−∞<br />
<br />
l[m]d[m] − l[m] ¯ d[m] (8.115a)<br />
− d log | ¯ H(e jω )|<br />
dω<br />
<br />
<br />
<br />
dω. (8.115b)<br />
D = 1<br />
π <br />
grd{H(e<br />
2π −π<br />
jω )} − grd{ ¯ H(e jω )} dω, (8.115c)<br />
The result of (8.115b) was also given by Tohkura [26].<br />
Instead of l[n] = n for all n, or the lifter of (8.110), Itakura proposed the<br />
lifter<br />
l[n] = n s e −n2 /2τ 2<br />
. (8.116)<br />
This lifter has great flexibility. For example, if s = 0 we have simply low<br />
quefrency liftering of the cepstrum. If s = 1 <strong>and</strong> τ is large, we have essentially<br />
l[n] = n for small n with high quefrency tapering. The effect of liftering with<br />
Eq. (8.116) is illustrated in Figure 8.40, which shows in (a) the short-time Fourier<br />
transform of a segment of voiced speech along with a linear predictive analysis<br />
spectrum with p = 12. In (b) is shown the liftered group delay spectrum for<br />
s = 1 <strong>and</strong> τ ranging from 5 to 30. Observe that as τ increases, the formant<br />
frequencies are increasingly emphasized. If larger values of s are used, even<br />
greater enhancement of the resonance structure is observed.