18.07.2013 Views

LR Rabiner and RW Schafer, June 3

LR Rabiner and RW Schafer, June 3

LR Rabiner and RW Schafer, June 3

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />

484CHAPTER 8. THE CEPSTRUM AND HOMOMORPHIC SPEECH PROCESSING<br />

where ⇐⇒ denotes the unique relationship between a sequence <strong>and</strong> its DTFT.<br />

An interesting result can be obtained if we represent the complex cepstrum as<br />

ˆh[n] = c[n] + d[n], (8.112)<br />

where c[n] = Ev{ ˆ h[n]} is the even part <strong>and</strong> d[n] = Odd{ ˆ h[n]} is the odd part of<br />

the complex cepstrum. Recalling that the DTFT of the complex cepstrum is,<br />

by definition, ˆ H(e jω ) = log |H(e jω )| + j arg{H(e jω )}, it can be shown that the<br />

following DTFT relations hold:<br />

<strong>and</strong><br />

nc[n] ⇐⇒ j d log |H(ejω )|<br />

, (8.113a)<br />

dω<br />

nd[n] ⇐⇒ − d arg{H(ejω )}<br />

. (8.113b)<br />

dω<br />

The DTFT expression on the right in (8.113b) is the group delay function [15]<br />

for H(ejω ); i.e.,<br />

grd{H(e jω )} = − d arg{H(ejω )}<br />

. (8.114)<br />

dω<br />

Now if h[n] is assumed to be obtained by all-pole modeling as discussed in<br />

Section 8.6, the complex cepstrum satisfies ˆ h[n] = 0 for n < 0. This means that<br />

ˆh[n] = 2c[n] = 2d[n] for n > 0. If we define l[n] = n, then the liftered cepstrum<br />

distance<br />

D =<br />

∞<br />

m=−∞<br />

|l[m]c[m] − l[m]¯c[m]| =<br />

is equivalent to either<br />

D = 1<br />

π <br />

<br />

<br />

d log |H(e<br />

2π <br />

jω )|<br />

dω<br />

or<br />

−π<br />

∞<br />

m=−∞<br />

<br />

l[m]d[m] − l[m] ¯ d[m] (8.115a)<br />

− d log | ¯ H(e jω )|<br />

dω<br />

<br />

<br />

<br />

dω. (8.115b)<br />

D = 1<br />

π <br />

grd{H(e<br />

2π −π<br />

jω )} − grd{ ¯ H(e jω )} dω, (8.115c)<br />

The result of (8.115b) was also given by Tohkura [26].<br />

Instead of l[n] = n for all n, or the lifter of (8.110), Itakura proposed the<br />

lifter<br />

l[n] = n s e −n2 /2τ 2<br />

. (8.116)<br />

This lifter has great flexibility. For example, if s = 0 we have simply low<br />

quefrency liftering of the cepstrum. If s = 1 <strong>and</strong> τ is large, we have essentially<br />

l[n] = n for small n with high quefrency tapering. The effect of liftering with<br />

Eq. (8.116) is illustrated in Figure 8.40, which shows in (a) the short-time Fourier<br />

transform of a segment of voiced speech along with a linear predictive analysis<br />

spectrum with p = 12. In (b) is shown the liftered group delay spectrum for<br />

s = 1 <strong>and</strong> τ ranging from 5 to 30. Observe that as τ increases, the formant<br />

frequencies are increasingly emphasized. If larger values of s are used, even<br />

greater enhancement of the resonance structure is observed.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!