18.07.2013 Views

LR Rabiner and RW Schafer, June 3

LR Rabiner and RW Schafer, June 3

LR Rabiner and RW Schafer, June 3

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />

8.3. HOMOMORPHIC ANALYSIS OF THE SPEECH MODEL 449<br />

r ^ [ n ]<br />

g ^ [ n ]<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

(a) Glottal Pulse Complex Cepstrum<br />

−5 0<br />

quefrency nT in ms<br />

5<br />

0<br />

−0.2<br />

−0.4<br />

−0.6<br />

−0.8<br />

(c) Radiation Load Complex Cepstrum<br />

−1<br />

−5 0<br />

quefrency nT in ms<br />

5<br />

v ^ [ n ]<br />

p ^ [ n ]<br />

0.2<br />

0.1<br />

0<br />

−0.1<br />

−0.2<br />

−0.3<br />

−0.4<br />

(b) Vocal Tract Complex Cepstrum<br />

−0.5<br />

−5 0<br />

quefrency nT in ms<br />

5<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

(d) Voiced Excitation Complex Cepstrum<br />

−0.2<br />

−5 0 5 10 15 20 25<br />

quefrency nT in ms<br />

Figure 8.17: Complex cepstra of the speech model: (a) Glottal pulse ˆg[n], (b)<br />

Vocal tract impulse response ˆv[n], (c) Radiation load impulse response ˆr[n], <strong>and</strong><br />

(d) Periodic excitation ˆp[n].<br />

from which it follows that<br />

ˆp[n] =<br />

∞<br />

k=1<br />

β k<br />

k δ[n − kNp]. (8.46)<br />

As seen in Figure 8.17(d), the spacing between impulses in the complex cepstrum<br />

due to the input p[n] is Np = 80 samples, corresponding to a pitch period of<br />

1/F0 = 80/10000 = 8 ms. Note that in Figure 8.17 we have shown the discrete<br />

quefrency index in terms of ms, i.e., the horizontal axis shows nT .<br />

According to Eq. (8.44), the complex cepstrum of the synthetic speech output<br />

is the sum of all of the complex cepstra in Figure 8.17. Thus, ˆs[n] =<br />

ˆhV [n] + ˆp[n] is depicted in Figure 8.18(a). The cepstrum, being the even part<br />

of ˆs[n] is depicted in Figure 8.18(b). Note that in both cases, the impulses due<br />

to the periodic excitation tend to st<strong>and</strong> out from the contributions due to the<br />

system impulse response. The location of the first impulse peak is at quefrency<br />

Np, which is the period of the excitation. This is the basis for the use of the<br />

cepstrum or complex cepstrum for pitch detection; i.e., the presence of a strong<br />

peak signals voiced speech, <strong>and</strong> its quefrency is an estimate of the pitch period.<br />

Finally, it is worthwhile to connect the z-transform analysis employed in<br />

this example to the discrete-time Fourier transform representation of the complex<br />

cepstrum. This is depicted in Figures 8.19(a) <strong>and</strong> 8.19(b) which show

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!