LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
LR Rabiner and RW Schafer, June 3
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
DRAFT: L. R. <strong>Rabiner</strong> <strong>and</strong> R. W. <strong>Schafer</strong>, <strong>June</strong> 3, 2009<br />
8.7. CEPSTRUM DISTANCE MEASURES 481<br />
x[n]<br />
x[n]<br />
Linear<br />
Predictive<br />
Analysis<br />
Linear<br />
Predictive<br />
Analysis<br />
H ( z)<br />
= Factor zk<br />
Employ hˆ<br />
[ n]<br />
G<br />
A(<br />
z)<br />
A(z)<br />
(8.102)<br />
(a)<br />
H ( z)<br />
= Compute h[n]<br />
Employ hˆ<br />
[ n]<br />
h[n] Recursion<br />
G<br />
(8.101) (8.103)<br />
A(<br />
z)<br />
(b)<br />
Figure 8.39: Computation of the complex cepstrum of the impulse response of an<br />
all-pole minimum-phase model of the vocal tract system.; (a) polynomial rooting<br />
of the denominator of the all-pole system function, (b) recursive computation<br />
using Eq. (8.103). (Numbers in parenthesis refer to text equations.)<br />
obtain<br />
⎧<br />
0 n < 0<br />
⎪⎨ G n = 0<br />
h[n] =<br />
⎪⎩<br />
Gˆ n−1 <br />
<br />
k<br />
h[n] + ˆh[k]h[n − k] n > 0.<br />
n<br />
k=0<br />
(8.104)<br />
This method of computation of the complex cepstrum of the all-pole vocal<br />
tract model relies on the linear predictive analysis to to remove the effects of<br />
the excitation. By restricting p to be much less than the pitch period Np,<br />
linear predictive modeling accomplishes what the lowpass lifter accomplishes in<br />
homomorphic filtering.<br />
8.7 Cepstrum Distance Measures<br />
Perhaps the most pervasive application of the cepstrum in speech processing is<br />
its use in pattern recognition problems such as vector quantization (VQ) <strong>and</strong><br />
automatic speech recognition (ASR). In such applications, a speech signal is<br />
represented on a frame-by-frame basis by a sequence of short-time cepstrums.<br />
In later discussions in this section, it will be useful to use somewhat more<br />
complicated notation. Specifically, we denote the cepstrum of the m th frame of<br />
a signal xm[n] as c (x)<br />
m [n], where n denotes the quefrency index of the cepstrum.<br />
In cases where it is not necessary to distinguish between signals or frames, these<br />
additional designations will be omitted as we have done up to this point in this<br />
chapter.<br />
Cepstrum-like representations can be obtained in many ways as we have<br />
seen. No matter how it is computed, we can assume that the cepstrum vector<br />
corresponds to a gain-normalized (c[0] = 0) minimum-phase vocal tract impulse