02.10.2019 Views

UploadFile_6417

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Linear Predictive Coding (LPC) of Speech 625<br />

Then<br />

N−1<br />

∑<br />

G 2<br />

n=0<br />

x 2 (n) =<br />

N−1<br />

∑<br />

n=0<br />

e 2 (n) (12.36)<br />

If the input excitation is normalized to unit energy by design, then<br />

G 2 =<br />

N−1<br />

∑<br />

n=0<br />

e 2 (n) =r ss (0) +<br />

p∑<br />

a p (k) r ss (k) (12.37)<br />

Thus G 2 is set equal to the residual energy resulting from the least-squares<br />

optimization.<br />

Once the LPC coefficients are computed, we can determine whether<br />

the input speech frame is voiced, and if so, what the pitch is. This is<br />

accomplished by computing the sequence<br />

where r a (k) isdefined as<br />

r e (n) =<br />

r a (k) =<br />

k=1<br />

p∑<br />

r a (k) r ss (n − k) (12.38)<br />

k=1<br />

p∑<br />

a p (i) a p (i + k) (12.39)<br />

i=1<br />

which is the autocorrelation sequence of the prediction coefficients.<br />

The pitch is detected by finding the peak of the normalized sequence<br />

r e (n)/r e (0) in the time interval that corresponds to 3 to 15 ms in the<br />

20-ms sampling frame. If the value of this peak is at least 0.25, the frame<br />

of speech is considered voiced with a pitch period equal to the value of<br />

n = N p , where r e (N p ) /r e (0) is a maximum. If the peak value is less than<br />

0.25, the frame of speech is considered unvoiced and the pitch is zero.<br />

The values of the LPC coefficients, the pitch period, and the type of<br />

excitation are transmitted to the receiver, where the decoder synthesizes<br />

the speech signal by passing the proper excitation through the all-pole<br />

filter model of the vocal tract. Typically, the pitch period requires 6 bits,<br />

and the gain parameter may be represented by 5 bits after its dynamic<br />

range is compressed logarithmically. If the prediction coefficients were to<br />

be coded, they would require between 8 to 10 bits per coefficient for accurate<br />

representation. The reason for such high accuracy is that relatively<br />

small changes in the prediction coefficients result in a large change in the<br />

pole positions of the filter model. The accuracy requirements are lessened<br />

by transmitting the reflection coefficients {K i }, which have a smaller dynamic<br />

range—that is, |K i | < 1. These are adequately represented by 6<br />

bits per coefficient. Thus for a 10th-order predictor the total number of<br />

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).<br />

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!