05.06.2013 Views

A Study of the ITU-T G.729 Speech Coding Algorithm ...

A Study of the ITU-T G.729 Speech Coding Algorithm ...

A Study of the ITU-T G.729 Speech Coding Algorithm ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Open<br />

MASTER THESIS<br />

Datum - Date Rev Dokumentnr - Document no.<br />

04-09-28 PA1<br />

CELP-coding, <strong>the</strong> long-term predictor is also called an adaptive codebook. The adaptivecodebook<br />

entries are segments <strong>of</strong> <strong>the</strong> recently syn<strong>the</strong>sized signal [15]. The short-term<br />

predictor accounts for <strong>the</strong> spectral envelope, i.e. <strong>the</strong> formants, and is <strong>the</strong> same type <strong>of</strong><br />

production filter used in vocoders. Figure 11 depicts how speech is syn<strong>the</strong>sized in a CELPencoder<br />

[27]. Thus, <strong>the</strong> fixed innovation comes from <strong>the</strong> fixed codebook which is <strong>the</strong>n<br />

filtered through <strong>the</strong> long-term production filter followed by <strong>the</strong> short-term production filter.<br />

Ano<strong>the</strong>r way to see this is that <strong>the</strong> excitation vectors from <strong>the</strong> adaptive and fixed codebook<br />

are added to construct <strong>the</strong> excitation signal which is <strong>the</strong>n filtered through <strong>the</strong> short-term<br />

production filter.<br />

Figure 11: Long-term prediction / Adaptive codebook<br />

By subtracting <strong>the</strong> long-term and short-term predictions <strong>of</strong> <strong>the</strong> speech signal, <strong>the</strong> redundancy<br />

<strong>of</strong> <strong>the</strong> signal is removed. The remaining signal is called <strong>the</strong> innovation or excitation<br />

signal. A set <strong>of</strong> signals which approximates <strong>the</strong> possible excitation sequences forms <strong>the</strong><br />

codebook. Thus, a CELP- encoder algorithm consist <strong>of</strong> three stages:<br />

1. Finding <strong>the</strong> best long-term and short-term linear predictors.<br />

2. Filtering <strong>the</strong> predictor filters with all possible excitations and finding <strong>the</strong> sequence in<br />

<strong>the</strong> codebook that minimizes <strong>the</strong> perceptually-weighted mean squared error. This<br />

is an analysis-by-syn<strong>the</strong>sis procedure. The perceptual weighting takes into account<br />

how humans perceive sound or speech and is absolutely essential to <strong>the</strong> success <strong>of</strong><br />

CELP-coding.<br />

3. Send information to <strong>the</strong> decoder about <strong>the</strong> long- and short-term predictor coefficients<br />

and <strong>the</strong> binary codebook word corresponding to <strong>the</strong> sequence chosen from <strong>the</strong> codebook.<br />

The decoder <strong>the</strong>n looks up <strong>the</strong> excitation sequence from <strong>the</strong> codebook with <strong>the</strong> received<br />

codebook word and drives <strong>the</strong> excitation sequence through <strong>the</strong> long- and short-time production<br />

filters [13]. The CELP-coding principle is depicted in Figure 12 [8].<br />

CELP Complexity<br />

A CELP-coding algorithm is complex and <strong>the</strong>refore computationally demanding. Assuming<br />

a CELP-coder that uses a 10-bit word as an index to <strong>the</strong> codebook, would give a codebook<br />

consisting <strong>of</strong> 2 10 = 1024 excitation sequences. Applying each <strong>of</strong> <strong>the</strong>se 1024 excitation<br />

28 (78)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!