A Study of the ITU-T G.729 Speech Coding Algorithm ...
A Study of the ITU-T G.729 Speech Coding Algorithm ...
A Study of the ITU-T G.729 Speech Coding Algorithm ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Open<br />
MASTER THESIS<br />
Datum - Date Rev Dokumentnr - Document no.<br />
04-09-28 PA1<br />
Then, for each subframe analysis-by-syn<strong>the</strong>sis is performed to retrieve <strong>the</strong> CELPparameters<br />
for <strong>the</strong> best syn<strong>the</strong>sized signal.<br />
In Figure 13, <strong>the</strong> excitation signal consists <strong>of</strong> contributions from both, <strong>the</strong> adaptive and<br />
fixed codebook. The adaptive codebook <strong>the</strong>n simulates <strong>the</strong> pitch, and <strong>the</strong>refore voiced<br />
sounds. The fixed codebook simulates unvoiced sounds. In reality, <strong>the</strong> adaptive codebook<br />
is an adaptive filter through which <strong>the</strong> fixed excitation signal is filtered. The following two<br />
sections deal with <strong>the</strong> fixed and adaptive codebook in detail.<br />
3.3.1 Fixed Codebook<br />
The fixed codebook has an algebraic structure and is called interleaved single-pulse permutation<br />
(ISPP) design [13]. Each 40-sample fixed codebook excitation vector has only<br />
four pulses, denoted i0, i1, i2 and i3, with, as previously mentioned, two possible amplitudes:<br />
+1 and -1. The fixed codebook is displayed in Table 4. Each excitation codevector<br />
is a sum <strong>of</strong> four pulses, given by<br />
c(n) = s0δ(n − m0) + s1δ(n − m1) + s2δ(n − m2) + s3δ(n − m3) n = 0, ..., 39 (15)<br />
where δ(n) is <strong>the</strong> unit impulse at time instant n.<br />
Table 4: <strong>G.729</strong> fixed codebook<br />
Pulse Sign Positions<br />
i0<br />
i1<br />
i2<br />
i3<br />
s0 : +<br />
− 1 m0:0, 5, 10, 15, 20, 25, 30, 36<br />
s1 : +<br />
− 1 m1:1, 6, 11, 16, 21, 26, 31, 36<br />
s2 : +<br />
− 1 m2:2, 7, 12, 17, 22, 27, 32, 37<br />
s3 : +<br />
− 1 m3:3, 8, 13, 18, 23, 28, 33, 38<br />
4, 9, 14, 19, 24, 29, 34, 39<br />
The fixed codebook is searched in four nested loops, one for every pulse. However, to<br />
reduce complexity, <strong>the</strong> fourth loop is not entirely searched since a full search in <strong>the</strong> fourth<br />
loop would produce slightly higher speech quality, but also require high computational effort.<br />
Since <strong>the</strong> four excitation pulses are unable to have <strong>the</strong> same position, <strong>the</strong> search is<br />
named conjugate-structured.<br />
As can be seen in Table 4, <strong>the</strong> index to <strong>the</strong> fixed codebook consists <strong>of</strong> two parts: sign<br />
and position. Each pulse requires one bit per sign, thus, a total <strong>of</strong> 4 bits is needed to<br />
encode <strong>the</strong> signs. For <strong>the</strong> first three pulses, i0, i1, i2, 3 bits are needed to encode <strong>the</strong> pulse<br />
positions. For <strong>the</strong> fourth pulse, i3, 4 bits are required. This yields a total <strong>of</strong> 13 bits to index<br />
<strong>the</strong> pulse positions. Thus, altoge<strong>the</strong>r 4 + 13 = 17 bits are needed to transmit <strong>the</strong> fixed<br />
codebook index (without gain), which in fact is done once every subframe.<br />
3.3.2 Adaptive Codebook<br />
The adaptive codebook, which in fact is an adaptive pitch filter is given by<br />
P (z) =<br />
1<br />
1 − βz −T<br />
(16)<br />
32 (78)