18.11.2012 Views

Speech Coding

Speech Coding

Speech Coding

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Class Presentation of Custom DSP Implementation Course on:<br />

<strong>Speech</strong> <strong>Coding</strong><br />

ECE Department - University of Tehran<br />

Presented By:<br />

Neda Kazemian Amiri<br />

May 2005<br />

This is a class presentation. All data are copy righted to<br />

respective authors as listed in the references and have been used here for educational purpose only.


Outline<br />

• Introduction<br />

• Pulse Code Modulation (PCM)<br />

• Algorithm Objectives and Requirements<br />

• Vector Quantization vs. Scalar Quantization<br />

• Linear Predictive <strong>Coding</strong> (LPC) and <strong>Speech</strong> Synthesis<br />

• <strong>Speech</strong> <strong>Coding</strong> Schemes<br />

• Measurement of <strong>Speech</strong> Quality<br />

• ITU Encoding Standards


Introduction<br />

The Spectrum of Human <strong>Speech</strong> [5]


Pulse Code Modulation (PCM)<br />

The sampling and coding process and the resultant signal [5]


Algorithm Objectives and Requirements<br />

• Quality and capacity<br />

• <strong>Coding</strong> Delay<br />

• Robustness<br />

• Complexity and cost<br />

• Voiceband Data Handling


Scalar Quantization<br />

Uniform Quantization Non-Uniform Quantization [1]


Scalar Quantization<br />

The technique of Predictive Quantization [5]


Vector Quantization<br />

Partitioning of a 2-dimensional space to 18 cells<br />

Full search Codebook [1]<br />

X(n)<br />

Input<br />

Vector<br />

Buffer<br />

x<br />

Vector<br />

Matching<br />

Codebook<br />

Y<br />

Index i<br />

y


Vector Quantization<br />

Codebook Types<br />

• Full search codebooks<br />

• Binary search codebooks<br />

• Cascaded codebooks<br />

• Split codebooks<br />

• Gain shape codebooks<br />

• Adaptive codebooks<br />

• Random codebooks


Vector Quantization<br />

Binary search codebooks<br />

Binary Splitting into 8 cells [1]


Vector Quantization<br />

Cascaded Codebooks<br />

Cascaded Vector Quantizer [1]


Vector Quantization<br />

Gain Shape Codebooks<br />

Gain-Shape Vector Quantizer [1]


Voiced and Unvoiced <strong>Speech</strong> Waveforms [1]<br />

<strong>Speech</strong> Signal for word “TO”


S(<br />

z)<br />

G<br />

H ( z)<br />

=<br />

X ( z)<br />

A(<br />

z)<br />

s(<br />

n)<br />

= Gx(<br />

n)<br />

+<br />

p<br />

= = − ∑<br />

j=<br />

z A(<br />

) 1<br />

1<br />

Block Diagram of a Simplified Source Filter model of <strong>Speech</strong> Production [1]<br />

p<br />

∑<br />

j=<br />

1<br />

a s(<br />

n − j)<br />

a<br />

j<br />

− j<br />

j z


Linear Predictive <strong>Coding</strong> (LPC)<br />

and <strong>Speech</strong> Synthesis [5]


<strong>Speech</strong> <strong>Coding</strong> Schemes [1],[2]


Measurement of <strong>Speech</strong> Quality<br />

• Signal to Noise Ratio (SNR)<br />

• Mean Opinion Scores (MOS)<br />

MOS is a subjective measurement of speech quality coded in low bit rates.<br />

1 = bad 2=poor 3 = fair 4 = good 5 = excellent


Quality Comparison of <strong>Speech</strong> <strong>Coding</strong> Schemes [2]


Block Diagram of a Sub-band Coder [1]<br />

A Frequency Domain Hybrid Coder


Code Excited Linear Prediction<br />

(CELP)<br />

a) Coder and b) Decoder [3]


Adaptive Code Excited Linear<br />

Prediction (ACELP) [3]<br />

Adaptive Excitation Codebook


CELP Encoder with an Adaptive Codebook


Generalized Block Diagram of AbS LPC Coder<br />

with Different Excitation types [1]


ITU Encoding Standards [7]<br />

G.711: The first standard introduced for speech compression. It uses<br />

PCM. Its bit rate is 64Kbps. It is used in PSTN.<br />

G.721: It uses ADPCM. Its bit rate is 32Kbps. It is used in PSTN too.<br />

G.722: It is like G.721. Its maximum bit rate is 64Kbps.<br />

G.726: It uses ADPCM. Its bit rates are 16, 24, 32 and 40 Kbps.<br />

G.723.1: It is a hybrid coder. With MP-MLQ algorithm its bit rate is<br />

6.3 Kbps. With ACELP algorithm its bit rate is 5.3 Kbps. It is used for<br />

videophones.<br />

G.728: It is a hybrid coder with LD-CELP algorithm. Its bit rate is 16<br />

Kbps. It uses 5 samples frames.<br />

G.729: It is a hybrid coder with CS-ACELP algorithm. Its bit rate is 8<br />

Kbps. It uses 10ms frames.


ITU Encoding Standards [7]


Comparison of Some Standards in the Terms<br />

of Quality, Bit Rate and Frame Size


Some Characteristics of <strong>Speech</strong><br />

Compression Algorithms


Comparison of Some Standards in the Terms of Quality and Bit Rate


Summery and conclusion<br />

• Some techniques of speech coding have been studied and<br />

compared in terms of quality, bit rate, frame size, and<br />

complexity.<br />

• Algorithms with low bit rate, low complexity and fair Quality<br />

are preferred.<br />

• Quality will increase by increasing the bit rate in an<br />

algorithm.<br />

• Algorithms which are designed for low bit rate<br />

communications, are more complex and thus consume more<br />

power.<br />

• Waveform coders such as G.711 and G.726 have low<br />

complexity and introduce low delay.<br />

• Hybrid coders have high complexity. G.728 is the most<br />

complex algorithm, G.729 and G.723.1 have moderate<br />

complexity.


References<br />

[1] Kondoz, A. M., “ Digital <strong>Speech</strong>, <strong>Coding</strong> for Low Bit Rate<br />

Communications Systems”, John Wiley & Sons, 1995.<br />

[2] Xydeas, C., “An Overview of <strong>Speech</strong> <strong>Coding</strong> Techniques”,<br />

<strong>Speech</strong> coding – Techniques and applications, IEE colloquium<br />

on 14 Apr. 1992, pp. 111 – 125.<br />

[3] Kipper, U., Reininger, H., and Wolf, D., “CELP <strong>Coding</strong> with<br />

Adaptive Excitation Codebooks”, IEEE., 1991.<br />

[4] Schroeder, M. R., and Atal, B. S., “Code-Excited Linear<br />

Prediction (CELP): High-Quality <strong>Speech</strong> at Very Low Bit<br />

Rates”, IEEE, 1985.<br />

[5] Owen, F.E., “PCM and Digital Transmission Systems”,<br />

McGRAW-HILL Book Company, 1976.<br />

[6] Proakis, J. G., Salehi, M., “ Contemporary Communication<br />

Systems Using MATLAB”, PWS Publishing Company, 1997<br />

[7] Brunner, S., and Ali, A. A., “Voice over IP, Understanding<br />

VoIP Networks”, Juniper Networks, Inc., 2004. www.juniper.net

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!