04.02.2014 Views

Lecture Series in Mobile Telecommunications and Networks (1583KB)

Lecture Series in Mobile Telecommunications and Networks (1583KB)

Lecture Series in Mobile Telecommunications and Networks (1583KB)

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Questions & Answers<br />

Michael Walker: Thank you, Peter, for that splendid talk. Peter has agreed to take questions, so we have the<br />

opportunity to tease out more <strong>in</strong>formation.<br />

John Lowe (Institution of Mechanical Eng<strong>in</strong>eers): Thank you, Peter – it is a fasc<strong>in</strong>at<strong>in</strong>g future that you have<br />

described. What worries me is the duplication of st<strong>and</strong>ards between the three territories of the Far East, Europe <strong>and</strong><br />

North America. Could you fill us <strong>in</strong> on the prospect of some compatibility or s<strong>in</strong>gularity of direction?<br />

Peter Vary: Yes, that is a very good question. We have many st<strong>and</strong>ards – perhaps too many – but this is also evolution.<br />

We are <strong>in</strong> the happy situation that, <strong>in</strong> the meantime, we have worldwide agreement on two sides. One is the ITU<br />

(International <strong>Telecommunications</strong> Union), which is more or less responsible for fixed l<strong>in</strong>e, voice-over IP bus<strong>in</strong>ess.<br />

On the other side we have what is called the 3GPP (Third Generation Partnership Programme). They talk to each other<br />

<strong>and</strong> take over the st<strong>and</strong>ards we have for cellular radio meanwhile <strong>in</strong> the ITU world. Th<strong>in</strong>gs are gett<strong>in</strong>g better.<br />

There will be many different st<strong>and</strong>ards but there will be a convergence, I am pretty sure but, for reasons of compatibility,<br />

you will need to have the old codecs. I have a collection of some old phones – my first one is the Motorola shown on<br />

the slide, <strong>and</strong> it still works. If I switch it on, it says that it is no longer compatible with the system, but it works. We have<br />

to keep the compatibility but, on the other side, it is not too expensive because it is just software – but, say<strong>in</strong>g that it is<br />

‘just software’ means that most of the signall<strong>in</strong>g process<strong>in</strong>g is carried out on a programmable signal processor <strong>in</strong> the<br />

mobile phone. You have a lot of memory there <strong>and</strong> select<strong>in</strong>g a different codec is just a matter of select<strong>in</strong>g a different<br />

part of the memory where you have the programme for the codec A, B, C, D, E. It is not that difficult, but it is more<br />

difficult than <strong>in</strong> the old world where you just had one or two different versions of A-law PCM <strong>and</strong> Mu-law PCM. It is<br />

becom<strong>in</strong>g more difficult but it is not too expensive.<br />

Professor Ralph Benjam<strong>in</strong> (Visit<strong>in</strong>g Professor, Bristol University): Right at the beg<strong>in</strong>n<strong>in</strong>g, you mentioned the<br />

importance of underst<strong>and</strong><strong>in</strong>g the physiological constra<strong>in</strong>ts on speech generation as a guide to the optimum use of a<br />

limited number of bits <strong>in</strong> the <strong>in</strong>formation, which will probably be even more important <strong>in</strong> speech extension, b<strong>and</strong>width<br />

expansion, or <strong>in</strong> turbo decod<strong>in</strong>g. However, this is only one of probably three separate elements because you also have<br />

the constra<strong>in</strong>ts of language <strong>and</strong> dialect. In American southern states, people talk with a long drawl; other languages are<br />

totally different from English <strong>and</strong>, <strong>in</strong> one extreme, some African languages consist of a lot of staccato clicks. The features<br />

you have to encode depend rather critically on this.<br />

A third feature might be that different elements <strong>in</strong> your speech may be of different sensitivity <strong>in</strong> terms of recognition<br />

<strong>and</strong> underst<strong>and</strong><strong>in</strong>g, or acceptability by the user. You may need rather a clever comb<strong>in</strong>ation of those three features to<br />

make the best use of the <strong>in</strong>formation <strong>in</strong> encod<strong>in</strong>g <strong>in</strong> the first place, or <strong>in</strong> know<strong>in</strong>g which preamble or support<strong>in</strong>g<br />

parameters are required to tell you how to do your b<strong>and</strong>width extension. Thank you.<br />

Peter Vary: The st<strong>and</strong>ardisation of the different speech codecs is always based on subjective listen<strong>in</strong>g tests. In the<br />

codec competition, different companies are propos<strong>in</strong>g codec c<strong>and</strong>idates <strong>and</strong> there is no objective measure that is so<br />

reliable that you can rely on the objective measure. Therefore, we need subjective listen<strong>in</strong>g tests, which are very<br />

expensive, <strong>in</strong> different languages, with native listeners <strong>and</strong> native speakers. That, first of all, is a necessary constra<strong>in</strong>t<br />

which is fulfilled for the cod<strong>in</strong>g part.<br />

For b<strong>and</strong>width extension, I can tell you that it makes a big difference to apply b<strong>and</strong>width extension for German or<br />

French or Japanese. You hear the variation <strong>and</strong> it might work better for one language than for another. You could<br />

implement a language switch, or language recognition, or someth<strong>in</strong>g like that. That is why the artificial b<strong>and</strong>width<br />

extension without any side <strong>in</strong>formation is only the second best solution: it gives you some wideb<strong>and</strong> impression <strong>and</strong><br />

sometimes it works well. If you could adapt it to the speaker, then it would be very nice.<br />

Therefore, the proposal to add side <strong>in</strong>formation – <strong>in</strong> that case, the concept will hide these bits <strong>in</strong> the bit-stream, but we<br />

first have to get the bits. This is done <strong>in</strong> such a way that, at the transmit side, we have the higher frequency b<strong>and</strong> <strong>and</strong><br />

then we apply there locally the artificial b<strong>and</strong>width extension <strong>and</strong> we compare it with the orig<strong>in</strong>al. We then transmit<br />

the correction terms as hidden <strong>in</strong>formation to the receiver. If you have a strange language there, hopefully the<br />

56 The Royal Academy of Eng<strong>in</strong>eer<strong>in</strong>g

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!