23.07.2014 Views

COMMUNICATION THEORY - Signal Processing Systems

COMMUNICATION THEORY - Signal Processing Systems

COMMUNICATION THEORY - Signal Processing Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>COMMUNICATION</strong> <strong>THEORY</strong><br />

Frans M.J. Willems 1<br />

Spring 2005<br />

1 Group <strong>Signal</strong> <strong>Processing</strong> <strong>Systems</strong>, Electrical Engineering Department, Eindhoven University of Technology.


Contents<br />

I Some History 7<br />

1 Historical facts related to communication 8<br />

1.1 Telegraphy and telephony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8<br />

1.2 Wireless communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />

1.3 Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />

1.4 Classical modulation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

1.5 Fundamental bounds and concepts . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

II Optimum Receiver Principles 14<br />

2 Decision rules for discrete channels 15<br />

2.1 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15<br />

2.2 Decision rules, an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16<br />

2.3 The “maximum a-posteriori” decision rule . . . . . . . . . . . . . . . . . . . . . 18<br />

2.4 The “maximum likelihood” decision rule . . . . . . . . . . . . . . . . . . . . . . 19<br />

2.5 Scalar and vector channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />

3 Decision rules for the real scalar channel 25<br />

3.1 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />

3.2 MAP and ML decision rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

3.3 The scalar additive Gaussian noise channel . . . . . . . . . . . . . . . . . . . . 27<br />

3.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />

3.3.2 The MAP decision rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />

3.3.3 The Q-function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />

3.3.4 Probability of error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />

3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

4 Real vector channels 35<br />

4.1 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35<br />

4.2 Decision variables, MAP decoding . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

4.3 Decision regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37<br />

1


CONTENTS 2<br />

4.4 The additive Gaussian noise vector channel . . . . . . . . . . . . . . . . . . . . 38<br />

4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />

4.4.2 Minimum Euclidean distance decision rule . . . . . . . . . . . . . . . . 39<br />

4.4.3 Error probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />

4.5 Theorem of reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />

4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />

5 Waveform channels, correlation receiver 46<br />

5.1 System description, additive white Gaussian noise . . . . . . . . . . . . . . . . . 46<br />

5.2 Waveform expansions in series of orthonormal functions . . . . . . . . . . . . . 47<br />

5.3 An equivalent vector channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

5.4 A correlation receiver is optimum . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />

5.5 Sufficient statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br />

5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br />

6 Waveform channels, building-blocks 53<br />

6.1 Another sufficient statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

6.2 The Gram-Schmidt orthogonalization procedure . . . . . . . . . . . . . . . . . . 55<br />

6.3 Transmitting vectors instead of waveforms . . . . . . . . . . . . . . . . . . . . . 55<br />

6.4 <strong>Signal</strong> space, signal structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56<br />

6.5 Receiving vectors instead of waveforms . . . . . . . . . . . . . . . . . . . . . . 58<br />

6.6 The additive Gaussian vector channel . . . . . . . . . . . . . . . . . . . . . . . 58<br />

6.7 <strong>Processing</strong> the received vector . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />

6.8 Relation between waveform and vector channel . . . . . . . . . . . . . . . . . . 60<br />

6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61<br />

7 Matched filters 64<br />

7.1 Matched-filter receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64<br />

7.2 Direct receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

7.3 <strong>Signal</strong>-to-noise ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

7.4 Parseval relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />

7.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />

7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />

8 Orthogonal signaling 75<br />

8.1 Orthogonal signal structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75<br />

8.2 Optimum receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />

8.3 Error probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />

8.4 Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77<br />

8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79


CONTENTS 3<br />

III Transmitter Optimization 82<br />

9 <strong>Signal</strong> energy considerations 83<br />

9.1 Translation and rotation of signal structures . . . . . . . . . . . . . . . . . . . . 83<br />

9.2 <strong>Signal</strong> energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br />

9.3 Translating a signal structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84<br />

9.4 Comparison of orthogonal and antipodal signaling . . . . . . . . . . . . . . . . . 85<br />

9.5 Energy of |M| orthogonal signals . . . . . . . . . . . . . . . . . . . . . . . . . 88<br />

9.6 Upper bound on the error probability . . . . . . . . . . . . . . . . . . . . . . . . 88<br />

9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89<br />

10 <strong>Signal</strong>ing for message sequences 92<br />

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92<br />

10.2 Definitions, problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 92<br />

10.3 Bit-by-bit signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93<br />

10.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93<br />

10.3.2 Probability of error considerations . . . . . . . . . . . . . . . . . . . . . 96<br />

10.4 Block-orthogonal signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />

10.4.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />

10.4.2 Probability of error considerations . . . . . . . . . . . . . . . . . . . . . 97<br />

10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98<br />

11 Time, bandwidth, and dimensionality 99<br />

11.1 Number of dimensions for bit-by-bit and block-orthogonal signaling . . . . . . . 99<br />

11.2 Dimensionality as a function of T . . . . . . . . . . . . . . . . . . . . . . . . . 99<br />

11.3 Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103<br />

11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103<br />

12 Bandlimited transmission 104<br />

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104<br />

12.2 Sphere hardening (noise vector) . . . . . . . . . . . . . . . . . . . . . . . . . . 105<br />

12.3 Sphere hardening (received vector) . . . . . . . . . . . . . . . . . . . . . . . . . 106<br />

12.4 The interior volume of a hypersphere . . . . . . . . . . . . . . . . . . . . . . . . 107<br />

12.5 An upper bound for the number |M| of signal vectors . . . . . . . . . . . . . . . 107<br />

12.6 A lower bound to the number |M| of signal vectors . . . . . . . . . . . . . . . . 108<br />

12.6.1 Generating a random code . . . . . . . . . . . . . . . . . . . . . . . . . 108<br />

12.6.2 Error probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108<br />

12.6.3 Achievability result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111<br />

12.7 Capacity of the bandlimited channel . . . . . . . . . . . . . . . . . . . . . . . . 111<br />

12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113


CONTENTS 4<br />

13 Wideband transmission 114<br />

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114<br />

13.2 Capacity of the wideband channel . . . . . . . . . . . . . . . . . . . . . . . . . 114<br />

13.3 Relation between capacities, signal-to-noise ratio . . . . . . . . . . . . . . . . . 115<br />

13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116<br />

IV Channel Models 118<br />

14 Pulse amplitude modulation 119<br />

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119<br />

14.2 Orthonormal pulses: the Nyquist criterion . . . . . . . . . . . . . . . . . . . . . 120<br />

14.2.1 The Nyquist result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120<br />

14.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121<br />

14.2.3 Proof of the Nyquist result . . . . . . . . . . . . . . . . . . . . . . . . . 122<br />

14.2.4 Receiver implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 124<br />

14.3 Multi-pulse transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125<br />

14.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126<br />

15 Bandpass channels 127<br />

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127<br />

15.2 Quadrature multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128<br />

15.3 Optimum receiver for quadrature multiplexing . . . . . . . . . . . . . . . . . . . 132<br />

15.4 Transmission of a complex signal . . . . . . . . . . . . . . . . . . . . . . . . . . 134<br />

15.5 Dimensions per second . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134<br />

15.6 Capacity of the bandpass channel . . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />

15.7 Quadrature amplitude modulation . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />

15.8 Serial quadrature amplitude modulation . . . . . . . . . . . . . . . . . . . . . . 135<br />

15.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

16 Random carrier-phase 137<br />

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137<br />

16.2 Optimum incoherent reception . . . . . . . . . . . . . . . . . . . . . . . . . . . 138<br />

16.3 <strong>Signal</strong>s with equal energy, receiver implementation . . . . . . . . . . . . . . . . 140<br />

16.4 Envelope detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142<br />

16.5 Probability of error for two orthogonal signals . . . . . . . . . . . . . . . . . . . 144<br />

16.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148<br />

17 Colored noise 150<br />

17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150<br />

17.2 Another result on reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 150<br />

17.3 Additive nonwhite Gaussian noise . . . . . . . . . . . . . . . . . . . . . . . . . 151<br />

17.4 Receiver implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152


CONTENTS 5<br />

17.4.1 Correlation type receiver . . . . . . . . . . . . . . . . . . . . . . . . . . 152<br />

17.4.2 Matched-filter type receiver . . . . . . . . . . . . . . . . . . . . . . . . 153<br />

17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155<br />

18 Capacity of the colored noise waveform channel 157<br />

18.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157<br />

18.2 The sub-channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158<br />

18.3 Capacity of parallel channels, water-filling . . . . . . . . . . . . . . . . . . . . . 159<br />

18.4 Back to the non-white waveform channel again . . . . . . . . . . . . . . . . . . 161<br />

18.5 Capacity of the linear filter channel . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />

18.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />

V Coded Modulation 165<br />

19 Coding for bandlimited channels 166<br />

19.1 AWGN channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166<br />

19.2 |A|-Pulse amplitude modulation . . . . . . . . . . . . . . . . . . . . . . . . . . 167<br />

19.3 Normalized S/N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168<br />

19.4 Baseline performance in terms of S/N norm . . . . . . . . . . . . . . . . . . . . 169<br />

19.5 The promise of coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170<br />

19.6 Union bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171<br />

19.6.1 Sequence error probability . . . . . . . . . . . . . . . . . . . . . . . . . 171<br />

19.6.2 Symbol error probability . . . . . . . . . . . . . . . . . . . . . . . . . . 172<br />

19.7 Extended Hamming codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172<br />

19.8 Coding by set-partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173<br />

19.8.1 Construction of the signal set . . . . . . . . . . . . . . . . . . . . . . . . 173<br />

19.8.2 Distance between the signals . . . . . . . . . . . . . . . . . . . . . . . . 174<br />

19.8.3 Asymptotic coding gain . . . . . . . . . . . . . . . . . . . . . . . . . . 175<br />

19.8.4 Effective coding gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177<br />

19.8.5 Coding gains for more complex codes . . . . . . . . . . . . . . . . . . . 178<br />

19.9 Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179<br />

19.10Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179<br />

VI Appendices 181<br />

A An upper bound for the Q-function 182<br />

B The Fourier transform 184<br />

B.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184<br />

B.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184<br />

B.2.1 Parseval’s relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184


CONTENTS 6<br />

C Impulse signal, filters 186<br />

C.1 The impulse or delta signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186<br />

C.2 Linear time-invariant systems, filters . . . . . . . . . . . . . . . . . . . . . . . . 186<br />

D Correlation functions, power spectra 188<br />

D.1 Expectation of an integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />

D.2 Power spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />

D.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />

D.4 Wide-sense stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190<br />

D.5 Gaussian processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191<br />

D.6 Properties of S x ( f ) and R x (τ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 191<br />

E Gram-Schmidt procedure, proof, example 192<br />

E.1 Proof of result 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192<br />

E.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193<br />

F Schwarz inequality 197<br />

G Bound error probability orthogonal signaling 199<br />

H Decibels 202<br />

I Whitening filters 203<br />

J Elementary energy E pam of an |A|-PAM constellation 205


Part I<br />

Some History<br />

7


Chapter 1<br />

Historical facts related to communication<br />

SUMMARY: In this chapter we mention some historical facts related to telegraphy and<br />

telephony, wireless communication, electronics, and modulation. Furthermore we shortly<br />

discuss some classic fundamental bounds and concepts relevant for digital communication.<br />

1.1 Telegraphy and telephony<br />

1800 Alessandro Volta (Italy 1745 - 1827) Announcement of the electric battery (Volta’s pile).<br />

This battery made constant-current electricity possible. Motivated by the “frog-leg” experiments<br />

of Luigi Galvani (Italy 1737 - 1798), anatomy professor at the university of<br />

Bologna, Volta discovered that it was possible to construct a battery from a plate of silver<br />

and a plate of zinc separated by spongy matter impregnated with a saline solution, repeated<br />

thirty or forty times.<br />

1819 Hans Christian Oersted (Denmark 1777 - 1851) Discovery of the fact that an electric<br />

current generates a magnetic field. This can be regarded as the first electro-magnetic result.<br />

It took twenty years after Volta’s invention to realize and prove that this effect exists<br />

although it was known at the time that a stroke of lightning magnetized objects.<br />

1827 Joseph Henry (U.S. 1797 - 1878) Constructed electromagnets using coils of wire.<br />

1838 Samuel Morse (U.S. 1791 - 1872) Demonstration of the electric telegraph in Morristown,<br />

New Jersey. The first telegraph line linked Washington with Baltimore. It became operational<br />

in May 1844. By 1848 every state east of the Mississippi was linked by telegraph<br />

lines. The alphabet that was designed by Morse (a portrait-painter) transforms letters into<br />

variable-length sequences (code words) of dots and dashes (see table 1.1). A dash should<br />

last roughly three times as long as a dot. Frequent letters (e.g. E) get short code words,<br />

less frequently occurring letters (e.g. J, Q, and Y) are represented by longer words. The<br />

first transcontinental telegraph line was completed in 1861. A transatlantic cable became<br />

operational in 1866. However already in 1858 a cable was completed but it failed after a<br />

few weeks.<br />

8


CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 9<br />

A . - G - - . M - - S . . . Y - . - -<br />

B - . . . H . . . . N - . T - Z - - . .<br />

C - . - . I . . O - - - U . . - period . - . - . -<br />

D - . . J . - - - P . - - . V . . . - ? . . - - . .<br />

E . K - . - Q - - . - W . - -<br />

F . . - . L . - . . R . - . X - . . -<br />

Table 1.1: Morse code.<br />

It should be mentioned that in 1833 C.F. Gauss and W.E. Weber demonstrated an electromagnetic<br />

telegraph in Göttingen, Germany.<br />

1875 Alexander Graham Bell (Scotland 1847 - 1922 U.S.) Invention of the telephone. Bell<br />

was a teacher of the deaf. His invention was patented in 1876 (electro-magnetic telephone).<br />

In 1877 Bell established the Bell Telephone Company. Early versions provided<br />

service over several hundred miles. Advances in quality resulted e.g. from the invention<br />

of the carbon microphone.<br />

1900 Michael Pupin (Yugoslavia 1858 - 1935 U.S.) Obtained a patent on loading coils for the<br />

improvement of telephone communications. Adding these Pupin-coils at specified intervals<br />

along a telephone line reduced attenuation significantly. The same invention was<br />

disclosed by Campbell two days later than Pupin. Pupin sold his patent for $455000 to<br />

AT&T Co. Pupin (and Campbell) were the first to set up a theory of transmission lines.<br />

The implications of this theory were not obvious at all for ”experimentalists”.<br />

1.2 Wireless communications<br />

1831 Michael Faraday (England 1791 - 1867) Discovery of the fact that a changing magnetic<br />

field induces an electric current in a conducting circuit. This is the famous law of induction.<br />

1873 James Clerk Maxwell (Scotland 1831 - 1879, England) The publication of “Treatise on<br />

Electricity and Magnetism”. Maxwell combined the results obtained by Oersted and Faraday<br />

into a single theory. From this theory he could predict the existence of electromagnetic<br />

waves.<br />

1886 Heinrich Hertz (Germany 1857 - 1894) Demonstration of the existence of electromagnetic<br />

waves. In his laboratory at the university of Karlsruhe, Hertz used a spark-transmitter<br />

to generate the waves and a resonator to detect them.<br />

1890 Edouard Branly (France 1844 - 1940) Origination of the coherer. This is a receiving device<br />

for electromagnetic waves based on the principle that, while most powdered metals are<br />

poor direct current conductors, metallic powder becomes conductive when high-frequency<br />

current is applied. The coherer was further refined by Augusto Righi (Italy 1850 - 1920)<br />

at the University of Bologna and by Oliver Lodge (England 1851 - 1940) .


CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 10<br />

1896 Aleksander Popov (Russia 1859 - 1906) Wireless telegraph transmission between two<br />

buildings (200 meters) was shown to be possible. Marconi did similar experiments at<br />

roughly the same time.<br />

1901 Guglielmo Marconi (Italy 1874 - 1937) attended the lectures of Righi in Bologna and<br />

built a first radio telegraph in 1895. He used Hertz’s spark transmitter, and Lodge’s coherer<br />

and added antennas to improve performance. In 1898 he developed a tuning device<br />

and was now able to use two radio circuits simultaneously. The first cross-Atlantic radio<br />

link was operated by Marconi in 1901. A radio signal was received at St. John’s, Newfoundland<br />

while being originated from Cornwall, England, 1700 miles away. Important<br />

was that Marconi applied directional antennas.<br />

1955 John R. Pierce (U.S. 1910 - 2002) Pierce (Bell Laboratories) proposed the use of satellites<br />

for communications and did pioneering work in this area. Originally this idea came<br />

from Arthur C. Clarke who suggested already in 1945 the idea to use earth-orbiting satellites<br />

as relay points between earth stations. The first satellite, Telstar I, built by Bell Laboratories,<br />

was launched in 1962. It served as a relay station for TV programs across the<br />

Atlantic. Pierce also headed the Bell-Labs team that developed the transistor and suggested<br />

the name for it.<br />

1.3 Electronics<br />

1874 Karl Ferdinand Braun (Germany 1850 - 1918) Observation of rectification at metal contacts<br />

to galena (lead sulfide). These semiconductor devices were the forerunners of the<br />

“cat’s whisker” diodes used for detection of radio channels (crystal detectors).<br />

1904 John Ambrose Fleming (England 1849 - 1945) Invention of the thermionic diode, “the<br />

valve”, that could be used as rectifier. A rectifier can convert trains of high-frequency<br />

oscillations into trains of intermittent but unidirectional current. A telephone can then be<br />

used to produce a sound with the frequency of the trains of the sparks.<br />

1906 Lee DeForest (U.S. 1873 - 1961) Invention of the vacuum triode, a diode with grid control.<br />

This invention made practical the cascade amplifier, the triode oscillator and the<br />

regenerative feedback circuit. As a result transcontinental telephone transmission became<br />

operational in 1915. A transatlantic cable was laid not earlier than 1953.<br />

1948 John Bardeen, Walter Brattain and William Shockley Development of the theory of the<br />

junction transistor. This transistor was first fabricated in 1950. The point contact transistor<br />

was invented in December 1947 at Bell Telephone Labs.<br />

1958 Jack Kilby and Robert Noyce (U.S.) Invention of the integrated circuit.


CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 11<br />

1.4 Classical modulation methods<br />

Beginning of the 20th century Carrier-frequency currents were generated by electric arcs or<br />

by high-frequency alternators. These currents were modulated by carbon transmitters and<br />

demodulated (or converted to audio) by a crystal detector or an other kind of rectifier.<br />

1909 George A. Campbell (U.S. 1870 - 1954) Internal AT&T memorandum on the “Electric<br />

Wave Filter”. Campbell there discussed band-pass filters that would reject frequencies<br />

other than those in a narrow band. A patent application filed in 1915 was awarded [2].<br />

1915 Edwin Howard Armstrong (U.S. 1890 - 1954) Invention of regenerative amplifiers and<br />

oscillators. Regenerative amplification (using positive feedback to obtain a nearly oscillating<br />

circuit) considerably increased the sensitivity of the receiver. DeForest claimed the<br />

same invention.<br />

1915 Hendrik van der Bijl, Ralph V.L. Hartley, and Raymond A. Heising In a patent that<br />

was filed in 1915 van der Bijl showed how the non-linear portion of a vacuum tube could be<br />

utilized to modulate and demodulate. Heising participated in a project (in Arlinton, VA) in<br />

which a vacuum-tube transmitter based on the van der Bijl modulator was designed. Heising<br />

invented the constant-current modulator. Hartley also participated in the Arlington<br />

project and designed the receiver. He invented an oscillator that was named after him. In a<br />

paper published in 1923 Hartley explained how suppression of the carrier signal and using<br />

only one of the sidebands could economize transmitter power and reduce interference.<br />

1915 John R. Carson Publication of a mathematical analysis of the modulation and demodulation<br />

process. Description of single-sideband and suppressed carrier methods.<br />

1918 Edwin Howard Armstrong Invention of the superheterodyne radio receiver. The combination<br />

of the received signal with a local oscillation resulted in an audio beat-note.<br />

1922 John R. Carson “Notes on the Theory of Modulation [3]”. Carson compares amplitude<br />

modulation (AM) and frequency modulation (FM) . He came to the conclusion that FM<br />

was inferior to AM from the perspective of bandwidth requirement. He overlooked the<br />

possibility that wide-band FM might offer some advantages over AM as was later demonstrated<br />

by Armstrong.<br />

1933 Edwin Howard Armstrong Demonstration of an FM system to RCA. A patent was granted<br />

to Armstrong. Despite of the larger bandwidth that is needed, FM gives a considerable<br />

better performance than AM.<br />

1.5 Fundamental bounds and concepts<br />

1924 Harry Nyquist (Sweden 1889 - 1976 U.S.) Shows that the number of resolvable (noninterfering)<br />

pulses that can be transmitted per second over a bandlimited channel is proportional<br />

to the channel bandwidth [15]. If the bandwidth is W Hz the number of pulses


CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 12<br />

n(t)<br />

s(t)<br />

✗❄<br />

✔<br />

✲ +<br />

✖ ✕<br />

✲<br />

linear<br />

filter<br />

ŝ(t)<br />

✲<br />

Figure 1.1: The communication problem considered by Wiener.<br />

per second cannot exceed 2W . This is strongly related to the sampling theorem which says<br />

that time is essentially discrete. If a time-continuous signal has bandwidth not larger than<br />

W Hz then it is completely specified by samples taken from it at discrete time instants<br />

1/2W seconds apart.<br />

1928 Ralph V.L. Hartley (U.S. 1888 - 1970) Determined the number of distinguishable pulse<br />

amplitudes [8]. Suppose that the amplitudes are confined to the interval [−A, +A], and<br />

that the receiver can estimate an amplitude reliably only to an accuracy of ±. Then<br />

the number of distinguishable pulses is roughly 2(A+)<br />

2<br />

= A <br />

+ 1. Clearly Hartley was<br />

concerned with digital communication. He realized that inaccuracy (due to noise) limited<br />

the amount of information that could be transmitted.<br />

1938 Alec Reeves Invention of Pulse Code Modulation (PCM) for digital encoding of speech<br />

signals. In World War II PCM-encoding allowed transmission of encrypted speech (Bell<br />

Laboratories). In [16] Oliver, Pierce and Shannon compared PCM to classical modulation<br />

systems.<br />

1939 Homer Dudley Description of a vocoder. This system did overthrow the idea that communication<br />

requires a bandwidth at least as wide as that of the signal to be communicated.<br />

1942 Norbert Wiener (U.S. 1894 - 1964) Investigated the problem shown in figure 1.1. The<br />

linear filter is to be chosen such that its output ŝ(t) is the best mean-square approximation<br />

to s(t) given the statistical properties of the processes S(t) and N(t). A drawback of<br />

the result of Wiener was that that modulation did not fit into his model and could not be<br />

analyzed.<br />

1943 Dwight O. North Discovery of the matched filter. This filter was shown to be the optimum<br />

detector of a known signal in additive white noise.<br />

1947 Vladimir A. Kotelnikov (Russia 1908 - ) Analysis of several modulation systems. His<br />

noise immunity work [11] dealt with how to design receivers (but not how to choose the<br />

transmitted signals) to minimize error probability in the case of digital signals or mean<br />

square error in the case of analog signals. Kotelnikov discovered independently of Nyquist,<br />

the sampling theorem in 1933.


CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 13<br />

Figure 1.2: Claude E. Shannon, founder of Information Theory. Photo IEEE-IT Soc. Newsl.,<br />

Summer 1998.<br />

1948 Claude E. Shannon (U.S. 1916 - 2001) Publication of “A Mathematical Theory of Communication<br />

[20].” Shannon (see figure 1.2) showed that noise does not place an inescapable<br />

restriction on the accuracy of communication. He showed that noise properties, channel<br />

bandwidth, and restricted signal magnitude can be incorporated into a single parameter C<br />

which he called the channel capacity. If the cardinality |M| of the message M grows as a<br />

function of the signal duration T such that<br />

1<br />

T log 2 |M| ≈ C − ɛ,<br />

for some small ɛ > 0, then arbitrarily high communication accuracy is possible by increasing<br />

T , i.e. by taking longer and longer signals (code words). Conversely Shannon showed<br />

that reliable communication is not possible when<br />

1<br />

T log 2 |M| ≈ C + ɛ,<br />

for any ɛ > 0. Therefore a channel can be considered as a pipe through which which bits<br />

can reliably be transmitted up to rate C.<br />

Other results of Shannon include the application of Boole’s algebra to switching circuits<br />

(M.S. thesis, MIT, 1938). It was also Shannon who introduced the sampling theorem to<br />

the engineering community in 1949.


Part II<br />

Optimum Receiver Principles<br />

14


Chapter 2<br />

Decision rules for discrete channels<br />

SUMMARY: In this first chapter we investigate a discrete channel, i.e. a channel with a<br />

discrete input and output alphabet. For this discrete channel we determine the optimum<br />

receiver, i.e. the receiver that minimizes the error probability P E . The so-called maximum<br />

a-posteriori (MAP) receiver turns out to be optimum. When all messages are equally likely<br />

the optimum receiver reduces to a maximum-likelihood (ML) receiver. In the last section<br />

of this chapter we distinguish between discrete scalar and vector channels.<br />

2.1 System description<br />

m<br />

s m<br />

r<br />

ˆm<br />

✲<br />

✲ discrete<br />

source transmitter ✲ receiver ✲ destin.<br />

channel<br />

Figure 2.1: Elements in a communication system based on a discrete channel.<br />

We start by considering a very simple communication system based on a discrete channel. It<br />

consists of the following elements (see figure 2.1):<br />

SOURCE An information source produces message m ∈ M = {1, 2, · · · , |M|}, one out of<br />

|M| alternatives. Message m occurs with probability Pr{M = m} for m ∈ M. This<br />

probability is called the a-priori message probability and M is the name of the random<br />

variable associated with this mechanism.<br />

TRANSMITTER The transmitter sends a signal s m if message m is to be transmitted. This<br />

signal is input to the channel. It assumes values from the discrete channel input alphabet<br />

S. The random variable corresponding to the signal is denoted by S. The collection of<br />

used signals is s 1 , s 2 , · · · , s |M| .<br />

15


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 16<br />

DISCRETE CHANNEL The channel produces an output r that assumes a value from a discrete<br />

alphabet R. If the channel input is signal s ∈ S the output r ∈ R occurs with conditional<br />

probability Pr{R = r|S = s}. This channel output is directed to the receiver. The random<br />

variable associated with the channel output is denoted by R.<br />

When message m ∈ M occurs, the transmitter chooses s m as channel input. Therefore<br />

Pr{R = r|M = m} = Pr{R = r|S = s m } for all r ∈ R. (2.1)<br />

These conditional probabilities describe the behavior of the transmitter followed by the<br />

channel.<br />

RECEIVER The receiver forms an estimate ˆm of the transmitted message (or signal) by looking<br />

at the received channel output r ∈ R, hence ˆm = f (r). The random variable corresponding<br />

to this estimate is called ˆM. We assume here that ˆm ∈ M = {1, 2, · · · , |M|}, thus the<br />

receiver has to choose one of the possible messages (and can not declare an error, e.g.).<br />

DESTINATION The destination accepts the estimate ˆm.<br />

Note that we call our system discrete because the channel is discrete. It has a discrete input<br />

and output alphabet.<br />

The performance of our communication system is evaluated by considering the probability<br />

of error. This performance depends on the receiver, i.e. on the mapping f (·).<br />

Definition 2.1 The mapping f (·) is called the decision rule.<br />

Definition 2.2 The probability of error is defined as<br />

P E<br />

= Pr{ ˆM ̸= M}. (2.2)<br />

We are now interested in choosing a decision rule f (·) that minimizes P E . A decision rule that<br />

minimizes the error probability P E is called optimum. The corresponding receiver is called an<br />

optimum receiver.<br />

The probability of correct (right) decision is denoted as P C and obviously P E + P C = 1.<br />

2.2 Decision rules, an example<br />

To get more familiar with the properties of our system we study an example first.<br />

Example 2.1 We assume that |M| = 2, i.e. there are two possible messages. Their a-priori probabilities<br />

can be found in the following table:<br />

m Pr{M = m}<br />

1 0.4<br />

2 0.6


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 17<br />

The two signals corresponding to the messages are s 1 and s 2 and the conditional probabilities Pr{R =<br />

r|S = s m } are given in the table below for the values of r and m that can occur. There are three values that<br />

r can assume, i.e. R = {a, b, c}.<br />

m Pr{R = a|S = s m } Pr{R = b|S = s m } Pr{R = c|S = s m }<br />

1 0.5 0.4 0.1<br />

2 0.1 0.3 0.6<br />

Note that the row sums are equal to one.<br />

Now suppose that we use the decision rule f (·) that is given by:<br />

r a b c<br />

f (r) 1 1 2<br />

This means that the receiver outputs the estimate ˆm = 1 if the channel output is a or b and ˆm = 2 if the<br />

channel output is c. To determine the probability of error P E that is achieved by this rule we first compute<br />

the joint probabilities<br />

Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m}<br />

= Pr{M = m} Pr{R = r|S = s m }, (2.3)<br />

where we used (2.1). We list these joint probabilities in the following table:<br />

m Pr{M = m, R = a} Pr{M = m, R = b} Pr{M = m, R = c}<br />

1 0.20 0.16 0.04<br />

2 0.06 0.18 0.36<br />

Now we can determine the probability of correct decision P C = 1 − P E which is:<br />

P C = Pr{M = 1, R = a} + Pr{M = 1, R = b} + Pr{M = 2, R = c}<br />

= 0.20 + 0.16 + 0.36 = 0.72. (2.4)<br />

This decision rule is not optimum. To see why, note that for every r ∈ R a certain message ˆm ∈<br />

M is chosen. In other words the decision rule selects in each column exactly one joint probability.<br />

These selected probabilities are added together to form P C . Our decision rule selects the joint probability<br />

Pr{M = 1, R = b} = 0.16 in the column that corresponds to R = b. A larger P C is obtained if for<br />

output R = b instead of message 1 the message 2 is chosen. In that case the larger joint probability<br />

Pr{M = 2, R = b} = 0.18 is selected. Now the probability of correct decision becomes<br />

P C = Pr{M = 1, R = a} + Pr{M = 2, R = b} + Pr{M = 2, R = c}<br />

= 0.20 + 0.18 + 0.36 = 0.74. (2.5)<br />

We may conclude that, in general, to maximize P C , we have to choose the m that achieves the largest<br />

Pr{M = m, R = r} for the r that was received. This will be made more precise in the following section.


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 18<br />

2.3 The “maximum a-posteriori” decision rule<br />

We can upper bound the probability of correct decision as follows:<br />

P C = ∑ r<br />

Pr{M = f (r), R = r}<br />

≤<br />

∑ r<br />

max Pr{M = m, R = r}. (2.6)<br />

m<br />

This upper bound is achieved if for all r the estimate f (r) corresponds to the message m that<br />

achieves the maximum in max m Pr{M = m, R = r}, hence an optimum decision rule for the<br />

discrete channel achieves<br />

Pr{M = f (r), R = r} ≥ Pr{M = m, R = r}, for all m ∈ M, (2.7)<br />

for all r ∈ R that can be received, i.e. that have Pr{R = r} > 0. Note that it is possible that for<br />

some channel outputs r ∈ R more than one decision f (r) is optimum.<br />

Definition 2.3 For a communication system based on a discrete channel the joint probabilities<br />

Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m}<br />

= Pr{M = m} Pr{R = r|S = s m }. (2.8)<br />

which are, given the received output value r, indexed by m ∈ M, are called the decision variables.<br />

An optimum decision rule f (·) is based on these variables.<br />

RESULT 2.1 (MAP decision rule) To minimize the probability of error P E , the decision rule<br />

f (·) should produce for each received r a message m having the largest decision variable. Hence<br />

for r ∈ R that actually can occur, i.e. that have Pr{R = r} > 0,<br />

f (r) = arg max<br />

m∈M Pr{M = m} Pr{R = r|S = s m}. (2.9)<br />

For<br />

∑<br />

such r we can divide all decision variables by channel output probability Pr{R = r} =<br />

m∈M Pr{M = m} Pr{R = r|M = m}. This results in the a-posteriori probabilities of the<br />

message m ∈ M given the channel output r. By the Bayes rule<br />

Pr{M = m} Pr{R = r|S = s m }<br />

Pr{R = r}<br />

= Pr{M = m|R = r}, for all m ∈ M, (2.10)<br />

therefore the optimum receiver chooses as ˆm a message that has maximum a-posteriori probability<br />

(MAP). This decision rule is called the MAP-decision rule, the receiver is called a MAPreceiver.


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 19<br />

Example 2.2 Below are the a-posteriori probabilities that correspond to the example in the previous section<br />

(obtained from Pr{R = a} = 0.26, Pr{R = b} = 0.34, and Pr{R = c} = 0.4). Note that the<br />

probabilities in a column add up to one now.<br />

m Pr{M = m|R = a} Pr{M = m|R = b} Pr{M = m|R = c}<br />

1 20/26 16/34 4/40<br />

2 6/26 18/34 36/40<br />

From this table it is clear that the optimum decision rule is<br />

This leads to an error probability<br />

r a b c<br />

f (r) 1 2 2<br />

P E = ∑ m<br />

Pr{M = m} Pr{ ˆM ̸= m|M = m} = 0.4(0.4 + 0.1) + 0.6 · 0.1 = 0.26, (2.11)<br />

where Pr{ ˆM ̸= m|M = m} = ∑ r: f (r)̸=m Pr{R = r|S = s m} is the probability of error conditional on the<br />

fact that message m was sent.<br />

2.4 The “maximum likelihood” decision rule<br />

When all messages are equally likely, i.e. when<br />

we get for the joint probabilities<br />

Pr{M = m} = 1 for all m ∈ M = {1, 2, · · · , |M|}, (2.12)<br />

|M|<br />

Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m} = 1<br />

|M| Pr{R = r|S = s m}. (2.13)<br />

Considering (2.7) we obtain:<br />

RESULT 2.2 (ML decision rule) If the a-priori message probabilities are all equal, in order to<br />

minimize the error probability P E , a decision rule f (·) has to be applied that satisfies<br />

f (r) = arg max<br />

m∈M Pr{R = r|S = s m}, (2.14)<br />

for r ∈ R that can occur i.e. for r with Pr{R = r} > 0. Such a decision rule is called a maximum<br />

likelihood (ML) decision rule, the resulting receiver is a maximum likelihood receiver.<br />

Given the received channel output r ∈ R the receiver chooses a message ˆm ∈ M for which<br />

the received r has maximum likelihood. The transition probabilities Pr{R = r|S = s m } for<br />

r ∈ R and m ∈ M are called likelihoods. A receiver that operates like this (no matter whether<br />

or not the messages are indeed equally likely) is called a maximum likelihood receiver. It should<br />

be noted that such a receiver is less complex than a maximum a-posteriori receiver since it does<br />

not have to multiply the transition probabilities Pr{R = r|S = s m } by the a-priori probabilities<br />

Pr{M = m}.


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 20<br />

Example 2.3 Consider again the transition probabilities of the example in section 2.2:<br />

m Pr{R = a|S = s m } Pr{R = b|S = s m } Pr{R = c|S = s m }<br />

1 0.5 0.4 0.1<br />

2 0.1 0.3 0.6<br />

From this table follows the maximum-likelihood decision rule which is<br />

r a b c<br />

f (r) 1 1 2<br />

When the messages 1 and 2 both have a-priori probability 1/2 the probability of correct decision is<br />

P C = ∑ m<br />

Pr{M = m} Pr{ ˆM = m|M = m} = 1 2 (0.5 + 0.4) + 1 0.6 = 0.75. (2.15)<br />

2<br />

Here Pr{ ˆM = m|M = m} = ∑ r: f (r)=m Pr{R = r|S = s m} is the correct-probability conditioned on the<br />

fact that message m was sent. Note that the maximum likelihood decision rule is optimum here since both<br />

messages have the same a-priori probability.<br />

2.5 Scalar and vector channels<br />

s m<br />

r<br />

s ✲<br />

r<br />

m<br />

m1<br />

1 ✲<br />

s m2 ✲ discrete r<br />

ˆm<br />

2 ✲<br />

source ✲ transmitter vector<br />

receiver ✲ destin.<br />

.<br />

.<br />

✲ channel ✲<br />

s m N<br />

r N<br />

Figure 2.2: Elements in a communication system based on a discrete vector channel.<br />

So far we have considered in this chapter a discrete channel that accepts a single input s ∈ S<br />

and produces a single output r ∈ R. This channel could be called a discrete scalar channel.<br />

There is also a vector variant of this channel, see figure 2.2. The components of a discrete vector<br />

system are described below.<br />

TRANSMITTER Let N be a positive integer. Then the vector transmitter sends a signal vector<br />

s m = (s m1 , s m2 , · · · , s m N ) of N components from the discrete alphabet S if message m<br />

is to be conveyed. This signal vector is input to the discrete vector channel. The random<br />

variable corresponding to the signal signal vector is denoted by S. The collection of used<br />

signal vectors is s 1 , s 2 , · · · , s |M| .<br />

DISCRETE VECTOR CHANNEL This channel produces an output vector r = (r 1 , r 2 , · · · ,<br />

r N ) with N components all assuming a value from a discrete alphabet R. If the channel<br />

input was signal s ∈ S N the output r ∈ R N occurs with conditional probability Pr{R =


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 21<br />

r|S = s}. This channel output vector is directed to the vector receiver. The random<br />

variable associated with the channel output is denoted by R.<br />

When message m ∈ M occurs, s m is chosen by the transmitter as channel input vector.<br />

Then the conditional probabilities<br />

Pr{R = r|M = m} = Pr{R = r|S = s m } for all r ∈ R, (2.16)<br />

describe the behavior of the vector transmitter followed by the discrete vector channel.<br />

RECEIVER The vector receiver forms an estimate ˆm of the transmitted message (or signal) by<br />

looking at the received channel output vector r ∈ R N , hence ˆm = f (r). The mapping<br />

f (·) is again called the decision rule.<br />

It will not be a big surprise that we define the decision variables for the discrete vector channel<br />

as follows:<br />

Definition 2.4 For a communication system based on a discrete vector channel the decision<br />

variables are again the joint probabilities<br />

Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m}<br />

which are, given the received output vector r, indexed by m ∈ M.<br />

= Pr{M = m} Pr{R = r|S = s m }. (2.17)<br />

The following result tells us what the optimum decision rule should be for the discrete vector<br />

channel:<br />

RESULT 2.3 (MAP decision rule for the discrete vector channel) To minimize the probability<br />

of error P E , the decision rule f (·) should produce for each received vector r a message m<br />

having the largest decision variable. Hence for vectors r ∈ R N that actually can occur, i.e. that<br />

have Pr{R = r} > 0,<br />

This decision rule is the MAP-decision rule.<br />

f (r) = arg max Pr{M = m} Pr{R = r|S = s m }. (2.18)<br />

m∈M<br />

There is obviously also a ”maximum likelihood version” of this result.<br />

2.6 Exercises<br />

1. Consider the noisy discrete communication channel illustrated in figure 2.3.<br />

(a) If Pr{M = 1} = 0.7 and Pr{M = 2} = 0.3, determine the optimum decision rule<br />

(assignment of a, b or c to 1, 2) and the resulting probability of error P E .


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 22<br />

1<br />

2<br />

✄<br />

✂ ✁<br />

0.7<br />

✲<br />

✄<br />

◗ ✑✂ ✁ a<br />

◗<br />

✑<br />

◗ 0.2 ✑<br />

0.1◗<br />

✑<br />

◗ ✑<br />

◗✑<br />

✏ ✄<br />

✑◗<br />

✂ ✁ b<br />

✑ ◗<br />

0.3✑<br />

◗<br />

✑<br />

✄<br />

✄<br />

✂✏✁<br />

✏✏✏✏✏✏✏✏✏✏<br />

◗<br />

✑ 0.2 ◗<br />

✑ ✲<br />

◗✂ ✁ c<br />

0.5<br />

Figure 2.3: A noisy discrete communication system with input M and output R.<br />

(b) There are eight decision rules. Plot the probability of error for each decision rule<br />

versus Pr{M = 1} on one graph.<br />

(c) Each decision rule has a maximum probability of error which occurs for some least<br />

favorable a-priori probability Pr{M = 1}. The decision rule which has the smallest<br />

is called the minimax decision rule. Which of the eight rules is minimax?<br />

(Exercise 2.12 from Wozencraft and Jacobs [25].)<br />

2. Consider a communication system based on a discrete channel. The number of messages<br />

is five i.e. M = {1, 2, 3, 4, 5}. The a-priori probabilities are Pr{M = 1} = Pr{M = 5} =<br />

0.1, Pr{M = 2} = Pr{M = 4} = 0.2, and Pr{M = 3} = 0.4. The signal corresponding to<br />

message m is equal to m, thus s m = m, for m ∈ M. This signal is the input for the discrete<br />

channel.<br />

The channel adds the noise n to the signal, hence the channel output<br />

r = s m + n. (2.19)<br />

The noise variable N is independent of the channel input and can have the values −1, 0,<br />

and +1 only. It is given that Pr{N = 0} = 0.4 and Pr{N = −1} = Pr{N = +1} = 0.3.<br />

(a) Note that the channel output assumes values in {0, 1, 2, 3, 4, 5, 6}. Determine the<br />

probability Pr{R = 2}. For each message m ∈ M compute the a-posteriori probability<br />

Pr{M = m|R = 2}.<br />

(b) Note that an optimum receiver minimizes the probability of error P E . Give for each<br />

possible channel output the estimate ˆm that an optimum receiver will make. Determine<br />

the resulting error probability.<br />

(c) Consider a maximum-likelihood receiver. Give for each possible channel output the<br />

estimate ˆm that a maximum-likelihood receiver will make. Again determine the resulting<br />

probability of error.<br />

(Exam Communication Principles, July 5, 2002.)


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 23<br />

3. Consider an optical communication system. A binary message is transmitted by switching<br />

the intensity λ of a laser ”on” or ”off” within the time-interval [0, T ]. For message m = 1<br />

(”off”) the intensity λ = λ off ≥ 0. For message m = 2 (”on”) the intensity λ = λ on > λ off .<br />

The laser produces photons that are counted by a detector. The number K of photons is a<br />

non-negative random variable, and the corresponding probabilities are<br />

Pr{K = k} =<br />

(λT )k<br />

k!<br />

exp(−λT ), for k = 0, 1, 2, · · · .<br />

This probability distribution is known as Poisson distribution.<br />

The a-priori message probabilities are Pr{M = 1} = p off and Pr{M = 2} = p on . It is<br />

assumed that 0 < p on = 1 − p off < 1.<br />

The detector forms the message estimate ˆm based on the number k of photons that are<br />

counted.<br />

(a) Suppose that the detector chooses ˆm such that the probability of error P E = Pr{ ˆM ̸=<br />

M} is minimized. For what values of k does the detector choose ˆm = 1 and when<br />

does it choose ˆm = 2?<br />

(b) Consider a maximum-likelihood detector. For what values of k does this detector<br />

choose ˆm = 1 and when does it choose ˆm = 2 now?<br />

(c) Assume that λ off = 0. Consider a maximum-a-posteriori detector. For what values<br />

of p on does the detector choose ˆm = 2 no matter what k is? What is in that case the<br />

error probability P E ?<br />

(Exam Communication Principles, July 8, 2003.)<br />

0<br />

1<br />

1 − p<br />

✄✂ ❍ ✁ ✲<br />

❍❍<br />

✄✂ ✁<br />

❍❍❍❥<br />

✟<br />

❍<br />

✟<br />

p ❍<br />

✟<br />

❍ ✟<br />

✟<br />

❍✟<br />

✟<br />

✟❍<br />

❍<br />

p<br />

✄<br />

✂✟ ✁<br />

✟✟✟✯ ✟ ❍<br />

✟<br />

❍<br />

✟✟<br />

❍<br />

✲ ❍✄<br />

✂ ✁<br />

1 − p<br />

0<br />

1<br />

Figure 2.4: A binary symmetric channel with cross-over probability p.<br />

4. Consider transmission over a binary symmetric channel with cross-over probability 0 <<br />

p < 1/2 (see figure 2.4. All messages m ∈ M are equally likely. Each message m<br />

now corresponds to a signal vector (codeword) s m = (s m1 , s m2 , · · · , s m N ) consisting of N<br />

binary digits, hence s mi ∈ {0, 1} for i = 1, N. Such a codeword (sequence) is transmitted<br />

over the binary symmetric channel, the resulting channel output sequence is denoted by r.<br />

The Hamming-distance d H (x, y) between two sequences x and y, both consisting of N<br />

binary digits, is the number of positions at which they differ.


CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 24<br />

(a) Show that the optimum receiver should choose the codeword that has minimum Hamming<br />

distance to the received channel output sequence.<br />

(b) What is the optimum receiver for 1/2 < p < 1?<br />

(c) Assume that the messages are not equally likely and that the cross-over probability<br />

p = 1/2? What is the minimum error probability now? What does the corresponding<br />

receiver do?<br />

(Exam Communication Principles, October 6, 2003.)<br />

5. The weather type in Paris is a random variable that is denoted by P. Similarly the weather<br />

type in Marseille is a random variable denoted by M. Both random variables can assume<br />

the values {s, c, r}. Here s stands for sunny, c for cloudy, and r for rainy. We assume that<br />

the weather type in Marseille M depends on the weather type in Paris P. The joint probability<br />

of a certain weather type in Paris together with a certain weather type in Marseille<br />

can be found in the table below.<br />

p Pr{P = p, M = s} Pr{P = p, M = c} Pr{P = p, M = r}<br />

s 0.25 0.05 0.00<br />

c 0.20 0.10 0.10<br />

r 0.05 0.15 0.10<br />

The table shows e.g. that the joint probability of cloudy weather in Paris and sunny weather<br />

in Marseille is 0.20.<br />

(a) Suppose that you have to predict a combination of weather types for Paris and Marseille<br />

such that your prediction is correct with a probability as large as possible. What<br />

combination do you choose and what is the probability that your prediction will be<br />

wrong?<br />

(b) Suppose that you have to guess the weather type in Marseille knowing the weather<br />

type in Paris. How do you decide for each of the three possible types of weather in<br />

Paris if you want to be correct with the highest possible probability? How large is the<br />

probability that your guess is wrong?<br />

(c) Suppose that you know the weather type in Marseille and that you must suggest two<br />

types of weather in Paris such that the probability that at least one of your suggestions<br />

is correct is maximal. What is your pair of suggestions for each of the three possible<br />

types of weather in Marseille? Compute the probability that both your suggestions<br />

are wrong also for this case.<br />

(Exam Communication Theory, January 12, 2005)


Chapter 3<br />

Decision rules for the real scalar channel<br />

SUMMARY: Here we will consider transmission of information over a channel with a single<br />

real-valued input and output. Again the optimum receiver is determined. As an example<br />

we investigate the additive Gaussian noise channel, i.e. the channel that adds Gaussian<br />

noise to the input signal.<br />

3.1 System description<br />

m<br />

s m<br />

r<br />

ˆm<br />

✲<br />

✲ real<br />

source transmitter ✲ receiver ✲ destin.<br />

channel<br />

Figure 3.1: Elements in a communication system based on a real scalar channel.<br />

A communication system based on a channel with real-valued input and output alphabet, i.e.<br />

a real scalar channel does not differ very much from a system based on a discrete channel (see<br />

figure 3.1).<br />

SOURCE An information source produces the message m ∈ M = {1, 2, · · · , |M|} with a-<br />

priori probability Pr{M = m} for m ∈ M. Again M is the name of the random variable<br />

associated with this mechanism.<br />

TRANSMITTER The transmitter now sends a real-valued scalar signal s m if message m is<br />

to be transmitted. The scalar signal is input to the channel. It assumes a value in the<br />

range (−∞, ∞). The random variable corresponding to the signal is denoted by S. The<br />

collection of used signals is s 1 , s 2 , · · · , s |M| .<br />

REAL SCALAR CHANNEL The channel now produces an output r in the range (−∞, ∞).<br />

When the input signal is the real-valued scalar s the channel output is generated according<br />

25


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 26<br />

to the conditional probability density function p R (r|S = s) and thus R is a real-valued<br />

scalar random variable. The probability of receiving a channel output r ≤ R < r + dr is<br />

equal to p R (r|S = s)dr for an infinitely small interval dr.<br />

When message m ∈ M occurs, signal s m is chosen by the transmitter as channel input.<br />

Then the conditional probability density function<br />

p R (r|M = m) = p R (r|S = s m ) for all r ∈ (−∞, ∞), (3.1)<br />

describes the behavior of the transmitter followed by the real scalar channel.<br />

RECEIVER The receiver forms an estimate ˆm of the transmitted message (or signal) based on<br />

the received real-valued scalar channel output r, hence ˆm = f (r). The mapping f (·) is<br />

called the decision rule.<br />

DESTINATION The destination accepts the estimate ˆm.<br />

3.2 MAP and ML decision rules<br />

For the real scalar channel we can write the probability of correct decision is<br />

P C =<br />

∫ ∞<br />

−∞<br />

Pr{M = f (r)}p R (r|M = f (r))dr. (3.2)<br />

An optimum decision rule is obtained if, after receiving the scalar R = r, the decision f (r) is<br />

taken in such a way that<br />

Pr{M = f (r)}p R (r|M = f (r)) ≥ Pr{M = m}p R (r|M = m) for all m ∈ M. (3.3)<br />

This leads to the definition of the decision variables given below.<br />

Alternatively we can assume that a value r ≤ R < r + dr was received. Then the decision<br />

variables would be<br />

Pr{M = m, r ≤ R < r + dr} = Pr{M = m} Pr{r ≤ R < r + dr|S = s m }<br />

= Pr{M = m}p R (r|S = s m )dr. (3.4)<br />

Also this reasoning leads to the definition of the decision variables below.<br />

Definition 3.1 For a system based on a real scalar channel the products<br />

Pr{M = m}p R (r|M = m) = Pr{M = m}p R (r|S = s m ) (3.5)<br />

which are, given the received output r, indexed by m ∈ M, are called the decision variables.<br />

The optimum decision rule f (·) is again based on these variables. It is possible that for certain<br />

channel outputs r more decisions f (r) are optimum.


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 27<br />

RESULT 3.1 (MAP) To minimize the error probability P E , the decision rule f (·) should be<br />

such that for each received r a message m is chosen with the largest decision variable. Hence<br />

for r that can be received, an optimum decision rule f (·) should satisfy<br />

f (r) = arg max<br />

m∈M Pr{M = m}p R(r|S = s m ). (3.6)<br />

Both sides of the inequality (3.3) can be divided by p R (r) = ∑ m∈M Pr{M = m}p R(r|S = s m ),<br />

i.e. de density for the value of r that actually did occur. Then we obtain<br />

f (r) = arg max Pr{M = m|R = r}, (3.7)<br />

m∈M<br />

for r for which p R (r) > 0. This rule is again called maximum a-posteriori (MAP) decision rule.<br />

RESULT 3.2 (ML) When all messages have equal a-priori probabilities. i.e. when Pr{M =<br />

m} = 1/|M| for all m ∈ M, we observe from (3.6), that the optimum receiver has to choose<br />

f (r) = arg max<br />

m∈M p R(r|S = s m ), (3.8)<br />

for all r with p R (r) > 0. This rule is referred to as the maximum likelihood (ML) decision rule.<br />

3.3 The scalar additive Gaussian noise channel<br />

3.3.1 Introduction<br />

Gaussian noise is probably the most important kind of impairment. Therefore we will investigate<br />

a simple communication situation based on a channel that adds a Gaussian noise sample n to the<br />

real scalar channel input signal s (see figure 3.2).<br />

n<br />

p N (n) = √ 1<br />

n2<br />

exp(− )<br />

2πσ 2 2σ 2<br />

s<br />

✛❄<br />

✘r = s + n<br />

✲ +<br />

✲<br />

✚✙<br />

Figure 3.2: Scalar additive Gaussian noise channel.<br />

Definition 3.2 The scalar additive Gaussian noise (AGN) channel adds Gaussian noise N to<br />

the input signal S. This Gaussian noise N has variance σ 2 and mean 0. The probability density<br />

function of the noise is defined to be<br />

p N (n) =<br />

1<br />

√<br />

2πσ 2<br />

The noise variable N is assumed to be independent of the signal S.<br />

n2<br />

exp(− ). (3.9)<br />

2σ 2


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 28<br />

3.3.2 The MAP decision rule<br />

We assume e.g. that |M| = 2, i.e. there are two messages and M can be either 1 or 2. The<br />

corresponding two signals s 1 and s 2 are the inputs of our channel. Without loss of generality let<br />

s 1 > s 2 .<br />

The conditional probability density function of receiving R = r given the message m depends<br />

only on the value n that the noise variable N gets. In order to receive R = r when the signal is<br />

s m , the noise variable N should have value r − s m . The noise N is independent of the signal S<br />

(and the message M). Therefore (assuming that dr is infinitely small)<br />

p R (r|S = s m ) = Pr{r ≤ R < r + dr|S = s m }/dr<br />

= Pr{r − s m ≤ N < r − s m + dr}/dr<br />

= p N (r − s m )<br />

=<br />

1<br />

√ exp(−(r − s m) 2<br />

2πσ 2 2σ 2 ), for m = 1, 2. (3.10)<br />

We obtain an optimum (maximum a-posteriori) receiver when ˆm = 1 is chosen if<br />

1<br />

Pr{M = 1} √ exp(−(r − s 1) 2<br />

1<br />

2πσ 2 2σ 2 ) ≥ Pr{M = 2} √ exp(−(r − s 2) 2<br />

2πσ 2 2σ 2 ), (3.11)<br />

and ˆm = 2 otherwise. The following inequalities are all equivalent to (3.11):<br />

ln Pr{M = 1} − (r − s 1) 2<br />

2σ 2 ≥ ln Pr{M = 2} − (r − s 2) 2<br />

2σ 2 ,<br />

2σ 2 ln Pr{M = 1} − (r − s 1 ) 2 ≥ 2σ 2 ln Pr{M = 2} − (r − s 2 ) 2 ,<br />

2σ 2 ln Pr{M = 1} + 2rs 1 − s1 2 ≥ 2σ 2 ln Pr{M = 2} + 2rs 2 − s2 2 ,<br />

2rs 1 − 2rs 2 ≥ 2σ 2 Pr{M = 2}<br />

ln<br />

Pr{M = 1} + s2 1 − s2 2 . (3.12)<br />

RESULT 3.3 (Optimum receiver for the scalar additive Gaussian noise channel) A receiver<br />

that decides ˆm = 1 if<br />

r ≥ r ∗ σ 2 Pr{M = 2}<br />

= ln<br />

s 1 − s 2 Pr{M = 1} + s 1 + s 2<br />

, (3.13)<br />

2<br />

and ˆm = 2 otherwise, is optimum. When the a-priori probabilities Pr{M = 1} and Pr{M = 2}<br />

are equal the optimum threshold is<br />

r ∗ = s 1 + s 2<br />

. (3.14)<br />

2<br />

Example 3.1 Assume that σ 2 = 1 and s 1 = +1 and s 2 = −1. If the a-priori message probabilities are<br />

equal, i.e. when Pr{M = 1} = Pr{M = 2}, we obtain an optimum receiver if only for r ≥ r ∗ = s 1+s 2<br />

we<br />

2<br />

decide for ˆm = 1. The decision changes exactly halfway between s 1 and s 2 . The intervals (−∞, r ∗ ) and


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 29<br />

0.2<br />

0.18<br />

0.16<br />

0.14<br />

0.12<br />

0.1<br />

0.08<br />

0.06<br />

0.04<br />

0.02<br />

0<br />

−5 −4 −3 −2 −1 0 1 2 3 4 5<br />

Figure 3.3: Decision variables for equal a-priori probabilities as a function of r.<br />

0.35<br />

0.3<br />

0.25<br />

0.2<br />

0.15<br />

0.1<br />

0.05<br />

0<br />

−5 −4 −3 −2 −1 0 1 2 3 4 5<br />

Figure 3.4: Decision variables as a function of r for a-priori probabilities 1/4 and 3/4.<br />

(r ∗ , ∞) are called decision intervals. In our case, since s 2 = −s 1 , the threshold r ∗ = 0. This also can be<br />

seen from figure 3.3 where the decision variables<br />

1<br />

Pr{M = 1} √ exp(−(r − s 1) 2<br />

1<br />

), Pr{M = 2} √<br />

2πσ<br />

2 2σ exp(−(r − s 2) 2<br />

) (3.15)<br />

2 2πσ<br />

2 2σ 2<br />

are plotted as a function of r assuming that Pr{M = 1} = Pr{M = 2} = 1/2.<br />

Next assume that the a-priori probabilities are not equal but let Pr{M = 1} = 3/4 and Pr{M = 2} =<br />

1/4. Now the decision variables change (see figure 3.4) and we must also change the decision rule. It<br />

turns out that for r ≥ r ∗ = − ln 3 ≈ −0.5493 the optimum receiver should choose ˆm = 1 (see again figure<br />

2<br />

3.4). The threshold r ∗ has moved away from the more probable signal s 1 .<br />

Note that, no matter how much the a-priori probabilities differ, there is always a value of r, which we<br />

have called r ∗ , the threshold, before, for which (3.11) is satisfied with equality. For r > r ∗ the left side in<br />

(3.11) is larger than the right side, for r < r ∗ the right side is the largest.


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 30<br />

3.3.3 The Q-function<br />

To compute the probability of error we introduce the so-called Q-function (see figure 3.5).<br />

Definition 3.3 (Q-function) This function of x ∈ (−∞, ∞) is defined as<br />

Q(x) =<br />

∫ ∞<br />

x<br />

1<br />

√ exp(− α2<br />

)dα. (3.16)<br />

2π 2<br />

It is the probability that a Gaussian random variable with mean 0 and variance 1 assumes a value<br />

larger than x.<br />

0.4<br />

0.35<br />

0.3<br />

0.25<br />

0.2<br />

0.15<br />

0.1<br />

0.05<br />

0<br />

−4 −3 −2 −1 0 1 2 3 4<br />

Figure 3.5: Gaussian probability density function for mean 0 and variance 1. The shaded area<br />

corresponds to Q(1).<br />

A useful property of the Q-function is<br />

Q(x) + Q(−x) = 1. (3.17)<br />

In appendix A an upper bound for Q(x) is derived. In table 3.1 we tabulated Q(x) for several<br />

values of x.<br />

3.3.4 Probability of error<br />

We now want to find an expression for the error probability of our scalar system with two messages.<br />

We write<br />

P E = Pr{M = 1} Pr{R < r ∗ |M = 1} + Pr{M = 2} Pr{R ≥ r ∗ |M = 2}, (3.18)


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 31<br />

x Q(x) x Q(x) x Q(x)<br />

0.00 0.50000 1.5 6.6681 ·10 −2 5.0 2.8665 ·10 −7<br />

0.25 0.40129 2.0 2.2750 ·10 −2 6.0 9.8659 ·10 −10<br />

0.50 0.30854 2.5 6.2097 ·10 −3 7.0 1.2798 ·10 −12<br />

0.75 0.22663 3.0 1.3499 ·10 −3 8.0 6.2210 ·10 −16<br />

1.00 0.15866 4.0 3.1671 ·10 −5 10.0 7.6200 ·10 −24<br />

Table 3.1: Table of Q(x) for some values of x.<br />

where<br />

Pr{R ≥ r ∗ |M = 2} =<br />

=<br />

=<br />

∫ ∞<br />

1<br />

r=r<br />

∫ ∗ ∞<br />

1<br />

r=r<br />

∫ ∗ ∞<br />

√ exp(−(r − s 2) 2<br />

2πσ 2 2σ 2 )dr<br />

√ exp(− ((r/σ ) − s 2/σ ) 2<br />

)d(r/σ )<br />

2π 2<br />

µ=r ∗ /σ −s 2 /σ<br />

1<br />

√<br />

2π<br />

exp(− µ2<br />

2 )dµ = Q(r ∗ /σ − s 2 /σ ), (3.19)<br />

and similarly<br />

Pr{R < r ∗ |M = 1} =<br />

=<br />

=<br />

∫ r=r ∗<br />

−∞<br />

∫ r=r ∗<br />

1<br />

√ exp(−(r − s 1) 2<br />

2πσ 2 2σ 2 )dr<br />

1<br />

√ exp(− ((r/σ ) − s 1/σ ) 2<br />

)d(r/σ )<br />

2π 2<br />

−∞<br />

∫ µ=r ∗ /σ −s 1 /σ<br />

−∞<br />

Note that the last equality (*) follows from (3.17).<br />

1<br />

√<br />

2π<br />

exp(− µ2<br />

2 )dµ<br />

= 1 − Q(r ∗ /σ − s 1 /σ ) (∗)<br />

= Q(s 1 /σ − r ∗ /σ ). (3.20)<br />

RESULT 3.4 (Minimum probability of error) If we combine (3.18), (3.19), and (3.20), we obtain<br />

P E = Pr{M = 1}Q( s 1 − r ∗<br />

) + Pr{M = 2}Q( r ∗ − s 2<br />

). (3.21)<br />

σ<br />

σ<br />

If the a-priori message probabilities Pr{M = 1} and Pr{M = 2} are equal then, according to<br />

(3.14), we get r ∗ = s 1+s 2<br />

2<br />

and<br />

s 1 − r ∗<br />

σ<br />

r ∗ − s 2<br />

σ<br />

= s 1 − s 1+s 2<br />

2<br />

σ<br />

s 1 +s 2<br />

2<br />

− s 2<br />

=<br />

σ<br />

= s 1 − s 2<br />

2σ<br />

= s 1 − s 2<br />

2σ<br />

(3.22)


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 32<br />

hence for the minimum probability of error we obtain<br />

P E = Q( s 1 − s 2<br />

). (3.23)<br />

2σ<br />

Example 3.2 For s 1 = +1, s 2 = −1, and σ 2 = 1, we obtain if Pr{M = 1} = Pr{M = 2} that<br />

P E = Q(1) ≈ 0.1587. (3.24)<br />

For Pr{M = 1} = 3/4 and Pr{M = 2} = 1/4 we get that r ∗ = − ln 3<br />

2<br />

≈ −0.5493 and<br />

P E = 3 ln 3<br />

Q(1 +<br />

4 2 ) + 1 4 Q(−ln 3<br />

2 + 1)<br />

≈ 0.75Q(1.5493) + 0.25Q(0.4507)<br />

≈ 0.75 · 0.0607 + 0.25 · 0.3261 ≈ 0.1270. (3.25)<br />

Note that this is smaller than Q(1) ≈ 0.1587 which would be the error probability after ML-detection<br />

(which is suboptimum here).<br />

3.4 Exercises<br />

p R (r|M = 1) p R (r|M = 2)<br />

1 1<br />

0<br />

❅<br />

❅❅<br />

1/4<br />

❅<br />

1 2 r -1 0 1 2 3 r<br />

Figure 3.6: Conditional probability density functions.<br />

1. A communication system is used to transmit one of two equally likely messages, 1 and 2.<br />

The channel output is a real-valued random variable R, the conditional density functions of<br />

which are shown in figure 3.6. Determine the optimum receiver decision rule and compute<br />

the resulting probability of error.<br />

(Exercise 2.23 from Wozencraft and Jacobs [25].)<br />

2. The noise n in figure 3.7a is Gaussian with zero mean, i.e. E[N] = 0. If one of two<br />

equally likely messages is transmitted, using the signals of 3.7b, an optimum receiver<br />

yields P E = 0.01.<br />

(a) What is the minimum attainable probability of error P min<br />

E<br />

when the channel of 3.7a<br />

is used with three equally likely messages and the signals of 3.7c? And with four<br />

equally likely messages and the signals of 3.7d?<br />

(b) How do the answers to the previous questions change if it is known that E[N] = 1<br />

instead of 0?


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 33<br />

s<br />

n<br />

✛❄<br />

✘r = s + n<br />

✲ + ✲<br />

✚✙<br />

(a)<br />

s 1<br />

s 2<br />

✄<br />

✂ ✄<br />

✁ ✂ ✁<br />

-2 +2<br />

(b)<br />

s 1 s 2 s 3<br />

s 1 s 2 s 3 s 4<br />

✄ ✄ ✄<br />

✄ ✄ ✄ ✄<br />

✂ ✁ ✂ ✁ ✂ ✁<br />

✂ ✁ ✂ ✁<br />

✂ ✁ ✂ ✁<br />

-4 0 +4 -4 0 +4 +8<br />

(c)<br />

(d)<br />

Figure 3.7:<br />

(Exercise 4.1 from Wozencraft and Jacobs [25].)<br />

3. Show that p(n) = √ 1 exp(− n2<br />

2π 2<br />

) for −∞ < n < ∞ is a probability density function<br />

(non-negative, integrating to 1).<br />

4. Consider a communication channel with an input s that can only assume positive values.<br />

The probability density function of the output R when the input is s is given by<br />

p R (r|S = s) =<br />

{ s−|r|<br />

s 2<br />

for 0 ≤ |r| ≤ s<br />

0 for |r| > s<br />

,<br />

see figure 3.8.<br />

1/s<br />

p R (r|S = s)<br />

❅<br />

❅❅❅❅<br />

❅<br />

−s<br />

s<br />

r →<br />

Figure 3.8: The probability density function p R (r|S = s).<br />

There are two messages i.e. M = {1, 2} and the corresponding signals are s 1 = 1/2 and<br />

s 2 = 2. Let the a-priori probabilities Pr{M = 1} = Pr{M = 2} = 1/2.<br />

(a) Sketch the decision variables as a function of the output r in a single figure. For what<br />

values r does an optimum receiver decide ˆM = 1?


CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 34<br />

(b) Determine the corresponding error probability P E .<br />

(c) If the a-priori probability Pr{M = 1} is small enough, an optimum receiver will<br />

choose ˆM = 2 for all r. What is the largest value of Pr{M = 1} for which this<br />

happens? What is the error probability for this value of Pr{M = 1}?<br />

(Exam Communication Theory, November 18, 2003)<br />

5. Consider a communication channel with an input s that can only assume positive values.<br />

The probability density function of the output R when the input is s is given by<br />

(<br />

1<br />

p R (r|S = s) = √ exp − r 2 )<br />

2πs 2 2s 2 .<br />

Note that conditionally on the input s, this R is Gaussian, with mean zero and variance s 2 .<br />

In figure 3.9 this density is plotted for s = 1.<br />

0.45<br />

0.4<br />

0.35<br />

0.3<br />

p R<br />

(r|S=1)<br />

0.25<br />

0.2<br />

0.15<br />

0.1<br />

0.05<br />

0<br />

−3 −2 −1 0 1 2 3<br />

r<br />

Figure 3.9: The probability density function p R (r|S = 1).<br />

There are two messages i.e. M = {1, 2} and the corresponding signals are s 1 = 1 and<br />

s 2 = √ e. Let the a-priori probabilities Pr{M = 1} = Pr{M = 2} = 1/2.<br />

(a) Sketch both probability density functions p R (r|S = s 1 ) and p R (r|S = s 2 ) in a single<br />

figure. For what values r does an optimum receiver decide ˆM = 1?<br />

(b) Determine the corresponding error probability P E . Express it in terms of the Q(·)<br />

function.<br />

(c) If the a-priori probability Pr{M = 1} is small enough, an optimum receiver will<br />

choose ˆM = 2 for all r. What is the largest value of Pr{M = 1} for which this<br />

happens? What is the error probability for this value of Pr{M = 1}?<br />

(Exam Communication Theory, January 23, 2004)


Chapter 4<br />

Real vector channels<br />

SUMMARY: In this chapter we discuss the channel whose input and output are vectors of<br />

real-valued components. An important example of such a real vector channel is the additive<br />

Gaussian noise (AGN) vector channel. This channel adds zero-mean Gaussian noise to each<br />

input signal component. All noise samples are assumed to be independent of each other and<br />

have the same variance. For the AGN vector channel we determine the optimum receiver.<br />

If all messages are equally likely this receiver chooses the signal vector that is closest to the<br />

received vector in Euclidean sense. We also investigate the error behavior of this channel.<br />

The last part of this chapter deals with the theorem of reversibility.<br />

4.1 System description<br />

s m<br />

r<br />

source<br />

m<br />

✲<br />

transmitter<br />

s m1 ✲<br />

r 1 ✲<br />

s m2 ✲ real r 2 ✲<br />

vector<br />

.<br />

✲ channel .<br />

✲<br />

s m N r N<br />

receiver<br />

ˆm<br />

✲<br />

destin.<br />

Figure 4.1: A communication system based on a real vector channel.<br />

In a communication system based on a real vector channel (see figure 4.1) the channel input<br />

and output are vectors with real-valued components.<br />

SOURCE The information source generates the message m ∈ M = {1, 2, · · · , |M|} with a-<br />

priori probability Pr{M = m} for m ∈ M. M is the random variable associated with this<br />

mechanism.<br />

TRANSMITTER The transmitter sends a vector-signal s m = (s m1 , s m2 , · · · , s m N ) consisting<br />

of N components if message m is to be transmitted. Each component assumes a value in<br />

the range (−∞, ∞). The random variable corresponding to the signal vector is denoted<br />

35


CHAPTER 4. REAL VECTOR CHANNELS 36<br />

by S. The vector-signal is input to the channel. The collection of used signal vectors is<br />

s 1 , s 2 , · · · , s |M| .<br />

REAL VECTOR CHANNEL The channel produces an output vector r = (r 1 , r 2 , · · · , r N )<br />

consisting of N components. We assume here that all these components assume values<br />

from (−∞, ∞). The conditional probability density function of the channel output vector<br />

r when the input vector s is transmitted is p R (r|S = s). Thus, for S = s, the probability<br />

of receiving an N-dimensional channel output vector R with components R n for n = 1, N<br />

satisfying r n ≤ R n < r n + dr n is equal to p R (r|S = s)dr for an infinitely small dr =<br />

dr 1 dr 2 · · · dr N and r = (r 1 , r 2 , · · · , r N ).<br />

When message m ∈ M is sent, signal s m is chosen by the transmitter as channel input<br />

vector. Then the conditional probability density function<br />

p R (r|M = m) = p R (r|S = s m ) for all r ∈ (−∞, ∞) N , (4.1)<br />

describes the behavior of the cascade of the vector transmitter and the vector channel.<br />

RECEIVER The receiver forms ˆm based on the received real-valued channel output vector r,<br />

hence ˆm = f (r). Mapping f (·) is called the decision rule.<br />

DESTINATION The destination accepts ˆm.<br />

4.2 Decision variables, MAP decoding<br />

The optimum receiver upon receiving r determines which one of the possible messages has<br />

maximum a-posteriori probability. It therefore chooses the decision rule f (r) such that for all r<br />

that actually can be received<br />

Pr{M = f (r)}p R (r|M = f (r)) ≥ Pr{M = m}p R (r|M = m), for all m ∈ M. (4.2)<br />

In other words, upon receiving r, the optimum receiver produces an estimate ˆm that corresponds<br />

to a largest decision variable. Given r, the decision variables for the real vector channel are<br />

Pr{M = m}p R (r|M = m) = Pr{M = m}p R (r|S = s m ) for all m ∈ M. (4.3)<br />

That this is MAP-decoding follows from the Bayes rule which says that<br />

Pr{M = m|R = r} = Pr{M = m}p R(r|M = m)<br />

p R (r)<br />

(4.4)<br />

for r with p R (r) = ∑ m∈M Pr{M = m}p R(r|M = m) > 0. Note that p R (r) > 0 for an output<br />

vector r that has actually been received.


CHAPTER 4. REAL VECTOR CHANNELS 37<br />

4.3 Decision regions<br />

The optimum receiver, upon receiving r, determines the maximum of all decision variables which<br />

are given in (4.3). To compute these decision variables the a-priori probabilities Pr{M = m}<br />

must be known to the receiver and also the conditional density functions p R (r|S = s m ) for all<br />

m ∈ M. This calculation can be carried out for every r in the observation space, i.e. the set of<br />

all possible output vectors r. Each point r in the observation space is therefore assigned to one of<br />

the possible messages m ∈ M, the message that is actually chosen by the receiver. This results<br />

in a partitioning of the observation space in at most |M| regions.<br />

Definition 4.1 Given the decision rule f (·) we can write<br />

I m<br />

= {r| f (r) = m}. (4.5)<br />

where I m is called the decision region that corresponds to message m ∈ M.<br />

Note that in the example in subsection 3.3.2 of the previous chapter we considered decision<br />

intervals. A decision region is an N-dimensional generalization of a (one-dimensional) decision<br />

interval.<br />

I 1 ✚ ✚✘ ✘<br />

s 1<br />

✄ ✂ ✁<br />

✡ ✡✡✡✡✚✚✘ I 2<br />

✄ <br />

✂ ✁ s 2<br />

<br />

✑<br />

✑ ❅<br />

✑ ❅❅❆❆<br />

✟<br />

✑<br />

✟<br />

✟<br />

✟<br />

✏ ✏✏✟✟<br />

❇<br />

✏<br />

❅ ◗<br />

✄ <br />

❳❳❵❵❵❵<br />

✂ ✁<br />

s 3<br />

I 3<br />

Figure 4.2: Three two-dimensional signal vectors and their decision regions.<br />

Example 4.1 Suppose (see figure 4.2) that we have three messages i.e. M = {1, 2, 3}, and the corresponding<br />

signal-vectors are two-dimensional i.e. s 1<br />

= (1, 2), s 2<br />

= (2, 1), and s 3<br />

= (1, −3). A possible<br />

partitioning of the observation space into the three decision regions I 1 , I 2 , and I 3 is shown in the figure.


CHAPTER 4. REAL VECTOR CHANNELS 38<br />

4.4 The additive Gaussian noise vector channel<br />

4.4.1 Introduction<br />

The actual shape of the decision regions is determined by the a-priori message probabilities<br />

Pr{M = m}, the signals s m , and the conditional density functions p R (r|S = s m ) for all m ∈<br />

M. A relatively simple but again quite important situation is the case where the channel adds<br />

n p N (n) =<br />

1<br />

(2πσ 2 ) N/2 exp(− ‖n‖2<br />

2σ 2 )<br />

s<br />

✛❄<br />

✘<br />

r = s + n<br />

✲ +<br />

✲<br />

✚✙<br />

Figure 4.3: Additive Gaussian noise vector channel.<br />

Gaussian noise to the signal components (see figure 4.3).<br />

Definition 4.2 For the output of the additive Gaussian noise (AGN) vector channel we have<br />

that<br />

r = s + n, (4.6)<br />

where n = (n 1 , n 2 , · · · , n N ) is an N-dimensional noise vector. We denote this random noise<br />

vector by N. The noise vector N is assumed to be independent of the signal vector S. Moreover<br />

the N components of the noise vector are assumed to be independent of each other. All noise<br />

components have mean 0 and the same variance σ 2 . Therefore the joint probability density<br />

function of the noise vector is given by<br />

p N (n) = ∏<br />

i=1,N<br />

1<br />

√ exp(− n2 i<br />

2πσ 2 2σ 2 ) = 1<br />

1<br />

(2πσ 2 exp(−<br />

) N/2 2σ 2<br />

∑<br />

i=1,N<br />

n 2 i<br />

). (4.7)<br />

The notation in the definition can be contracted by noting that<br />

∑<br />

ni 2 = (n · n) = ‖n‖ 2 , (4.8)<br />

i=1,N<br />

where (a · b) = ∑ i=1,N a ib i is the dot (inner) product of the vectors a = (a 1 , a 2 , · · · , a N ) and<br />

b = (b 1 , b 2 , · · · , b N ). Thus<br />

p N (n) =<br />

1<br />

(2πσ 2 exp(−‖n‖2 ). (4.9)<br />

) N/2 2σ 2


CHAPTER 4. REAL VECTOR CHANNELS 39<br />

4.4.2 Minimum Euclidean distance decision rule<br />

If the channel output is r and the channel input was s m then the noise vector must have been<br />

n = r − s m . This and the independence of the noise vector and the signal vector yields that<br />

p R (r|M = m) = p R (r|S = s m ) = p N (r − s m |S = s m ) = p N (r − s m ), (4.10)<br />

similar to (3.10) in the previous chapter. Now we can easily determine the decision variables,<br />

one for each m ∈ M:<br />

Pr{M = m}p R (r|M = m) = Pr{M = m}p N (r − s m )<br />

1<br />

= Pr{M = m}<br />

(2πσ 2 ) N/2 exp(−‖r − s m ‖2<br />

2σ 2 ). (4.11)<br />

Note that the factor (2πσ 2 ) N/2 is independent of m. Hence maximizing over the decision variables<br />

in (4.11) is equivalent to minimizing<br />

‖r − s m ‖ 2 − 2σ 2 ln Pr{M = m} (4.12)<br />

over all m ∈ M. We obtain a very simple decision rule if all messages are equally likely.<br />

RESULT 4.1 (Minimum Euclidean distance decision rule) If the a-priori message probabilities<br />

are all equal, the optimum receiver has to minimize the squared Euclidean distance<br />

‖r − s m ‖ 2 , over all m ∈ M. (4.13)<br />

In other words the receiver has to choose the message ˆm with corresponding signal vector s ˆm<br />

that is closest in Euclidean sense to the received vector r. Note that this result holds for the<br />

additive Gaussian noise vector channel.<br />

Example 4.2 Consider again (see figure 4.4) three two dimensional signal vectors s 1<br />

= (1, 2), s 2<br />

=<br />

(2, 1), and s 3<br />

= (1, −3).<br />

The optimum decision regions can be found by drawing the perpendicular bisectors of the sides of the<br />

signal triangle. These are the boundaries of the decision regions I 1 , I 2 , and I 3 (see figure). Note that the<br />

three bisectors have a single point in common.<br />

4.4.3 Error probabilities<br />

The error probability of an additive Gaussian noise vector channel is determined by the location<br />

of the signals s 1 , s 2 , · · · , s |M| , and, more importantly, the hyperplanes that separate these signalvectors.<br />

An error occurs if the noise ”pushes” a signal-point to the wrong side of a hyperplane. In<br />

general it is quite difficult to determine the exact error probability, however it is easy to compute<br />

the probability that the received signal is on the wrong side of a hyperplane.<br />

To study this behavior we investigate a simple example in two dimensions. i.e. N = 2.<br />

Consider a signal vector s = (a, b), see figure 4.5. This signal vector is corrupted by the additive


CHAPTER 4. REAL VECTOR CHANNELS 40<br />

<br />

I 1<br />

s 1<br />

❡<br />

I 2<br />

❅<br />

❅<br />

❅❡<br />

s 2<br />

❳❳❳❳❳❳❳ ❳ ❳ ❳ ❳❳❳❳<br />

❡✄ ✄✄✄✄✄✄✄✄✄✄✄<br />

s 3<br />

I 3<br />

Figure 4.4: Three two-dimensional signal vectors and the corresponding optimum decision regions<br />

for the additive Gaussian noise channel.<br />

◗<br />

◗<br />

◗<br />

◗<br />

◗<br />

r 2 ◗<br />

I<br />

◗<br />

◗<br />

✻<br />

◗<br />

◗<br />

◗<br />

◗<br />

(r 1 , r 2 ) ◗<br />

◗<br />

✡◗ ◗<br />

❆❑ ◗◗◗◗◗◗<br />

✡<br />

❆<br />

✢✡ ✡✡✡✡✡✡✣<br />

◗<br />

◗❦<br />

◗<br />

◗<br />

✡<br />

❆<br />

◗<br />

◗✡<br />

❆<br />

✡ ✡✡✡✣ ◗<br />

◗<br />

◗<br />

◗❦<br />

❆<br />

◗<br />

u 2<br />

◗<br />

◗ ❆<br />

◗ ❆<br />

✡ ✡✣ ✡✡ u 1<br />

◗<br />

◗ line<br />

◗<br />

◗<br />

◗❆<br />

◗✡ ❆<br />

✡ ◗ θ<br />

◗<br />

✄<br />

✂ ✁<br />

(a, b)<br />

r 1<br />

✲<br />

Figure 4.5: A signal point and a line.


CHAPTER 4. REAL VECTOR CHANNELS 41<br />

Gaussian noise vector n = (n 1 , n 2 ) with independent components that both have mean zero and<br />

variance σ 2 . What is now the probability P I that the received vector r = (r 1 , r 2 ) = (a, b) +<br />

(n 1 , n 2 ) is in region I, the region above the line, see the figure?<br />

To solve this problem we have to change the coordinate system. Let<br />

r 1 = a + u 1 cos θ − u 2 sin θ (4.14)<br />

r 2 = b + u 1 sin θ + u 2 cos θ, (4.15)<br />

where (a, b) is the center of a Cartesian system with coordinates u 1 and u 2 , and θ the rotationangle.<br />

Coordinate u 1 is perpendicular to and coordinate u 2 is parallel to the line in the figure.<br />

Note that<br />

∫ (<br />

1<br />

P I =<br />

I 2πσ 2 exp − (r 1 − a) 2 + (r 2 − b) 2 )<br />

2σ 2 dr 1 dr 2 . (4.16)<br />

Elementary calculus (see e.g. [26], p. 386, theorem 7.7.13) tells us that<br />

∫<br />

∫<br />

f (r 1 , r 2 )dr 1 dr 2 = f (r 1 (u 1 , u 2 ), r 2 (u 1 , u 2 ))|J|du 1 du 2 , (4.17)<br />

I<br />

where |J| is the determinant of the Jacobian J which is<br />

( ∂r1 ∂r 2<br />

(<br />

J =<br />

∂u 1 ∂u 1 cos θ − sin θ<br />

=<br />

sin θ cos θ<br />

u 1 =<br />

I<br />

∂r 1<br />

∂u 2<br />

∂r 2<br />

∂u 2<br />

)<br />

)<br />

. (4.18)<br />

Note that |J| = cos 2 θ + sin 2 θ = 1 and (r 1 − a) 2 + (r 2 − b) 2 = u 2 1 + u2 2 . Therefore<br />

∫ (<br />

1<br />

P I =<br />

I 2πσ 2 exp − u2 1 + )<br />

u2 2<br />

2σ 2 du 1 du 2<br />

∫ ∞ ∫ (<br />

∞<br />

1<br />

=<br />

u 1 = u 2 =−∞ 2πσ 2 exp − u2 1 + )<br />

u2 2<br />

2σ 2 du 1 du 2<br />

∫ ( )<br />

∞<br />

∫ ( )<br />

1<br />

= √ exp − u2 ∞<br />

1<br />

1<br />

2πσ 2 2σ 2 du 1 √ exp − u2 2<br />

2πσ 2 2σ 2 du 2<br />

u 2 =−∞<br />

= Q( σ ) · 1 = Q( ). (4.19)<br />

σ<br />

Conclusion is that the probability depends only on the distance from the signal point (a, b) to<br />

the line. This result carries over to more than two dimensions:<br />

RESULT 4.2 For the additive Gaussian noise vector channel, the probability that the noise<br />

pushes a signal to the wrong side of a hyperplane is<br />

P I = Q( ), (4.20)<br />

σ<br />

where is the distance from the signal-point to the hyperplane and σ 2 is the variance of each<br />

noise component. All noise components are assumed to be zero-mean.


CHAPTER 4. REAL VECTOR CHANNELS 42<br />

4.5 Theorem of reversibility<br />

It is interesting to investigate what kind of processing of the output of a channel is allowed, in the<br />

sense that it does not degrade the performance. An important result corresponding to this issue<br />

is stated below.<br />

THEOREM 4.3 (Theorem of reversibility) The minimum attainable probability of error is not<br />

affected by the introduction of a reversible operation at the output of a channel, see figure 4.6.<br />

s<br />

✲<br />

channel<br />

r<br />

✲<br />

G<br />

r ′<br />

✲<br />

Figure 4.6: If G is reversible then an optimum decision can also be made from r ′ = G(r).<br />

Proof: Consider the receiver for r ′ that is depicted in figure 4.7. This receiver first recovers<br />

r = G −1 (r ′ ). This is possible since the mapping G from r to r ′ is reversible. Then an optimum<br />

receiver for r is used to determine ˆm. The receiver constructed in this way for r ′ yields the same<br />

error probability as an optimum receiver that observes r directly, thus a reversible operation does<br />

not (necessarily) lead to an increase of P E .<br />

✷<br />

r ′<br />

✲<br />

G −1<br />

r<br />

✲<br />

optimum<br />

receiver<br />

for r<br />

ˆm<br />

✲<br />

Figure 4.7: An optimum receiver for r ′<br />

4.6 Exercises<br />

1. One of four equally likely messages is to be communicated over a vector channel which<br />

adds a (different) statistically independent zero-mean Gaussian random variable with variance<br />

N 0 /2 to each transmitted vector component. Assume that the transmitter uses the<br />

signal vectors shown in figure 4.8 and express the P E produced by an optimum receiver in<br />

terms of the function Q(x).<br />

(Exercise 4.2 from Wozencraft and Jacobs [25].)<br />

2. In the communication system diagrammed in figure 4.9 the transmitted signal s m and the<br />

noises n 1 and n 2 are all random voltages and all statistically independent. Assume that


CHAPTER 4. REAL VECTOR CHANNELS 43<br />

s 2 ✄ ✂<br />

✁<br />

✄ s<br />

✂ ✁ 1<br />

s 3<br />

✄<br />

✂ ✁<br />

√<br />

Es /2<br />

✄<br />

✂ ✁<br />

s 4<br />

|M| = 2 i.e. M = {1, 2} and that<br />

Figure 4.8: <strong>Signal</strong> structure.<br />

Pr{M = 1} = Pr{M = 2} = 1/2,<br />

s 1 = √ 2E s ,<br />

s 2 = 0,<br />

p N1 (n) = p N2 (n) =<br />

1<br />

√<br />

2πσ 2<br />

n2<br />

exp(− ). (4.21)<br />

2σ 2<br />

m<br />

n 1<br />

✲ transmitter<br />

n 2<br />

✛✘r 1 = s m + n 1<br />

✲ +<br />

✲<br />

✚✙<br />

✻<br />

s m<br />

✛❄<br />

✘<br />

✲ +<br />

✲<br />

✚✙r 2 = s m + n 2<br />

optimum<br />

receiver<br />

ˆm<br />

✲<br />

Figure 4.9: Communication system.<br />

r 1<br />

✛❄<br />

✘r 1 + r 2<br />

+<br />

✚✙<br />

✻<br />

r 2<br />

✲ threshold device<br />

ˆm<br />

✲<br />

Figure 4.10: Detector structure.<br />

(a) Show that the optimum receiver can be realized as diagrammed in figure 4.10. Determine<br />

the optimum threshold setting and the value ˆm for r 1 + r 2 larger than the<br />

threshold.


CHAPTER 4. REAL VECTOR CHANNELS 44<br />

m<br />

s m<br />

✛✘r 1 = s m + n 1<br />

✲ transmitter ✲ +<br />

✲<br />

✚✙<br />

✻<br />

n 1<br />

✛❄<br />

✘<br />

✲ +<br />

✲<br />

n 2<br />

✚✙r 2 = n 1 + n 2<br />

optimum<br />

receiver<br />

ˆm<br />

✲<br />

Figure 4.11:<br />

r 1<br />

✛❄<br />

✘r 1 + ar 2<br />

+<br />

✲ threshold device<br />

✛✘✚<br />

✙<br />

✲<br />

✻<br />

×<br />

r 2<br />

✚✙<br />

✻<br />

a<br />

ˆm<br />

✲<br />

Figure 4.12:<br />

(b) Express the resulting P E in terms of Q(·).<br />

(c) By what factor would E s have to be increased to yield this same probability of error<br />

if the receiver were restricted to observing only r 1 .<br />

(Exam Communication Principles, July 4, 2001.)<br />

3. In the communication system diagrammed in figure 4.11 the transmitted signal s and the<br />

noises n 1 and n 2 are all random voltages and all statistically independent. Assume that<br />

|M| = 2 i.e. M = {1, 2} and that<br />

Pr{M = 1} = Pr{M = 2} = 1/2<br />

s 1 = −s 2 = √ E s<br />

p N1 (n) = p N2 (n) =<br />

1<br />

√<br />

2πσ 2<br />

n2<br />

exp(− ). (4.22)<br />

2σ 2<br />

(a) Show that the optimum receiver can be realized as diagrammed in figure 4.12 where<br />

a is an appropriately chosen constant.<br />

(b) What is the optimum value for a?<br />

(c) What is the optimum threshold setting?<br />

(d) Express the resulting P E in terms of Q(x).


CHAPTER 4. REAL VECTOR CHANNELS 45<br />

(e) By what factor would E s have to be increased to yield this same probability of error<br />

if the receiver were restricted to observing only r 1 .<br />

(Exercise 4.6 from Wozencraft and Jacobs [25].)<br />

4. Consider a set of signal vectors {s 1 , s 2 , · · · , s |M| }. Result 4.1 says that if the a-priori<br />

probabilities of the corresponding messages are equal, the minimum Euclidean distance<br />

decision rule is optimum if the channel is an additive Gaussian noise vector channel (definition<br />

4.2). In this case the decision regions can be described by a set of hyperplanes (one<br />

for each message pair). Show that this is also the case if the a-priori probabilities are not<br />

all equal.


Chapter 5<br />

Waveform channels, correlation receiver<br />

SUMMARY: All topics discussed so far are necessary to analyze the additive white Gaussian<br />

noise waveform channel. In a waveform channel continuous-time waveforms are transmitted<br />

over a channel that adds continuous-time white Gaussian noise to them. We show in<br />

this chapter that an optimum receiver for such a waveform channel correlates the received<br />

waveform with all possible transmitted waveforms and makes a decision based on all these<br />

correlations. Together these correlations form a sufficient statistic.<br />

5.1 System description, additive white Gaussian noise<br />

In a communication system based on an additive white Gaussian noise waveform channel (see<br />

figure 5.1) the channel input and output signals are waveforms.<br />

TRANSMITTER The transmitter chooses the waveform s m (t) for 0 ≤ t < T , if message m is<br />

to be transmitted. The set of used waveforms is therefore {s 1 (t), s 2 (t), · · · , s |M| (t)}. The<br />

chosen waveform is input to the channel. We assume here that all these waveforms have<br />

finite-energy, i.e.<br />

E sm<br />

=<br />

∫ T<br />

0<br />

sm 2 (t)dt < ∞ for all m ∈ M, (5.1)<br />

where E sm is the energy of waveform s m (t). To convince yourself that this is a reasonable<br />

definition assume that s m (t) is the voltage across (or the current through) a resistor of 1.<br />

The total energy that is dissipated in this resistor is then equal to E sm .<br />

n w (t)<br />

m<br />

✲<br />

transmitter<br />

s m (t)<br />

✛❄<br />

✘<br />

✲ +<br />

✲<br />

✚r(t) ✙= s m (t) + n w (t)<br />

receiver<br />

ˆm<br />

✲<br />

Figure 5.1: Communication over an additive white Gaussian noise waveform channel.<br />

46


CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 47<br />

WAVEFORM CHANNEL The channel accepts the input waveform s m (t) and adds white Gaussian<br />

noise n w (t) to it, i.e. for the received waveform r(t) we have that<br />

r(t) = s m (t) + n w (t). (5.2)<br />

The noise n w (t) is supposed to be generated by the random process N w (t) which is zeromean<br />

(i.e. E[N w (t) = 0 for all t), stationary, white, and Gaussian 1 . The autocorrelation<br />

function of the noise is<br />

R Nw (t, s) = E[N w (t)N w (s)] = N 0<br />

δ(t − s), (5.3)<br />

2<br />

and depends only on the time difference t − s (by the stationarity). The noise has power<br />

spectral density<br />

S Nw ( f ) = N 0<br />

for −∞ < f < ∞, (5.4)<br />

2<br />

i.e. it is white, see appendix D. Note that f is the frequency in Hz.<br />

RECEIVER The receiver forms the message estimate ˆm based on the observed waveform<br />

r(t), 0 ≤ t < T . The interval [0, T ) is called the observation interval.<br />

The transmission of messages over a waveform channel is what we actually want to study in<br />

the first part of these course notes. It is a model for many digital communication systems (e.g.<br />

a telephone line including the modems on both sides, wireless communication links, recording<br />

channels). In the next sections we will show that the optimum receiver should correlate the<br />

received signal r(t) with all possible transmitted waveforms s m (t), m ∈ M to form a good<br />

message estimate. In a second chapter on waveform communication we will study alternative<br />

implementations of optimum receivers.<br />

GAUSSIAN NOISE is always everywhere. It originates in the thermal fluctuations of all<br />

matter in the universe. It creates randomly varying electromagnetic fields that will a produce a<br />

fluctuating voltage across the terminals of an antenna. The random movements of electrons in<br />

e.g. resistors will create random voltages in the electronics of the receiver. Since these voltages<br />

are the sum of many minuscule effects, by the central-limit theorem, they are Gaussian.<br />

5.2 Waveform expansions in series of orthonormal functions<br />

To determine the optimum receiver some new techniques have to be introduced. It is our aim<br />

to transform the waveform channel into vector channel first and then use our knowledge of the<br />

vector channel to determine an optimum receiver. Therefore we first expand the noise process<br />

N w (t) and the signals s m (t) in terms of a countable set of orthonormal functions (waveforms).<br />

Consider a set of functions { f i (t), i = 1, 2, · · · } that are orthonormal over the interval [0, T )<br />

in the sense that<br />

∫ T<br />

{<br />

f i (t) f j (t)dt =<br />

1 if i = j<br />

(5.5)<br />

0 if i ̸= j.<br />

0<br />

1 One can argue that white noise is always Gaussian, see Helstrom [9] p.36.


CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 48<br />

f 1<br />

(t)<br />

f 2<br />

(t)<br />

f 3<br />

(t)<br />

f 4<br />

(t)<br />

1<br />

1<br />

1<br />

1<br />

0.8<br />

0.8<br />

0.8<br />

0.8<br />

0.6<br />

0.6<br />

0.6<br />

0.6<br />

0.4<br />

0.4<br />

0.4<br />

0.4<br />

0.2<br />

0.2<br />

0.2<br />

0.2<br />

0<br />

0<br />

0<br />

0<br />

−0.2<br />

0 2 4<br />

−0.2<br />

−0.2<br />

−0.2<br />

0 2 4 0 2 4 0 2 4<br />

Figure 5.2: Example of orthonormal functions: four time-translated pulses.<br />

Some well-known sets of orthonormal functions are given in the example that follows.<br />

Example 5.1 In figure 5.2 four<br />

∫<br />

functions are shown that are time-translated orthonormal pulses. Note<br />

T<br />

that the energy in a pulse i.e.<br />

0<br />

fi 2 (t)dt = 1 for i = 1, 4. In figure 5.3 five functions are shown on<br />

a√ one-second time interval, a pulse with amplitude 1 and four sine and cosine waves whose amplitude is<br />

2. All these functions are again mutually orthogonal and have unit energy. Note that functions like these<br />

form the first building blocks that are used in Fourier series.<br />

Now assume 2 that all outcomes n w (t) of the random process N w (t) can be represented in<br />

terms of the orthonormal functions f 1 (t), f 2 (t), · · · as<br />

n w (t) =<br />

∞∑<br />

n i f i (t). (5.6)<br />

i=1<br />

Then the coefficients n i , i = 1, 2, · · · , will satisfy<br />

n i =<br />

∫ T<br />

0<br />

n w (t) f i (t)dt. (5.7)<br />

To see why, substitute (5.6) in the right side of (5.7) and integrate making use of (5.5). Next<br />

assume that also all signals can be expressed in terms of the same set of orthonormal functions,<br />

i.e.<br />

∞∑<br />

s m (t) = s mi f i (t), for m ∈ {1, 2, · · · , |M|}, (5.8)<br />

i=1<br />

2 In this discussion we will focus only on the main concepts and we do not bother very much about mathematical<br />

details as e.g. the existence of such a “complete set” of orthonormal functions, extra conditions on the random<br />

process N w (t) and on the waveforms s m (t), the convergence of series, the interchanging of orders of summation and<br />

integration, etc. . For a more precise treatment we refer e.g. to Gallager [7], Helstrom [9], or Lee and Messerschmitt<br />

[13].


CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 49<br />

1.5<br />

constant 1<br />

1.5<br />

sqrt(2)*cos(2*pi t)<br />

1.5<br />

sqrt(2)*cos(4*pi*t)<br />

1<br />

1<br />

1<br />

0.5<br />

0.5<br />

0.5<br />

0<br />

0<br />

0<br />

−0.5<br />

−0.5<br />

−0.5<br />

−1<br />

−1<br />

−1<br />

−1.5<br />

0 0.5 1<br />

−1.5<br />

0 0.5 1<br />

−1.5<br />

0 0.5 1<br />

1.5<br />

sqrt(2)*sin(2*pi*t)<br />

1.5<br />

sqrt(2)*sin(4*pi*t)<br />

1<br />

1<br />

0.5<br />

0.5<br />

0<br />

0<br />

−0.5<br />

−0.5<br />

−1<br />

−1<br />

−1.5<br />

0 0.5 1<br />

−1.5<br />

0 0.5 1<br />

Figure 5.3: Example of orthonormal functions: first five functions in a Fourier series.<br />

where the coefficients s mi , m ∈ M, i = 1, 2, · · · , are given by<br />

s mi =<br />

∫ T<br />

0<br />

s m (t) f i (t)dt. (5.9)<br />

Then also the received signal r(t) = s m (t) + n w (t) can be expanded as<br />

∞∑<br />

r(t) = r i f i (t), (5.10)<br />

with coefficients r i , i = 1, 2, · · · , that satisfy<br />

i=1<br />

r i =<br />

=<br />

∫ T<br />

0<br />

∫ T<br />

0<br />

r(t) f i (t)dt<br />

s m (t) f i (t)dt +<br />

∫ T<br />

0<br />

n w (t) f i (t)dt = s mi + n i . (5.11)<br />

5.3 An equivalent vector channel<br />

An optimum receiver can now make a decision based on the vector r ∞ = (r 1 , r 2 , r 3 , · · · ) instead<br />

of the waveform r(t). By the theorem of reversibility (Theorem 4.3) this does not affect the<br />

smallest possible achievable error probability P E .<br />

Similarly we can assume that instead of the signal s m (t) the vector s ∞ m = (s m1, r m2 , r m3 , · · · )<br />

is transmitted in order to convey the message m ∈ M, the reason for this is that s ∞ m and s m(t)<br />

are equivalent.


CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 50<br />

All this implies that we have transformed our waveform channel into a vector channel (with<br />

an infinite number of dimensions however). Since<br />

r ∞ = s ∞ m + n∞ , (5.12)<br />

with n ∞ = (n 1 , n 2 , n 3 , · · · ) the resulting vector channel adds noise to the input vector. What<br />

can we say now about the noise components N i , i = 1, 2, · · · ?<br />

Note first that each component results from linear operations on the Gaussian process N w (t).<br />

Therefore the noise components are jointly Gaussian.<br />

For the mean of component N i for i = 1, 2, · · · , we find that<br />

∫ T<br />

E[N i ] = E[<br />

0<br />

N w (t) f i (t)dt] =<br />

∫ T<br />

0<br />

E[N w (t)] f i (t)dt = 0. (5.13)<br />

For the correlation of component N i , i = 1, 2, · · · , and component N j , j = 1, 2, · · · , we obtain<br />

∫ T<br />

E[N i N j ] = E[<br />

=<br />

=<br />

=<br />

∫ T<br />

0 0<br />

∫ T ∫ T<br />

0 0<br />

∫ T ∫ T<br />

0<br />

∫ T<br />

0<br />

0<br />

N w (α)N w (β) f i (α) f j (β)dαdβ]<br />

E[N w (α)N w (β)] f i (α) f j (β)dαdβ<br />

N 0<br />

2 δ(α − β) f i(α) f j (β)dαdβ<br />

N 0<br />

2 f i(α) f j (α)dα = N 0<br />

2 δ i j, (5.14)<br />

and thus the noise variables all have variance N 0 /2 and are uncorrelated and thus independent of<br />

each other. In our derivation we used the Kronecker delta function which is defined as<br />

δ i j<br />

=<br />

{ 1 if i = j,<br />

0 if i ̸= j.<br />

(5.15)<br />

As a consequence of all this our vector channel is an additive Gaussian noise vector channel<br />

as described in definition 4.2.<br />

5.4 A correlation receiver is optimum<br />

In the previous subsection we have seen that our waveform channel is equivalent to an additive<br />

Gaussian noise vector channel (with an infinite number of dimensions). From (4.12) we know<br />

what the decision variables are for this situation. The decision variable for m ∈ M is expressed<br />

as<br />

‖r ∞ − s ∞ m ‖2 − 2σ 2 ln Pr{M = m} = ‖r ∞ − s ∞ m ‖2 − N 0 ln Pr{M = m}, (5.16)


CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 51<br />

where we have substituted N 0 /2 for the variance σ 2 . The receiver has to minimize (5.16) over<br />

m ∈ M. With (a ∞ · b ∞ ) = ∑ ∞<br />

i=1 a i b i for a ∞ = (a 1 , a 2 , · · · ) and b ∞ = (b 1 , b 2 , · · · ), we<br />

rewrite<br />

‖r ∞ − s ∞ m ‖2 = ((r ∞ − s ∞ m ) · (r ∞ − s ∞ m ))<br />

= (r ∞ · r ∞ ) − 2(r ∞ · s ∞ m ) + (s∞ m · s∞ m )<br />

= ‖r ∞ ‖ 2 − 2(r ∞ · s ∞ m ) + ‖s∞ m ‖2 . (5.17)<br />

Since ‖r ∞ ‖ 2 does not depend on m, the optimum receiver should maximize<br />

with<br />

(r ∞ · s ∞ m ) + c m over m ∈ M = {1, 2, · · · , |M|} (5.18)<br />

N 0<br />

c m =<br />

2 ln Pr{M = m} − ‖s∞ m ‖2<br />

. (5.19)<br />

2<br />

Now it would be nice if the receiver could use some simple method to determine the dot<br />

product (r ∞ · s ∞ m ) and the squared vector length ‖s∞ m ‖, which are both sums of an infinite<br />

number of components. We will see next that this is possible and in fact quite easy. For the dot<br />

product we find<br />

∫ T<br />

0<br />

r(t)s m (t)dt =<br />

=<br />

∫ T<br />

0<br />

r(t)<br />

∞∑<br />

s mi f i (t)dt<br />

i=1<br />

∞∑<br />

∫ ∞<br />

s mi r(t) f i (t)dt =<br />

i=1<br />

−∞<br />

Similarly we get for the squared vector length<br />

E sm =<br />

∫ T<br />

This leads to an important result.<br />

0<br />

∫ T ∞∑<br />

sm 2 (t)dt = s m (t) s mi f i (t)dt<br />

=<br />

0<br />

i=1<br />

∞∑<br />

s mi r i = (r ∞ · s ∞ m ). (5.20)<br />

i=1<br />

∞∑<br />

∫ ∞<br />

s mi s m (t) f i (t)dt =<br />

i=1<br />

−∞<br />

∞∑<br />

smi 2 = ‖s∞ m ‖2 . (5.21)<br />

RESULT 5.1 The optimum receiver for transmission of the signals s m (t), m ∈ M, over an<br />

additive white Gaussian noise waveform channel has to maximize<br />

Here<br />

∫ ∞<br />

−∞<br />

r(t)s m (t)dt + c m over m ∈ M = {1, 2, · · · , |M|}. (5.22)<br />

N 0<br />

c m =<br />

2 ln Pr{M = m} − E s m<br />

2 . (5.23)<br />

This receiver is called a correlation receiver (see figure 5.4).<br />

i=1


CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 52<br />

r(t)<br />

s 1 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

s 2 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

integrator<br />

integrator<br />

∫ T<br />

0 r(t)s 1(t)dt<br />

∫ T<br />

0 r(t)s 2(t)dt<br />

c 1<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

c 2<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

ˆm<br />

✲<br />

s |M| (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

.<br />

integrator<br />

∫ T<br />

0 r(t)s |M|(t)dt<br />

c |M|<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

Figure 5.4: Correlation receiver, an optimum receiver for waveforms in white Gaussian noise.<br />

5.5 Sufficient statistic<br />

In the previous section we have seen that the receiver can make an optimum decision by looking<br />

only at the correlations<br />

∫ T<br />

0<br />

r(t)s m (t)dt, for all m ∈ M. (5.24)<br />

These |M| correlations together form a so-called sufficient statistic for making an optimum decision.<br />

It is unnecessary to store the received waveform r(t) after having determined the correlations.<br />

Or, to put it in another way, given the correlations the received waveform r(t) is not<br />

relevant anymore.<br />

5.6 Exercises<br />

1. Consider a waveform communication system in which one of two equally likely messages<br />

is transmitted, hence M = {1, 2} and Pr{M = 1} = Pr{M = 2} = 1/2. The waveforms<br />

s 1 (t) and s 2 (t) corresponding to the messages both have energy E, i.e.<br />

∫ T<br />

0<br />

s 2 1 (t)dt = ∫ T<br />

0<br />

s2 2 (t)dt = E,<br />

where [0, T ) is the observation interval. The channel adds zero-mean additive Gaussian<br />

noise with power spectral density S Nw ( f ) = 1 to the transmitted waveform.<br />

(a) Find an expression for the error probability obtained by a correlation receiver when<br />

s 1 (t) = −s 2 (t).<br />

(b) What is the error probability of a correlation receiver when ∫ T<br />

0 s 1(t)s 2 (t)dt = 0?


Chapter 6<br />

Waveform channels, building-blocks<br />

SUMMARY: In this chapter we assume that when signal waveforms are linear combinations<br />

of a set of orthonormal waveforms (building-bock waveforms) the waveform channel<br />

can be transformed into an AGN vector channel without loss of optimality. To each input<br />

waveform there corresponds a vector in what is called the signal space. It is possible to<br />

make an optimum decision based on the correlations of the channel output waveform with<br />

the building-block waveforms. The difference between this vector of correlations and the<br />

transmitted vector is a noise vector which is Gaussian and spherically distributed.<br />

6.1 Another sufficient statistic<br />

Suppose that the number |M| of waveforms s m (t) is large. Then the receiver has to determine<br />

|M| correlations which is quite complex. The question that pops up is now is whether an optimum<br />

decision can be based on a smaller number of correlations? This turns out to be true when<br />

all waveforms s m (t), m ∈ M, can be represented as linear combinations of N orthonormal functions<br />

and N < |M|. We call these orthonormal functions building-block waveforms or building<br />

blocks.<br />

The decomposition of waveforms into building-blocks is also essential when we are interested<br />

in evaluating the error probability of a system.<br />

Definition 6.1 The building-block waveforms {ϕ i (t), i = 1, N} satisfy by definition<br />

∫ T<br />

{<br />

ϕ i (t)ϕ j (t)dt =<br />

1 if i = j<br />

0 if i ̸= j.<br />

0<br />

Note that such a set of building blocks is just a set of functions orthonormal over the interval<br />

[0, T ).<br />

We say that a basis of N building-block waveforms exists for a collection s 1 (t), s 2 (t), · · · ,<br />

s |M| (t) of transmitted waveforms if we can decompose these transmitted waveforms as<br />

s m (t) = ∑<br />

s mi ϕ i (t), for m ∈ M = {1, 2, · · · , |M|}. (6.2)<br />

i=1,N<br />

53<br />

(6.1)


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 54<br />

Note that the coefficient s mi corresponding to a building-block ϕ i (t) satisfies<br />

∫ T<br />

0<br />

s m (t)ϕ i (t)dt =<br />

∫ T<br />

0<br />

∑<br />

s mj ϕ j (t)ϕ i (t)dt = ∑ ∫ T<br />

s mj ϕ j (t)ϕ i (t)dt = s mi . (6.3)<br />

j=1,N<br />

j=1,N<br />

0<br />

Now the correlation of r(t) and the signal s m (t) for m ∈ M can be expressed as<br />

with<br />

∫ T<br />

0<br />

r(t)s m (t)dt =<br />

∫ T<br />

0<br />

i=1,N<br />

r(t) ∑<br />

i=1,N<br />

0<br />

s mi ϕ i (t)dt<br />

= ∑ ∫ T<br />

s mi r(t)ϕ i (t)dt = ∑<br />

s mi r i , (6.4)<br />

i=1,N<br />

∫ T<br />

<br />

r i = r(t)ϕ i (t)dt for i = 1, N. (6.5)<br />

0<br />

Observe that all the correlations ∫ T<br />

0 r(t)s m(t)dt, m ∈ M, can be computed from the N<br />

correlations ∫ T<br />

0 r(t)ϕ i(t)dt, i = 1, N. Therefore also these N correlations ∫ T<br />

0 r(t)ϕ i(t)dt, i =<br />

1, N, are a sufficient statistic since from these N numbers an optimum decision can be made.<br />

This is summarized in the following theorem.<br />

RESULT 6.1 If the transmitted waveforms s m (t), m ∈ M, are linear combinations of the N<br />

building-block waveforms ϕ i (t), i = 1, N, i.e. if<br />

s m (t) = ∑<br />

s mi ϕ i (t), for m ∈ M, (6.6)<br />

i=1,N<br />

then the receiver can make an optimum decision from<br />

r i =<br />

∫ T<br />

0<br />

r(t)ϕ i (t)dt, for i = 1, N. (6.7)<br />

The numbers r 1 , r 2 , · · · , r N therefore form a sufficient statistic. The optimum receiver produces<br />

as estimate the message m that maximizes ∑ N<br />

i=1 r i s mi + c m over m ∈ M. Note that c m is given<br />

by ( 5.23 ).<br />

All this results in the receiver that is shown in figure 6.1. There we first see a bank of N<br />

multipliers and integrators. This bank yields the correlations r i = ∫ T<br />

0 r(t)ϕ i(t)dt, i = 1, N.<br />

Then a matrix multiplication yields ∑ i=1,N r is mi = ∫ T<br />

0 r(t)s m(t)dt for all m ∈ M. Adding the<br />

constants c m and choosing as message estimate the m that achieves the largest sum yields again<br />

an optimum receiver.


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 55<br />

r(t)<br />

ϕ 1 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

ϕ 2 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

integrator<br />

integrator<br />

r 1<br />

✲<br />

r 2<br />

✲<br />

weighting<br />

matrix<br />

c 1<br />

∑ Ni=1<br />

r i s 1i<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

c 2<br />

∑ Ni=1<br />

r i s 2i<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

✲ˆm<br />

ϕ N (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

.<br />

integrator<br />

r N<br />

✲<br />

.<br />

c |M|<br />

∑ Ni=1<br />

r i s |M|i<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

Figure 6.1: Correlation receiver based on building blocks.<br />

6.2 The Gram-Schmidt orthogonalization procedure<br />

It is not so difficult to construct an orthonormal basis, i.e a set of N building-block waveforms,<br />

for a given set of |M| finite-energy signaling waveforms. The Gram-Schmidt procedure can be<br />

used.<br />

RESULT 6.2 (Gram-Schmidt) For an arbitrary signal collection, i.e. a collection of waveforms<br />

{s 1 (t), s 2 (t), · · · , s |M| (t)} on the interval [0, T ) we can construct a set of N ≤ |M|<br />

building-block waveforms {ϕ 1 (t), ϕ 2 (t), · · · , ϕ N (t)} and find coefficients s mi such that for m =<br />

1, 2, · · · , |M| the signals can be synthesized as s m (t) = ∑ i=1,N s miϕ i (t).<br />

The proof of this result can found in appendix E. There is also an example there where for<br />

three signals we determine a set of building-blocks.<br />

A certain set of signals can be expanded in many different sets of building-block waveforms,<br />

also in sets with a larger dimensionality in general. What remains the same however is the geometrical<br />

configuration of the vector representations of the signals (see exercise 3 in this chapter).<br />

The Gram-Schmidt procedure always yields a base with the smallest possible dimension.<br />

6.3 Transmitting vectors instead of waveforms<br />

In the previous section we have seen that for each collection of signals s 1 (t), s 2 (t), · · · , s |M| (t)<br />

we can construct a finite orthonormal base such that each waveform s m (t) for m ∈ M is a linear<br />

combination of the building-block waveforms ϕ 1 (t), ϕ 2 (t), · · · , ϕ N (t) that form the base, i.e.<br />

s m (t) = ∑<br />

s mi ϕ i (t). (6.8)<br />

i=1,N


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 56<br />

ϕ 1 (t)<br />

✛❄<br />

✘<br />

s ✲<br />

m1 ×<br />

✚✙<br />

ϕ 2 (t)<br />

✛❄<br />

✘<br />

s ✲<br />

✲<br />

m2 ×<br />

✚✙<br />

. +<br />

✲<br />

ϕ N (t)<br />

✛❄<br />

✘<br />

s ✲<br />

m N<br />

×<br />

✚✙<br />

✲<br />

s m (t)<br />

Figure 6.2: A modulator forms the waveform s m (t) from the vector s m .<br />

Hence to each waveform s m (t) there corresponds a vector consisting of N coefficients<br />

s m = (s m1 , s m2 , · · · , s m N ). (6.9)<br />

Now, instead of the waveform s m (t) we can say that the vector s m is transmitted. A<br />

modulator (see figure 6.2) performs operation (6.8) that transforms a vector into the transmitted<br />

waveform.<br />

6.4 <strong>Signal</strong> space, signal structure<br />

The next result applies to the set of all transmitted waveforms.<br />

RESULT 6.3 If the waveforms s 1 (t), s 2 (t), · · · , s |M| (t) are linear combinations of N buildingblocks,<br />

the collection of waveforms can be represented as a collection of vectors s 1 , s 2 , · · · , s |M|<br />

in an N-dimensional space. This space is called the signal space. The collection of vectors<br />

s 1 , s 2 , · · · , s |M| is called the signal structure.<br />

Example 6.1 Consider for m = 1, 2, 3, 4 the set of phase-modulated transmitter waveforms<br />

{ √<br />

2E s<br />

s m (t) = T<br />

cos(2π f 0t + mπ ) for 0 ≤ t < T and<br />

2<br />

0 elsewhere,<br />

(6.10)<br />

where f 0 is an integral multiple of 1/T .<br />

Note that<br />

cos(2π f 0 t + mπ<br />

2 ) = cos(2π f 0t) cos mπ<br />

2 − sin(2π f 0t) sin mπ<br />

2 . (6.11)


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 57<br />

ϕ<br />

s 2<br />

3 ✻<br />

s 1<br />

✛<br />

s 2<br />

s<br />

✲ 4<br />

ϕ 1<br />

❄<br />

Figure 6.3: Four signals represented as vectors in a two-dimensional space. The length of all<br />

signal vectors is √ E s .<br />

If we now take as building-block waveforms<br />

ϕ 1 (t) = √ 2/T cos(2π f 0 t), and<br />

ϕ 2 (t) = √ 2/T sin(2π f 0 t), (6.12)<br />

for 0 ≤ t < T and zero elsewhere, we obtain the following vector-representations for the signals:<br />

s 1<br />

= (0, − √ E s ),<br />

s 2<br />

= (− √ E s , 0),<br />

s 3<br />

= (0, √ E s ),<br />

s 4<br />

= ( √ E s , 0). (6.13)<br />

These vectors are depicted in figure 6.3. Note that the four waveforms shown in figure 6.4 have the same<br />

signal structure (with respect to the base also shown there) as the signals defined in (6.10). We will learn<br />

later in this chapter that both sets of waveforms also have identical error behavior.<br />

√<br />

1<br />

τ<br />

0<br />

ϕ 1 (t)<br />

τ<br />

ϕ 2 (t)<br />

2τ<br />

t<br />

√<br />

E<br />

− s<br />

τ<br />

√<br />

E s<br />

τ<br />

0 τ 2τ 0<br />

s 1 (t)<br />

s 3 (t)<br />

t<br />

t<br />

√<br />

E<br />

− s<br />

τ<br />

√<br />

E s<br />

τ<br />

s 2 (t)<br />

s 4 (t)<br />

τ<br />

2τ<br />

t<br />

t<br />

0<br />

τ<br />

2τ<br />

0<br />

τ<br />

2τ<br />

Figure 6.4: Another set of waveforms that leads to the vector diagram in figure 6.3.


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 58<br />

6.5 Receiving vectors instead of waveforms<br />

In the previous sections we have shown that a waveform-transmitter actually sends a vector out of<br />

an N-dimensional signal space. In this section we will see that the optimum waveform-receiver<br />

as derived in section 6.1, which operates with sufficient statistics corresponding to the buildingblocks,<br />

actually receives a vector in this N-dimensional signal space.<br />

r(t)<br />

ϕ 1 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

ϕ 2 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

integrator<br />

integrator<br />

r 1<br />

✲<br />

r 2<br />

✲<br />

ϕ N (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

.<br />

integrator<br />

r N<br />

✲<br />

Figure 6.5: Recovering the vector r = (r 1 , r 2 , · · · , r N ) from the received waveform r(t).<br />

Consider the first part (demodulator) of the optimum receiver for waveform communication<br />

working with sufficient statistics (see figure 6.5). This receiver observes<br />

r(t) = s m (t) + n w (t), (6.14)<br />

where s m (t) for some m ∈ M is the transmitted waveform and n w (t) is a realization of a white<br />

Gaussian noise process. The first part of the receiver determines the N-dimensional vector r =<br />

(r 1 , r 2 , · · · , r N ) whose components are defined as (see figure 6.5)<br />

∫ T<br />

<br />

r i = r(t)ϕ i (t)dt for i = 1, 2, · · · , N. (6.15)<br />

0<br />

The components of this vector are the sufficient statistics mentioned in section 6.1. So we can<br />

say that the optimum receiver makes a decision based on the vector r instead of on the<br />

waveform r(t).<br />

6.6 The additive Gaussian vector channel<br />

In the previous sections we have seen that without losing optimality we may assume that an<br />

N-dimensional vector s m for some m ∈ M is transmitted and an N-dimensional vector r is


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 59<br />

received. We will now consider the vector-channel that has s m as input and r as output. Since<br />

r(t) = s m (t) + n w (t) we have that<br />

with for i = 1, N,<br />

r i =<br />

=<br />

∫ T<br />

0<br />

∫ T<br />

0<br />

r(t)ϕ i (t)dt<br />

Hence again, as in section 5.3, we get<br />

s m (t)ϕ i (t)dt +<br />

∫ T<br />

0<br />

n w (t)ϕ i (t)dt = s mi + n i , (6.16)<br />

∫ T<br />

<br />

n i = n w (t)ϕ i (t)dt. (6.17)<br />

0<br />

r = s m + n, (6.18)<br />

with r = (r 1 , r 2 , · · · , r N ), s = (s 1 , s 2 , · · · , s N ), and n = (n 1 , n 2 , · · · , n N ). Note that the<br />

dimension N of the vectors is the dimension of the signal space (the number of building-block<br />

waveforms), which is now finite. In section 5.3 the dimension of the vectors was infinite.<br />

What can we say about the noise vector n that disturbs the transmitted vector s m ? Just<br />

like before, in section 5.3, the noise components are jointly Gaussian. They result from linear<br />

operations on the Gaussian process N w (t). For the mean of component N i , i = 1, N, we again<br />

find that<br />

∫ T<br />

∫ T<br />

E[N i ] = E[ N w (t)ϕ i (t)dt] = E[N w (t)]ϕ i (t)dt = 0. (6.19)<br />

0<br />

0<br />

By the orthonormality of the building-block waveforms we get for the correlation of component<br />

N i , i = 1, N, and component N j , j = 1, N, that<br />

∫ T ∫ T<br />

E[N i N j ] = E[ N w (α)N w (β)ϕ i (α)ϕ j (β)dαdβ]<br />

=<br />

=<br />

=<br />

0 0<br />

∫ T ∫ T<br />

0 0<br />

∫ T ∫ T<br />

0<br />

∫ T<br />

0<br />

0<br />

E[N w (α)N w (β)]ϕ i (α)ϕ j (β)dαdβ<br />

N 0<br />

2 δ(α − β)ϕ i(α)ϕ j (β)dαdβ<br />

N 0<br />

2 ϕ i(α)ϕ j (α)dα = N 0<br />

2 δ i j, (6.20)<br />

and thus all N noise variables have variance N 0 /2 and are uncorrelated and since they are jointly<br />

Gaussian also independent of each other.<br />

RESULT 6.4 The joint density function of the N-dimensional noise vector n, that is added to<br />

the input of a vector channel, which is derived from a waveform communication system, is<br />

p N (n) =<br />

1<br />

exp(−‖n‖2 ), (6.21)<br />

(π N 0 ) N/2 N 0


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 60<br />

hence the noise is spherically symmetric and depends on the magnitude but not on the direction<br />

of n. The noise projected on each dimension i = 1, N has variance N 0 /2.<br />

We may conclude that our channel with input s and output r is an additive Gaussian noise<br />

(AGN) vector channel as was described in definition 4.2.<br />

6.7 <strong>Processing</strong> the received vector<br />

Note (see 4.12) that after having determined the vector r, an optimum vector receiver should<br />

search for the message m that minimizes<br />

‖r − s m ‖ 2 − 2σ 2 ln Pr{M = m}. (6.22)<br />

Note that by result 6.4 we must substitute N 0 /2 for σ 2 here. Inspection now shows that an<br />

optimum vector receiver should maximize<br />

(r · s m ) + N 0<br />

2 ln Pr{M = m} − ‖s m ‖2<br />

2<br />

= ∑<br />

i=1,N<br />

r i s mi + N 0<br />

2 ln Pr{M = m} − ‖s m ‖2<br />

2<br />

(6.23)<br />

over m ∈ M. If we note that<br />

E sm =<br />

∫ T<br />

0<br />

∫ T<br />

sm 2 (t)dt = s m (t) ∑<br />

s mi ϕ i (t)dt<br />

0<br />

i=1,N<br />

i=1,N<br />

= ∑ ∫ T<br />

s mi s m (t)ϕ i (t)dt = ∑<br />

smi 2 = ‖s m ‖2 , (6.24)<br />

0<br />

i=1,N<br />

we can observe that the back end of the receiver shown in figure 6.1 performs exactly as we<br />

expect.<br />

6.8 Relation between waveform and vector channel<br />

We have seen that to each set of waveforms there corresponds a finite set of building-block waveforms<br />

such that each waveform is a linear combination of these building-block waveforms. We<br />

have observed that instead of the waveform s m (t) the vector s m is transmitted over the channel.<br />

We have also seen that an optimum receiver only needs to consider the projections (correlations)<br />

of the received signal r(t) onto the building-block waveforms. The receiver can base its<br />

decision only on the vector r, a sufficient statistic. It is important to note that these conclusions<br />

depend on the fact that the channel adds white Gaussian noise to the input waveform.<br />

From the above we may conclude that transmission over an additive white Gaussian waveform<br />

channel can actually be considered as transmission over an additive Gaussian noise vector<br />

channel (see figure 6.6). A vector transmitter produces the vector s m when message m is generated<br />

by the source. This vector is transformed into the waveform s m (t) by a modulator. The


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 61<br />

transmitter<br />

optimum receiver<br />

<br />

❅ ❅ m s m s m (t) r(t) r ˆm<br />

source<br />

✲ vector ✲<br />

✲ waveform<br />

✲<br />

✲ ✲<br />

✲ modulator<br />

demodulator<br />

vector<br />

✲<br />

transmitter<br />

channel<br />

receiver<br />

❅ ❅<br />

vector channel<br />

<br />

Figure 6.6: Reduction of waveform channel to vector channel.<br />

waveform s m (t) is input to the channel that adds white Gaussian noise. The demodulator operates<br />

on the received waveform r(t) and forms the vector r. A vector receiver determines the<br />

estimate ˆm of the message that was sent from r. The combination modulator - waveform channel<br />

- demodulator is an additive Gaussian noise vector channel. In the previous chapter we have<br />

seen how an optimum receiver for such a channel can be constructed. The decision function that<br />

applies to this situation can be found in (4.12).<br />

We may conclude that, by converting the problem of designing an optimum receiver for<br />

waveform communication to designing an optimum receiver for an additive white Gaussian noise<br />

vector channel, we have solved it. We want to end this chapter with a very important observation:<br />

RESULT 6.5 The error behavior of the waveform communication system is only determined by<br />

the collection of signal vectors (the signal structure) and N 0 /2. We should realize that depending<br />

on the chosen building-block waveforms different sets of waveforms having the same signal structure<br />

yield the same error performance. What building-block waveforms actually are chosen by<br />

the communication engineer depends possibly on other factors as e.g. bandwidth requirements,<br />

as we will see in later parts of these course notes.<br />

6.9 Exercises<br />

1. For each set of waveforms s 1 (t), s 2 (t), · · · , s |M| (t) we can apply the Gram-Schmidt procedure<br />

to construct an orthonormal base such that each waveform s m (t) for m ∈ M is a<br />

linear combination of the building-block waveforms ϕ 1 (t), ϕ 2 (t), · · · , ϕ N (t). Show that<br />

for a given set of waveforms the Gram-Schmidt procedure results in a base with the smallest<br />

possible dimension N.<br />

2. If we apply the Gram-Schmidt procedure we must assume that the waveforms s m (t), for<br />

m ∈ M, have finite energy. In the construction it is essential that the energy E θm corresponding<br />

the the auxiliary signal θ m, (t) is finite. Show that<br />

E θm ≤<br />

∫ T<br />

0<br />

sm 2 (t)dt < ∞. (6.25)


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 62<br />

3. Show that independent of the choice of the building-block waveforms, the length of any<br />

signal vector s m for m ∈ M is constant. Moreover show that the Euclidean distance<br />

between two signal vectors s m and s m ′ for m, m ′ ∈ M is constant. What does this imply<br />

for energy and expected error probability of the system?<br />

4. (a) Calculate PE<br />

min when the signal sets (a), (b), and (c) specified in figure 6.7 are used to<br />

communicate one of two equally likely messages over a channel disturbed by additive<br />

Gaussian noise with S n ( f ) = 0.15. Note that the third set is specified by the Fourier<br />

transforms (see appendix B) of the waveforms. Use Parseval’s equation.<br />

(b) Also calculate the PE<br />

min for the same three sets when the a-priori message probabilities<br />

are 1/4, 3/4.<br />

(Based on exercise 4.8 from Wozencraft and Jacobs [25].)<br />

s 1 (t)<br />

1 1<br />

s 2 (t)<br />

s 1 (t)<br />

2 2<br />

s 2 (t)<br />

t t t t<br />

0 1 0 1<br />

0 1 0 2<br />

(a)<br />

(b)<br />

1<br />

S 1 ( f )<br />

-1 0 +1<br />

f<br />

(c)<br />

− 3 2<br />

1<br />

− 1 2<br />

S 2 ( f )<br />

Figure 6.7: S i ( f ), the Fourier transform of s i (t), is real.<br />

1<br />

2<br />

3<br />

2<br />

f<br />

5. One of three equally likely messages is to be transmitted over an additive white Gaussian<br />

noise channel with S n ( f ) = N 0 /2 = 1 by means of pulse amplitude modulation.<br />

Specifically<br />

for 0 ≤ t ≤ 1 and zero elsewhere.<br />

s 1 (t) = +2 √ 2 sin(2πt),<br />

s 2 (t) = 0,<br />

s 3 (t) = −2 √ 2 sin(2πt), (6.26)<br />

(a) What mathematical operations are performed by the optimum receiver?<br />

(b) Express the resulting average error probability in terms of Q(·).


CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 63<br />

(c) Calculate the minimum attainable average error probability if instead of being equal<br />

the message probabilities satisfy<br />

Pr{M = 1} = Pr{M = 3} = 1 and Pr{M = 2} =<br />

e<br />

e + 2<br />

where e is the base of the natural logarithm.<br />

(Exam Communication Principles, July 5, 2002.)<br />

e + 2 , (6.27)<br />

6. One of two equally likely messages is to be transmitted over an additive white Gaussian<br />

noise channel with S n ( f ) = N 0 /2 = 2 using two waveforms s 1 (t) and s 2 (t). Specifically<br />

s 1 (t) = 2 √ 2φ 1 (t), and<br />

where φ 1 (t) and φ 2 (t) are two orthonormal waveforms, i.e.<br />

∫ ∞<br />

−∞<br />

s 2 (t) = 2 √ 2φ 2 (t), (6.28)<br />

∫ ∞<br />

∫ ∞<br />

φ1 2 (t)dt = φ2 2 (t)dt = 1 and φ 1 (t)φ 2 (t)dt = 0. (6.29)<br />

−∞<br />

(a) What mathematical operations are performed by an optimum receiver?<br />

(b) Express the the resulting average error probability in terms of Q(·).<br />

(c) Calculate the minimum attainable average error probability if instead of being equal<br />

the message probabilities satisfy<br />

−∞<br />

Pr{M = 1} = 1<br />

e 2 and Pr{M = 2} =<br />

e2<br />

+ 1<br />

where e is the base of the natural logarithm.<br />

(Exam Communication Principles, July 8, 2003.)<br />

e 2 + 1 , (6.30)


Chapter 7<br />

Matched filters<br />

SUMMARY: Here we discuss matched-filter receiver structures for optimum waveform<br />

communication. The optimum receiver can be implemented as a matched-filter receiver,<br />

so a matched filter not only maximizes the signal-to-noise ratio but it can also be used to<br />

minimize the error probability. In the last part of this chapter we consider the Parseval<br />

relation between correlation and dot product.<br />

7.1 Matched-filter receiver<br />

r(t)<br />

✲<br />

h i (t) = ϕ i (T − t)<br />

u i (t)<br />

✲<br />

Figure 7.1: A filter matched to the building-block waveform ϕ i (t).<br />

Note that for all i = 1, N the building-block waveforms are such that ϕ i (t) ≡ 0 for t < 0<br />

and t ≥ T .<br />

We can now replace the N multipliers and integrators by N matched filters and samplers.<br />

This can be advantageous in analog implementations since accurate multipliers are in that case<br />

more expensive than filters. Note that appendix C contains some elementary material on filters.<br />

Consider (see figures 7.1 and 7.2) filters with an impulse response h i (t) = ϕ i (T − t) for<br />

i = 1, N. Note that h i (t) ≡ 0 for t ≤ 0 (i.e. the filter is causal) and also for t > T . For the<br />

output of such a filter we can write<br />

u i (t) = r(t) ∗ h i (t) =<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(α)h i (t − α)dα<br />

r(α)ϕ i (T − t + α)dα. (7.1)<br />

64


CHAPTER 7. MATCHED FILTERS 65<br />

ϕ i (t)<br />

ϕ i (T − t)<br />

❆<br />

✁ ✁✁❆ ❆<br />

❆ ❆<br />

✁✁ ✁ ❆<br />

✁ ✁✁✁✁❆ ❆<br />

❆❆ t<br />

❆❆ t<br />

✁<br />

0 T<br />

0<br />

T<br />

Figure 7.2: A building-block waveform ϕ i (t) and the impulse response ϕ i (T − t) of the corresponding<br />

matched filter.<br />

At t = T the matched-filter output<br />

u i (T ) =<br />

∫ ∞<br />

−∞<br />

r(α)ϕ i (α)dα =<br />

∫ T<br />

0<br />

r(α)ϕ i (α)dα = r i , (7.2)<br />

hence we can determine the i-th component of the vector r in this way. Note that r i is the i-th<br />

component of the sufficient statistic corresponding to the building-blocks.<br />

r(t)<br />

✲<br />

✲<br />

ϕ 1 (T − t)<br />

ϕ 2 (T − t)<br />

❍ r 1<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁ ✲<br />

u 1 (t)<br />

❍ r 2<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁ ✲<br />

u 2 (t)<br />

weighting<br />

matrix<br />

c 1<br />

(r · s ❄ 1 ) ✛✘<br />

✲ + ✲<br />

✚✙<br />

c 2<br />

(r · s ❄ 2 ) ✛✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

ˆm<br />

✲<br />

∑<br />

✲ ϕ N (T − t)<br />

❍ r N<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁ ✲<br />

u N (t)<br />

i r is mi<br />

c |M|<br />

(r · s ✛❄<br />

|M| ) ✘<br />

✲ + ✲<br />

✚✙<br />

❄<br />

sample at t = T<br />

Figure 7.3: Matched-filter receiver based on building-blocks.<br />

Now we have an alternative method for computing the vector r. This results in the receiver<br />

shown in figure 7.3. Note that for all m ∈ M<br />

(r · s m ) = ∑<br />

i=1,N<br />

r i s mi =<br />

∫ T<br />

0<br />

r(t)s m (t)dt. (7.3)


CHAPTER 7. MATCHED FILTERS 66<br />

A filter whose impulse response is a delayed time-reversed version of a signal ϕ i (t) is called<br />

”matched to” ϕ i (t). A receiver that is equipped with such filters is called a matched-filter receiver.<br />

7.2 Direct receiver<br />

Note that, just like the building-blocks, for all m ∈ M the transmitted waveforms are such that<br />

s m (t) ≡ 0 for t < 0 and t ≥ T .<br />

Now for all m ∈ M consider a filter with impulse response s m (T − t), let the waveform<br />

channel output r(t) be the input of all these filters and sample the |M| filter outputs at t = T .<br />

Then we obtain<br />

∫ ∞<br />

−∞<br />

r(α)s m (T − t + α)dα t=T =<br />

∫ ∞<br />

−∞<br />

r(α)s m (α)dα =<br />

∫ T<br />

0<br />

r(α)s m (α)dα. (7.4)<br />

This gives another method to form an optimum receiver (see figure 7.4). This receiver is called a<br />

direct receiver since the filters are matched directly to the signals {s m (t), m ∈ M}.<br />

Note that a direct receiver is usually more expensive than a receiver with filters matched to<br />

the building-block waveforms, since always M ≥ N and in practice even often M ≫ N. The<br />

weighting-matrix operations are not needed here however.<br />

r(t)<br />

✲<br />

✲<br />

s 1 (T − t)<br />

s 2 (T − t)<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

c 1<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

c 2<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

ˆm<br />

✲<br />

✲<br />

s |M| (T − t)<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

c |M|<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

❄<br />

sample at t = T<br />

Figure 7.4: Direct matched-filter receiver.<br />

7.3 <strong>Signal</strong>-to-noise ratio<br />

We have seen in the previous section that a matched-filter receiver is optimum, i.e. minimizes<br />

the error probability P E . Not only in this sense is the matched filter optimum, we will show


CHAPTER 7. MATCHED FILTERS 67<br />

r(t) = s(t) + n w (t)<br />

✲<br />

h(t)<br />

❍ u ❍❍ s (T ) + u n (T )<br />

✄<br />

✂ ✄<br />

✁ ✂ ✁<br />

✲<br />

❄ sample at t = T<br />

Figure 7.5: A matched filter maximizes S/N .<br />

next that it also maximizes the signal-to-noise ratio. To see what we mean by this, consider<br />

the communication situation shown in figure 7.5. A signal s(t) is assumed to be non-zero only<br />

for 0 ≤ t < T . This signal is observed in additive white noise, i.e. the observer receives<br />

r(t) = s(t)+n w (t). The process N w (t) is a zero-mean white Gaussian noise process with power<br />

spectral density S w ( f ) = N 0 /2 for all −∞ < f < ∞.<br />

The problem is now e.g. to decide whether the signal s(t) was present in the noise or not<br />

(or in a similar setting to decide whether s(t) or −s(t) was seen). Therefore the observer uses a<br />

linear time-invariant filter with impulse response h(t) and samples the filter output at time t = T .<br />

For the sampled filter output u(t) at time t = T we can write<br />

with<br />

u(T ) = r(t) ∗ h(t)| t=T<br />

=<br />

u s (T )<br />

u n (T )<br />

∫ ∞<br />

−∞<br />

<br />

=<br />

<br />

=<br />

r(T − α)h(α)dα = u s (T ) + u n (T ), (7.5)<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

s(T − α)h(α)dα, and (7.6)<br />

n w (T − α)h(α)dα, (7.7)<br />

where u s (T ) is the signal component and u n (T ) the noise component in the the sampled filter<br />

output.<br />

Definition 7.1 We can now define the signal-to-noise ratio as<br />

i.e. the ratio between signal energy and noise variance.<br />

The noise variance can be expressed as<br />

E[U 2 n (T )] = E[ ∫ ∞<br />

=<br />

= N 0<br />

2<br />

−∞<br />

∫ ∞ ∫ ∞<br />

−∞ −∞<br />

∫ ∞ ∫ ∞<br />

−∞<br />

S/N = u2 s (T )<br />

E[Un 2 (7.8)<br />

(T )],<br />

N w (T − α)h(α)dα<br />

∫ ∞<br />

−∞<br />

N w (T − β)h(β)dβ]<br />

E[N w (T − α)N w (T − β)]h(α)h(β)dαdβ<br />

−∞<br />

δ(β − α)h(α)h(β)dαdβ = N 0<br />

2<br />

∫ ∞<br />

−∞<br />

h 2 (α)dα. (7.9)


CHAPTER 7. MATCHED FILTERS 68<br />

RESULT 7.1 If we substitute (7.9) and (7.6) in (7.8) we obtain for the maximum attainable<br />

signal-to-noise ratio<br />

S/N =<br />

(∗)<br />

≤<br />

=<br />

[∫ ∞<br />

−∞ s(T − α)h(α)dα] 2<br />

N 0<br />

∫ ∞<br />

2 −∞ h2 (α)dα<br />

∫ ∞<br />

−∞ s2 (T − α)dα ∫ ∞<br />

−∞ h2 (α)dα<br />

N 0<br />

∫ ∞<br />

2 −∞ h2 (α)dα<br />

∫ ∞ ∫<br />

−∞ s2 (T − α)dα T<br />

=<br />

N 0<br />

2<br />

0 s2 (T − α)dα<br />

. (7.10)<br />

N 0<br />

2<br />

The inequality (∗) in this derivation comes from Schwarz inequality (see appendix F). Equality<br />

is obtained if and only if h(t) = Cs(T − t) for some constant C, i.e. if the filter h(t) is matched<br />

to the signal s(t).<br />

Note that the maximum signal-to-noise ratio depends only on the energy of the waveform<br />

s(t) and not on its specific shape.<br />

It should be noted that in this section we have demonstrated a weaker form of optimality for<br />

the matched filter than the one that we have obtained in the previous chapter. The matched filter<br />

is not only the filter that maximizes signal-to-noise ratio, but can be used for optimum detection<br />

as well.<br />

7.4 Parseval relation<br />

Equation (7.3) demonstrates that a correlation can be expressed as the dot product of the corresponding<br />

vectors. We will investigate this phenomenon more closely here. Therefore consider a<br />

set of orthonormal waveforms {ϕ i (t), i = 1, 2, · · · , N} over interval [0, T ) and two waveforms<br />

f (t) and g(t) that can be expressed in terms of these building-blocks, i.e.<br />

f (t)<br />

g(t)<br />

<br />

= ∑<br />

i=1,N<br />

<br />

= ∑<br />

i=1,N<br />

f i ϕ i (t),<br />

The vector-representations that correspond to f (t) and g(t) are<br />

Now:<br />

f = ( f 1 , f 2 , · · · , f N ) and<br />

g i ϕ i (t). (7.11)<br />

g = (g 1 , g 2 , · · · , g N ). (7.12)


CHAPTER 7. MATCHED FILTERS 69<br />

RESULT 7.2 (Parseval relation)<br />

∫ T<br />

0<br />

f (t)g(t)dt =<br />

∫ T<br />

0<br />

= ∑<br />

i=1,N<br />

= ∑<br />

i=1,N<br />

∑<br />

i=1,N<br />

∑<br />

j=1,N<br />

∑<br />

j=1,N<br />

f i ϕ i (t) ∑<br />

j=1,N<br />

g j ϕ j (t)dt<br />

∫ T<br />

f i g j ϕ i (t)ϕ j (t)dt<br />

0<br />

f i g j δ i j = ∑<br />

i=1,N<br />

f i g i = ( f · g). (7.13)<br />

This result says that the correlation of f (t) and g(t), which is defined as the integral of their<br />

product, is equal to the dot product of the corresponding vectors. Note that this result can be<br />

regarded as an analogue to the Parseval relation in Fourier analysis (see appendix B) which says<br />

that<br />

with<br />

∫ ∞<br />

−∞<br />

F( f ) =<br />

G( f ) =<br />

f (t)g(t)dt =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

Here G ∗ ( f ) is the complex conjugate of G( f ).<br />

The consequences of result 7.2 are:<br />

• Take g(t) ≡ f (t) then<br />

∫ T<br />

0<br />

∫ ∞<br />

−∞<br />

F( f )G ∗ ( f )d f, (7.14)<br />

f (t) exp(− j2π f t)dt, and<br />

g(t) exp(− j2π f t)dt. (7.15)<br />

f 2 (t)dt = ( f · f ) = ‖ f ‖ 2 . (7.16)<br />

This means that the energy of waveform f (t) is simply the square of the length of the<br />

corresponding vector f . We therefore also call the squared length of a vector its energy.<br />

We have seen before that ∫ T<br />

= sm 2 (t)dt = ‖s m ‖2 , (7.17)<br />

E sm<br />

0<br />

the energy corresponding to the waveform s m (t), for m ∈ M = {1, 2, · · · , |M|}.<br />

• Consider ∫ T<br />

r(t)s m (t)dt =<br />

0<br />

∫ T<br />

0<br />

i=1,N<br />

r(t) ∑<br />

i=1,N<br />

0<br />

s mi ϕ i (t)dt<br />

= ∑ ∫ T<br />

s mi r(t)ϕ i (t)dt = ∑<br />

s mi r i = (s m · r). (7.18)<br />

i=1,N<br />

This result is similar to the Parseval result 7.2 but not identical since r(t) ̸= ∑ i=1,N r iϕ i (t),<br />

i.e. r(t) can not be expressed as a linear combination of building-block waveforms. Note<br />

that (7.18) is identical to equation (7.3).


CHAPTER 7. MATCHED FILTERS 70<br />

7.5 Notes<br />

Figure 7.6: Dwight O. North, inventor of the matched filter. Photo IEEE-IT Soc. Newsl., Dec.<br />

1998.<br />

The matched filter as a filter for maximizing the signal-to-noise ratio was invented by North<br />

[14] in 1943. The result was published in a classified report at RCA Labs in Princeton. The<br />

name “matched filter” was coined by Van Vleck and Middleton who independently published<br />

the result a year later in a Harvard Radio Research Lab report [24].<br />

7.6 Exercises<br />

1. One of two equally likely messages is to be transmitted over an additive white Gaussian<br />

noise channel with S n ( f ) = 0.05 by means of binary pulse position modulation. Specifically<br />

s 1 (t) = p(t)<br />

in which the pulse p(t) is shown in figure 7.7.<br />

s 2 (t) = p(t − 2), (7.19)<br />

(a) What mathematical operations are performed by the optimum receiver?<br />

(b) What is the resulting average error probability?<br />

(c) Indicate two methods of implementing the receiver, each of which uses a single linear<br />

filter followed by a sampler and comparison device. Method I requires that two samples<br />

from the filter output be fed into the comparison device. Method II requires that<br />

just one sample be used. For each method sketch the impulse response of the appropriate<br />

filter and its response to p(t). Which of the methods is most easily extended to<br />

M-ary pulse position modulation, where s m (t) = p(t − 2m + 2), m = 1, 2, · · · , M?


CHAPTER 7. MATCHED FILTERS 71<br />

2 p(t)<br />

❆<br />

1 ❆❆❆❆<br />

0 ✁✁✁✁✁✁❆ 1 2 t<br />

Figure 7.7: A pulse.<br />

(d) Suggest another pair of waveforms that require the same energy as the binary pulseposition<br />

waveforms and yield the same average error probability and a pair that<br />

achieves smaller error probability.<br />

(e) Calculate the minimum attainable average error probability if<br />

Repeat this for<br />

(Exercise 4.10 from Wozencraft and Jacobs [25].)<br />

s 1 (t) = p(t) and s 2 (t) = p(t − 1). (7.20)<br />

s 1 (t) = p(t) and s 2 (t) = −p(t − 1). (7.21)<br />

√<br />

Es /2<br />

s 1 (t)<br />

√<br />

Es /7<br />

s 2 (t)<br />

0<br />

1 2 3 4 5 6 7 t<br />

0 1 2 3 4 5 6 7 t<br />

− √ E s /7<br />

Figure 7.8: The waveforms s 1 (t) and s 2 (t).<br />

2. Specify a matched filter for each of the signals shown in figure 7.8 and sketch each filter<br />

output as a function of time when the signal matched to it is the input. Sketch the output<br />

of the filter matched to s 2 (t) when the input is s 1 (t).<br />

(Exercise 4.14 from Wozencraft and Jacobs [25].)<br />

3. Consider a transmitter that sends the signal<br />

s a (t) = ∑<br />

a k h(t − (k − 1)τ),<br />

k=1,K


CHAPTER 7. MATCHED FILTERS 72<br />

impulses<br />

a 1 , · · · , a K<br />

+1 +1<br />

✻<br />

0<br />

❄<br />

−1<br />

✻<br />

✲<br />

✻<br />

τ 2τ 3τ<br />

t<br />

h(t)<br />

n w (t)<br />

s ❄<br />

a (t) ✗ ✔<br />

✲ + ✲ h(T − t)<br />

✖ ✕<br />

h(t)<br />

✁ ✁✁❅ ❅ ✁ ✁❅ ❅✁ ✁❅ ❅ ❅ ✁ ✁❅<br />

❅<br />

0 τ❆ 2τ 3τ 4τ<br />

❆❆ <br />

−h(t − τ)<br />

h(t − 3τ)<br />

t<br />

❍ ❍❍ ✄<br />

✂ ✁<br />

❄<br />

✲<br />

sample at<br />

t = T + (k − 1)τ<br />

✁ ✁✁❅ ❅ ✂❅ ❅ ❅<br />

❇❇❇ ❅❅<br />

0 τ 2τ 3τ 4τ<br />

❇ ✂✂✂<br />

s a (t)<br />

t<br />

Figure 7.9: Pulse transmission and detection with a sampled-matched-filter receiver. <strong>Signal</strong>s are<br />

also shown.<br />

when the message a = a 1 , a 2 , · · · , a K has to be conveyed to the decoder (i.e. pulsetransmission,<br />

see figure 7.9). Assume that a k ∈ {−1, +1} for k = 1, K , thus there are 2 K<br />

messages. The signal is sent over an additive white Gaussian noise waveform channel.<br />

Assume that h(t) = 0 for t < 0 and t ≥ T . The receiver now uses a matched filter<br />

h(T − t) and samples its output at t = T + (k − 1)τ, for k = 1, K . These K samples are<br />

then processed to form an estimate of the transmitted message (sequence).<br />

Show that this receiver is optimum if processing of the samples is done in the right way.<br />

What is optimum processing here?<br />

4. One of two equally likely messages is to be transmitted over an additive white Gaussian<br />

noise channel with S n ( f ) = N 0 /2 = 1 by means of binary pulse-position modulation.<br />

Specifically<br />

s 1 (t) = p(t),<br />

s 2 (t) = p(t − 2),<br />

in which the pulse p(t) is shown in figure 7.10.<br />

(a) Describe (and sketch) an optimum receiver for this case? Express the resulting error<br />

probability in terms of Q(·).<br />

(b) Give the implementation of an optimum receiver which uses a single linear filter<br />

followed by a sampler and comparison device. Assume that two samples from the<br />

filter output are fed into the comparison device. Sketch the impulse response of the


CHAPTER 7. MATCHED FILTERS 73<br />

2 p(t)<br />

1<br />

0 1 2 t<br />

Figure 7.10: A rectangular pulse.<br />

appropriate filter. What is the output of the filter at both sample moments when the<br />

filter input is s 1 (t)? What are these outputs for filter input s 2 (t)?<br />

(c) Calculate the minimum attainable average error probability if<br />

s 1 (t) = p(t) and s 2 (t) = p(t − 1).<br />

(Exam Communication Principles, October 6, 2003.)<br />

5. In a communication system based on an additive white Gaussian noise waveform channel<br />

six signals (waveforms) are used. All signals are zero for t < 0 and t ≥ 8. For 0 ≤ t < 8<br />

the signals are<br />

s 1 (t) = 0<br />

s 2 (t) = +2 cos(πt/2)<br />

s 3 (t) = +2 cos(πt/2) + 2 sin(πt/2)<br />

s 4 (t) = +2 cos(πt/2) + 4 sin(πt/2)<br />

s 5 (t) = +4 sin(πt/2)<br />

s 6 (t) = +2 sin(πt/2)<br />

The messages corresponding to the signals all have probability 1/6. The power spectral<br />

density of the noise process N w (t) is N 0 /2 = 4/9 for all f . The receiver observes the<br />

received waveform r(t) = s m (t) + n w (t) in the time interval 0 ≤ t < 8.<br />

(a) Determine a set of building-block waveforms for these six signals. Sketch these<br />

building-block waveforms. Show that they are orthonormal over [0, 8). Give the<br />

vector representations of all six signals and sketch the resulting signal structure.<br />

(b) Describe for what received vectors r an optimum receiver chooses ˆM = 1, ˆM =<br />

2, · · · , and ˆM = 6. Sketch the corresponding decision regions I 1 , I 2 , · · · , I 6 . Give<br />

an expression for the error probability P E obtained by an optimum receiver. Use the<br />

Q(·)-function.<br />

(c) Sketch and specify a matched-filter implementation of an optimum receiver for the<br />

six signals (all occurring with equal a-priori probability). Use only two matched<br />

filters.


CHAPTER 7. MATCHED FILTERS 74<br />

(d) Compute the average energy of the signals. We can translate the signal structure<br />

over a vector a such that the average signal energy is minimal. Determine this vector<br />

a. What is the minimal average signal energy? Specify the modified signals<br />

s<br />

1 ′ (t), s′ 2 (t), · · · , s′ 6<br />

(t) that correspond to this translated signal structure.<br />

(Exam Communication Theory, November 18, 2003.)


Chapter 8<br />

Orthogonal signaling<br />

SUMMARY: Here we investigate orthogonal signaling. In this case the waveforms that<br />

correspond to the messages are orthogonal. We determine the average error probability P E<br />

for these signals. It appears that P E can be made arbitrary small by increasing the number<br />

of waveforms if only the energy per transmitted bit is larger than N 0 ln 2. Note that this is<br />

a capacity-result!<br />

8.1 Orthogonal signal structure<br />

Consider |M| signals s m (t), or in vector representation s m , with a-priori probabilities Pr{M =<br />

m} = 1/|M| for m ∈ M = {1, 2, · · · , |M|}. We now define an orthogonal signal set in the<br />

following way:<br />

Definition 8.1 All signals in an orthogonal set are assumed to have equal energy and are orthogonal<br />

i.e.<br />

s m<br />

=<br />

√<br />

Es ϕ m<br />

for m ∈ M, (8.1)<br />

where ϕ m<br />

is the unit-vector corresponding to dimension m. There are as many building-block<br />

waveforms ϕ m (t) and dimensions in the signal space as there are messages.<br />

The signals that we have defined are now orthogonal since<br />

∫ ∞<br />

−∞<br />

s m (t)s k (t)dt = (s m · s k ) = E s (ϕ m · ϕ k<br />

) = E s δ mk for m ∈ M and k ∈ M. (8.2)<br />

Note that all signals have energy equal to E s .<br />

So far we have not been very explicit about the actual signals. However we can e.g. think of<br />

(disjoint) shifts of a pulse (pulse-position modulation, PPM) or sines and cosines with an integer<br />

number of periods over [0, T ) (frequency-shift keying, FSK).<br />

75


CHAPTER 8. ORTHOGONAL SIGNALING 76<br />

8.2 Optimum receiver<br />

How does the optimum receiver decide when it receives the vector r = (r 1 , r 2 , · · · , r |M| )? It has<br />

to choose the message m ∈ M that minimizes the squared Euclidean distance between s m and<br />

r, i.e.<br />

‖r − s m ‖ 2 = ‖r‖ 2 + ‖s m ‖ 2 − 2(r · s m )<br />

= ‖r‖ 2 + E s − 2 √ E s r m . (8.3)<br />

RESULT 8.1 Since only the term −2 √ E s r m depends on m an optimum receiver has to choose<br />

ˆm such that<br />

r ˆm ≥ r m for all m ∈ M. (8.4)<br />

Now we can find an expression for the error probability.<br />

8.3 Error probability<br />

To determine the error probability for orthogonal signaling we may assume, because of symmetry,<br />

that signal s 1 (t) was actually sent. Then<br />

r 1 = √ E s + n 1 ,<br />

r m = n m , for m = 2, 3, · · · , |M|,<br />

with p N (n) =<br />

∏ 1<br />

√ exp(− n2 m<br />

). π N0 N 0<br />

(8.5)<br />

m=1,|M|<br />

Note that the noise vector n = (n 1 , n 2 , · · · , n |M| ) consists of |M| independent components all<br />

with mean 0 and variance N 0 /2.<br />

Suppose that the first component of the received vector is α. Then we can write for the correct<br />

probability, conditional on the fact that message 1 was sent and that the first component of r is<br />

α,<br />

Pr{ ˆM = 1|M = 1, R 1 = α} = Pr{N 2 < α, N 3 < α, · · · , N |M| < α}<br />

= (Pr{N 2 < α}) |M|−1<br />

(∫ α<br />

|M|−1<br />

(β)dβ)<br />

= p N (8.6)<br />

therefore the correct probability<br />

P C =<br />

∫ ∞<br />

−∞<br />

−∞<br />

p N (α − √ (∫ α<br />

|M|−1<br />

E s ) p N (β)dβ)<br />

dα. (8.7)<br />

−∞


CHAPTER 8. ORTHOGONAL SIGNALING 77<br />

We rewrite this correct probability as follows<br />

P C =<br />

=<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

1<br />

√ exp(− (α − √ E s ) 2 (∫ α<br />

|M|−1<br />

) · · · dβ)<br />

dα<br />

π N0 N 0 −∞<br />

1<br />

√ exp(− (α/√ N 0 /2 − √ E s / √ N 0 /2) 2<br />

2π<br />

(∫ α<br />

|M|−1<br />

) · · · dβ)<br />

dα/ √ N 0 /2<br />

−∞<br />

2<br />

1<br />

√ exp(− (µ − √ ( ∫<br />

2E s /N 0 ) 2 √ |M|−1 µ N0 /2<br />

) · · · dβ)<br />

dµ (8.8)<br />

2π 2<br />

−∞<br />

with µ = α/ √ N 0 /2. Furthermore<br />

∫ µ<br />

√ N0 /2<br />

−∞<br />

1<br />

√ π N0<br />

exp(− β2<br />

N 0<br />

)dβ =<br />

=<br />

∫ µ<br />

√ N0 /2<br />

−∞<br />

∫ µ<br />

−∞<br />

1<br />

√ exp(− (β/√ N 0 /2) 2<br />

)dβ/ √ N 0 /2<br />

2π 2<br />

1<br />

√ exp(− λ2<br />

2π 2<br />

)dλ (8.9)<br />

with λ = β/ √ N 0 /2. Therefore<br />

P C =<br />

∫ ∞<br />

−∞<br />

(∫ µ<br />

|M|−1<br />

p(µ − b) p(λ)dλ)<br />

dµ, (8.10)<br />

−∞<br />

with p(γ ) = 1 √<br />

2π<br />

exp(− γ 2<br />

2 ) and b = √ 2E s /N 0 . Note that p(γ ) is the probability density<br />

function of a Gaussian random variable with mean zero and variance 1.<br />

From (8.10) we conclude that for a given |M| the correct probability P C depends only on<br />

b i.e. on the ratio E s /N 0 . This can be regarded as a signal to noise ratio since E s is the signal<br />

energy and N 0 is twice the variance of the noise in each dimension.<br />

Example 8.1 In figure 8.1 the error probability P E = 1 − P C is depicted for values of log 2 |M| =<br />

1, 2, · · · , 16 and as a function of E s /N 0 .<br />

8.4 Capacity<br />

Consider the following experiment. We keep increasing |M| and want to know how we should<br />

increase E s /N 0 such that the error probability P E gets smaller and smaller. It turns out that it is<br />

the energy per bit that counts. We define E b i.e. the energy per transmitted bit of information<br />

E b<br />

=<br />

Reliable communication is possible if E b > N 0 ln 2. More precisely:<br />

E s<br />

log 2 |M| . (8.11)


CHAPTER 8. ORTHOGONAL SIGNALING 78<br />

10 0 Es/N0<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

−4 −2 0 2 4 6 8 10 12 14 16<br />

Figure 8.1: Error probability P E for |M| orthogonal signals as a function of E s /N 0 in dB for<br />

log 2 |M| = 1, 2, · · · , 16. Note that P E increases with |M| for fixed E s /N 0 .<br />

RESULT 8.2 The error probability for orthogonal signalling satisfies<br />

P E ≤<br />

{ 2 exp(− log2 |M|[E b /(2N 0 ) − ln 2]) E b /N 0 ≥ 4 ln 2<br />

2 exp(− log 2 |M|[ √ E b /N 0 − √ ln 2] 2 ) ln 2 ≤ E b /N 0 ≤ 4 ln 2.<br />

(8.12)<br />

The proof of result 8.2 can be found in appendix G.<br />

The consequence of (8.12) is that if E b , i.e. the energy per bit, is larger than N 0 ln 2 we can<br />

get an arbitrary small error probability by increasing |M|, the dimensionality of the signal space.<br />

This a capacity result!<br />

Note that the number of bits that we transmit each time is log 2 |M| and this number grows<br />

much slower than |M| itself.<br />

Example 8.2 In figure 8.2 we have plotted the error probability P E as a function of the ratio E b /N 0 . It<br />

appears that for ratios larger than ln 2 = −1.5917 dB (this number is called the Shannon limit) the error<br />

probability decreases by making |M| larger.


CHAPTER 8. ORTHOGONAL SIGNALING 79<br />

10 0 Eb/N0<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

−4 −2 0 2 4 6 8 10 12 14 16<br />

Figure 8.2: Error probability for |M| = 2, 4, 8, · · · , 32768 now as function of the ratio E b /N 0<br />

in dB.<br />

8.5 Exercises<br />

1. A Hadamard matrix is a matrix whose elements are ±1. When n is a power of 2, an n × n<br />

Hadamard matrix is constructed by means of the recursion:<br />

)<br />

,<br />

H 2<br />

=<br />

( +1 +1<br />

+1 −1<br />

( )<br />

+Hn +H<br />

H 2n = n<br />

. (8.13)<br />

+H n −H n<br />

Let n be a power of 2 and M = {1, 2, · · · , n}. Consider for m ∈ M the signal vectors<br />

s m<br />

=<br />

√<br />

E s<br />

n h m where h m is the m-th row of the Hadamard matrix H n.<br />

(a) Show that the signal set {s m , m ∈ M} consists of orthogonal vectors all having<br />

energy E s .<br />

(b) What is the error probability P E if the signal vectors correspond to waveforms that<br />

are transmitted over a waveform channel with additive white Gaussian noise having<br />

spectral density N 0 /2.<br />

(c) What is the advantage of using the Hadamard signal set over the orthogonal set from<br />

definition 8.1 if we assume that the building-block waveforms are in both cases timeshifts<br />

of a pulse? And the disadvantage?


CHAPTER 8. ORTHOGONAL SIGNALING 80<br />

(d) The matrix n × 2n matrix<br />

H ∗ n<br />

<br />

=<br />

∣ +H ∣<br />

n ∣∣∣<br />

. (8.14)<br />

−H n<br />

defines a set of 2n signals which is called bi-orthogonal. Determine the error probability<br />

if this signal set if the energy of each signal is E s .<br />

A bi-orthogonal “code” with n = 32 was used for an early deep-space mission (Mariner,<br />

1969). A fast Hadamard transform was used as decoding method [6].<br />

2. Consider a communication system based on frequency-shift keying (FSK). There are 8<br />

equiprobable messages, hence Pr{M = m} = 1/8 for m ∈ {1, 2, · · · , 8}. The signal<br />

waveform corresponding to message m is<br />

s m (t) = A √ 2 cos(2πmt), for 0 ≤ t < 1.<br />

For t < 0 and t ≥ 1 all signals are zero. The signals are transmitted over an additive white<br />

Gaussian noise waveform channel. The power spectral density of the noise is S n ( f ) =<br />

N 0 /2 for all frequencies f .<br />

(a) First show that the signals s m (t), m ∈ {1, 2, · · · , 8} are orthogonal 1 . Give the energies<br />

of the signals. What are the building-block waveforms ϕ 1 (t), ϕ 2 (t), · · · , ϕ 8 (t)<br />

that result in the signal vectors<br />

s 1 = (A, 0, 0, 0, 0, 0, 0, 0),<br />

s 2 = (0, A, 0, 0, 0, 0, 0, 0),<br />

· · ·<br />

s 8 = (0, 0, 0, 0, 0, 0, 0, A)?<br />

(b) The optimum receiver first determines the correlations r i = ∫ 1<br />

0 r(t)ϕ i(t)dt for i =<br />

1, 2, · · · , 8. Here r(t) is the received waveform hence r(t) = s m (t) + n w (t). For<br />

what values of the vector r = (r 1 , r 2 , · · · , r 8 ) does the receiver decide that message<br />

m was transmitted?<br />

(c) Give an expression for the error probability P E obtained by the optimum receiver.<br />

(d) Next consider a system with 16 messages all having the same a-priori probability.<br />

The signal waveform corresponding to message m is now given by<br />

s m (t) = A √ 2 cos(2πmt), for 0 ≤ t < 1 for m = 1, 2, · · · , 8,<br />

s m (t) = −A √ 2 cos(2π(m − 8)t), for 0 ≤ t < 1 for m = 9, 10, · · · , 16.<br />

For t < 0 and t ≥ 1 these signals are again zero. What are the signal vectors<br />

s 1 , s 2 , · · · , s 16 if we use the building vectors mentioned in part (a) again?<br />

1 Hint: 2 cos(a) cos(b) = cos(a − b) + cos(a + b).


CHAPTER 8. ORTHOGONAL SIGNALING 81<br />

(e) The optimum receiver again determines the correlations r i = ∫ 1<br />

0 r(t)ϕ i(t)dt for i =<br />

1, 2, · · · , 8. For what values of the vector r = (r 1 , r 2 , · · · , r 8 ) does the receiver now<br />

decide that message m was transmitted? Give an expression for the error probability<br />

P E obtained by the optimum receiver now.<br />

(Exam Communication Principles, October 6, 2003.)


Part III<br />

Transmitter Optimization<br />

82


Chapter 9<br />

<strong>Signal</strong> energy considerations<br />

SUMMARY: The collection of vectors that corresponds to the signal-waveforms is called<br />

the signal structure. If the signal structure is translated or rotated the error probability<br />

need not change. In this chapter we determine the translation vector that minimizes the<br />

average signal energy. After that we demonstrate that orthogonal signaling achieves a certain<br />

error performance with twice as much energy as antipodal signal. The energy loss of<br />

orthogonal sets with many signals can be neglected however.<br />

9.1 Translation and rotation of signal structures<br />

In the additive white 1 Gaussian noise case, if a signal structure, i.e. the collection of vectors<br />

s 1 , s 2 , · · · , s |M| , is translated or rotated the error probability P E need not change. To see why,<br />

just assume that the corresponding decision regions I 1 , I 2 , · · · , I |M| (which need not be optimum)<br />

are translated or rotated in the same way. Then, because the additive white Gaussian<br />

noise-vector is spherically symmetric in all dimensions in the signal space, and since distances<br />

between decoding regions and signal points did not change, the error probability remains the<br />

same. However, in general, the average signal energy changes if a signal structure is translated.<br />

Rotation about the origin has no effect on the average signal energy.<br />

9.2 <strong>Signal</strong> energy<br />

Here we want to stress again that<br />

E sm =<br />

∫ T<br />

0<br />

s 2 m (t)dt = ‖s m ‖2 , (9.1)<br />

1 Although we are concerned with an AGN vector channel, we use the word ”white” to emphasize that this vector<br />

channel is equivalent to a waveform channel with additive white Gaussian noise.<br />

83


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 84<br />

therefore we only need to know the collection of signal vectors, the signal structure, and the<br />

message probabilities to determine the average signal energy which is defined as<br />

E av<br />

=<br />

∑<br />

m∈M<br />

Pr{M = m}E sm = ∑<br />

m∈M<br />

9.3 Translating a signal structure<br />

Pr{M = m}‖s m ‖ 2 = E[‖S‖ 2 ]. (9.2)<br />

The smallest possible average error probability of a waveform communication system only depends<br />

on the signal structure, i.e. on the collection of vectors s 1 , s 2 , · · · , s |M| . It does not<br />

change if the entire signal structure is translated. The optimum decision regions simply translate<br />

too. Translation may affect the average signal energy however. We next determine the translation<br />

ϕ 2<br />

ϕ ′ 2<br />

✄<br />

✂ ✁<br />

a<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

ϕ ′ 1<br />

✄<br />

✡ ✡✡✡✡✣ ✄<br />

✂ ✂ ✁<br />

✁<br />

ϕ 1<br />

Figure 9.1: Translation of a signal structure, moving the origin of the coordinate system to a.<br />

vector that minimizes the average signal energy.<br />

Consider a certain signal structure. Let E av (a) denote the average signal energy when the<br />

structure is translated over −a, or equivalently when the origin of the coordinate system is moved<br />

to a (see figure 9.1), then<br />

E av (a) = ∑<br />

Pr{M = m}‖s m − a‖ 2 = E[‖S − a‖ 2 ]. (9.3)<br />

m∈M<br />

We now have to find out how we should choose the translation vector a that minimizes<br />

E av (a). It turns out that the best choice is<br />

a =<br />

∑<br />

Pr{M = m}s m = E[S]. (9.4)<br />

m=1,|M|<br />

This follows from considering an alternative translation vector b. The energy of the signal struc-


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 85<br />

ture after translation by b is<br />

E av (b) = E[‖S − b‖ 2 ] = E[‖(S − a) + (a − b)‖ 2 ]<br />

= E[‖S − a‖ 2 + 2(S − a) · (a − b) + ‖a − b‖ 2 ]<br />

= E[‖S − a‖ 2 ] + 2(E[S] − a) · (a − b) + ‖a − b‖ 2<br />

= E[‖S − a‖ 2 ] + ‖a − b‖ 2<br />

≥ E[‖S − a‖ 2 ], (9.5)<br />

where equality in the third line follows from a = E[S]. Observe that E av (b) is minimized only<br />

for b = a = E[S].<br />

If we do not translate, i.e. when b = 0, the average signal energy<br />

E av (0) = E[‖S − a‖ 2 ] + ‖a‖ 2 , (9.6)<br />

hence we can save ‖a‖ 2 if we translate the center of the coordinate system to a.<br />

RESULT 9.1 This implies that to minimize the average signal energy we should choose the<br />

center of gravity of the signal structure as the origin of the coordinate system. If the center of<br />

gravity of the signal structure a ̸= 0 we can decrease the average signal energy by ‖a‖ 2 by doing<br />

an optimal translation of the coordinate system.<br />

In what follows we will discuss some examples.<br />

9.4 Comparison of orthogonal and antipodal signaling<br />

Let M = {1, 2} and Pr{M = 1} = Pr{M = 2} = 1/2. Consider a first signal set of two<br />

orthogonal waveforms (see the two sub-figures in the top row in figure 9.2)<br />

s 1 (t) = √ 2E s sin(10πt)<br />

s 2 (t) = √ 2E s sin(12πt), for 0 ≤ t < 1, (9.7)<br />

and a second signal set of two antipodal signals (see the bottom row sub-figures in figure 9.2)<br />

s 1 (t) = √ 2E s sin(10πt)<br />

s 2 (t) = − √ 2E s sin(10πt), also for 0 ≤ t < 1. (9.8)<br />

Note that binary FSK (frequency-shift keying) is the same as orthogonal signaling, while binary<br />

PSK (phase-shift keying) is identical to antipodal signaling.<br />

The vector representations of the signal sets are<br />

s 1 = ( √ E s , 0),<br />

s 2 = (0, √ E s ), (9.9)


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 86<br />

1.5<br />

s1(t)/sqrt(Es)<br />

1.5<br />

s2(t)/sqrt(Es)<br />

1<br />

1<br />

0.5<br />

0.5<br />

0<br />

0<br />

−0.5<br />

−0.5<br />

−1<br />

−1<br />

−1.5<br />

0 0.2 0.4 0.6 0.8 1<br />

−1.5<br />

0 0.2 0.4 0.6 0.8 1<br />

1.5<br />

s1(t)/sqrt(Es)<br />

1.5<br />

s2(t)/sqrt(Es)<br />

1<br />

1<br />

0.5<br />

0.5<br />

0<br />

0<br />

−0.5<br />

−0.5<br />

−1<br />

−1<br />

−1.5<br />

0 0.2 0.4 0.6 0.8 1<br />

−1.5<br />

0 0.2 0.4 0.6 0.8 1<br />

Figure 9.2: The two signal sets of (9.7) and (9.8) in waveform representation.<br />

s 2<br />

✄<br />

✂ ✁<br />

ϕ 2<br />

s 1<br />

0<br />

s 2<br />

✄ ✄ ✂ ✁<br />

✂ ✁<br />

✄<br />

✂ ✁ 0<br />

ϕ 1<br />

s 1<br />

ϕ 1<br />

(a)<br />

(b)<br />

Figure 9.3: Vector representation of the orthogonal (a) and the antipodal signal set (b).


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 87<br />

10 0<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

10 −5<br />

10 −6<br />

10 −7<br />

−15 −10 −5 0 5 10 15<br />

Figure 9.4: Probability of error for binary antipodal and orthogonal signaling as function of<br />

E s /N 0 in dB. It is assumed that Pr{M = 1} = Pr{M = 2} = 1/2. Observe the difference of 3<br />

dB between both curves.<br />

for the first (orthogonal) set, and<br />

s 1 = ( √ E s , 0),<br />

s 2 = (− √ E s , 0), (9.10)<br />

for the second (antipodal) set.<br />

The average signal energy for both sets is equal to E s . For the error probabilities in the case<br />

of additive white Gaussian noise with power spectral density N 0 /2 we get<br />

⎛ ⎞<br />

P orthog.<br />

E<br />

= Q<br />

P antipod.<br />

E<br />

= Q<br />

⎝√ 2Es<br />

√ ⎠ = Q( √ E s /N 0 ),<br />

N<br />

2 0<br />

2<br />

⎛ ⎞<br />

⎝ 2√ E s<br />

√ ⎠ = Q( √ 2E s /N 0 ). (9.11)<br />

2<br />

N 0<br />

2<br />

Note that the distance d between the signal points differs by a factor √ 2 and P E = Q(d/2σ )).<br />

These error probabilities are depicted in figure 9.4. We observe a difference in the error probabilities<br />

of 3.0 dB. For a description of the meaning of decibels etc. we refer to appendix H. By<br />

this we mean that, in order to achieve a certain value of P E , we have to make the signal energy<br />

twice as large for orthogonal signaling than for antipodal signaling. The better performance of<br />

antipodal signaling relative to orthogonal signaling is best explained by the fact that for antipo-


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 88<br />

dal signaling the center of gravity is the origin of the coordinate system while for orthogonal<br />

signaling this is certainly not the case.<br />

We may conclude that binary PSK modulation achieves a better error-performance than binary<br />

FSK. An advantage of FSK modulation is however that efficient FSK receivers can be<br />

designed that do not recover the phase. PSK on the other hand requires coherent demodulation.<br />

9.5 Energy of |M| orthogonal signals<br />

The average energy E av of the signals in an orthogonal set (all signals having energy E s ) is<br />

E av = E[‖S‖ 2 ] = ∑<br />

Pr{M = m}‖s m ‖ 2 = E s . (9.12)<br />

m∈M<br />

The center of gravity of our orthogonal signal structure is<br />

We know from (9.6) that<br />

thus, since E av (0) = E av ,<br />

or<br />

a = ( 1<br />

|M| , 1<br />

|M| , · · · , 1<br />

|M| )√ E s . (9.13)<br />

E av (0) = E av (a) + ‖a‖ 2 (9.14)<br />

E s = E av (a) + E s<br />

|M|<br />

(9.15)<br />

E av (a) = E s (1 − 1 ). (9.16)<br />

|M|<br />

Observe that E av (a) is the average signal energy after translating the coordinate system such that<br />

its origin is the center of gravity of the signal structure. For |M| → ∞ the difference between<br />

the E av (a) and E s can be neglected. Therefore we conclude that an orthogonal signal set is not<br />

optimal in the sense that for a given error performance it needs the smallest possible average<br />

signal energy, but that the difference diminishes if M → ∞.<br />

9.6 Upper bound on the error probability<br />

In general the probability of error P E of an optimum receiver cannot be computed easily. However<br />

we can use the ”union bound” to obtain an upper bound for it. To see how this works we<br />

assume that the a priori message probabilities are all equal. Then for the error probability PE<br />

1<br />

when message 1 is sent, we can write<br />

⋃<br />

PE 1 = Pr{ (‖R − s m ‖ ≤ ‖R − s 1 ‖)|M = 1}<br />

≤<br />

m∈M,m̸=1<br />

∑<br />

m∈M,m̸=1<br />

Pr{‖R − s m ‖ ≤ ‖R − s 1 ‖|M = 1}. (9.17)


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 89<br />

Note that in the last step we used the union bound Pr{∪ i E i } ≤ ∑ Pr{E i }, where E i is an event<br />

indexed by i.<br />

Now the total error probability can upper bounded by<br />

P E ≤ ∑<br />

Pr{M = m}PE<br />

m<br />

≤<br />

m∈M<br />

∑<br />

m∈M<br />

1<br />

|M|<br />

∑<br />

m ′ ∈M,m ′̸=m<br />

Pr{‖R − s m ′‖ ≤ ‖R − s m ‖|M = m}. (9.18)<br />

Also when the a priori probabilities are not all equal we can use the union bound to upper bound<br />

the error probability.<br />

9.7 Exercises<br />

1. Either of the two waveform sets illustrated in figures 9.5 and 9.6 may be used to communicate<br />

one of four equally likely messages over an additive white Gaussian noise channel.<br />

s 1 (t)<br />

s 2 (t)<br />

√<br />

Es /2<br />

√<br />

Es /2<br />

1 2 3 4 t<br />

1<br />

2<br />

3<br />

4<br />

t<br />

s 3 (t)<br />

s 4 (t)<br />

√<br />

Es /2<br />

√<br />

Es /2<br />

1 2 3 4 t<br />

1 2 3 4 t<br />

Figure 9.5: <strong>Signal</strong> set (a).<br />

(a) Show that both sets use the same energy.<br />

(b) Exploit the union bound to show that the set of figure 9.6 uses energy almost 3 dB<br />

more effectively than the set of figure 9.5 when a small P E is required.<br />

(Exercise 4.18 from Wozencraft and Jacobs [25].)<br />

2. In the communication system diagrammed in figure 9.7 the transmitted signal-vectors<br />

(s m1 , s m2 ) and the noises n 1 and n 2 are all statistically independent. Assume that |M| = 2


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 90<br />

s 1 (t)<br />

s 2 (t)<br />

√<br />

Es<br />

1 2 3 4 t<br />

√<br />

Es<br />

1<br />

2<br />

3<br />

4<br />

t<br />

s 3 (t)<br />

s 4 (t)<br />

√<br />

Es<br />

√<br />

Es<br />

1 2 3 4 t<br />

1<br />

2<br />

3<br />

4<br />

t<br />

Figure 9.6: <strong>Signal</strong> set (b).<br />

i.e. M = {1, 2} and that<br />

Pr{M = 1} = Pr{M = 2} = 1/2,<br />

(s 11 , s 12 ) = (3 √ E, 2 √ E),<br />

(s 21 , s 22 ) = (− √ E, −2 √ E),<br />

p N1 (n) = p N2 (n) =<br />

1 n2<br />

√ exp(− ).<br />

2πσ 2 2σ 2<br />

(9.19)<br />

m<br />

✲<br />

transmitter<br />

s m1<br />

s m2<br />

n 1<br />

✛❄<br />

✘r 1 = s m1 + n 1<br />

✲ +<br />

✲<br />

✚✙<br />

✛✘<br />

✲ +<br />

✲<br />

✚✙r 2 = s m2 + n 2<br />

✻<br />

optimum<br />

receiver<br />

ˆm<br />

✲<br />

n 2<br />

Figure 9.7: Communication system.<br />

(a) Show that the optimum receiver can be realized as diagrammed in figure 9.8. Determine<br />

α, the optimum threshold setting, and the value ˆm for r 1 + αr 2 larger than the<br />

threshold.<br />

(b) Express the resulting P E in terms of Q(·).


CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 91<br />

r 1<br />

2<br />

threshold device<br />

✛❄<br />

✘r 1 + αr<br />

+<br />

✲<br />

r 2<br />

✛✘✚<br />

✙<br />

✲<br />

✻<br />

×<br />

✚✙<br />

✻<br />

α<br />

ˆm<br />

✲<br />

Figure 9.8: Detector structure.<br />

(c) What is the average energy Eav of the signals. By translation of the signal structure<br />

we can reduce this average energy without changing the error probability. Determine<br />

the smallest possible average signal energy that can be obtained in this way.<br />

(Exam Communication Principles, July 5, 2002.)


Chapter 10<br />

<strong>Signal</strong>ing for message sequences<br />

SUMMARY: In this chapter we will describe the problem of transmitting a continuous<br />

stream of messages. How should we use the available signal power P s in such a way that we<br />

achieve a large information rate R together with a small error probability P E ?<br />

10.1 Introduction<br />

In the previous part we have considered transmission of a single randomly-chosen message<br />

m ∈ M, over a waveform channel during the time-interval 0 ≤ t < T . The problem that was to<br />

be solved there was to determine the optimum receiver, i.e the receiver that minimizes the error<br />

probability P E for a channel with additive white Gaussian noise. We have found the following<br />

solution:<br />

1. First form an orthonormal base for the set of signals.<br />

2. The optimum receiver determines the finite-dimensional vector representation of the received<br />

waveform, i.e. it projects the received waveform onto the orthonormal signal base.<br />

3. Then using this vector representation of the received waveform, it determines the message<br />

that has the (or a) largest a-posteriori probability. This is done by minimizing expression<br />

(4.12) over all m ∈ M, hence by acting as an optimum vector receiver.<br />

In the present part we shall investigate the more practical situation where we have a continuous<br />

stream of messages that are to be transmitted over our additive white Gaussian noise<br />

waveform channel.<br />

10.2 Definitions, problem statement<br />

Definition 10.1 We assume that each T seconds one out of |M| messages is to be transmitted.<br />

The messages are equally likely 1 i.e. Pr{M = m} = 1/|M| for all m ∈ M.<br />

1 Only for the sake of simplicity we assume that all messages are equally likely. This is not essential however.<br />

92


CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 93<br />

m 1 ∈ M<br />

m 2 ∈ M<br />

m 3 ∈ M<br />

· · ·<br />

0<br />

T<br />

2T<br />

3T<br />

t<br />

Figure 10.1: A continuous stream of messages.<br />

The waveform that corresponds to message m ∈ M is s m (t). It is non-zero only for 0 ≤<br />

t < T . If the message in the k-th interval [(k − 1)T, kT ) is m, the signal in that interval is<br />

s k,m (t) = s m (t − (k − 1)T ).<br />

Definition 10.2 For each signal s m (t) for m ∈ M the available energy is E s hence<br />

Therefore the available power is<br />

E s ≥<br />

∫ T<br />

0<br />

sm 2 (t)dt Joule. (10.1)<br />

E s<br />

P s = Joule/second. (10.2)<br />

T<br />

Definition 10.3 The transmission rate R is defined as<br />

R = log 2 |M| bit/second. (10.3)<br />

T<br />

Definition 10.4 Now the available energy per transmitted bit is defined as<br />

E b<br />

=<br />

E s<br />

log 2 |M|<br />

Joule/bit. (10.4)<br />

From the above definitions we can deduce for the available energy per transmitted bit that<br />

E b = E s<br />

T<br />

T<br />

log 2 |M| = P s<br />

R , (10.5)<br />

hence P s = RE b .<br />

Our problem is now to determine the maximum rate at which we can communicate reliably<br />

over a waveform channel when the available power is P s . What are the signals that are to<br />

be used to achieve this maximum rate? In the next sections we will first consider two extremal<br />

situations, namely bit-by-bit signaling and block-orthogonal signaling.<br />

10.3 Bit-by-bit signaling<br />

10.3.1 Description<br />

Suppose that in T seconds, we want to transmit K binary digits b 1 b 2 · · · b K . Then<br />

|M| = 2 K ,<br />

R = log 2 |M|<br />

T<br />

= K T . (10.6)


CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 94<br />

x(t)<br />

c<br />

❅ ❅<br />

τ<br />

t<br />

∫ ∞<br />

−∞ x2 (t)dt = E b<br />

❅ ❅<br />

x(t − 3τ)<br />

c<br />

s(t)<br />

❅ ❅<br />

❅ ❅<br />

❅ ❅<br />

❅ ❅<br />

❅ ❅ τ 2τ❅ ❅ ❅ ❅<br />

<br />

❅ ❅<br />

3τ 4τ 5τ t<br />

Figure 10.2: A bit-by-bit waveform for K = 5 and message 11010.<br />

We can realize this by transmitting a signal s(t) that is composed out of K pulses x(t) that<br />

are shifted in time (see figure 10.2). More precisely<br />

s(t) = ∑<br />

{ −1 if bi = 0<br />

s i x(t − (i − 1)τ) with s i =<br />

(10.7)<br />

+1 if b i = 1.<br />

i=1,K<br />

Here x(t) is a pulse with energy E b and duration τ.<br />

To evaluate the performance of bit-by-bit signaling we first have to determine the buildingblock<br />

waveforms that are the base of our signals. It will be clear that if we take for i =<br />

1, 2, · · · , K<br />

ϕ i (t) = x(t − (i − 1)τ)<br />

√ (10.8)<br />

Eb<br />

then for i = 1, 2, · · · , K and j = 1, 2, · · · , K<br />

∫ ∞<br />

−∞<br />

ϕ i (t)ϕ j (t)dt = δ i j , (10.9)<br />

and we have an orthonormal base. Hence the building-block waveforms are the time-shifts over<br />

multiples of τ of the normalized pulse x(t)/ √ E b as is shown in figure 10.3. Now we can determine<br />

the signal structures that correspond to all the signals. For K = 1, 2, 3 we have shown the<br />

signal structures for bit-by-bit signaling in figure 10.4. The structure is always a K -dimensional<br />

hypercube.<br />

To see how the decision regions look like note first that<br />

s 1 = (− √ E b , − √ E b , · · · , − √ E b ). (10.10)


CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 95<br />

c √ Eb<br />

ϕ 1 (t)<br />

❅ ❅<br />

ϕ 2 (t)<br />

❅ ❅<br />

ϕ 3 (t)<br />

ϕ 4 (t)<br />

❅ ❅<br />

ϕ 5 (t)<br />

❅ ❅<br />

<br />

❅<br />

❅ ❅ <br />

❅❅ ❅<br />

❅ ❅ τ 2τ 3τ 4τ 5τ t<br />

Figure 10.3: Our K = 5 building-block waveforms.<br />

✛<br />

2 √ E b<br />

✲<br />

✄<br />

✂ ϕ 2<br />

✄<br />

✁<br />

✂ ✁<br />

<br />

2 √ s 4<br />

E b<br />

✛<br />

✲<br />

✄<br />

✄<br />

✂ ✁<br />

✂ ✁<br />

ϕ 1 ϕ 1<br />

s 0 1 s 0<br />

2<br />

s 1<br />

s 3<br />

s 2<br />

K = 1<br />

ϕ<br />

s 2<br />

s<br />

✄ 3<br />

4<br />

✂ ✄<br />

✁<br />

✂ ✁<br />

✟✟<br />

✄<br />

s ✟ ✟✟✟✟✟✟ ✄✟ ✟✟✟✟✟✟<br />

7 ✂ ✁<br />

✂ ✁<br />

s 8<br />

s 6<br />

✄<br />

✂ ✁<br />

<br />

✄<br />

✂ ✁<br />

K = 2<br />

ϕ 1<br />

✟✟<br />

✟<br />

✟<br />

✟ 0<br />

ϕ 3 ✟<br />

✟<br />

✟<br />

✟ ✄<br />

✂ ✄<br />

✁<br />

✂ ✁ s 2<br />

s 1<br />

✄<br />

✂✟ ✟✟✟✟✟✟ ✄<br />

✁<br />

✂✟✁<br />

✟✟✟✟✟✟<br />

s 5 K = 3<br />

✛<br />

✲<br />

2 √ E b<br />

Figure 10.4: Bit-by-bit signal structure for K = 1, 2, 3.


CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 96<br />

We claim 2 that the optimum receiver decides ˆm = 1 if<br />

r i < 0, for all i = 1, K (10.11)<br />

Now for messages other than for m = 1 similar arguments show that the decision regions are<br />

separated by hyperplanes r i = 0 for i = 1, K .<br />

To find an expression for P E note that the signal hypercube is symmetrical and assume that<br />

s 1 was transmitted. No error occurs if r i = − √ E b + n i < 0 for all i = 1, K or, in other words,<br />

if<br />

n i < √ E b for all i = 1, K , (10.12)<br />

hence, noting that σ 2 = N 0 /2, we get<br />

P C =<br />

(<br />

1 − Q( √ 2E b /N 0 )) K<br />

. (10.13)<br />

Therefore<br />

P E = 1 −<br />

(<br />

1 − Q( √ 2E b /N 0 )) K<br />

. (10.14)<br />

Observe that to estimate bit b i for i = 1, K , an optimum receiver only needs to consider the<br />

received signal r i in dimension i.<br />

10.3.2 Probability of error considerations<br />

Note that for bit-by-bit signaling K = RT (see equation (10.6)) and that E b = P s /R (see<br />

equation (10.5)) and substitute this in (10.14), our expression for P E . We then obtain<br />

P E = 1 −<br />

(<br />

1 − Q( √ 2P s /RN 0 )) RT<br />

. (10.15)<br />

Based on this expression for P E we can now investigate what we should do to improve the<br />

reliability of our system.<br />

First note that K = RT ≥ 1. For K = 1 we get P E = Q( √ 2P s /RN 0 ). There are two<br />

conclusions that can be drawn now:<br />

• For fixed average power P s and fixed rate R the average error probability increases and<br />

approaches 1 if we increase T . We cannot improve reliability by increasing T therefore.<br />

• For fixed T the average error probability P E can only be decreased by increasing the power<br />

P s or by decreasing the rate R.<br />

This seemed to be ”the end of the story” for communication engineers before Shannon presented<br />

his ideas in [20] and [21].<br />

2 To see why, consider a signal s m for some m ̸= 1. Consider dimensions i for which s mi = √ E b . Note that<br />

s 1i = − √ E b hence for r i < 0 we have (r i − s mi ) 2 = (r i − √ E b ) 2 > (r i + √ E b ) 2 = (r i − s 1i ) 2 . For dimensions<br />

i with s mi = − √ E b = s 1i we get (r i − s mi ) 2 = (r i − s 1i ) 2 . Taking together all dimensions i = 1, K we obtain<br />

‖r − s m ‖ 2 > ‖r − s 1 ‖ 2 and therefore as claimed ˆm = 1 is chosen by an optimum receiver if r i < 0 for all i = 1, K .


CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 97<br />

10.4 Block-orthogonal signaling<br />

10.4.1 Description<br />

✄ ✄❈ ❈<br />

ϕ(t)<br />

✄ ✄❈ ❈<br />

s 27 (t)<br />

0 τ t 0 8τ 16τ 24τ 32τ t<br />

∫ ∞<br />

−∞ ϕ2 (t)dt = 1<br />

∫ ∞<br />

−∞ s2 27 (t)dt = E s = T P s<br />

Figure 10.5: A block-orthogonal waveform for K = 5 and message 11010 or m = 27.<br />

Next we will consider block-orthogonal signaling. We again want to transmit K bits in T<br />

seconds. We do this by sending one out of 2 K orthogonal pulses every T seconds. If we use<br />

pulse-position modulation (PPM) the |M| = 2 K signals are<br />

s m (t) = √ E s ϕ(t − (m − 1)τ), for m = 1, 2 K , (10.16)<br />

where ϕ(t) has energy 1 and duration not more than τ with<br />

τ = T<br />

2 K . (10.17)<br />

All signals within the block [0, T ) are orthogonal and have energy E s . That is why we call our<br />

signaling method block-orthogonal.<br />

10.4.2 Probability of error considerations<br />

To determine the error probability P E for block-orthogonal signaling we assume that<br />

E b /N 0 = (1 + ɛ) 2 ln 2, for 0 ≤ ɛ ≤ 1, (10.18)<br />

i.e. we are willing to spend slightly more than N 0 ln 2 Joule per transmitted bit of information.<br />

Now we obtain from (8.12) that<br />

P E ≤ 2 exp(− log 2 |M|[ √ (1 + ɛ) 2 ln 2 − √ ln 2] 2 )<br />

= 2 exp(− log 2 |M|ɛ 2 ln 2). (10.19)


CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 98<br />

Finally substitution of log 2 |M| = RT (see (10.3) yields<br />

P E ≤ 2 exp(−ɛ 2 RT ln 2). (10.20)<br />

What does (10.18) imply for the rate R? Note that E b = P s /R (see (10.5)). This results in<br />

E b /N 0 = P s /RN 0 = (1 + ɛ) 2 ln 2, (10.21)<br />

in other words<br />

1 P s<br />

R =<br />

(1 + ɛ) 2 N 0 ln 2 . (10.22)<br />

Based on (10.20) and (10.22) we can now investigate what we should do to improve the reliability<br />

of our system. This leads to the following result.<br />

THEOREM 10.1 For an available average power P s we can achieve rates R smaller than but<br />

arbitrarily close to<br />

P s<br />

<br />

C ∞ =<br />

bit/second (10.23)<br />

N 0 ln 2<br />

while the error probability P E can be made arbitrarily small by increasing T . Observe that C ∞ ,<br />

the capacity, depends only on the available power P s and power spectral density N 0 /2 of the<br />

noise.<br />

This is a Shannon-type of result. The reliability can be increased not only by increasing the<br />

power P s or decreasing the rate R but also by increasing the “codeword-lengths” T . It is also<br />

important to note that only rates up to the capacity C ∞ can be achieved reliably. We will see<br />

later that rates larger than C ∞ are indeed not achievable. Therefore it is correct to use the term<br />

capacity here.<br />

One might get the feeling that this could finish our investigations. We have found the capacity<br />

of the waveform channel with additive white Gaussian noise. In the next chapter we will see that<br />

so far we have ignored an important property of signal sets: their dimensionality.<br />

10.5 Exercises<br />

1. Rewrite the bound in (8.12) on P E in terms of R, T and C ∞ . Consider 3<br />

E(R) = lim<br />

T →∞ − 1 T ln P E(T, R), (10.24)<br />

for the best possible signaling methods with rate not smaller than R having signals with<br />

duration T . The function E(R) of R is called the reliability function.<br />

Now compute a lower bound on the reliability function E(R) and draw a plot of this lower<br />

bound. It can be shown that E(R) is equal to the lower bound that we have just derived<br />

(see Gallager [7]).<br />

3 Just assume that the limit in this definition exists.


Chapter 11<br />

Time, bandwidth, and dimensionality<br />

SUMMARY: Here we first discuss the fact that, to obtain a small error probability P E ,<br />

block-orthogonal signaling requires a huge number of dimensions per second. Then we<br />

show that a bandwidth W can only accommodate roughly 2W dimensions per second.<br />

Hence block-orthogonal signaling is only practical when the available bandwidth is very<br />

large.<br />

11.1 Number of dimensions for bit-by-bit and block-orthogonal<br />

signaling<br />

We again assume that in an interval of duration T we have to transmit K bits.<br />

Then for bit-by-bit signaling we need K = RT dimensions per block. For block-orthogonal<br />

signaling we need 2 K = 2 RT dimensions per block.<br />

Therefore for bit-by-bit signaling we need K/T = R dimensions per second. For blockorthogonal<br />

signaling we need 2 K /T = 2 RT /T dimensions per second.<br />

Note that for block-orthogonal signaling the number of dimensions per second explodes by<br />

increasing T . In the next section we will see that a channel with a finite bandwidth cannot<br />

accommodate all these dimensions. Increasing T is however necessary to improve the reliability<br />

of a block-orthogonal system hence finite bandwidth creates a problem.<br />

11.2 Dimensionality as a function of T<br />

The following theorem will not be proved. For more information on this subject check Wozencraft<br />

and Jacobs [25]. The Fourier transform is described shortly in appendix B.<br />

RESULT 11.1 (Dimensionality theorem) Let {φ i (t), for i = 1, N} denote any set of orthogonal<br />

waveforms such that for all i = 1, N,<br />

1. φ i (t) = 0 for t outside [− T 2 , T 2 ), 99


CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 100<br />

2. and also ∫ W<br />

∫ ∞<br />

| i ( f )| 2 d f ≥ (1 − ηW 2 ) | i ( f )| 2 d f (11.1)<br />

−W<br />

for the spectrum i ( f ) of φ i (t). The parameter W is called bandwidth (in Hz).<br />

Then the dimensionality N of the set of orthogonal waveforms is upper-bounded by<br />

−∞<br />

N ≤<br />

2W T<br />

1 − ηW<br />

2 . (11.2)<br />

The theorem says that if we require almost all spectral energy of the waveforms to be in the<br />

frequency range [−W, W ], the number of waveforms cannot be much more than roughly 2W T .<br />

In other words the number of dimensions is not much more than 2W per second.<br />

The “definition” of bandwidth in the dimensionality theorem may seem somewhat arbitrary,<br />

but what is important is that the number of dimensions grows not faster than linear in T .<br />

Note that instead of [0, T ) as in previous chapters, we have restricted ourselves here to the<br />

interval [− T 2 , T 2<br />

) here. The reason for this is that the analysis is more easy this way.<br />

Next we want to show the converse statement, i.e. that the number of dimensions N can grow<br />

linearly with T , with a coefficient roughly equal to 2W . Therefore consider 2K + 1 orthogonal<br />

waveforms that are always zero except for − T 2 ≤ t < T 2<br />

. In that case the waveforms are defined<br />

as<br />

φ 0 (t) = 1<br />

φ c 1 (t) = √ 2 cos(2π t T ) and φs 1 (t) = √ 2 sin(2π t T )<br />

φ c 2 (t) = √ 2 cos(4π t T ) and φs 2 (t) = √ 2 sin(4π t T )<br />

· · ·<br />

φK c (t) = √ 2 cos(2K π t T ) and φs K (t) = √ 2 sin(2K π t ). (11.3)<br />

T<br />

We now determine the spectra of all these waveforms.<br />

1. The spectrum of the first waveform φ 0 (t) is<br />

0 ( f ) =<br />

∫ T/2<br />

exp(− j2π f t)dt<br />

−T/2<br />

∫ T/2<br />

= −1<br />

j2π f<br />

= −1<br />

j2π f<br />

d exp(− j2π f t)<br />

−T/2<br />

(exp(− j2π f T 2 ) − exp( j2π f T )<br />

2 )<br />

= sin π f T<br />

π f<br />

. (11.4)<br />

Note that this is the so called sinc-function. In figure 11.1 the signal φ 0 (t) and the corresponding<br />

spectrum 0 ( f ) are shown for T = 1.


CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 101<br />

x0(t)<br />

X0(f)<br />

1<br />

1<br />

0.8<br />

0.8<br />

0.6<br />

0.6<br />

0.4<br />

0.4<br />

0.2<br />

0.2<br />

0<br />

0<br />

−0.2<br />

−1 −0.5 0 0.5 1<br />

−0.2<br />

−4 −2 0 2 4<br />

Figure 11.1: The signal φ 0 (t) and the corresponding spectrum 0 (t) for T = 1.<br />

2. The spectrum of the cosine-waveform φk c (t) for a specified k can be determined by observing<br />

that<br />

φk c (t) = φ 0(t) √ 2 cos 2π kt<br />

T<br />

= φ 0 (t) √ 2 1 (<br />

exp( j2π kt<br />

)<br />

kt<br />

) + exp(− j2π<br />

2 T T )<br />

(11.5)<br />

hence (see appendix B)<br />

c k ( f ) = 1 √<br />

2<br />

(<br />

0 ( f − k T ) + 0( f + k T ) )<br />

. (11.6)<br />

In figure 11.2 the signal φ5 c(t) and the corresponding spectrum c 5<br />

( f ) are shown again for<br />

T = 1.<br />

3. Similarly we can determine the spectrum of the sine-waveforms φk s (t) for specified k.<br />

We now want to find out how much energy is in certain frequency bands. E.g. MATLAB<br />

tells us that<br />

∫ ∞<br />

( ) sin π f T 2<br />

d f = T and<br />

−∞<br />

∫ 1/T<br />

π f<br />

( ) sin π f T 2<br />

d f = 0.9028T, (11.7)<br />

−1/T<br />

π f


CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 102<br />

1.5<br />

x5c(t)<br />

0.7<br />

X5c(f)<br />

1<br />

0.6<br />

0.5<br />

0.5<br />

0.4<br />

0<br />

0.3<br />

0.2<br />

−0.5<br />

0.1<br />

0<br />

−1<br />

−0.1<br />

−1.5<br />

−1 −0.5 0 0.5 1<br />

−0.2<br />

−5 0 5<br />

Figure 11.2: The signal φ5 c(t) and the corresponding spectrum c 5<br />

( f ) for T = 1.<br />

hence less than 10 % of the energy of the sinc-function is outside the frequency band [−1/T, 1/T ].<br />

Numerical analysis shows that never more than 12 % of the energy of c k<br />

( f ) is outside<br />

the frequency band [− k+1<br />

T , k+1<br />

T<br />

] for k = 1, 2, · · · . Similarly we can show that the spectrum<br />

of the sine-waveform φk s (t) has never more than 5 % of its energy outside the frequency band<br />

[− k+1<br />

T , k+1<br />

T<br />

] for such k. For large k both percentages approach roughly 5 %.<br />

The frequency band needed to accommodate most of the energy of all 2K + 1 waveforms is<br />

[− K +1<br />

T , K +1<br />

T<br />

]. Now suppose that the available bandwidth is W . Then, to have not more than 12<br />

% out-of-band energy, K should satisfy<br />

W ≥ K + 1 . (11.8)<br />

T<br />

If, for a certain bandwidth W we take the largest K , the number N of orthogonal waveforms is<br />

N = 2K + 1 = 2⌊W T − 1⌋ + 1 > 2(W T − 2) + 1 = 2W T − 3. (11.9)<br />

RESULT 11.2 There are at least N = 2W T − 3 orthogonal waveforms over [− T 2 , T 2<br />

] such that<br />

less than 12 % of their spectral energy is outside the frequency band [−W, W ]. This implies that<br />

a fixed bandwidth W can accommodate 2W dimensions per second for large T .<br />

The dimensionality theorem shows that block-orthogonal signaling has a very unpleasant<br />

property that concerns bandwidth. This is demonstrated by the next example.<br />

Example 11.1 For block-orthogonal signaling the number of required dimensions in an interval of T<br />

seconds is 2 RT . If the channel bandwidth is W , the number of available dimensions is roughly 2W T ,<br />

hence the bandwidth W should be such that W ≥ 2 RT /(2T ).


CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 103<br />

Now consider subsection 10.4.2 and take ɛ = 0.1. Then (see (10.22)) R = 100<br />

121 C ∞. To get an<br />

acceptable P E we have to take RT ≫ 100 (see (10.20)), hence<br />

W ≥ 2RT<br />

2T = 2RT<br />

2RT R ≫ 2100<br />

R. (11.10)<br />

200<br />

Even for very small values of the rate R, e,g. 1 bit per second, this causes a big problem. We can also<br />

conclude that only very small spectral efficiencies R/W can be realized 1 .<br />

11.3 Remark<br />

In this chapter we have studied building-block waveforms that are time-limited. As a consequence<br />

these waveforms have a spectrum which is not frequency-limited. We can also investigate<br />

waveforms that are frequency-limited. However this implies that these waveforms are not<br />

time-limited anymore. Chapter 14 on pulse modulation deals with such building blocks.<br />

11.4 Exercises<br />

1. Consider the Gaussian pulse x(t) = ( √ 2πσ 2 ) −1 exp(−t 2 /(2σ 2 )) and signals such as<br />

s m (t) = ∑<br />

s mi x(t − iτ), for m = 1, 2, · · · , |M|, (11.11)<br />

i=1,N<br />

constructed from successive τ-second translates of x(t). Constrain the inter-pulse interference<br />

by requiring that<br />

∫ ∞<br />

−∞<br />

x(t − kτ)x(t − lτ)dt ≤ 0.05<br />

∫ ∞<br />

−∞<br />

x 2 (t)dt, for all k ̸= l, (11.12)<br />

and constrain the signal bandwidth W by requiring that x(t) have no more than 10 % of<br />

its energy outside the frequency interval [−W, W ]. Determine the largest possible value<br />

of the coefficient c in the equation N = cT W when N ≫ 1.<br />

(Exercise 5.4 from Wozencraft and Jacobs [25].)<br />

1 Note however that e.g. for ɛ = 1 the required number of dimensions can be acceptable. However then the rate<br />

R is only one fourth of the capacity P s /(N 0 ln 2).


Chapter 12<br />

Bandlimited transmission<br />

SUMMARY: Here we determine the capacity of the bandlimited additive white Gaussian<br />

noise waveform channel. We approach this problem in a geometrical way. The key idea is<br />

random code generation (Shannon, 1948).<br />

12.1 Introduction<br />

In the previous chapter we have seen that for a waveform channel with bandwidth W , the number<br />

of available dimensions per second is roughly 2W . We have also seen that reliable blockorthogonal<br />

signaling requires, already for very modest rates, many more dimensions per second.<br />

The reason for this was that, for rates close to capacity, although the error probability decreases<br />

exponentially with T , the number of necessary dimensions increases exponentially with T . On<br />

the other hand bit-by-bit signaling requires as many dimensions as there are bits to be transmitted,<br />

which is acceptable from the bandwidth perspective. But bit-by-bit signaling can only made<br />

reliable by increasing the transmitter power P s or by decreasing the rate R.<br />

This raises the question whether we can achieve reliable transmission at certain rates R by<br />

increasing T , when both the bandwidth W and available power P s are fixed. The following<br />

sections show that this is indeed possible. We will describe the results obtained by Shannon<br />

[21]. His analysis has a strongly geometrical flavor. These early results may not be as strong as<br />

possible but they certainly are very surprising and give the right intuition.<br />

We will determine the capacity per dimension, not per unit of bandwidth since the definition<br />

of bandwidth is somewhat arbitrary. We will work with vectors as we learned in the previous<br />

part of these course notes. Note that all messages are equiprobable.<br />

Definition 12.1 The energies of all signals are upper-bounded by E s which is defined to be the<br />

number of dimensions N times E N , i.e.<br />

‖s m ‖ 2 = ∑<br />

i=1,N<br />

E N is therefore the energy available per dimension.<br />

s 2 mi ≤ E s = N E N , for all m ∈ M. (12.1)<br />

104


CHAPTER 12. BANDLIMITED TRANSMISSION 105<br />

0<br />

❳❳<br />

✘✘ ❳❳ <br />

✑✟✟<br />

✏✘ ✘<br />

❍ ◗<br />

r ′ ❅❅<br />

n ′<br />

❅ ❏<br />

✂✡ ✁<br />

❆ ❆❈❈<br />

✂ ✂✂✂✂✂✍<br />

✄ ✄<br />

✄<br />

✂✂✁<br />

<br />

❈ ❈<br />

s ′ m<br />

❇ ❆<br />

✁ ✁✄✄<br />

✄✏✑ ✏✏✏✏✏✏✏✏✏✏✏✏✏✏✶ ✑✑✑✑✑✑✑✑✑✑✑✑✑✑✑✑✸<br />

❅<br />

✂ ❏ ✡<br />

✁<br />

❅❅◗ ❅<br />

❍❍<br />

❳❳ ✘✘✟✟ <br />

❳❳ ✘✘ ✏✑<br />

❇<br />

❈<br />

✄<br />

✂<br />

Figure 12.1: Hardening of the noise sphere around a signal point.<br />

It has advantages to consider here normalized versions of the signal, noise and received vectors.<br />

These normalized versions are defined in the following way:<br />

Definition 12.2 The normalized version r ′ of r is defined by<br />

r ′ = r/<br />

√<br />

N.<br />

As usual N is the number of components in r or the number of dimensions. Consequently also<br />

R ′ √<br />

= R/ N for the random variables R ′ and R that correspond to the realizations r ′ and r.<br />

Similar definitions hold for normalized signal vectors s ′ m , for m ∈ M, and for the normalized<br />

noise vector n ′ .<br />

12.2 Sphere hardening (noise vector)<br />

The random noise vector N has N components, each with mean 0 and variance σ 2 = N 0 /2. The<br />

normalized noise vector N ′ = N/ √ N. Its expected squared length is<br />

E[‖N ′ ‖ 2 ] = E[ 1 N<br />

∑<br />

i=1,N<br />

N 2<br />

i ] = 1 N<br />

∑<br />

i=1,N<br />

E[N 2<br />

i ] = 1 N<br />

∑<br />

i=1,N<br />

For the variance of the squared length of N ′ we find (using σ 2 = N 0 /2) that<br />

var[‖N ′ ‖ 2 ] = 1 N 2 var[ ∑<br />

= 1 N 2 ∑<br />

i=1,N<br />

= 1 N 2 ∑<br />

i=1,N<br />

i=1,N<br />

N 2<br />

i ] = 1 N 2 ∑<br />

i=1,N<br />

(<br />

E[N 4<br />

i ] − (E[N 2<br />

i ])2)<br />

(3σ 4 − (σ 2 ) 2) = 2σ 4<br />

var[N 2<br />

i ]<br />

N<br />

= 2 N<br />

N 0<br />

2 = N 0<br />

2 . (12.2)<br />

(<br />

N0<br />

2<br />

) 2<br />

. (12.3)


CHAPTER 12. BANDLIMITED TRANSMISSION 106<br />

The second equality in this derivation holds since var[ ∑ i=1,N X i] = ∑ i=1,N var[X i] for independent<br />

random variables X 1 , X 2 , · · · , X N . Moreover we have used that E[Ni 4]<br />

= 3σ 4 , see<br />

( ) 2<br />

exercise 1 at the end of this chapter. Finally observe that 2 N0<br />

N 2 approaches zero for N → ∞.<br />

Chebyshev’s inequality can be used to show that the probability of observing a normalized<br />

noise vector with squared length smaller than N 0 /2 − ɛ or larger than N 0 /2 + ɛ for any fixed<br />

ɛ > 0 approaches zero for N → ∞. More precisely<br />

( )<br />

{∣ ∣∣∣<br />

Pr ‖N ′ ‖ 2 − N }<br />

0<br />

2 ∣ > ɛ ≤ var[‖N ′ ‖ 2 2 N0<br />

] N 2<br />

ɛ 2 =<br />

ɛ 2 . (12.4)<br />

RESULT 12.1 We have demonstrated that the normalized received vector r ′ is (roughly speaking)<br />

on the surface of a hypersphere with radius √ N 0 /2 around the signal vector s m . Small<br />

fluctuations are possible, however for N → ∞ these fluctuations diminish.<br />

12.3 Sphere hardening (received vector)<br />

The received vector r = s m + n has N components. The normalized received vector r ′ =<br />

s ′ m + n′ where s ′ m = s m /√ N. The squared length of the normalized received vector can now be<br />

expressed as<br />

‖r ′ ‖ 2 = ‖s ′ m + n′ ‖ 2 = ‖s ′ m ‖2 + 2(s ′ m · n′ ) + ‖n ′ ‖ 2<br />

= ‖s ′ m ‖2 + 2 ∑<br />

s mi n i + ‖n ′ ‖ 2 . (12.5)<br />

N<br />

i=1,N<br />

Note that the term ‖s ′ m ‖2 is ”non-random” for a given message m while the other two terms in<br />

(12.5) depend on N and are therefore random variables. For the expected squared length of the<br />

random normalized received vector R ′ we therefore obtain that<br />

E[‖R ′ ‖ 2 ] = ‖s ′ m ‖2 + 2 N<br />

= ‖s ′ m ‖2 + N 0<br />

2<br />

∑<br />

i=1,N<br />

s mi E[N i ] + E[‖N ′ ‖ 2 ]<br />

(a)<br />

≤ E N + N 0<br />

2 . (12.6)<br />

Here (a) follows from the fact that ‖s ′ m ‖2 = ‖s m ‖ 2 /N ≤ E s /N = E N by definition 12.1. It can<br />

be shown that also the variance of the squared length of R ′ approaches zero for N → ∞. See<br />

also exercise 2 at the end of this chapter.<br />

RESULT 12.2 . We have demonstrated that the normalized received vector r ′ is roughly speaking<br />

within a sphere with radius √ E N + N 0 /2 and center at the origin of the coordinate system.<br />

For a normalized signal vector with energy equal to E N the received vectors are however on the<br />

surface of this hypersphere.


CHAPTER 12. BANDLIMITED TRANSMISSION 107<br />

❳❳<br />

✘✘ ❳❳ <br />

✑✟✟<br />

✏✘ ✘<br />

❍ ◗<br />

<br />

❅❅<br />

<br />

❅ ❏<br />

✂✡ r<br />

✁<br />

❆ ❆❈❈ ❇<br />

ɛ ✄ ✄<br />

✲ ✛<br />

✄✚ ✚✚✚✚✚❃ ❈<br />

✂ ✁ ✲<br />

r − ɛ<br />

❈ ❈<br />

❇ ❆<br />

✁ ✁✄✄<br />

✄<br />

✂<br />

❏ ❅<br />

✡<br />

❅❅◗ ❅<br />

❍❍<br />

❳❳ ✘✘✟✟ <br />

❳❳ ✘✘ ✏✑<br />

Figure 12.2: A hypersphere with radius r − ɛ concentric within a hypersphere of radius r, the<br />

difference is a shell with thickness ɛ.<br />

12.4 The interior volume of a hypersphere<br />

Consider in N dimensions a hypersphere with radius r. The volume of this sphere is V r = B N r N<br />

where B N is some constant depending in N (see e.g. appendix 5D in Wozencraft and Jacobs<br />

[25]). E.g. B 2 = π and B 3 = 4π/3.<br />

Now also consider a hypersphere with the same center but with a slightly smaller radius<br />

r − ɛ for some ɛ > 0, see figure 12.2. The volume of this smaller hypersphere is now V r−ɛ =<br />

B N (r − ɛ) N . This leads to the following result.<br />

RESULT 12.3 The ratio of the volume of the ”interior” of a hypersphere in N dimensions and<br />

the hypersphere itself is<br />

V r−ɛ<br />

V r<br />

= ( r − ɛ ) N (12.7)<br />

r<br />

which approaches zero for N → ∞. Consequently for N → ∞ the shell volume V r − V r−ɛ<br />

constitutes the entire volume V r of the hypersphere, no matter how small the thickness ɛ > 0 of<br />

the shell is.<br />

12.5 An upper bound for the number |M| of signal vectors<br />

For reliable transmission, the (disjoint) decision regions I m , m ∈ M should not be essentially<br />

smaller than a hypersphere with radius (slightly larger than) √ N 0 /2. This follows from result<br />

12.1. Note that we are referring to normalized vectors. On the other hand we know that all<br />

received vectors are inside a hypersphere of radius (slightly larger than) √ E N + N 0 /2. This was<br />

a consequence of result 12.2. To obtain an upper bound on the number |M| of signal vectors that<br />

allow reliable communication, we can now apply a so-called sphere-packing argument.<br />

RESULT 12.4 For reliable transmission, the number of signals |M| can not be larger than the<br />

volume of the hypersphere containing the received vectors, divided by the volume of the noise


CHAPTER 12. BANDLIMITED TRANSMISSION 108<br />

hyperspheres, hence<br />

|M| ≤<br />

B N<br />

(E N + N 0<br />

B N<br />

(<br />

N0<br />

2<br />

2<br />

) N/2<br />

) N/2<br />

=<br />

(<br />

EN + N 0<br />

2<br />

N 0<br />

2<br />

) N/2<br />

. (12.8)<br />

Here B N is again the constant in the formula V r = B N r N for the volume V r of an N-dimensional<br />

hypersphere with radius r.<br />

We now have our upper bound on the number |M| of signal vectors. It should be noted that<br />

in the book of Wozencraft and Jacobs [25] a more detailed proof can be found.<br />

12.6 A lower bound to the number |M| of signal vectors<br />

Shannon [21] constructed an existence proof to show that there are signal sets that allow reliable<br />

communication and that contain surprisingly many signals. His argument involved random<br />

coding.<br />

12.6.1 Generating a random code<br />

Fix the number of signal vectors |M| and their length N. To obtain a signal set, choose |M|<br />

normalized signal vectors s ′ 1 , s′ 2 , · · · , s′ |M|<br />

at random, independently of each other, uniformly<br />

within a hypersphere centered at the origin having radius √ E N . Consider the ensemble of all<br />

signal sets that can be chosen in this way.<br />

12.6.2 Error probability<br />

We are now interested in PE av,<br />

i.e. the error probability P E averaged over the ensemble of signal<br />

sets, i.e.<br />

∫<br />

=<br />

P av<br />

E<br />

s ′ 1 ,s′ 2 ,··· ,s′ |M|<br />

p(s ′ 1 , s′ 2 , · · · , s′ |M| )P E(s ′ 1 , s′ 2 , · · · , s′ |M| )d(s′ 1 , s′ 2 , · · · , s′ |M| ). (12.9)<br />

The averaging corresponds to the choice of the signal sets {s ′ 1 , s′ 2 , · · · , s′ |M|<br />

}. Once we know<br />

PE<br />

av we claim that there exists at least one signal set {s′ 1 , s′ 2 , · · · , s′ |M|<br />

} with error probability<br />

P E (s ′ 1 , s′ 2 , · · · , s′ |M| ) ≤ Pav E<br />

. Therefore we will first show that Pav<br />

E<br />

is small enough.<br />

Consider figure 12.3. Suppose that the signal s ′ m for some fixed m ∈ M was actually sent.<br />

The noise vector n ′ is added to the signal vector and hence the received vector turns out to be<br />

r ′ = s ′ m + n′ . Now an optimum receiver will decode the message m only if there are no other<br />

signals s<br />

k ′ , k ̸= m, inside a hypersphere with radius n′ around r ′ . This is a consequence of result<br />

4.1 that says that minimum Euclidean distance decoding is optimum if all messages are equally<br />

likely.


CHAPTER 12. BANDLIMITED TRANSMISSION 109<br />

✘<br />

✑ ✟ ✦✦✘✘<br />

<br />

✡<br />

✁ ✄ ✄✄☞☞<br />

❈<br />

❈❈▲▲<br />

❆ ❆<br />

❏<br />

❅<br />

❅❅◗❍<br />

❍ ❛❛<br />

❳❳❳<br />

0<br />

❳❳ ❳<br />

❛ ❛ ✑ ✏✘ ✘<br />

❍❍<br />

◗<br />

❅❆ ❩<br />

✂✡ ❅ ❩❩❩❩7 n ′ ❆❆<br />

❅<br />

✄ h ❆ ❏<br />

❆ ✟✯<br />

s ′ ❆❆<br />

❆<br />

m ❈ ✟ ▲<br />

❇ C ▲<br />

❏<br />

❈<br />

❈<br />

✓✟ ✓✓✓✓✓✓✓✓✓✓✓✼❩ r ′ ❅❅◗❳<br />

❈<br />

✄ ✟✟✟✟✟✟✟✟✟✟✟✟<br />

✂ ✁<br />

❳<br />

✄<br />

✄<br />

✄<br />

☞<br />

☞<br />

✁<br />

✡<br />

<br />

<br />

<br />

✑<br />

✟✟<br />

✘✘✘✦ ✦<br />

❳❳ <br />

◗<br />

❅<br />

❏<br />

❇<br />

❈<br />

✄<br />

✂<br />

✡<br />

<br />

✘✘ ✏✑<br />

Figure 12.3: Random coding situation


CHAPTER 12. BANDLIMITED TRANSMISSION 110<br />

• Note that result 12.3 implies that, with probability approaching one for N → ∞, the signal<br />

s ′ m was chosen on the surface of a sphere with radius √ E N . This is a consequence of the<br />

selection procedure of the signal vectors which is uniform over the hypersphere with radius<br />

√<br />

EN . Therefore we may assume that ‖s ′ m ‖2 = E N .<br />

• By result 12.2 the normalized received vector r ′ is now, with probability approaching one,<br />

on the surface of a sphere with radius √ E N + N 0 /2 centered at the origin, and thus ‖r ′ ‖ 2 =<br />

E N + N 0 /2.<br />

• Moreover the normalized noise vector n ′ is on the surface of a sphere with radius √ N 0 /2<br />

hence ‖n ′ ‖ 2 = N 0 /2 by the sphere-hardening argument (see result 12.1).<br />

First we have to determine the probability that the signal corresponding to the fixed message<br />

k ̸= m is inside the sphere with center r ′ and radius ‖n ′ ‖. Therefore we need to know the volume<br />

of the “lens” in the figure 12.3. Determining this volume is not so easy but we know that it is not<br />

larger than the volume of a sphere with radius h, see figure 12.3, and center coinciding with the<br />

center C of the lens. This hypersphere contains the lens. Our next problem is therefore to find<br />

out how large h is.<br />

Observing the lengths of s ′ m , n′ , and r ′ , we may conclude that the angle between the normalized<br />

signal vector s ′ m and the normalized noise vector n′ is π/2 (Pythagoras). Now we can use<br />

simple geometry to determine the length h:<br />

√<br />

√<br />

N EN 0<br />

2<br />

h = √ , (12.10)<br />

E N + N 0<br />

2<br />

and consequently<br />

( )<br />

N<br />

V lens ≤ B N h N EN 0 N/2<br />

2<br />

= B N<br />

E N + N . (12.11)<br />

0<br />

2<br />

The probability that a signal s k for a specific k ̸= m was selected inside the lens is (by the<br />

uniform distribution over the hypersphere with radius √ E N ) now given by the volume of the<br />

lens divided by the volume of the sphere. This ratio can be upper-bounded as follows<br />

( )<br />

N<br />

V 0 N/2<br />

lens<br />

2<br />

B N E N/2 ≤<br />

E<br />

N<br />

N + N . (12.12)<br />

0<br />

2<br />

By the union bound 1 the probability that any signal s k for k ̸= m is chosen inside the lens is at<br />

most |M| − 1 times as large as the ratio (probability) considered in (12.12). Therefore we obtain<br />

for the error probability averaged over the ensemble of signal sets<br />

P av<br />

E ≤ |M| (<br />

N 0<br />

2<br />

E N + N 0<br />

2<br />

) N/2<br />

. (12.13)<br />

1 For the K events E 1 , E 2 , · · · , E K the probability Pr{E 1 ∪ E 2 ∪ · · · ∪ E K } ≤ Pr{E 1 } + Pr{E 2 } + · · · + Pr{E K }.


CHAPTER 12. BANDLIMITED TRANSMISSION 111<br />

12.6.3 Achievability result<br />

RESULT 12.5 Fix any δ > 0. Suppose that the number of signals |M| as a function of the<br />

number of dimensions N is given by<br />

then<br />

(<br />

|M| = 2 −δN EN + N 0<br />

2<br />

N 0<br />

2<br />

) N/2<br />

, (12.14)<br />

lim<br />

N→∞ Pav E ≤ lim<br />

N→∞ 2−δN = 0. (12.15)<br />

This implies the existence of signal sets, one for each value of N, with |M| as in (12.14), for<br />

which lim N→∞ P E = 0.<br />

Note that relation (12.14) makes reliable transmission possible. Since better methods could<br />

exist, it serves as a lower bound on |M| when reliable transmission is required.<br />

12.7 Capacity of the bandlimited channel<br />

We can now combine results 12.4 and 12.5. Define<br />

( )<br />

1<br />

C N =<br />

2 log EN + N 0 /2<br />

2<br />

, (12.16)<br />

N 0 /2<br />

then result 12.4 (upper bound) shows that reliable communication is only possible if the rate R N<br />

per dimension satisfies<br />

R N = log 2 |M| ≤ C N . (12.17)<br />

N<br />

On the other hand result 12.5 (lower bound) demonstrates the existence of signal sets that allow<br />

reliable communication with rates<br />

R N ≥ C N − δ, (12.18)<br />

for any value of δ > 0. Since we can take δ as small as we like we get the following theorem:<br />

THEOREM 12.6 (Shannon, 1949) Reliable communication is possible over a waveform channel,<br />

when the available energy per dimension is E N and the power spectral density of the noise<br />

is N 0<br />

2 , for all rates R N satisfying<br />

R N < C N bit/dimension. (12.19)<br />

Rates R N larger than C N are not realizable with reliable methods. The quantity C N is called the<br />

capacity of the waveform channel in bit per dimension.


CHAPTER 12. BANDLIMITED TRANSMISSION 112<br />

Example 12.1 In figure 12.4 we have depicted the behavior of randomly generated codes. In a MATLAB<br />

program we have generated codes (signal sets) with R N = 1 bit/dimension and vector-lengths N =<br />

4, 8, 12, and 16. Note that reliable transmission of R N = 1-codes is only possible for signal-to-noise<br />

ratios E N / N 0<br />

≥ 3, or 4.7 dB. These codes are therefore tested with E<br />

2 N / N 0<br />

between 3 and 11 dB. The<br />

2<br />

resulting error probabilities for vector-lengths 4,8,12, and 16 are listed in the figure. For each experiment<br />

we have generated 32 codes, and each code was tested with 256 randomly-generated messages. The<br />

resulting number of errors was divided by 8192 to give an estimate of the error probability. The figure<br />

10 0<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

3 4 5 6 7 8 9 10 11<br />

Figure 12.4: Error probability (vertically) as a function of E N / N 0<br />

2<br />

(∗), 8 (◦), 12 (△), and 16 (✷) .<br />

for codes with length N = 4<br />

clearly shows that increasing N results in decreasing error-probabilities for E N / N 0<br />

≥ 5 dB. Simulations<br />

2<br />

for N ≥ 20 turned out to be unfeasible in MATLAB.<br />

In the previous chapter we have seen that a waveform channel with bandwidth W can accommodate<br />

(roughly) at most 2W dimensions per second. With 2W dimensions per second, the<br />

available energy per dimension is E N = P s /(2W ) if P s is the available transmitterpower. In that<br />

case the channel capacity in bit per dimension is<br />

C N = 1 2 log 2<br />

(<br />

Ps<br />

Therefore we obtain the following result:<br />

2W + N 0/2<br />

N 0 /2<br />

)<br />

= 1 2 log 2<br />

(<br />

1 + P s<br />

W N 0<br />

)<br />

. (12.20)<br />

THEOREM 12.7 For a waveform channel with spectral noise density N 0 /2, frequency bandwidth<br />

W , and available transmitter power P s , the capacity in bit per second is<br />

(<br />

C = 2WC N = W log 2 1 + P )<br />

s<br />

bit/second. (12.21)<br />

W N 0


CHAPTER 12. BANDLIMITED TRANSMISSION 113<br />

Thus reliable communication is possible for rates R in bit per second smaller than C, while rates<br />

larger than C are not realizable with arbitrary small P E .<br />

12.8 Exercises<br />

1. Let N be a random variable with density p N (n) = √ 1<br />

n2<br />

exp(− ). Show that E[N 4 ] =<br />

2πσ 2 2σ 2<br />

3σ 4 .<br />

2. Show that the variance of the squared length of the normalized received vector R ′ approaches<br />

zero for N → ∞.<br />

m<br />

✲<br />

transmitter<br />

n w,a (t)<br />

✛❄<br />

✘<br />

✲ +<br />

✚✙<br />

n w,b (t)<br />

✛❄<br />

✘<br />

✲ +<br />

✚✙<br />

✲<br />

✲<br />

receiver<br />

ˆm<br />

✲<br />

Figure 12.5: Water-filling situation<br />

3. A transmitter having power P s = 10 Watt is connected to a receiver by two waveform<br />

channels (see figure 12.5). Each of these channels adds white Gaussian noise to its input<br />

waveform. The power densities of the two noise processes are N0 a/2 = 1 and N 0 b/2 = 2<br />

for −∞ < f < ∞, while the frequency bandwidth W = 1Hz. What is the total capacity<br />

of this parallel channel in bit per second? What happens if P s = 1 Watt? The power<br />

allocation procedure that should be used here is called “water-filling”.


Chapter 13<br />

Wideband transmission<br />

SUMMARY: We determine the capacity of the additive white Gaussian noise waveform<br />

channel in the case of unlimited bandwidth resources. The result that is obtained in this<br />

way shows that block-orthogonal signaling is optimal.<br />

13.1 Introduction<br />

We have seen that with block-orthogonal signaling we can achieve reliable communication at<br />

rates<br />

R <<br />

P s<br />

bit/second, (13.1)<br />

N 0 ln 2<br />

if we only had access to as much bandwidth as we would want. We will prove in this chapter<br />

that rates larger than P s /(N 0 ln 2) are not realizable with reliable signaling schemes. Therefore<br />

we may indeed call P s /(N 0 ln 2) the capacity of the wideband waveform channel.<br />

13.2 Capacity of the wideband channel<br />

In the previous chapter the capacity of a waveform channel with frequency bandwidth limited to<br />

W was shown to be<br />

(<br />

W log 2 1 + P )<br />

s<br />

bit/second, (13.2)<br />

W N 0<br />

see theorem 12.7. We can easily determine the wideband capacity by letting W → ∞.<br />

RESULT 13.1 The capacity C ∞ of the wideband waveform channel with additive white Gaussian<br />

noise with power density N 0 /2 for −∞ < f < ∞, when the transmitter power is P s , follows<br />

114


CHAPTER 13. WIDEBAND TRANSMISSION 115<br />

2.5<br />

ln( 1 + S/N ), S/N, and ln( 1 + S/N )/ S/N<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

0 0.5 1 1.5 2 2.5<br />

S/N<br />

Figure 13.1: The ratio ln(1 + S/N )/S/N and ln(1 + S/N ) as function of the S/N .<br />

from:<br />

(<br />

C ∞ = lim W log 2<br />

W →∞<br />

= lim<br />

W →∞<br />

P ln s<br />

N 0 ln 2<br />

1 + P )<br />

s<br />

W N<br />

( 0<br />

)<br />

1 + P s<br />

W N 0<br />

= P s<br />

P s N<br />

W N 0 ln 2 . (13.3)<br />

0<br />

Expressed in nats per second this capacity is equal to P s /N 0 .<br />

13.3 Relation between capacities, signal-to-noise ratio<br />

Note that we can express the capacity of the bandlimited channel as<br />

(<br />

C = W log 2 1 + P )<br />

s<br />

W N<br />

( 0<br />

)<br />

1 + P s<br />

W N 0<br />

=<br />

P ln s<br />

N 0 ln 2<br />

ln(1 + S/N )<br />

= C<br />

P ∞ s<br />

S/N<br />

W N 0<br />

, with S/N = P s<br />

W N 0<br />

. (13.4)<br />

Here S/N is the signal-to-noise ratio of the waveform channel. Note that the total noise power<br />

is equal to the frequency band 2W times the power spectral density N 0 /2 of the noise. We have<br />

plotted the ratio ln(1 + S/N )/S/N in figure 13.1.


CHAPTER 13. WIDEBAND TRANSMISSION 116<br />

RESULT 13.2 If we note that C = W log 2 (1 + S/N ) then we can can distinguish two cases.<br />

C ≈<br />

{<br />

C∞ if S/N ≪ 1,<br />

W log 2 (S/N ) if S/N ≫ 1.<br />

(13.5)<br />

The case S/N ≪ 1 is called the power-limited case. There is enough bandwidth. When on the<br />

other hand S/N ≫ 1 we speak about bandwidth-limited channels. Increasing the power by a<br />

factor of two does not double the capacity but increases it only by one bit per second per Hz.<br />

Finally we give some other ways to express the signal-to-noise ratio (under the assumption<br />

that the number of dimensions per second is exactly 2W ):<br />

13.4 Exercises<br />

S/N =<br />

P s<br />

W N 0<br />

=<br />

E N<br />

N 0 /2 = 2R E b<br />

N = R N 0 W<br />

E b<br />

N 0<br />

. (13.6)<br />

1. To obtain result 12.7 we have assumed that the number of dimensions that a channel with<br />

bandwidth W can accommodate per second is 2W . What would happen with the wideband<br />

capacity if instead the channel would give us αW dimensions per second?<br />

2. Find out whether a telephone line channel (W = 3400Hz, C = 34000 bit/sec) is power- or<br />

bandwidth-limited by determining its S/N . Also determine E b /N 0 and R N .<br />

3. For a fixed ratio E b /N 0 , for some values of R/W reliable transmission is possible, for<br />

other values of R/W this is not possible. Therefore the E b /N 0 × R/W - plane can be<br />

divided in a region in which reliable transmission is possible and a region where this is<br />

not possible. Find out what function of R/W describes the boundary between these two<br />

regions.<br />

(The ratio R/W is sometimes called the bandwidth efficiency. Note that R/W = 2R N<br />

where R N is the rate per dimension. Hence a similar partitioning is possible for the<br />

E b /N 0 × R N plane.)<br />

4. A receiver is connected to a transmitter by two channels, a bandlimited channel and a<br />

wide-band channel. The transmitter’s signalling power P s = 90 Watt. The transmitter can<br />

use all its power to communicate only via the bandlimited channel, only via the wide-band<br />

channel, or it may distribute its power over the bandlimited channel and the wide-band<br />

channel.<br />

(a) The bandlimited channel has bandwidth W = 200 Hz. The power spectral density of<br />

the noise in this channel N<br />

0 ′ /2 = 0.015 (Watt/Hz). Assume that the transmitter uses<br />

all of its power to communicate over the bandlimited channel. First determine the<br />

capacity C<br />

N ′ of this channel in bits per dimension. Then determine the capacity C′ of<br />

the bandlimited channel in bits per second.


CHAPTER 13. WIDEBAND TRANSMISSION 117<br />

transm.<br />

P s = 90<br />

✲<br />

W = 200<br />

−W<br />

1<br />

W<br />

☛❄<br />

✟<br />

✲ ✲<br />

✡+ ✠ receiv.<br />

☛ ✟<br />

✲<br />

✡+<br />

✠<br />

✻<br />

N<br />

0 ′ /2 = 0.015<br />

✲<br />

N<br />

0 ′′ /2 = 0.06<br />

Figure 13.2: Transmitter, two parallel channels (bandlimited and wide-band channel), and receiver.<br />

(b) The spectral density of the noise in the wide-band channel N<br />

0 ′′ /2 = 0.06 (Watt/Hz).<br />

Assume now that the transmitter only communicates via the wide-band channel to<br />

the receiver. What is the capacity C ′′ of this wide-band channel in bits per second.<br />

(c) How should the transmitter split up its power over the bandlimited channel and the<br />

wide-band channel to achieve the maximum total capacity in bits per second? How<br />

large is this maximum total capacity C tot ?<br />

(Exam Communication Theory, January 12, 2005.)


Part IV<br />

Channel Models<br />

118


Chapter 14<br />

Pulse amplitude modulation<br />

SUMMARY: In this chapter we discuss serial pulse amplitude modulation. This method<br />

provides us with a new channel dimension every T seconds if we choose the pulse according<br />

to the so-called Nyquist criterion. The Nyquist criterion implies that the required bandwidth<br />

W is larger than 1/(2T ). Multi-pulse transmission is also considered in this chapter.<br />

Just as in the single-pulse case a bandwidth of 1/(2T ) Hz per used pulse is required. This<br />

corresponds again to 2W = 1/T dimensions per second.<br />

14.1 Introduction<br />

impulses<br />

a 0 , a 1 , · · · , a K −1<br />

✲<br />

p(t)<br />

✁ ✁✁<br />

0<br />

❅ p(t)<br />

❅ ❅ t<br />

T 2T<br />

s a (t)<br />

n w (t)<br />

✗❄<br />

✔<br />

✲ +<br />

✖ ✕<br />

r(t)<br />

✲<br />

+1<br />

✻<br />

0<br />

T<br />

❄<br />

−1<br />

+1<br />

✻<br />

2T<br />

+1<br />

✻<br />

3T<br />

t<br />

✁✁ ✁ ❇ ❇ ✂ ❅ ❅ s ❅❅ a (t)<br />

❅ ❇ 0 T ❇ 2T<br />

✂ ✂ 3T 4T 5T<br />

❅❅✂<br />

t<br />

Figure 14.1: A serial pulse-amplitude modulation system.<br />

In chapter 11 we could observe that a channel with a bandwidth of W Hz can accommodate<br />

roughly 2W T dimensions each T seconds (for large enough T ). We considered there buildingblock<br />

waveforms of finite duration, therefore the bandwidth constraint could only be met approx-<br />

119


CHAPTER 14. PULSE AMPLITUDE MODULATION 120<br />

imately. Here we will investigate whether it is possible to get a new dimension every 1/(2W )<br />

seconds. We allow the building blocks to have a non-finite duration. Moreover all these buildingblock<br />

waveforms are time shifts of a ”pulse”. The subject of this chapter is therefore serial pulse<br />

transmission.<br />

Consider figure 14.1. We there assume that the transmitter sends a signal s(t) that consists<br />

of amplitude-modulated time shifts of a pulse p(t) by an integer multiple k of the so-called<br />

modulation interval T , hence<br />

s a (t) =<br />

∑<br />

a k p(t − kT ). (14.1)<br />

k=0,K −1<br />

The vector of amplitudes a = (a 0 , a 1 , · · · , a K −1 ) consists of symbols a k , k = 0, K − 1 taking<br />

values in the alphabet A. We call this modulation method serial pulse-amplitude modulation<br />

(PAM).<br />

14.2 Orthonormal pulses: the Nyquist criterion<br />

14.2.1 The Nyquist result<br />

If we have the possibility to choose the pulse p(t) ourselves we can choose it in such a way that<br />

all time shifts of the pulse form an orthonormal base. In that case the pulse p(t) has to satisfy<br />

∫ ∞<br />

{<br />

p(t − kT )p(t − k ′ 1 if k = k<br />

T )dt =<br />

′ ,<br />

0 if k ̸= k ′ (14.2)<br />

,<br />

−∞<br />

for integer k and k ′ . This is equivalent to<br />

∫ ∞<br />

−∞<br />

p(τ)p(τ − kT )dτ = p(t) ∗ p(−t)| t=kT = h(kT ) =<br />

{ 1 if k = 0,<br />

0 if k ̸= 0,<br />

(14.3)<br />

where h(t) = p(t) ∗ p(−t). This time-domain restriction on the pulse p(t) is called the zeroforcing<br />

(ZF) criterion. Later we will see why.<br />

THEOREM 14.1 The frequency-domain equivalent to (14.3) which is known as the Nyquist<br />

criterion for zero intersymbol interference is<br />

Z( f ) = 1 ∑<br />

H( f + m/T )<br />

T<br />

= 1 T<br />

m=··· ,−1,0,1,2,···<br />

∑<br />

m=··· ,−1,0,1,2,···<br />

‖P( f + m/T )‖ 2 = 1 for all f ,<br />

where H( f ) = P( f )P ∗ ( f ) = ‖P( f )‖ 2 is the Fourier transform of h(t) = p(t) ∗ p(−t). Note<br />

that ‖P( f )‖ is the modulus of P( f ). Moreover Z( f ) is called the 1/T -aliased spectrum of<br />

H( f ).<br />

Later in this section we will give the proof of this theorem. First we will discuss it however<br />

and consider an important consequence of it.


CHAPTER 14. PULSE AMPLITUDE MODULATION 121<br />

H( f )<br />

Z( f )<br />

✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆<br />

❆❆ ✁ ✁✁ ❆ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ❆<br />

1 1 3<br />

− 1<br />

1 f<br />

− 3 − 1 T − 1<br />

2T 2T<br />

2T 2T 2T T 2T<br />

✁ f<br />

Figure 14.2: The spectrum H( f ) = ‖P( f )‖ 2 corresponding to a pulse p(t) that does not satisfy<br />

the Nyquist criterion.<br />

H( f )<br />

T 1<br />

Z( f )<br />

− 1<br />

2T<br />

1<br />

2T<br />

f<br />

− 3<br />

2T<br />

− 1 T − 1<br />

2T<br />

1<br />

2T<br />

1<br />

T<br />

3<br />

2T<br />

f<br />

Figure 14.3: The ideally bandlimited spectrum H( f ) = ‖P( f )‖ 2 . Note that P( f ) is a spectrum<br />

with the smallest possible bandwidth satisfying the Nyquist criterion.<br />

14.2.2 Discussion<br />

• Since p(t) is a real signal, the real part of its spectrum P( f ) is even in f , and the imaginary<br />

part of this spectrum is odd. Therefore the modulus ‖P( f )‖ of P( f ) is an even function<br />

of the frequency f .<br />

• If the bandwidth W of the pulse p(t) is strictly smaller than 1/(2T ) (see figure 14.2), then<br />

the Nyquist criterion, which is based on H( f ) = ‖P( f )‖ 2 , can not be satisfied. Thus<br />

no pulse p(t) that satisfies a bandwidth-W constraint, can lead to orthogonal signaling if<br />

W < 1/(2T ).<br />

• The smallest possible bandwidth W of a pulse that satisfies the Nyquist criterion is 1/(2T ).<br />

The ”basic” pulse with bandwidth 1/(2T ) for which the Nyquist criterion holds has a socalled<br />

ideally bandlimited spectrum, which is given by<br />

P( f ) =<br />

{ √<br />

T if | f | < 1/(2T ),<br />

0 if | f | > 1/(2T ),<br />

(14.4)<br />

and √ T /2 for | f | = 1/(2T ). The corresponding H( f ) = ‖P( f )‖ 2 and 1/T -aliased<br />

spectrum Z( f ) can be found in figure 14.3.<br />

The basic pulse p(t) that corresponds to the ideally bandlimited spectrum is shown in<br />

figure 14.4. It is the well-known sinc-pulse<br />

p(t) = √ 1 sin(πt/T )<br />

. (14.5)<br />

T πt/T


CHAPTER 14. PULSE AMPLITUDE MODULATION 122<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

−0.2<br />

−0.4<br />

−5 −4 −3 −2 −1 0 1 2 3 4 5<br />

Figure 14.4: The sinc-pulse that corresponds to T = 1, i.e. p(t) = sin(πt)/(πt).<br />

H( f ) Z( f )<br />

T<br />

1<br />

❅ ❅<br />

❅ ❅ <br />

❅ ❅<br />

❅ <br />

❅ ❅ ❅ ❅ ❅<br />

❅ ❅ ❅<br />

❅ ❅ ❅<br />

1 f<br />

− 1<br />

2T 2T<br />

− 3<br />

2T<br />

− 1 − 1<br />

T 2T<br />

1<br />

2T<br />

1<br />

T<br />

3<br />

2T<br />

❅ ❅ <br />

❅<br />

f<br />

Figure 14.5: A spectrum H( f ) with a positive excess bandwidth satisfying the Nyquist criterion.<br />

• A pulse with a bandwidth larger than 1/(2T ) can also satisfy the Nyquist criterion as can<br />

be seen in figure 14.5. Note that the so-called excess bandwidth, i.e. the bandwidth minus<br />

1/(2T ), can be larger than 1/(2T ). Square-root raised-cosine pulses p(t) have a spectrum<br />

H( f ) = |P( f )| 2 that satisfies the Nyquist criterion and an excess bandwidth that can be<br />

controlled.<br />

At he end of this subsection we state the implication of the previous discussion.<br />

RESULT 14.2 The smallest possible bandwidth W of a pulse that satisfies the Nyquist criterion<br />

is W = 1<br />

1<br />

2T<br />

. The sinc-pulse p(t) = √<br />

sin(πt/T )<br />

T πt/T<br />

has this property. Note that this way of serial<br />

pulse transmission leads to exactly 1 T<br />

= 2W dimensions per second.<br />

Moreover observe that unlike before in chapter 11, our pulses and signals are not timelimited!<br />

14.2.3 Proof of the Nyquist result<br />

We will next give the proof of result 14.1, the Nyquist result.<br />

Proof: Since H( f ) is the Fourier transform of h(t), the condition (14.3) can be rewritten as<br />

h(kT ) =<br />

∫ ∞<br />

−∞<br />

H( f ) exp( j2π f kT )d f =<br />

{ 1 if k = 0,<br />

0 if k ̸= 0.<br />

(14.6)


CHAPTER 14. PULSE AMPLITUDE MODULATION 123<br />

We now break up the integral in parts, a part for each integer m, and obtain<br />

h(kT ) =<br />

=<br />

=<br />

=<br />

∞∑<br />

∫ (2m+1)/2T<br />

m=−∞ (2m−1)/2T<br />

∫ 1/2T<br />

∞∑<br />

m=−∞<br />

∫ 1/2T<br />

−1/2T<br />

∫ 1/2T<br />

−1/2T<br />

−1/2T<br />

[ ∞<br />

∑<br />

m=−∞<br />

H( f ) exp( j2π f kT )d f<br />

H( f + m ) exp( j2π f kT )d f<br />

T<br />

H( f + m T ) ]<br />

exp( j2π f kT )d f<br />

Z( f ) exp( j2π f kT )d f (14.7)<br />

where we have defined Z( f ) by<br />

Z( f ) =<br />

∞∑<br />

m=−∞<br />

H( f + m ). (14.8)<br />

T<br />

Observe that Z( f ) is a periodic function in f with period 1/T . Therefore it can be expanded in<br />

terms of its Fourier series coefficients · · · , z −1 , z 0 , z 1 , z 2 , · · · as<br />

where<br />

z k = T<br />

Z( f ) =<br />

∫ 1/2T<br />

−1/2T<br />

∞∑<br />

k=−∞<br />

If we now combine (14.7) and (14.10) we obtain that<br />

z k exp( j2πk f T ) (14.9)<br />

Z( f ) exp(− j2π f kT )d f. (14.10)<br />

T h(−kT ) = z k , (14.11)<br />

for all integers k. Condition (14.3) now tells us that only z 0 = T and all other z k are zero. This<br />

implies that<br />

Z( f ) = T, (14.12)<br />

or equivalently<br />

∞∑<br />

m=−∞<br />

H( f + m ) = T. (14.13)<br />

T<br />


CHAPTER 14. PULSE AMPLITUDE MODULATION 124<br />

r(t)<br />

✲<br />

p(T p − t)<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

r k<br />

✲<br />

❄<br />

sample at t = T p + kT<br />

Figure 14.6: The optimum receiver front-end for detection of serially transmitted orthonormal<br />

pulses.<br />

14.2.4 Receiver implementation<br />

If we use serial PAM with orthonormal pulses p(t − kT ), k = 0, K − 1, then we can use a<br />

single matched-filter m(t) = p(T p − t) for the computations of the correlations of the received<br />

waveform r(t) with all building-block waveforms, i.e. with all pulses. Assume that the delay T p<br />

is chosen in such a way that m(t) is (effectively 1 ) causal. The output of this filter m(t) when r(t)<br />

is the input signal is<br />

u(t) =<br />

∫ ∞<br />

−∞<br />

r(α)m(t − α)dα =<br />

At time t = T p + kT we see at the filter output<br />

r k = u(T p + kT ) =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(α)p(T p − t + α)dα. (14.14)<br />

r(α)p(α − kT )dα, (14.15)<br />

which is what the optimum receiver should determine, i.e. the correlation of r(t) with the pulse<br />

p(t − kT ). This leads to the very simple receiver structure shown in figure 14.6. <strong>Processing</strong> the<br />

samples r k , k = 1, K, should be done in the usual way.<br />

When there is no noise r(t) = ∑ k=0,K −1 a k p(t − kT ). Then at time t = T p + kT we see at<br />

the filter output<br />

r k = u(T p + kT ) =<br />

=<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

r(α)p(α − kT )dα<br />

∑<br />

−∞<br />

k ′ =0,K −1<br />

∑<br />

∫ ∞<br />

a k ′<br />

k ′ =0,K −1<br />

−∞<br />

a k ′ p(α − k ′ T )p(α − kT )dα<br />

p(α − k ′ T )p(α − kT )dα = a k , (14.16)<br />

by the orthonormality of the pulses. Conclusion is that there is no intersymbol interference<br />

present in the samples. In other words the Nyquist criterion forces the intersymbol interference<br />

to be zero (zero-forcing (ZF)).<br />

1 Note that p(t) has a non-finite duration in general.


CHAPTER 14. PULSE AMPLITUDE MODULATION 125<br />

14.3 Multi-pulse transmission<br />

Instead of using shifted versions of a single pulse we can apply the shifts of a set of orthonormal<br />

pulses. Consider J pulses p 1 (t), p 2 (t), · · · , p J (t) which are orthonormal. The shifted versions<br />

of these pulses can be used to convey a stream of messages. Again the shifts are over multiples<br />

of T and T is called the modulation interval again. Thus a signal s(t) can be written as<br />

s(t) =<br />

∑ ∑<br />

a kj p j (t − kT ), (14.17)<br />

k=0,K −1 j=1,J<br />

for a kj ∈ A. If we want all pulses and its time-shifts to form an orthonormal basis, then the<br />

time-shifts of all pulses should be orthogonal to the pulses p j (t), j = 1, J. In other words, we<br />

require the J pulses to satisfy<br />

∫ ∞<br />

{<br />

p j (t − kT )p j ′(t − k ′ 1 if j = j<br />

T )dt =<br />

′ and k = k ′ ,<br />

(14.18)<br />

0 elsewhere,<br />

−∞<br />

for all j = 1, J, j ′ = 1, J, and all integer k and k ′ . A set of pulses p j (t) for j = 1, 2, · · · , J<br />

that satisfies this restriction is<br />

p j (t) = √ 1 sin(πt/(2T ))<br />

cos ((2 j − 1)πt/(2T )) . (14.19)<br />

T πt/(2T )<br />

For J = 4, and assuming that T = 1, these pulses and their spectra are shown in figure 14.7.<br />

Observe that in this example the total bandwidth of the J pulses is J/(2T ).<br />

RESULT 14.3 The smallest possible bandwidth that is occupied by orthogonal multi-pulse signaling<br />

with J pulses each T seconds is J/(2T ). Note therefore that also bandwidth-efficient<br />

serial multi-pulse transmission leads to exactly 2W dimensions per second.<br />

Proof: (Lee and Messerschmitt [13], p. 266) Just like in the proof of theorem 14.1 we can show<br />

that the pulse-spectra P j ( f ) for j = 1, J must satisfy<br />

1<br />

∞∑<br />

P j ( f + m T<br />

T )P∗ j ′( f + m T ) = δ j j ′. (14.20)<br />

m=−∞<br />

Now fix a frequency f ∈ [−1/(2T ), +1/(2T )) and define for each j = 1, J the vector<br />

P j<br />

= (· · · , Pj ( f − 1/T ), P j ( f ), P j ( f + 1/T ), P j ( f + 2/T ), · · · ). (14.21)<br />

Here P j ( f +m/T ) is the component at position m. With this definition condition (14.20) implies<br />

that the vectors P j , j = 1, J are orthogonal. Assume for a moment that there are less than J<br />

positions m for which at least one component P j ( f + m/T ) for some j = 1, J is non-zero. This<br />

would imply that the J vectors P j , j = 1, J would be dependent. Contradiction!<br />

Thus for each frequency interval d f ∈ [−1/(2T ), 1/(2T )) we may conclude that the J pulses<br />

”fill” at least J disjoint intervals of size d f of the frequency spectrum. In total the J pulses fill a<br />

part J/T of the spectrum. The minimally occupied bandwidth is therefore J/(2T ). ✷<br />

Note that the optimum receiver for multi-pulse transmission can be implemented with J<br />

matched filters that are sampled each T seconds.


CHAPTER 14. PULSE AMPLITUDE MODULATION 126<br />

1<br />

0<br />

−1<br />

−3 −2 −1 0 1 2 3<br />

1<br />

1<br />

0.5<br />

0<br />

−3 −2 −1 0 1 2 3<br />

1<br />

0<br />

−1<br />

−3 −2 −1 0 1 2 3<br />

1<br />

0<br />

−1<br />

−3 −2 −1 0 1 2 3<br />

1<br />

0.5<br />

0<br />

−3 −2 −1 0 1 2 3<br />

1<br />

0.5<br />

0<br />

−3 −2 −1 0 1 2 3<br />

1<br />

0<br />

−1<br />

−3 −2 −1 0 1 2 3<br />

0.5<br />

0<br />

−3 −2 −1 0 1 2 3<br />

Figure 14.7: Time domain and frequency domain plots of the pulses in (14.19) for j = 1, 2, 3, 4<br />

and T = 1.<br />

14.4 Exercises<br />

1. Determine the spectra of the pulses p j (t) for j = 1, 2, · · · , J defined in (14.19). Show<br />

that these pulses satisfy (14.18). Determine the total bandwidth of these J pulses.


Chapter 15<br />

Bandpass channels<br />

SUMMARY: Here we consider transmission over an ideal bandpass channel with bandwidth<br />

2W . We show that building-block waveforms for transmission over a bandpass channel<br />

can be constructed from baseband building-block waveforms having a bandwidth not<br />

larger than W . A first set of orthonormal baseband building-block waveforms can be modulated<br />

on a cosine carrier, a second orthonormal set on a sine carrier, and the resulting<br />

bandpass waveforms are all orthogonal to each other. This technique is called quadrature<br />

multiplexing. We determine the optimum receiver for this signaling method and the capacity<br />

of the bandpass channel. We finally discuss quadrature amplitude modulation (QAM).<br />

15.1 Introduction<br />

So far we have only considered baseband communication. In baseband communication the channel<br />

allows signaling only in the frequency band [−W, W ]. But what should we do if the channel<br />

only accepts signals with a spectrum in the frequency band ±[ f 0 − W, f 0 + W ]? In the present<br />

chapter we will describe a method that can be used for signaling over such a channel. We start<br />

by discussing this so-called bandpass channel. First we give a definition of the ideal bandpass<br />

channel.<br />

Definition 15.1 Consider figure 15.1. The input signal s(t) to the bandpass channel is first sent<br />

through a filter W 0 ( f ). This filter W 0 ( f ) is an ideal bandpass filter with bandwidth 2W and<br />

s(t)<br />

✲ W 0 ( f )<br />

n w (t)<br />

✛❄<br />

✘r(t)<br />

✲ + ✲<br />

✚✙<br />

W 0 ( f )<br />

f<br />

1 0 − W f 0 + W<br />

− f 0 f 0 f<br />

Figure 15.1: Ideal bandpass channel with additive white Gaussian noise.<br />

127


CHAPTER 15. BANDPASS CHANNELS 128<br />

center frequency f 0 > W . It is defined by the transfer function<br />

{<br />

W 0 ( f ) =<br />

1 if f0 − W < | f | < f 0 + W ,<br />

0 elsewhere.<br />

(15.1)<br />

The noise process N w (t) is assumed to be stationary, Gaussian, zero-mean, and white. The noise<br />

has power spectral density function<br />

S Nw ( f ) = N 0<br />

2<br />

for −∞ < f < ∞, (15.2)<br />

and autocorrelation function R Nw (t, s) = E[N w (t)N w (s)] = N 0<br />

2<br />

δ(t − s). The noise-waveform<br />

n w (t) is added to the output of the bandpass filter W 0 ( f ) which results in r(t), the output of the<br />

bandpass channel.<br />

In the next section we will describe a signaling method for a bandpass channel. This method<br />

is called quadrature multiplexing.<br />

15.2 Quadrature multiplexing<br />

In quadrature multiplexing we combine two waveforms with a baseband spectrum into a waveform<br />

with a spectrum that matches with the bandpass channel. We assume that the two baseband<br />

waveforms together contain a single message.<br />

Consider a first set of baseband waveforms {s1 c(t), sc 2 (t), · · · , sc |M| (t)} with waveform sc m (t)<br />

for all m ∈ {1, 2, · · · , |M|}, having bandwidth smaller than W , i.e. with a spectrum Sm c ( f ) ≡ 0<br />

for | f | ≥ W . To this set of baseband-waveforms there corresponds a set of building-block waveforms<br />

{φ i (t), i = 1, 2, · · · , N c }. For all i = {1, 2, · · · , N c } the spectrum of the corresponding<br />

building-block waveform i ( f ) ≡ 0 for | f | ≥ W since the building-block waveforms are linear<br />

combinations of the signal waveforms (Gram-Schmidt procedure, see appendix E).<br />

Also consider a second set of baseband waveforms {s1 s(t), ss 2 (t), · · · , ss |M|<br />

(t)} with waveform<br />

sm s (t) for all m ∈ {1, 2, · · · , |M|} having bandwidth smaller than W , i.e. Ss m ( f ) ≡ 0<br />

for | f | ≥ W . To this second set of waveforms there corresponds a set of building-block waveforms<br />

{ψ j (t), j = 1, 2, · · · , N s }. For all j = {1, 2, · · · , N s } the spectrum of the corresponding<br />

building-block waveform j ( f ) ≡ 0 for | f | ≥ W .<br />

Now, to obtain a passband signal, the two waveforms sm c (t) and ss m (t) are combined into a<br />

signal s m (t) by modulating them on a cosine resp. sine wave with frequency f 0 and then adding<br />

the resulting signals together (see figure 15.2), thus<br />

s m (t) = s c m (t)√ 2 cos(2π f 0 t) + s s m (t)√ 2 sin(2π f 0 t). (15.3)<br />

We now make (15.3) more precise. Note that to each message m ∈ M there corresponds a<br />

waveform s c m (t) = ∑ i=1,N c<br />

s c mi φ i(t) and a waveform s s m (t) = ∑ j=1,N s<br />

s s mj ψ j(t). Therefore we


CHAPTER 15. BANDPASS CHANNELS 129<br />

√<br />

2 cos 2π f0 t<br />

s c (t)<br />

s s (t)<br />

✛❄<br />

✘<br />

✲ ×<br />

✚✙<br />

✛❄<br />

✘<br />

+<br />

✚✙<br />

✛✘<br />

✻<br />

✲ ×<br />

✚✙<br />

✻<br />

√<br />

2 sin 2π f0 t<br />

s(t)<br />

✲<br />

Figure 15.2: Quadrature multiplexing.<br />

write:<br />

s m (t) = sm c (t)√ 2 cos(2π f 0 t) + sm s (t)√ 2 sin(2π f 0 t)<br />

= ∑<br />

smi c φ i(t) √ 2 cos(2π f 0 t) +<br />

∑<br />

smj s ψ j(t) √ 2 sin(2π f 0 t). (15.4)<br />

i=1,N c j=1,N s<br />

This means that the signal s m (t) can be regarded as a linear combination of “building-block”<br />

waveforms φ i (t) √ 2 cos(2π f 0 t) and ψ j (t) √ 2 sin(2π f 0 t). To see whether these waveforms are<br />

actually building blocks we have to check whether they form an orthonormal set.<br />

To this end consider the Fourier transforms of the building block waveforms 1 φ i (t) and ψ j (t):<br />

i ( f ) =<br />

j ( f ) =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

φ i (t) exp(−2π f t)dt,<br />

ψ j (t) exp(−2π f t)dt. (15.5)<br />

Since both sets {φ i (t), i = 1, N c } and {ψ i (t), j = 1, N s } form an orthonormal base, using the<br />

Parseval relation, we have for all i and j and all i ′ and j ′ that<br />

δ i,i ′ =<br />

δ j, j ′ =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

φ i (t)φ i ′(t)dt =<br />

ψ j (t)ψ j ′(t)dt =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

i ( f )i ∗ ′( f )d f,<br />

j ( f ) ∗ j ′( f )d f. (15.6)<br />

1 Note that we do not assume here that the baseband building blocks are limited in time as in the chapters on<br />

waveform communication.


CHAPTER 15. BANDPASS CHANNELS 130<br />

i ( f )<br />

c,i ( f )<br />

❅ f 0 − W f 0 + W<br />

❅<br />

❅<br />

✚ ✚✚❩ ❩ ✚ ✚❩ ❩ ❩ ✚ ❩<br />

−W 0 W f<br />

0<br />

f<br />

− f 0<br />

f 0<br />

j ( f )<br />

j s, j ( f )<br />

❅<br />

❅❅ ❩ ❩❩✚ ✚ ✚<br />

− f 0<br />

−W<br />

0<br />

W<br />

f<br />

0<br />

f<br />

✚ ✚✚❩ ❩ ❩<br />

f 0<br />

Figure 15.3: Spectra after modulation with √ 2 cos 2π f 0 t and √ 2 sin 2π f 0 t. For simplicity it is<br />

assumed that the baseband spectra are real.<br />

Now we are ready to determine the spectrum c,i ( f ) of a cosine or in-phase building-block<br />

waveform<br />

φ c,i (t) = φ i (t) √ 2 cos 2π f 0 t (15.7)<br />

and the spectrum s, j ( f ) of a sine or quadrature building-block waveform<br />

ψ s, j (t) = ψ j (t) √ 2 sin 2π f 0 t. (15.8)<br />

We express these spectra in terms of the baseband spectra i ( f ) and j ( f ) (see also figure<br />

15.3).<br />

c,i ( f ) =<br />

∫ ∞<br />

−∞<br />

φ i (t) √ 2 cos(2π f 0 t) exp(−2π f t)dt<br />

= √ 1 ( i ( f − f 0 ) + i ( f + f 0 )) ,<br />

2<br />

s, j ( f ) =<br />

=<br />

∫ ∞<br />

−∞<br />

1<br />

j √ 2<br />

ψ j (t) √ 2 sin(2π f 0 t) exp(−2π f t)dt<br />

(<br />

j ( f − f 0 ) − j ( f + f 0 ) ) . (15.9)


CHAPTER 15. BANDPASS CHANNELS 131<br />

Note (see again figure 15.3) that the spectra c,i ( f ) and s, j ( f ) are zero outside the passband<br />

±[ f 0 − W, f 0 + W ].<br />

We now first consider the correlation between φ c,i (t) and φ c,i ′(t) for all i and i ′ . This leads<br />

to<br />

∫ ∞<br />

−∞<br />

= 1 2<br />

= 1 2<br />

φ c,i (t)φ c,i ′(t)dt =<br />

+ 1 2<br />

(a)<br />

= 1 2<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

∫ ∞<br />

−∞<br />

c,i ( f ) ∗ c,i ′( f )d f<br />

[ i ( f − f 0 ) + i ( f + f 0 )][ ∗ i ′( f − f 0) + ∗ i ′( f + f 0)]d f<br />

−∞<br />

i ( f − f 0 ) ∗ i ′( f − f 0)d f + 1 2<br />

∫ ∞<br />

∫ ∞<br />

−∞<br />

i ( f + f 0 ) ∗ i ′( f − f 0)d f + 1 2<br />

−∞<br />

i ( f − f 0 ) ∗ i ′( f − f 0)d f + 1 2<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

i ( f − f 0 ) ∗ i ′( f + f 0)d f<br />

i ( f + f 0 ) ∗ i ′( f + f 0)d f<br />

i ( f + f 0 ) ∗ i ′( f + f 0)d f<br />

(b)<br />

= 1 2 δ i,i ′ + 1 2 δ i,i ′ = δ i,i ′. (15.10)<br />

Here equality (a) follows from the fact that f 0 > W and the observation that for all i = 1, N c<br />

the spectra i ( f ) ≡ 0 for | f | ≥ W . Therefore the cross-terms are zero, see also figure 15.3.<br />

Equality (b) follows from (15.6) which holds since the baseband building-blocks φ i (t) for i =<br />

1, N c , form an orthonormal base.<br />

In a similar way we can show that for all j and j ′<br />

∫ ∞<br />

−∞<br />

ψ s, j (t)ψ s, j ′(t)dt = δ j, j ′. (15.11)<br />

What remains to be investigated are the correlations between all in-phase (cosine) buildingblock<br />

waveforms φ c,i (t) for i = 1, N c and all quadrature (sine) building-block waveforms<br />

ψ s, j (t) for j = 1, N s :<br />

∫ ∞<br />

−∞<br />

= j 2<br />

= j 2<br />

φ c,i (t)ψ s, j (t)dt =<br />

+ j 2<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

∫ ∞<br />

−∞<br />

c,i ( f ) ∗ s, j ( f )d f<br />

[ i ( f − f 0 ) + i ( f + f 0 )][ ∗ j ( f − f 0) − ∗ j ( f + f 0)]d f<br />

−∞<br />

i ( f − f 0 ) ∗ j ( f − f 0)d f − j 2<br />

∫ ∞<br />

∫ ∞<br />

−∞<br />

i ( f + f 0 ) ∗ j ( f − f 0)d f − j 2<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

i ( f − f 0 ) ∗ j ( f + f 0)d f<br />

i ( f + f 0 ) ∗ j ( f + f 0)d f<br />

(c)<br />

= j i ( f − f 0 ) ∗ j<br />

2<br />

( f − f 0)d f − j i ( f + f 0 ) ∗ j<br />

−∞ 2<br />

( f + f 0)d f<br />

−∞<br />

= 0. (15.12)


CHAPTER 15. BANDPASS CHANNELS 132<br />

Here equality (c) follows from the fact that f 0 > W and the observation that for all i = 1, N c<br />

the spectra i ( f ) ≡ 0 for | f | ≥ W and for all j = 1, N s the spectra j ( f ) ≡ 0 for | f | ≥ W .<br />

Therefore the cross-terms are zero. See figure 15.3.<br />

RESULT 15.1 We have shown that all in-phase building-block waveforms φ c,i (t) for i = 1, N c<br />

and all quadrature building-block waveforms ψ s, j (t) for j = 1, N s together form an orthonormal<br />

base.<br />

Moreover the spectra c,i ( f ) and s, j ( f ) of all these building-block waveforms are zero outside<br />

the passband ±[ f 0 − W, f 0 + W ]. Therefore none of these building-block waveforms is hindered<br />

by the bandpass filter W 0 ( f ) when they are sent over our bandpass channel.<br />

It is important to note that the baseband building-block waveforms φ i (t) and ψ j (t), for any<br />

i and j, need not be orthogonal. Multiplication of these baseband waveforms by √ 2 cos 2π f 0 t<br />

resp. √ 2 sin 2π f 0 t results in the orthogonality of the bandpass building-block waveforms φ c,i (t)<br />

and ψ s, j (t).<br />

Also note that all the bandpass building block waveforms have unit energy. This follows from<br />

(15.10) and (15.11). Therefore the energy of s m (t) is equal to the squared length of the vector<br />

s m = (s c m , ss m ) = (sc m1 , sc m2 , · · · , sc m N c<br />

, s s m1 , ss m2 , · · · , ss m N s<br />

). (15.13)<br />

15.3 Optimum receiver for quadrature multiplexing<br />

The result of the previous section actually states that by applying quadrature multiplexing for a<br />

bandpass channel we obtain a “normal” waveform channel communication problem. Quadrature<br />

multiplexing apparently yields the right building-block waveforms! The building-block waveforms<br />

have spectra that are matched to the bandpass channel. Each message m ∈ M results in<br />

a signal s m (t) that is a linear combination of N c cosine and N s sine building-block waveforms.<br />

Therefore the ideal bandpass filter at the input of the bandpass channel has no effect on the<br />

transmission of the signals.<br />

We have seen before that a waveform channel is actually equivalent to a vector channel. Also<br />

we know how the optimum receiver for a waveform channel should be constructed. If we restrict<br />

ourselves for a moment to the receiver that correlates the received signal r(t) with the building<br />

blocks, we know that it should start with N c + N s multipliers followed by integrators.<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(t)φ c,i (t)dt =<br />

r(t)ψ s, j (t)dt =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(t) √ 2 cos(2π f 0 t)φ i (t)dt = r c i , and<br />

r(t) √ 2 sin(2π f 0 t)ψ j (t)dt = r s j . (15.14)<br />

After having determined the received vector r = (r c , r s ) = (r1 c, r 2 c, · · · , r N c c<br />

, r1 s, r 2 s, · · · , r N s s<br />

) we<br />

can form the dot-products (r ·s m ) where the signal vector s m = (s c m , ss m ) = (sc m1 , sc m2 , · · · , sc m N c<br />

,<br />

sm1 s , ss m2 , · · · , ss m N s<br />

):<br />

(r · s m ) = ∑<br />

ri c sc mi +<br />

∑<br />

r s j ss mj . (15.15)<br />

i=1,N c j=1,N s


CHAPTER 15. BANDPASS CHANNELS 133<br />

φ 1 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

φ 2 (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

√<br />

2 cos 2π f0 t .<br />

φ Nc (t)<br />

✛❄<br />

✘✛❄<br />

✘<br />

✲ × ✲ × ✲<br />

✚✙✚✙<br />

r(t)<br />

ψ 1 (t)<br />

✛✘✛❄<br />

✘<br />

✲ × ✲ × ✲<br />

✚✙✚✙<br />

✻ ψ 2 (t)<br />

√ ✛❄<br />

✘<br />

2 sin 2π f0 t ✲ × ✲<br />

✚✙<br />

∫<br />

∫<br />

∫<br />

∫<br />

∫<br />

r1<br />

c ✲<br />

r2<br />

c ✲<br />

rN c c<br />

✲<br />

r1<br />

s ✲<br />

r2<br />

s ✲<br />

weighting<br />

matrix<br />

∑<br />

i r c i sc mi<br />

+ ∑ j r s j ss mj<br />

c 1<br />

(r · s ❄ 1 ) ✛✘<br />

✲ + ✲<br />

✚✙<br />

c 2<br />

(r · s ❄ 2 ) ✛✘<br />

✲ + ✲<br />

✚✙<br />

.<br />

c |M|<br />

(r · s |M| ✛)<br />

❄ ✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

ˆm<br />

✲<br />

.<br />

ψ Ns (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

∫<br />

rN s s<br />

✲<br />

Figure 15.4: Optimum receiver for quadrature-multiplexed signals that are transmitted over a<br />

bandpass channel.


CHAPTER 15. BANDPASS CHANNELS 134<br />

Adding the constants c m is now the last step before selecting ˆm as the message m that maximizes<br />

(r · s m ) + c m where (see result (6.1))<br />

with<br />

c m = N 0<br />

2 ln Pr{M = m} − ‖s m ‖2<br />

, (15.16)<br />

2<br />

‖s m ‖ 2 = ‖s c m ‖2 + ‖s s m ‖2 = ∑<br />

=<br />

i=1,N c<br />

(s c mi )2 +<br />

∫ ∞<br />

−∞<br />

(s c m (t))2 dt +<br />

∑<br />

(smj s )2<br />

j=1,N s<br />

∫ ∞<br />

Note that ‖s m ‖ 2 = ∫ ∞<br />

−∞ s2 m (t)dt.<br />

All this leads to the receiver implementation shown in figure 15.4.<br />

15.4 Transmission of a complex signal<br />

−∞<br />

(s s m (t))2 dt. (15.17)<br />

After the previous section we may think of quadrature multiplexing as a method for transmitting<br />

two real-valued signals sm c (t) and ss m (t). However we can also see quadrature multiplexing as a<br />

technique to transmit a single complex waveform<br />

Then we can write<br />

sm cs (t) = sc m (t) − jss m (t). (15.18)<br />

s m (t) = R[s cs<br />

m (t)√ 2 exp( j2π f 0 t)]<br />

= s c m (t)√ 2 cos(2π f 0 t) + s s m (t)√ 2 sin(2π f 0 t), (15.19)<br />

where R[x] denotes the real part of x.<br />

The notion of energy of the complex signal sm cs (t) is in accordance with the energy that we<br />

have used in expressing the constants c m , hence<br />

∫ ∞<br />

−∞<br />

‖s cs<br />

m (t)‖2 dt =<br />

∫ ∞<br />

−∞<br />

15.5 Dimensions per second<br />

(s c m (t))2 dt +<br />

∫ ∞<br />

−∞<br />

(s s m (t))2 dt. (15.20)<br />

By applying quadrature multiplexing we can not obtain more than 4W dimensions per second<br />

from our bandpass channel. By choosing both the baseband building-block waveform sets<br />

{φ i (t), i = 1, N c } and {ψ j (t), j = 1, N s } as large (rich) as possible given the bandwidth constraint<br />

of W Hz, i.e. by taking 2W dimensions per second for each of the building-block waveform<br />

sets, we obtain the upper bound of 4W dimensions per second. Note however that, as usual,<br />

this corresponds to 2 dimensions per Hz per second.


CHAPTER 15. BANDPASS CHANNELS 135<br />

15.6 Capacity of the bandpass channel<br />

In the previous section we saw that the number of dimensions in the case of bandpass transmission<br />

is at most 4W per second. Moreover there exist baseband building-block sets that achieve<br />

this maximum.<br />

RESULT 15.2 Therefore the capacity in bit per dimension of the bandpass channel with bandwidth<br />

±[ f 0 − W, f 0 + W ] is<br />

C N = 1 2 log 2 (1 + ), (15.21)<br />

2N 0 W<br />

if the transmitter power is P s and the noise spectral density is N 0 /2 for all f . Consequently the<br />

capacity per second is<br />

C = 4WC N = 2W log 2 (1 +<br />

15.7 Quadrature amplitude modulation<br />

P s<br />

P s<br />

) bit/second. (15.22)<br />

2N 0 W<br />

In general we take N c = N s = N and the baseband in-phase and quadrature building-block<br />

waveforms equal, i.e. φ i (t) = ψ i (t) for all i = 1, N. Then the two dimensions corresponding<br />

to the passband building-blocks φ i (t) √ 2 cos(2π f 0 t) and ψ i (t) √ 2 sin(2π f 0 t) for i = 1, N are<br />

in some way linked together. This will become clear in chapter 16 on random carrier-phase<br />

communication. We refer to this method as quadrature amplitude modulation (QAM).<br />

15.8 Serial quadrature amplitude modulation<br />

It is important to realize that we could use serial pulse amplitude modulation with a pulse p(t)<br />

satisfying the Nyquist criterion (see result 14.1) on both carriers (cosine and sine). A Nyquist<br />

pulse corresponds to an orthonormal basis consisting of time shifts of this pulse. Two such sets<br />

(based on the same pulse p(t)) can be used in a QAM scheme. Note that this signaling method<br />

leads to an extremely simple receiver structure.<br />

Example 15.1 QAM modulation results in two-dimensional signal structures. In serial QAM we can take<br />

as baseband building blocks<br />

φ 1 (t) = ψ 1 (t) = √ sin(2π W t)<br />

2W , (15.23)<br />

2πWt<br />

and all other baseband building blocks are shifts over multiples of 1/(2W ) seconds of this sinc-pulse.<br />

Therefore the frequency bandwidth of all building blocks is W . After modulation a bandpass spectrum<br />

is obtained, thus the spectra of the signals fit into ±[ f 0 − W, f 0 + W ]. In each pair of dimensions<br />

corresponding to a shift over (k − 1)/(2W ) seconds we can observe a two-dimensional part (sk c, ss k<br />

) of the<br />

signal vector. This part assumes values in a two-dimensional signal structure.<br />

In figure 15.5 two signal structures are shown. The signal points are placed on a rectangular grid.<br />

This implies that their minimum Euclidean distance is not smaller than a certain value d. It is important to<br />

place the required number of signal points in such a way on the grid that their average energy is minimal.


CHAPTER 15. BANDPASS CHANNELS 136<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄ d<br />

d<br />

✛ ✲✄<br />

✂ ✄<br />

✄<br />

✁<br />

✛ ✲✄<br />

✂ ✄<br />

✁ ✂ ✂ ✄<br />

✂ ✁<br />

✁ ✁ ✂ ✁ ✂ ✁<br />

<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

(a)<br />

(b)<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

Figure 15.5: Two QAM signal structures, (a) 16-QAM, (b) 32-QAM. Note that 32-QAM is not<br />

square!<br />

15.9 Exercises<br />

1. A signal structure must contain M signal points. These point are to be chosen on an integer<br />

grid.<br />

Now consider a hypercube and a hypersphere in N dimensions. Both have their center in<br />

the origin of the coordinate system.<br />

We can either choose our signal structure as all the grid points inside the sphere or all the<br />

grid points inside the cube. Assume that the dimensions of the sphere and cube are such<br />

that they contain equally many signal points.<br />

Let M ≫ 1 so that the signal points can be assumed to be uniformly distributed over the<br />

sphere and the cube.<br />

(a) Let N = 2. Find the ratio between the average signal energies of the sphere-structure<br />

and that of the cube-structure. What is the difference in dB?<br />

(b) Let N be even. What is then the ratio for N → ∞? In dB?<br />

Hint: The volume of an N-dimensional sphere is π N/2 R N /(N/2)! for even N.<br />

The computed ratios are called shaping gains.


Chapter 16<br />

Random carrier-phase<br />

SUMMARY: Here we consider carrier modulation under the assumption that the carrierphase<br />

is not known to the receiver. Only one baseband signal s b (t) will be transmitted. First<br />

we determine the optimum receiver for this case. Then we consider the implementation of<br />

the optimum receiver when all signals have equal energy. We investigate envelope detection<br />

and work out an example in which one of out of two orthogonal signals is transmitted.<br />

16.1 Introduction<br />

Because of oscillator drift or differences in propagation time it is not always reasonable to assume<br />

that the receiver knows the phase of the wave that is used as carrier of the message, when<br />

quadrature multiplexing would be used. In that case coherent demodulation is not possible.<br />

To investigate this situation we assume that the transmitter modulates with the 1 baseband<br />

signal s b (t) a wave with a random phase θ, i.e.<br />

s(t) = s b (t) √ 2 cos(2π f 0 t − θ), (16.1)<br />

where the random phase is assumed to be uniform over [0, 2π) hence<br />

p (θ) = 1 , for 0 ≤ θ < 2π. (16.2)<br />

2π<br />

More precisely, for each message m ∈ M, let the signal sm b (t) correspond to the vector<br />

s m = (s m1 , s m2 , · · · , s m N ) hence<br />

sm b (t) = ∑<br />

s mi φ i (t). (16.3)<br />

i=1,N<br />

We assume as before that the spectrum of the signals sm b (t) for all m ∈ M is zero outside<br />

[−W, W ]. Note that the same holds for the building-block waveforms φ i (t) for i = 1, N. This<br />

1 Note that there is only one baseband signal involved here. The type of modulation is called double sideband<br />

suppressed carrier (DSB-SC) modulation.<br />

137


CHAPTER 16. RANDOM CARRIER-PHASE 138<br />

is since the building-block waveforms are linear combinations of the signals. If we now use the<br />

goniometric identity cos(a − b) = cos a cos b + sin a sin b we obtain<br />

s m (t) = s b m (t)√ 2 cos(2π f 0 t − θ)<br />

= sm b (t) cos θ√ 2 cos(2π f 0 t) + sm b (t) sin θ√ 2 sin(2π f 0 t)<br />

= ∑<br />

s mi cos θφ i (t) √ 2 cos(2π f 0 t) + ∑<br />

s mi sin θφ i (t) √ 2 sin(2π f 0 t).(16.4)<br />

i=1,N<br />

i=1,N<br />

If we now turn to the vector approach we can say that, although θ is unknown, this is equivalent<br />

to quadrature multiplexing where vector-signals s c m and ss m are transmitted satisfying<br />

s c m = s m cos θ,<br />

s s m = s m sin θ. (16.5)<br />

After receiving the signal r(t) = s m (t) + n w (t) an optimum-receiver forms the vectors r c and r s<br />

for which<br />

r c = s c m + nc = s m cos θ + n c ,<br />

r s = s s m + ns = s m sin θ + n s , (16.6)<br />

where both n c and n s are independent Gaussian vectors with independent components all having<br />

mean zero and variance N 0 /2. In the next section we will further determine the optimum receiver<br />

for this situation.<br />

16.2 Optimum incoherent reception<br />

To determine the optimum receiver we first write out the decision variable for message m, i.e.<br />

Pr{M = m}p R c ,R s (r c , r s |M = m)<br />

= Pr{M = m}<br />

∫ 2π<br />

0<br />

p (θ)p R c ,R s (r c , r s |M = m, = θ)dθ. (16.7)<br />

Note that we average over the unknown phase θ. Assume first that all messages are equally<br />

likely. Therefore only the integral is relevant. The informative term inside the integral is<br />

Therefore we investigate<br />

p R c ,R s (r c , r s |M = m, = θ)<br />

= p N c ,N s (r c − s m cos θ, r s − s m sin θ)<br />

1<br />

= ( ) N exp<br />

(− ‖r c − s m cos θ‖ 2 + ‖r s − s m sin θ‖ 2 )<br />

. (16.8)<br />

π N 0 N 0<br />

‖r c − s m cos θ‖ 2 + ‖r s − s m sin θ‖ 2<br />

= ‖r c ‖ 2 − 2(r c · s m ) cos θ + ‖s m ‖ 2 cos 2 θ + ‖r s ‖ 2 − 2(r s · s m ) sin θ + ‖s m ‖ 2 sin 2 θ<br />

= ‖r c ‖ 2 + ‖r s ‖ 2 − 2(r c · s m ) cos θ − 2(r s · s m ) sin θ + ‖s m ‖ 2 , (16.9)


CHAPTER 16. RANDOM CARRIER-PHASE 139<br />

and note that both ‖r c ‖ 2 and ‖r s ‖ 2 do not depend on the message m and can be ignored. Therefore<br />

the relevant part of the decision variable is<br />

∫ ( ) (<br />

2<br />

p (θ) exp [(r c · s<br />

N m ) cos θ + (r s · s m ) sin θ] dθ exp − E )<br />

m<br />

, (16.10)<br />

0 N 0<br />

θ<br />

where E m = ‖s m ‖ 2 . Note that we have to maximize the decision variable over all m ∈ M.<br />

We next consider (r c · s m ) cos θ + (r s · s m ) sin θ. Therefore, for each m ∈ M, we first make<br />

✧<br />

✧<br />

✧<br />

✧<br />

X m ✧<br />

✧<br />

✧<br />

✧<br />

✧<br />

✧<br />

✧<br />

✧<br />

✧<br />

✧ ❇ γ m<br />

(r c , s m )<br />

(r s , s m )<br />

Figure 16.1: Polar transformation of matched-filter outputs.<br />

a transformation of the matched-filter outputs (r c · s m ) and (r s · s m ) from rectangular into the<br />

polar coordinates (amplitude X m and phase γ m ). Define (see figure 16.1):<br />

and let the angle γ m be such that<br />

Therefore<br />

and<br />

X m<br />

=<br />

√(r c · s m ) 2 + (r s · s m ) 2 (16.11)<br />

(r c · s m ) = X m cos γ m ,<br />

(r s · s m ) = X m sin γ m . (16.12)<br />

(r c · s m ) cos θ + (r s · s m ) sin θ = X m cos γ m cos θ + X m sin γ m sin θ<br />

∫ 2π<br />

0<br />

=<br />

(a)<br />

=<br />

= X m cos(θ − γ m ), (16.13)<br />

( )<br />

2<br />

p (θ) exp [(r c · s<br />

N m ) cos θ + (r s · s m ) sin θ] dθ<br />

0<br />

∫ 2π<br />

0<br />

∫ 2π<br />

0<br />

1<br />

2π exp ( 2Xm<br />

)<br />

cos(θ − γ m )<br />

N 0<br />

(<br />

1<br />

2π exp 2Xm<br />

cos θ<br />

N 0<br />

dθ<br />

) ( ) 2Xm<br />

dθ = I 0 . (16.14)<br />

N 0<br />

Here (a) follows from the periodicity of the cosine. Note that therefore the angle γ m is not<br />

relevant! The integral denoted by I 0 (·) is called the “zero-order modified Bessel function of the<br />

first kind” (see figure 16.2).


CHAPTER 16. RANDOM CARRIER-PHASE 140<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

0 0.5 1 1.5 2 2.5 3 3.5 4<br />

Figure 16.2: A plot of I 0 (x) = 1 ∫ 2π<br />

2π 0<br />

exp(x cos θ)dθ as a function of x.<br />

RESULT 16.1 Combining everything we obtain that the optimum receiver for “random-phase<br />

transmission” (incoherent detection) has to choose the message m ∈ M that maximizes<br />

( ) (<br />

2Xm<br />

I 0 exp − E )<br />

m<br />

. (16.15)<br />

N 0 N 0<br />

16.3 <strong>Signal</strong>s with equal energy, receiver implementation<br />

If all the signals have the same energy the optimum receiver only has to maximize I 0 (2X m /N 0 ).<br />

However since I 0 (x) is a monotonically increasing function of x this is equivalent to maximizing<br />

X m or its square<br />

X 2 m = (r c · s m ) 2 + (r s · s m ) 2 . (16.16)<br />

How can we now determine e.g. (r c · s m )? We can use the standard approach as shown in<br />

figure 15.4 and correlate r(t) with the bandpass building-block waveforms φ i (t) √ 2 cos(2π f 0 t)<br />

to obtain the components of r c and then form the dot product, but there is also a more direct way.


CHAPTER 16. RANDOM CARRIER-PHASE 141<br />

Note that<br />

(r c · s m ) = ∑<br />

i=1,N<br />

ri c s mi = ∑ (∫ ∞<br />

)<br />

r(t)φi c (t)dt s mi<br />

i=1,N<br />

−∞<br />

= ∑ (∫ ∞<br />

=<br />

=<br />

i=1,N<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(t)φ i (t) √ 2 cos(2π f 0 t)dt<br />

−∞<br />

r(t) √ 2 cos(2π f 0 t) ∑<br />

i=1,N<br />

s mi φ i (t)dt<br />

)<br />

s mi<br />

r(t) √ 2 cos(2π f 0 t)sm b (t)dt. (16.17)<br />

A similar result holds for the product (r s · s m ). All this suggests the implementation shown in<br />

figure 16.3. As usual there is also a matched-filter version of this incoherent receiver.<br />

√<br />

2 cos(2π f0 t)<br />

r(t) ✛❄<br />

✘<br />

✲ ×<br />

✚✙<br />

✛✘<br />

✲ ×<br />

✚✙<br />

✻<br />

√<br />

2 sin(2π f0 t)<br />

✛✘<br />

(r c · s 1 )<br />

∫<br />

✲ × ✲<br />

✲ (·) 2<br />

✚✙<br />

✛❄<br />

✘X ✻<br />

1<br />

2<br />

s ✲<br />

1 b(t)<br />

+<br />

✛❄<br />

✘<br />

✚✙<br />

∫<br />

✲ ✲<br />

✲<br />

✻<br />

×<br />

(·) 2<br />

✚✙<br />

(r s · s 1 )<br />

✛✘<br />

(r c · s 2 )<br />

∫<br />

✲ × ✲<br />

✲ (·) 2<br />

✚✙<br />

✛❄<br />

✘X ✻<br />

s b + ✲<br />

2 (t)<br />

2<br />

2<br />

✛❄<br />

✘<br />

✚✙<br />

∫<br />

✲ ✲<br />

✲<br />

✻<br />

×<br />

(·) 2<br />

✚✙<br />

(r s · s 2 )<br />

.<br />

✛✘<br />

(r c · s |M| )<br />

∫<br />

✲ × ✲<br />

✲ (·) 2<br />

✚✙<br />

✛❄<br />

X✘<br />

✻<br />

s b + ✲<br />

|M| (t)<br />

|M|<br />

2<br />

✛❄<br />

✘<br />

✚✙<br />

∫<br />

✲ ✲<br />

✲<br />

✻<br />

×<br />

(·) 2<br />

✚✙<br />

(r s · s |M| )<br />

select<br />

largest<br />

ˆm<br />

✲<br />

Figure 16.3: Correlation receiver for equal-energy signals with random phase.


CHAPTER 16. RANDOM CARRIER-PHASE 142<br />

16.4 Envelope detection<br />

Fix an m ∈ M. The matched filter for the (bandpass) signal s m (t) = s b m (t)√ 2 cos(2π f 0 t − θ)<br />

has impulse response s m (T − t) = s b m (T − t)√ 2 cos(2π f 0 (T − t) − θ). Here it is assumed that<br />

the signal s m (t) is zero outside [0, T ]. Since the phase θ of the transmitted signal is unknown to<br />

the receiver, the impulse response<br />

is probably a good matched filter. Let’s investigate this!<br />

The response of this filter on r(t) is<br />

u m (t) =<br />

with<br />

=<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

= cos(2π f 0 t)<br />

r(α)h m (t − α)dα<br />

h m (t) = s b m (T − t)√ 2 cos(2π f 0 t) (16.18)<br />

r(α)s b m (T − t + α)√ 2 cos(2π f 0 (t − α))dα<br />

r(α)s b m (T − t + α)√ 2[cos(2π f 0 t) cos(2π f 0 α) + sin(2π f 0 t) sin(2π f 0 α)]dα<br />

+ sin(2π f 0 t)<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(α)s b m (T − t + α)√ 2 cos(2π f 0 α)dα<br />

r(α)s b m (T − t + α)√ 2 sin(2π f 0 α)dα<br />

= u c m (t) cos(2π f 0t) + u s m (t) sin(2π f 0t), (16.19)<br />

u c m (t) =<br />

∫ ∞<br />

−∞<br />

u s m (t) =<br />

∫ ∞<br />

−∞<br />

r(α) √ 2 cos(2π f 0 α)sm b (T − t + α)dα,<br />

r(α) √ 2 sin(2π f 0 α)sm b (T − t + α)dα. (16.20)<br />

The signals u c m (t) and us m (t) are baseband signals. Why? The reason is that we can regard<br />

e.g. u c m (t) as the output of the filter with impulse response sb m (T − t) when the excitation is<br />

r(t) √ 2 cos(2π f 0 t). It is clear that signal-components outside the frequency band [−W, W ] can<br />

not pass the filter sm b (T − t) since sb m (t) is a baseband signal.<br />

We now use the trick of section 16.2 again. Write the matched-filter output as<br />

where<br />

and angle γ m (t) is such that<br />

u m (t) = X m (t) cos[2π f 0 t − γ m (t)] (16.21)<br />

X m (t) =<br />

√<br />

(u c m (t))2 + (u s m (t))2 (16.22)<br />

u c m (t) = X m(t) cos γ m (t),<br />

u s m (t) = X m(t) sin γ m (t). (16.23)


CHAPTER 16. RANDOM CARRIER-PHASE 143<br />

Now X m (t) is a slowly-varying “envelope” (amplitude) and γ m (t) a slowly-varying ”phase” of<br />

the output u m (t) of the matched filter. By slowly-varying we mean within bandwidth [−W, W ].<br />

Next let us consider what happens at t = T . Observe from equations (16.17) that<br />

hence<br />

X m (T ) =<br />

u c m (T ) = (r c · s m ),<br />

u s m (T ) = (r s · s m ), (16.24)<br />

√<br />

(r c · s m ) 2 + (r s · s m ) 2 . (16.25)<br />

Therefore, for equal-energy signals, we can construct an optimum receiver by sampling the<br />

envelope of the outcomes of the matched filters h m (t) = s b m (T −t)√ 2 cos(2π f 0 t) for all m ∈ M<br />

and comparing the samples. This leads to the implementation shown in figure 16.4.<br />

r(t)<br />

✲<br />

✲<br />

h 1 (t)<br />

h 2 (t)<br />

u 1 (t)<br />

u 2 (t)<br />

✲<br />

✲<br />

envelope<br />

detector<br />

envelope<br />

detector<br />

X 1 (t)<br />

X 2 (t)<br />

❍ X ❍❍ 1<br />

✄<br />

✂ ✄<br />

✁ ✂ ✁ ✲<br />

❍ X ❍❍ 2<br />

✄ ✄<br />

✂ ✁ ✂ ✁ ✲<br />

select<br />

largest<br />

ˆm<br />

✲<br />

.<br />

✲<br />

h |M| (t)<br />

u |M| (t)<br />

✲<br />

envelope<br />

detector<br />

X |M| (t) ❍ X ❍❍ |M|<br />

✄<br />

✂ ✄<br />

✁ ✂ ✁ ✲<br />

sample at t = T<br />

Figure 16.4: Envelope-detector receiver for equal energy signals with random phase.<br />

Note that instead of using a matched filter with (passband) impulse response sm b (T − t)√ 2<br />

cos(2π f 0 t) we can first multiply the received signal r(t) by cos(2π f 0 t) and then use a matched<br />

filter with (baseband) impulse response sm b (T − t).<br />

Example 16.1 For some m ∈ M assume that sm b (t) = 1 for 0 ≤ t ≤ 1 and zero elsewhere. If the random<br />

phase turns out to be θ = π/2 then<br />

s m (t) = s b m (t)√ 2 cos(2π f 0 t − π/2) = s b m (t)√ 2 sin(2π f 0 t). (16.26)<br />

For the matched-filter impulse response, noting that T = 1, we can write<br />

h m (t) = s b m (1 − t)√ 2 cos(2π f 0 t). (16.27)


CHAPTER 16. RANDOM CARRIER-PHASE 144<br />

If we assume that is no noise, hence r(t) = s m (t), the output u m (t) of the matched filter will be<br />

u m (t) =<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

r(α)h m (t − α)dα<br />

s b m (α)√ 2 sin(2π f 0 α)s b m (1 − t + α)√ 2 cos(2π f 0 (t − α)dα. (16.28)<br />

All these signals are shown in figure 16.5 for f 0 = 7Hz.<br />

Note that the envelope of u m (t) has the triangular shape which is the output of a matched filter for a<br />

rectangular pulse if the input of this matched filter is this rectangular pulse.<br />

s b m (t) −0.5 0 0.5 1 1.5 2 2.5<br />

1.5<br />

1<br />

0.5<br />

0<br />

−0.5<br />

−1<br />

−1.5<br />

−0.5 0 0.5 1 1.5 2 2.5<br />

1.5<br />

1<br />

0.5<br />

0<br />

−0.5<br />

−1<br />

−1.5<br />

s m<br />

(t)=s b m (t)sqrt(2)sin(2*pi*f 0 *t)<br />

h m<br />

(t)=s b m (1−t)sqrt(2)cos(2*pi*f 0 *t) −0.5 0 0.5 1 1.5 2 2.5<br />

1.5<br />

1<br />

0.5<br />

0<br />

−0.5<br />

−1<br />

−1.5<br />

−0.5 0 0.5 1 1.5 2 2.5<br />

1.5<br />

1<br />

0.5<br />

0<br />

−0.5<br />

−1<br />

−1.5<br />

u m<br />

(t)<br />

Figure 16.5: <strong>Signal</strong>s s b m (t), s m(t), and u m (t), and impulse-response h m (t).<br />

16.5 Probability of error for two orthogonal signals<br />

An incoherent receiver does not consider phase information. Therefore it can not yield so small<br />

an error probability as a coherent receiver. To illustrate this we will now calculate the probability<br />

of error for white Gaussian noise when one of two equally likely messages is communicated<br />

over a system utilizing an incoherent receiver and DSB-SC modulated equal-energy orthogonal<br />

baseband signals<br />

s b 1 (t) = √ E s φ 1 (t), hence s 1 = ( √ E s , 0), and<br />

s b 2 (t) = √ E s φ 2 (t), hence s 2 = (0, √ E s ). (16.29)


CHAPTER 16. RANDOM CARRIER-PHASE 145<br />

An optimum receiver (see result 16.1) now chooses ˆm = 1 if and only if<br />

(r c · s 1 ) 2 + (r s · s 1 ) 2 > (r c · s 2 ) 2 + (r s · s 2 ) 2 . (16.30)<br />

Here all vectors are two-dimensional, more precisely<br />

r c = s m cos θ + n c<br />

r s = s m sin θ + n s . (16.31)<br />

When M = 1 then the vector components of r c = (r c 1 , r c 2 ) and r s = (r s 1 , r s 2 ) are<br />

r c 1 = √ E s cos θ + n c 1 ,<br />

r s 1 = √ E s sin θ + n s 1 ,<br />

r c 2 = n c 2 ,<br />

r s 2 = n s 2 , (16.32)<br />

where n c = (n c 1 , nc 2 ) and ns = (n s 1 , ns 2 ). Now by s 1 = (√ E s , 0) and s 2 = (0, √ E s ) the optimum<br />

receiver decodes ˆm = 1 if and only if<br />

(r c 1 )2 + (r s 1 )2 > (r c 2 )2 + (r s 2 )2 . (16.33)<br />

The noise components are all statistically independent Gaussian variables with density function<br />

p N (n) = 1 √ π N0<br />

exp(− n2<br />

N 0<br />

). (16.34)<br />

Now fix some θ. Assume that (r c 1 )2 + (r s 1 )2 = ρ 2 . What is now the probability of error given<br />

that M = 1 and = θ? Therefore consider<br />

Pr{ ˆM = 2| = θ, M = 1, (R1 c )2 + (R1 s )2 = ρ 2 } = Pr{(R2 c )2 + (R2 s )2 ≥ ρ 2 }<br />

∫ ∫<br />

=<br />

p N (α)p N (β)dαdβ<br />

α 2 +β 2 ≥ρ<br />

∫ ∫<br />

2<br />

=<br />

exp(− α2 + β 2<br />

)dαdβ<br />

π N 0<br />

=<br />

1<br />

α 2 +β 2 ≥ρ<br />

∫ 2 ∞ ∫ 2π<br />

ρ<br />

0<br />

N 0<br />

1<br />

exp(− r 2<br />

)rdrdθ<br />

π N 0 N 0<br />

= exp(− ρ2<br />

N 0<br />

). (16.35)


CHAPTER 16. RANDOM CARRIER-PHASE 146<br />

We obtain the error probability for the case where M = 1 by averaging over all R c 1 and Rs 1 , hence<br />

Pr{ ˆM = 2| = θ, M = 1}<br />

=<br />

=<br />

=<br />

∫ ∞ ∫ ∞<br />

−∞ −∞<br />

∫ ∞ ∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

−∞<br />

p R c<br />

1 ,R s 1 (r c 1 , r s 1 | = θ, M = 1) exp(−(r c 1 )2 + (r s 1 )2<br />

N 0<br />

)dr c 1 dr s 1<br />

p R c<br />

1<br />

(r c 1 | = θ, M = 1)p R s 1 (r s 1 | = θ, M = 1) exp(−(r c 1 )2 + (r s 1 )2<br />

N 0<br />

)dr c 1 dr s 1<br />

p R c<br />

1<br />

(r1 c c | = θ, M = 1) exp(−(r 1 )2<br />

)dr1 c N ·<br />

0<br />

∫ ∞<br />

· p R s<br />

1<br />

(r1 s | = θ, M = 1) exp(−(r 1 s)2<br />

)dr1 s<br />

−∞<br />

N . (16.36)<br />

0<br />

Consider the first factor. Note that r c 1 = √ E s cos θ + n c 1 . Therefore<br />

∫ ∞<br />

−∞<br />

=<br />

p R c<br />

1<br />

(r1 c c | = θ, M = 1) exp(−(r 1 )2<br />

)dr1<br />

c N 0<br />

∫ ∞<br />

−∞<br />

With m = √ E s cos θ this integral becomes<br />

∫ ∞<br />

−∞<br />

=<br />

=<br />

=<br />

1<br />

√ exp(− (α − √ E s cos θ) 2<br />

) exp(− α2<br />

)dα. (16.37)<br />

π N0 N 0<br />

N 0<br />

1 (α − m)2<br />

√ exp(− ) exp(− α2<br />

)dα<br />

π N0 N 0 N 0<br />

∫ ∞<br />

1<br />

√ exp(− 2α2 − 2αm + m 2<br />

)dα<br />

π N0<br />

−∞<br />

∫ ∞<br />

−∞<br />

N 0<br />

1<br />

√ exp(− 2(α2 − αm + m 2 /4) + m 2 /2<br />

)dα<br />

π N0<br />

exp(−<br />

m2<br />

2N 0<br />

)<br />

√<br />

2<br />

∫ ∞<br />

−∞<br />

1<br />

√ π N0 /2<br />

N 0<br />

− m/2)2<br />

exp(−(α )dα =<br />

N 0 /2<br />

exp(−<br />

m2<br />

2N 0<br />

)<br />

√<br />

2<br />

= exp(− E s cos 2 θ<br />

2N 0<br />

)<br />

√<br />

2<br />

. (16.38)<br />

Combining this with a similar result for the second factor we obtain<br />

Pr{ ˆM = 2| = θ, M = 1} = exp(− E s cos 2 θ<br />

2N 0<br />

)<br />

√<br />

2<br />

· exp(− E s sin 2 θ<br />

2N 0<br />

)<br />

√<br />

2<br />

= 1 2 exp(− E s<br />

2N 0<br />

). (16.39)<br />

Although we have fixed θ this probability is independent of θ. Averaging over θ yields therefore<br />

Pr{ ˆM = 2|M = 1} = 1 2 exp(− E s<br />

2N 0<br />

). (16.40)<br />

Based on symmetry we can obtain a similar result for Pr{ ˆM = 1|M = 2}.


CHAPTER 16. RANDOM CARRIER-PHASE 147<br />

RESULT 16.2 Therefore we obtain for the error probability for an incoherent receiver for two<br />

equally likely orthogonal signals, both having energy E s , that<br />

P E = 1 2 exp(− E s<br />

2N 0<br />

). (16.41)<br />

Note that for coherent reception of two equally likely orthogonal signals we have obtained<br />

before that<br />

√<br />

E s<br />

P E = Q( ) ≤ 1 N 0 2 exp(− E s<br />

). (16.42)<br />

2N 0<br />

Here we have used the bound on the Q-function that was derived in appendix A. Figure 16.6<br />

shows that the error probabilities are comparable however.<br />

10 0 Es/N0<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

10 −5<br />

10 −6<br />

10 −7<br />

−15 −10 −5 0 5 10 15<br />

Figure 16.6: Probability of error for coherent and incoherent reception of two equally likely<br />

orthogonal signals of energy E s as a function of E s /N 0 in dB. Note that probability of error for<br />

incoherent reception is largest.<br />

Another disadvantage of incoherent transmission of two orthogonal signals is that we loose<br />

3 dB relative to antipodal signaling. Also the bandwidth efficiency of random-phase transmission<br />

of two-orthogonal signals is a factor of two smaller than that of coherent transmission. A<br />

transmitted dimension spreads out over two received dimensions.


CHAPTER 16. RANDOM CARRIER-PHASE 148<br />

16.6 Exercises<br />

1. Let X and Y be two independent Gaussian random variables with common variance σ 2 .<br />

The mean of X is m and Y is a zero-mean random variable. We define the random variable<br />

V as V = √ X 2 + Y 2 . Show that<br />

p V (v) = v<br />

σ 2 I 0( mv<br />

σ 2 ) exp(−v2 + m 2<br />

2σ 2 ), for v > 0, (16.43)<br />

and 0 for v ≤ 0. Here I 0 (·) is the modified Bessel function of the first kind and zero order.<br />

The distribution of V is know as the Rician distribution. In the special case where m = 0,<br />

the Rician distribution simplifies to the Rayleigh distribution.<br />

(Exercise 3.31 from Proakis and Salehi [17].)<br />

2. In on-off keying of a carrier-modulated signal, the two possible signals are<br />

s b 1 (t) = √ Eφ(t), hence s 1 = √ E,<br />

s b 2 (t) = 0, hence s 2 = 0. (16.44)<br />

Note that s 1 and s 2 are signals in one-dimensional “vector”-notation.<br />

equiprobable i.e. Pr{M = 1} = Pr{M = 2} = 1/2.<br />

The signals are<br />

These signals are transmitted over a bandpass channel with a carrier with random phase θ<br />

with p (θ) = 1/2π for 0 ≤ θ < 2π. Power density of the noise is N 0<br />

2<br />

for all frequencies.<br />

An optimum receiver first determines r c and r s for which we can write<br />

r c = s m cos θ + n c ,<br />

r s = s m sin θ + n s , (16.45)<br />

where both n c and n s are independent zero-mean Gaussian vectors with variance N 0<br />

2 .<br />

Now let E/N 0 = 10dB.<br />

(a) Determine the optimum decision regions I m for m = 1, 2.<br />

Hint: Note that I 0 (12.1571) = 22025.<br />

(b) Determine the expected error probability that is achieved by the optimum receiver.<br />

Hint: Note that ∫ 2.7184<br />

0<br />

x exp(− x2<br />

2 )I 0(x √ 20)dx = 635.72.<br />

3. In on-off keying of a carrier modulated signal the two baseband signals are<br />

s b 1 (t) = Aφ(t), hence s 1 = A<br />

s b 2 (t) = 0, hence s 2 = 0.<br />

Here φ(t) is a baseband building-block waveform and s 1 and s 2 are the baseband signals<br />

in vector-representation (one dimensional). Both signals are equiprobable, hence Pr{M =<br />

1} = Pr{M = 2} = 1/2.


CHAPTER 16. RANDOM CARRIER-PHASE 149<br />

The baseband signals are modulated with a carrier having frequency f 0 and random phase<br />

, hence<br />

s m (t) = s b m (t)√ 2 cos(2π f 0 t − θ).<br />

The resulting signal s 1 (t) or s 2 (t) is transmitted over a bandpass channel, power spectral<br />

density of the additive white Gaussian noise is N 0 /2 for all frequencies f . Assume that<br />

A 2 /N 0 = ln(13/5).<br />

An optimum receiver correlates the received signal r(t) = s m (t) + n w (t) with the bandpass<br />

building-block waveforms φ(t) √ 2 cos(2π f 0 t) and φ(t) √ 2 sin(2π f 0 t) to obtain the<br />

received vector (r c , r s ).<br />

(a) Fix some θ. Express the received vector (r c , r s ) as a signal vector (s c , s s ) plus a<br />

noise vector (n c , n s ). What is the signal vector (s c , s s ) as a function of θ? What are<br />

the statistical properties of the noise vector (n c , n s )?<br />

(b) The random variable can have the values +π/2 and −π/2 only and Pr{ =<br />

+π/2} = Pr{ = −π/2} = 1/2. What are now the possible signal vectors (s c , s s )?<br />

Give the decision variables as function of (r c , r s ) and specify 2 for what (r c , r s ) an<br />

optimum receiver decides ˆM = 2.<br />

(c) Give an expression for the error probability P E that is achieved by an optimum receiver.<br />

2 Hint: note that x = ln 5 satisfies 26<br />

5<br />

= exp(x) + exp(−x).


Chapter 17<br />

Colored noise<br />

SUMMARY: So far we have only considered additive white Gaussian noise. In this chapter<br />

we will see that we should use a whitening filter if the channel noise is nonwhite. We will<br />

discuss the corresponding receiver structures.<br />

17.1 Introduction<br />

n(t)<br />

m<br />

✲<br />

transmitter<br />

s m (t) ✛❄<br />

✘<br />

✲ + ✲ receiver<br />

✚✙r(t) = s m (t) + n(t)<br />

ˆm<br />

✲<br />

Figure 17.1: Communication over an additive waveform channel. The noise n(t) is nonwhite<br />

and Gaussian.<br />

Consider figure 17.1. There the transmitter sends, depending on the message index m, one of<br />

the signals (waveforms) s 1 (t), s 2 (t), · · · , s |M| (t) over a waveform channel. The channel adds<br />

wide-sense-stationary zero-mean non-white Gaussian noise n(t) with power spectral density<br />

S n ( f ) to this waveform 1 . Therefore the output of the channel r(t) = s m (t) + n(t).<br />

How should an optimum receiver process this output signal r(t) now? We will show next<br />

that we can use a so-called whitening filter to make the colored noise white. Then we proceed as<br />

usual.<br />

17.2 Another result on reversibility<br />

We could try to construct an optimum receiver in two steps (see figure 17.2). First an operation is<br />

performed on the channel output r(t). This yields a new channel output r o (t). Then we construct<br />

1 See appendix D.<br />

150


CHAPTER 17. COLORED NOISE 151<br />

s m (t)<br />

r(t)<br />

r o (t)<br />

✲ channel<br />

✲ operation<br />

✲<br />

receiver<br />

ˆm<br />

✲<br />

Figure 17.2: Insertion of an operation between channel output and receiver.<br />

an optimum receiver for the channel with input s m (t) and output r o (t).<br />

It will be clear that such a two-step procedure cannot result in a smaller average error probability<br />

P E than an optimum one-step procedure would achieve.<br />

If an inverse operation exist which permits r(t) to be reconstructed from r o (t) however, the<br />

theorem 4.3 of reversibility states that the average error probability need not be increased by a<br />

two-step procedure. The second step could in that case consist of the calculation of r(t) from<br />

r o (t), followed by the optimum receiver for r(t). The theorem of reversibility in its most general<br />

form states:<br />

RESULT 17.1 A reversible operation, transforming r(t) into one or more waveforms, may be<br />

inserted between the channel output and the receiver without affecting the minimum attainable<br />

average error probability.<br />

17.3 Additive nonwhite Gaussian noise<br />

Suppose that we use a filter with impulse response g(t) and transfer function G( f ) to change<br />

the power spectral density S n ( f ) of the Gaussian noise process N(t). Consider figure 17.3. The<br />

filter produces the Gaussian noise process N o (t) at its output.<br />

From equation (D.6) in appendix D we know that<br />

S n o( f ) = S n ( f )G( f )G ∗ ( f ) = S n ( f )|G( f )| 2 . (17.1)<br />

If we use the filter g(t) to whiten the noise, i.e. to produce noise at the output with power spectral<br />

density S n o( f ) = N 0 /2, the filter should be such that S n ( f )|G( f )| 2 = N 0 /2. In appendix I we<br />

have shown how to determine a realizable linear filter with transfer function G( f ), which has a<br />

realizable inverse and for which<br />

|G( f )| 2 = N 0<br />

2<br />

1<br />

S n ( f ) . (17.2)<br />

The method in the appendix is applicable if whenever S n ( f ) can be expressed (or approximated)<br />

as a ratio of two polynomials in f . We will work out an example next.<br />

n(t)<br />

✲<br />

h(t)<br />

n o (t)<br />

✲<br />

Figure 17.3: Filtering the noise process N(t).


CHAPTER 17. COLORED NOISE 152<br />

Example 17.1 Consider (see figure 17.4) noise with power spectral density<br />

Observe that we can take as transfer function of our whitening filter<br />

S n ( f ) = f 2 + 4<br />

f 2 + 1 . (17.3)<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

−10 −8 −6 −4 −2 0 2 4 6 8 10<br />

Figure 17.4: The power spectrum S n ( f ) = ( f 2 + 4)/( f 2 + 1) of the noise and the squared<br />

modulus ( f 2 + 1)/( f 2 + 4) of the transfer function G( f ) of the corresponding whitening filter.<br />

then both G( f ) and its inverse are realizable (see appendix I).<br />

G( f ) = f − j<br />

f − 2 j , (17.4)<br />

17.4 Receiver implementation<br />

17.4.1 Correlation type receiver<br />

When the whitening filter g(t) has been determined we can construct the optimum receiver. Note<br />

that we actually place the whitening filter at the channel output r(t) as in the left part of figure<br />

17.5 but this is equivalent to the circuit containing two filters g(t) as in the right part of figure<br />

17.5. Therefore the effect of applying the whitening filter g(t) is that we have transformed the<br />

channel into an additive white Gaussian noise waveform channel and that the signals s m (t) are<br />

changed by the whitening filter g(t) into the signals<br />

s o m (t) = s m(t) ∗ g(t) =<br />

∫ ∞<br />

−∞<br />

s m (α)g(t − α)dα for m ∈ M. (17.5)


CHAPTER 17. COLORED NOISE 153<br />

n(t)<br />

❄<br />

g(t)<br />

n(t)<br />

n w (t)<br />

s m (t)<br />

✛❄<br />

✘<br />

s m (t)<br />

✛❄<br />

✘<br />

✲ + ✲ g(t) ✲<br />

✲ g(t) ✲ + ✲<br />

✚<br />

✙r(t)<br />

r o (t)<br />

s o ✚✙<br />

m (t) r o (t)<br />

Figure 17.5: Two equivalent channel/whitening-filter circuits.<br />

This observation leads to the optimum receiver shown in figure 17.6. This receiver correlates<br />

the output r o (t) of the whitening filter g(t) with the signals sm o (t) for all m ∈ M. The signals<br />

sm o (t) are obtained at the output of filters with impulse response g(t) when s m(t) is input. Note<br />

that the constants cm o for m ∈ M are now<br />

c o m (t) = N 0<br />

2 ln Pr{M = m} − 1 2<br />

since the signal s m (t) has changed into s o m (t).<br />

17.4.2 Matched-filter type receiver<br />

∫ ∞<br />

−∞<br />

[s o m (t)]2 dt, (17.6)<br />

To determine the matched-filter type receiver here, we will do some frequency-domain investigations.<br />

Suppose that for some m ∈ M the Fourier spectrum of the waveform s m (t) is denoted by<br />

S m ( f ). Assume that s m (t) = 0 except for 0 ≤ t ≤ T . The transfer function of the corresponding<br />

matched filter h m (t) = s m (T − t) is then given by<br />

H m ( f ) =<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

s m (T − t) exp(− j2π f t)dt<br />

s m (α) exp(− j2π f (T − α))dα<br />

= exp(− j2π f T )<br />

∫ ∞<br />

−∞<br />

s m (α) exp( j2π f α)dα = exp(− j2π f T )Sm ∗ ( f ), (17.7)<br />

where Sm ∗ ( f ) is the complex conjugate of S m( f ).<br />

Now what is the matched filter corresponding to the signal sm o (t). First note that the spectrum<br />

of sm o (t) is Sm o ( f ) = G( f )S m( f ). (17.8)<br />

Suppose that g(t) is only non-zero for 0 ≤ t ≤ T 1 . Moreover let all s m (t) be zero outside the<br />

time interval 0 ≤ t ≤ T 2 . Then s o m (t) = ∫ ∞<br />

−∞ s m(α)g(t − α)dα can only be non-zero if an α


CHAPTER 17. COLORED NOISE 154<br />

r(t)<br />

✲<br />

g(t)<br />

r o (t)<br />

s 1 (t)<br />

✲<br />

s 2 (t)<br />

✲<br />

g(t)<br />

g(t)<br />

.<br />

s1 o(t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

s2 o(t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

∫<br />

∫<br />

c1<br />

o<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

c2<br />

o<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

ˆm<br />

✲<br />

s |M| (t)<br />

✲<br />

g(t)<br />

s|M| o (t)<br />

✛❄<br />

✘<br />

✲ × ✲<br />

✚✙<br />

.<br />

∫<br />

c|M|<br />

o<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

Figure 17.6: Correlation receiver for the additive nonwhite Gaussian noise channel (filteredreference<br />

receiver).<br />

exists such that 0 ≤ α ≤ T 2 and 0 ≤ t − α ≤ T 1 hence if 0 ≤ t ≤ T 1 + T 2 . The transfer function<br />

H o ( f ) of the filter matched to sm o (t) should therefore be<br />

H o m ( f ) = [So m ( f )]∗ exp(− j2π f (T 1 + T 2 ))<br />

= G ∗ ( f )S ∗ m ( f ) exp(− j2π f (T 1 + T 2 ))<br />

= G ∗ ( f ) exp(− j2π f T 1 )S ∗ m ( f ) exp(− j2π f T 2). (17.9)<br />

This corresponds to a cascade of the filter g(T 1 − t) followed by the filter s m (T 2 − t), one for<br />

each m ∈ M. Thus the filter matched to sm o (t) consists of a part matched to the channel (thus<br />

to the whitening filter) and a part that is matched to the signal s m (t). This leads to the receiver<br />

structure in figure 17.7.<br />

It is interesting to see what the front-end of this receiver does. This consists of the cascade of<br />

the whitening filter g(t) and the part of the matched filter matched to the channel (the whitening<br />

filter). The transfer function of this cascade is<br />

G( f )G ∗ ( f ) exp(− j2π f T 1 ) = |G( f )| 2 exp(− j2π f T 1 )<br />

= N 0/2<br />

S n ( f ) exp(− j2π f T 1). (17.10)<br />

This cascade may be implemented as a single filter. This effect of this filter is to pass energy<br />

over frequency bands where the the noise power is small and to suppress energy over frequency<br />

bands where it is strong.


CHAPTER 17. COLORED NOISE 155<br />

r(t)<br />

✲<br />

g(t)<br />

r o (t)<br />

✲<br />

g(T 1 − t)<br />

✲<br />

✲<br />

s 1 (T 2 − t)<br />

s 2 (T 2 − t)<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

c1<br />

o<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

c2<br />

o<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

select<br />

largest<br />

ˆm<br />

✲<br />

.<br />

✲<br />

s |M| (T 2 − t)<br />

❍<br />

✄ ✄<br />

✂ ❍❍<br />

✁ ✂ ✁<br />

c|M|<br />

o<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚✙<br />

❄<br />

sample at t = T 1 + T 2<br />

Figure 17.7: Matched filter receiver for the additive nonwhite Gaussian noise channel (filteredsignal<br />

receiver).<br />

17.5 Exercises<br />

1. A communication system uses the {ϕ j (t)} shown in figure 17.8 to transmit one of two<br />

equally likely messages by means of the signals s 1 (t) = √ E s ϕ 1 (t), s 2 (t) = √ E s ϕ 2 (t).<br />

The channel is also illustrated in figure 17.8.<br />

(a) What is the probability of error for an optimum receiver when E s /N 0 = 5?<br />

(b) Give a detailed block diagram, including waveshapes in the absence of noise, of the<br />

optimum correlation receiver (filtered-signal receiver).<br />

(Exercise 7.1 from Wozencraft and Jacobs [25].)


CHAPTER 17. COLORED NOISE 156<br />

✻ϕ 1 (t)<br />

1<br />

✻ϕ 2 (t)<br />

1<br />

✲<br />

t<br />

1<br />

2<br />

✲<br />

t<br />

n w (t)<br />

s(t)<br />

✲<br />

h(t)<br />

s 0 (t) ✛❄<br />

✘<br />

✲ +<br />

✚✙<br />

r(t)<br />

✲<br />

1<br />

✻h(t)<br />

1/2<br />

4<br />

9/2<br />

✲<br />

t<br />

Figure 17.8: Building blocks waveforms, communication system, and the channel impulse response<br />

h(t).


Chapter 18<br />

Capacity of the colored noise waveform<br />

channel<br />

SUMMARY: In this chapter we determine the capacity of an additive non-white Gaussian<br />

noise waveform channel. We have to find out how to distribute signal power over a number<br />

of parallel channels, each with their own noise power. These channels partition the<br />

spectrum in different frequency bands. It turns out that a ”water-filling” method achieves<br />

capacity.<br />

18.1 Problem statement<br />

transmitter<br />

n(t)<br />

s(t) ✛❄ ✘<br />

✲ +<br />

✲<br />

✚✙r(t) = s(t) + n(t)<br />

receiver<br />

Figure 18.1: Communication system based on an additive non-white Gaussian noise channel.<br />

Consider the communication system shown on figure 18.1. The channel adds stationary zeromean<br />

Gaussian noise to the input waveform s(t). The power spectral density S n ( f ) of the noise<br />

process N(t) is assumed to be non-white, see figure 18.2. Our problem is now to determine the<br />

capacity of this channel when the available transmitter power is P s ? We will solve this problem<br />

by decomposing the colored-noise channel into sub-channels. Each of these sub-channels is a<br />

bandpass channel. We then have to find out how the signal power P s is distributed over these<br />

sub-channels in an optimal way, i.e. to maximize the total capacity of all the sub-channels.<br />

157


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 158<br />

❊❊<br />

❊<br />

❊<br />

❊<br />

❆<br />

❆<br />

❆<br />

❅<br />

❍❍<br />

<br />

✟✟<br />

✟✟<br />

✻ S n ( f )<br />

❍❍<br />

❅ ❅❍<br />

❍<br />

✟✟ ✁ ✁ ✁✆✆✆✆✆<br />

0<br />

✲<br />

f<br />

Figure 18.2: Power spectral density S n ( f ) of colored noise.<br />

❊❊<br />

❊<br />

❊<br />

❊<br />

❆<br />

.............<br />

❆<br />

❆<br />

.............<br />

❅<br />

❍❍<br />

.............<br />

✟✟<br />

.............<br />

<br />

✟✟<br />

.............<br />

✻S n ( f )<br />

3 2 1 1 2 3<br />

............. .............<br />

4<br />

.............<br />

✆ ✆✆✆<br />

❍❍<br />

.............<br />

.............<br />

❅ ❅❍<br />

❍ ✟✟ ✁✁✁✆ ............. .............<br />

✲<br />

k<br />

−f<br />

0<br />

f 3f<br />

2f<br />

f<br />

Figure 18.3: Decomposition of our channel into sub-channels, each having bandwidth f .<br />

18.2 The sub-channels<br />

Suppose that we divide our channel into K sub-channels, each sub-channel being a bandpass<br />

channel (see figure 18.3). We assume that channel k has a passband which is ±[(k −1)f, kf ]<br />

for k = 1, K . For the moment we assume that f and K are suitably chosen.<br />

Assume that transmitter power used in sub-channel k is P k . The spectral density of the noise<br />

is not constant within the passband of a sub-channel. Therefore we make a staircase approximation<br />

of the spectral density (see figure 18.3). We assume that the spectral density of the noise over<br />

the entire passband of sub-channel k is S n ( f k ) where f k = (k − 1/2)f is the center-frequency<br />

of the passband. Now the capacity of the sub-channel k is<br />

C k = f log 2<br />

(1 +<br />

P k<br />

S n ( f k )2f<br />

)<br />

bit/second. (18.1)<br />

This a consequence of result 15.2. In the next section we investigate how we should distribute<br />

the signal power over all the sub-channels.


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 159<br />

∼ N 1<br />

✛❄<br />

✘<br />

✲ +<br />

✚✙<br />

✲<br />

∼ P 1<br />

∼ N 2<br />

∼ P 2<br />

✛❄<br />

✘<br />

✲ +<br />

✚✙<br />

✲<br />

∼ P K<br />

.<br />

∼ N K<br />

✛❄<br />

✘<br />

✲ +<br />

✚✙<br />

✲<br />

Figure 18.4: K parallel channels with different noise powers.<br />

18.3 Capacity of parallel channels, water-filling<br />

Consider K parallel channels (see figure 18.4). Each channel has the same bandwidth f .<br />

We assume that the signal power used in channel k is P k . The noise power in that channel is<br />

N k = S n ( f k )2f . The total available signal power is P s hence we obtain the condition<br />

∑<br />

P k ≤ P s . (18.2)<br />

k=1,K<br />

Our problem is to distribute the signal power P s over all the sub-channels such that the total<br />

capacity is maximal. In other words<br />

∑<br />

(<br />

C = max f log 2 1 + P )<br />

k<br />

, (18.3)<br />

P 1 ,P 2 ,··· ,P K<br />

N k<br />

k=1,K<br />

where condition (18.2) should be satisfied. We can solve this maximization problem using the<br />

following result.<br />

RESULT 18.1 (Gallager [7], p. 87). Let f (α) be a convex-∩ function of α = (α 1 , α 2 , · · · , α K )<br />

over the region R when α is a probability vector. Assume that the partial derivatives ∂ F(α)/∂α k<br />

are defined and continuous over the region R with the possible exception that lim αk →0 ∂ F(α)/∂α k<br />

may be +∞. Then the conditions<br />

∂ f (α)<br />

= λ for all k such that α k > 0,<br />

∂α k<br />

(18.4)<br />

∂ f (α)<br />

≤ λ for all k such that α j = 0,<br />

∂α k<br />

(18.5)


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 160<br />

are necessary and sufficient conditions on a probability vector α to maximize f (α) over the<br />

region R.<br />

To find the capacity of a collection of K parallel channels we take the function<br />

f (α) = ∑<br />

k=1,K<br />

(<br />

f log 2 1 + α )<br />

k P s<br />

, (18.6)<br />

N k<br />

i.e. the total capacity C. This is a convex-∩ function of the probability vectors α. R is the region<br />

of all these vectors. The partial derivatives of f (α) are<br />

∂ f (α)<br />

∂α j<br />

= f<br />

ln 2 ·<br />

P s<br />

N k + α k P s<br />

. (18.7)<br />

Result 18.1 now implies that the probability vector α reaches a maximum if and only if<br />

N k + α k P s = L for all k such that α k > 0,<br />

N k ≥ L for all k such that α k = 0 (18.8)<br />

for some power level L. This suggest the following procedure:<br />

RESULT 18.2 The procedure is called ”water-filling” (see figure 18.5.) Before the water-filling<br />

N 4<br />

L<br />

α 1 P s α 2 P s α 3 P s<br />

N 3<br />

N 2<br />

N 1<br />

Figure 18.5: Water-filling with four parallel channels.<br />

procedure begins each channel is filled with noise-power only. Channel k contains noise power<br />

N k . Then we start pouring signal power into the channels. The signal power starts filling the<br />

channel with the smallest noise power first, i.e. channel 1 here. When water-level reaches the


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 161<br />

second-smallest noise power level N 2 , and we keep pouring signal power into the channels, both<br />

channel channel 1 and 2 are being filled, etc. Observe that depending on the total amount of<br />

signal power some channels will not get signal power at all (channel 4 here).<br />

This procedure leads to the optimal distribution of signal power P k = α k P s over all the<br />

channels. From these powers the total capacity C can be computed.<br />

18.4 Back to the non-white waveform channel again<br />

Remember that we have decomposed the non-white waveform channel into K sub-channels.<br />

These sub-channels all have bandwidth f . We have seen how we should divide up the signal<br />

power over these sub-channels (water-filling procedure, result 18.2).<br />

Now instead of specifying the signal power in each sub-channel it has advantages to consider<br />

the spectral density of the signal power P s ( f k ) in each sub-channel k hence<br />

P s ( f k ) =<br />

P k<br />

2f<br />

= α k P s<br />

2f . (18.9)<br />

Moreover note that we can express the noise power in sub-channel as N k = S n ( f k )2f . Therefore<br />

a signal power distribution {P s ( f k ), k = 1, K } reaches the maximum if and only if<br />

S n ( f k ) + P s ( f k ) = L ′ for all k such that P s ( f k ) > 0,<br />

S n ( f k ) ≥ L ′ for all k such that P s ( f k ) = 0 (18.10)<br />

for some power level L ′ such that<br />

∑<br />

k=1,K<br />

P s ( f k )2f = P s . (18.11)<br />

Note that L ′ relates to the power level L in the previous section as given by L ′ 2f = L.<br />

Now we are ready to let the bandwidth of the sub-channels f ↓ 0 and the number of<br />

sub-channels K → ∞. This leads to the main result in this chapter.<br />

RESULT 18.3 A spectral density function P s ( f ) of the signal power reaches the maximum if<br />

and only if<br />

for some power level L” such that<br />

S n ( f ) + P s ( f ) = L” for all f such that P s ( f ) > 0,<br />

S n ( f ) ≥ L” for all f such that P s ( f ) = 0 (18.12)<br />

∫ ∞<br />

−∞<br />

P s ( f )d f = P s . (18.13)<br />

This suggests a water-filling procedure in the spectral domain (see figure 18.6).


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 162<br />

❊❊ S n ( f )<br />

❊<br />

❊<br />

❊<br />

❆<br />

❆<br />

❆<br />

❅<br />

❍❍<br />

<br />

✟✟<br />

✟✟<br />

✻<br />

❍❍<br />

P s ( f )<br />

❅ ❅❍<br />

❍ ✟✟ ✁ ✁ ✁✆✆✆✆✆ ✲<br />

L”<br />

0<br />

f<br />

where<br />

Figure 18.6: Water-filling in the spectral domain.<br />

For the capacity of the non-white waveform channel we can now write<br />

∫<br />

1<br />

C =<br />

f ∈B 2 log 2 (1 + P s( f )<br />

S n ( f ) )d f<br />

∫<br />

1<br />

=<br />

2 log 2 ( L”<br />

)d f bit/second. (18.14)<br />

S n ( f )<br />

f ∈B<br />

B = { f |P s ( f ) > 0}. (18.15)<br />

Note that L” is the power spectral density level that determines how the spectral power is distributed<br />

over the frequency spectrum.<br />

18.5 Capacity of the linear filter channel<br />

❄<br />

g(t)<br />

n(t)<br />

channel<br />

n(t)<br />

s(t)<br />

✲<br />

h(t)<br />

✛❄<br />

✘<br />

✲ + ✲<br />

✚<br />

✙r(t)<br />

g(t)<br />

r o (t)<br />

✲<br />

s(t)<br />

✛ ❄ ✘r o (t)<br />

✲ + ✲<br />

✚✙<br />

Figure 18.7: Creating a flat channel changes the spectral density of the noise.<br />

Suppose that the channel filters the input signal s(t) with a filter with impulse response h(t)<br />

and then adds (possibly non-white) Gaussian noise n(t) to the filter output s(t) ∗ h(t). Then,


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 163<br />

at the receiver side, the overal channel can be made flat by filtering the channel output r(t) =<br />

s(t) ∗ h(t) + n(t) with an equalizing filter with response g(t) that satisfies<br />

G( f )H( f ) = 1, (18.16)<br />

thus g(t) is the inverse 1 filter to h(t). The effect of this filter is however that also the noise n(t)<br />

is filtered by this equalizing filter, hence<br />

r o (t) = r(t) ∗ g(t)<br />

= (s(t) ∗ h(t) + n(t)) ∗ g(t)<br />

= s(t) ∗ h(t) ∗ g(t) + n(t) ∗ g(t) = s(t) + n(t) ∗ g(t). (18.17)<br />

The power spectral density of the noise n o (t) = n(t) ∗ g(t) in r o (t) is now given by<br />

S n o( f ) = S n ( f )‖G( f )‖ 2 = S n( f )<br />

‖H( f )‖ 2 . (18.18)<br />

Note that from the previous sections we know how to determine the capacity of this channel now.<br />

18.6 Exercises<br />

1. Determine the capacity C of the channel with noise spectrum S n ( f ) = | f | + 1 depicted in<br />

figure 18.8 as a function of the available signal power P s .<br />

✻S n ( f )<br />

❅<br />

<br />

❅<br />

❅<br />

❅<br />

| f | + 1<br />

❅<br />

❅<br />

❅ 1<br />

✲<br />

0<br />

f<br />

Figure 18.8: A noise spectrum S n ( f ).<br />

2. Consider a waveform channel with additive non-white Gaussian noise. The output waveform<br />

r(t) = s(t) + n(t) where s(t) is the input waveform and n(t) the noise.<br />

The power spectral density of the noise is given by S n ( f ) = exp(| f |) for −∞ < f < ∞<br />

(see figure 18.9). Determine the signal power P s as a function of the capacity C of this<br />

channel. As usual C is expressed in bit per second.<br />

(Exam July 4, 2001.)<br />

1 We assume here that this inverse filter is realizable. Therefore the theorem of reversibility 4.3 applies.


CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 164<br />

8<br />

7<br />

6<br />

5<br />

S n<br />

(f)<br />

4<br />

3<br />

2<br />

1<br />

0<br />

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2<br />

f<br />

Figure 18.9: Power spectral density of the noise S n ( f ) = exp(| f |).<br />

3. Consider a waveform channel with additive non-white Gaussian noise. The output waveform<br />

r(t) = s(t) + n(t) where s(t) is the input waveform and n(t) the noise.<br />

The power spectral density of the noise (see figure 18.10) is given by<br />

⎧<br />

1 for | f | ≤ 1,<br />

⎪⎨<br />

2 for 1 < | f | ≤ 2,<br />

S n ( f ) =<br />

4 for 2 < | f | ≤ 3,<br />

⎪⎩<br />

∞ otherwise.<br />

(18.19)<br />

9<br />

8<br />

7<br />

6<br />

5<br />

S n<br />

(f)<br />

4<br />

3<br />

2<br />

1<br />

0<br />

−4 −3 −2 −1 0 1 2 3 4<br />

f<br />

Figure 18.10: Power spectral density of the noise S n ( f ) in Watt/Hz as a function of the frequency<br />

f in Hz.<br />

(Exam July 5, 2002.)


Part V<br />

Coded Modulation<br />

165


Chapter 19<br />

Coding for bandlimited channels<br />

SUMMARY: Modulation makes it possible to transmit sequences of symbols over a waveform<br />

channel. Here we will investigate coding, i.e. we will study which sequences of symbols<br />

should be used to achieve efficient transmission. We study only coding methods, i.e. techniques<br />

that create distance between sequences. Shaping techniques, i.e. methods that aim<br />

at achieving spherical signal structures will not be discussed.<br />

19.1 AWGN channel<br />

Consider transmission over an additive white Gaussian noise (AWGN) channel. For this channel<br />

n i<br />

p N (n i ) = 1 √ π N0<br />

exp(− n2 i<br />

N 0<br />

)<br />

s i<br />

✛❄<br />

✘r i = s i + n i<br />

✲ + ✲<br />

✚✙<br />

Figure 19.1: Additive white Gaussian noise channel.<br />

the i-th channel output r i is equal to the i-th input s i to which noise n i is added, hence<br />

r i = s i + n i , for i = 1, 2, · · · . (19.1)<br />

We assume that the noise variables N i are independent and normally distributed, with mean 0<br />

and variance σ 2 = N 0 /2.<br />

This AWGN-channel models transmission over a waveform channel with additive white<br />

Gaussian noise with power density S Nw ( f ) = N 0 /2 for −∞ < f < ∞ as we have seen<br />

before (see section 6.8).<br />

Note that i is the index of the dimensions. If e.g. the waveform channel has bandwidth W ,<br />

we can get a new dimension every 1<br />

2W<br />

seconds (see chapter 14). This determines the relation<br />

between the actual time t and the dimension index i but we will ignore this in what follows.<br />

166


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 167<br />

In this chapter we call a dimension a transmission. We assume that the energy “invested” in<br />

each transmission (dimension) is E N . It is our objective to investigate coding techniques for this<br />

channel. First we consider uncoded transmission.<br />

19.2 |A|-Pulse amplitude modulation<br />

In this section we will consider pulse amplitude modulation (PAM) again. In chapter 14 the modulation<br />

properties of this technique were discussed. Here we will investigate the error behavior<br />

of this method which can be regarded as uncoded transmission. We will use (uncoded) PAM as<br />

a reference for evaluating alternative signaling methods, methods that do involve coding.<br />

In the case of PAM, the channel input symbols s i for i = 1, 2, · · · assume values in the<br />

symbol alphabet (or signal structure or signal constellation)<br />

scaled by a factor d 0 /2. We write<br />

A = {−|A| + 1, −|A| + 3, · · · , |A| − 1}, (19.2)<br />

s i = d 0<br />

2 a i where a i ∈ A. (19.3)<br />

We will here call a i an elementary PAM-symbol.<br />

The constellation A is called an |A|-PAM constellation. It contains |A| equidistant elementary<br />

symbols centered on the origin and having minimum Euclidean distance 2. Therefore d 0 is<br />

the minimum Euclidean distance between the symbols s i . We assume in this chapter that |A| is<br />

even. In figure 19.2 an |A|-PAM constellation is shown.<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

<br />

−|A| + 1 · · · -5 -3 -1 1 3 5 · · · |A| − 1<br />

Figure 19.2: An |A|-PAM constellation (|A| even) with the |A| elementary symbols.<br />

Each symbol value has a probability of occurrence of 1/|A|. Therefore the information rate<br />

per transmission is<br />

R N = log 2 |A| bits per transmission. (19.4)<br />

For the average symbol energy E N per transmission we now have that<br />

E N = d2 0<br />

4 E pam (19.5)<br />

where E pam is the elementary energy of an |A|-PAM constellation which is defined as<br />

E pam<br />

=<br />

1<br />

|A|<br />

∑<br />

a=−|A|+1,−|A|+3,··· ,|A|−1<br />

a 2 . (19.6)


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 168<br />

RESULT 19.1 In appendix J it is shown that<br />

The result holds both for even and odd |A|.<br />

E pam = |A|2 − 1<br />

. (19.7)<br />

3<br />

It now follows from the above result that, since s i = (d 0 /2)a i , the average symbol energy<br />

E N = d2 0<br />

4<br />

Example 19.1 Consider the 4-PAM constellation<br />

|A| 2 − 1<br />

. (19.8)<br />

3<br />

A = {−3, −1, +1, +3}. (19.9)<br />

The rate corresponding to this constellation is R N = log 2 4 = 2 bits per transmission. For the elementary<br />

energy per transmission we obtain<br />

E pam = 1 4 ((−3)2 + (−1) 2 + (+1) 2 + (+3) 2 ) = 42 − 1<br />

3<br />

= 5. (19.10)<br />

Apart from the rate R N and the average symbol energy E N , a third important parameter is<br />

the symbol error probability PE s. This error probability is Q( d 0/2<br />

σ<br />

) for the outer two signal points<br />

and 2Q( d 0/2<br />

σ<br />

) for the inner signal points. Therefore<br />

P s E<br />

=<br />

2(|A| − 2) + 2<br />

|A|<br />

Q( d 0/2<br />

σ<br />

) =<br />

2(|A| − 1)<br />

|A|<br />

Q( d 0/2<br />

). (19.11)<br />

σ<br />

By (19.8) and the fact that σ 2 = N 0 /2, the square of the argument of the Q-function can be<br />

written as<br />

d0<br />

2<br />

4σ 2 = 3 E N<br />

|A| 2 − 1 N 0 /2 . (19.12)<br />

Therefore we can express the symbol error probability as<br />

(√<br />

)<br />

PE s 2(|A| − 1) 3 E N<br />

= Q<br />

|A| |A| 2 =<br />

− 1 N 0 /2<br />

(√<br />

2(|A| − 1)<br />

Q<br />

|A|<br />

)<br />

3<br />

|A| 2 − 1 S/N , (19.13)<br />

if we note that the signal-to-noise ratio S/N = E N /(N 0 /2), i.e. the average symbol energy<br />

divided by the variance of the noise in a transmission.<br />

19.3 Normalized S/N<br />

We now want to look more closely at signal-to-noise ratios. Therefore note that reliable transmission<br />

is only possible if the rate R N (in bits per transmission, dimension) is smaller than the


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 169<br />

capacity C N per dimension. If we write the capacity per transmission (see 12.19) as a function<br />

of S/N = E N /(N 0 /2) we obtain<br />

R N < C N = 1 2 log 2 (1 + S/N ). (19.14)<br />

Rewriting this inequality in the reverse direction way we obtain the following lower bound for<br />

S/N as a function of the rate R N<br />

S/N > 2 2R N<br />

− 1 = S/N min . (19.15)<br />

Definition 19.1 This immediately suggests defining the normalized signal-to-noise ratio as<br />

S/N norm<br />

=<br />

S/N<br />

S/N min<br />

=<br />

S/N<br />

2 2R N − 1 . (19.16)<br />

This definition allows us to compare coding methods with different rates with each other.<br />

Now it follows from (19.15) that for reliable transmission S/N norm > 1 must hold. Good<br />

signaling methods achieve small error probabilities for S/N norm close to 1 while lousy methods<br />

need a larger S/N norm to realize reliable transmission. It is the objective of the communication<br />

engineer to design systems that achieve acceptable error probabilities for a S/N norm as close as<br />

possible to 1.<br />

19.4 Baseline performance in terms of S/N norm<br />

In a previous section we have determined the symbol error probability of |A|-PAM as a function<br />

of S/N . This symbol error probability can be expressed in terms of S/N norm in a very simple<br />

way as we see soon.<br />

Since we are considering bandlimited transmission in this chapter, we may assume that<br />

S/N ≫ 1. We are aiming at achieving<br />

log |A| = R N ≈ C N = 1 2 log 2 (1 + S/N ) (19.17)<br />

or |A| ≈ √ 1 + S/N and hence we make the assumption that also |A| ≫ 1. Therefore we may<br />

conclude for the symbol error probability for |A|-PAM that<br />

(√<br />

)<br />

PE s 3<br />

≈ 2Q<br />

|A| 2 − 1 S/N<br />

(√<br />

)<br />

3<br />

(√ )<br />

= 2Q<br />

2 2R N − 1 S/N = 2Q 3S/N norm (19.18)<br />

by definition 19.1 of the normalized signal-to-noise ratio. This symbol error probability for PAM<br />

is depicted in figure 19.3. It will serve as reference performance for the alternative methods that<br />

will be investigated soon. We call it therefore the baseline performance. Note once more that the<br />

baseline performance corresponds to uncoded PAM transmission.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 170<br />

10 0 SNRnorm<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

10 −5<br />

10 −6<br />

−1 0 1 2 3 4 5 6 7 8 9 10<br />

Figure 19.3: Average symbol error probability P s E versus S/N norm for uncoded |A|-PAM, under<br />

the assumption that |A| is large. Horizontally S/N norm in dB, vertically log 10 P s E .<br />

19.5 The promise of coding<br />

It can be observed from figure 19.3 that a S/N norm of approximately 9 dB is needed to achieve<br />

a reliable performance when PAM is used. It is more or less common practice to assume that<br />

P s E ≈ 10−6 corresponds to reliable transmission.<br />

Shannon’s results imply that reliable transmission is possible for S/N norm ≈ 1. Therefore<br />

for |A|-PAM the S/N norm is roughly 9 dB, a factor of eight, larger than necessary. There exist<br />

methods that perform up to 9 dB better than our baseline method, i.e. up to 9 dB better than<br />

uncoded PAM.<br />

Why do we prefer small signal-to-noise ratios? In a radio transmission system a smaller S/N<br />

gives the opportunity to use less transmitter power or to apply simpler and/or smaller antennas.<br />

Using less transmitter power makes smaller batteries possible, results in a smaller bill from the<br />

electricity distribution company, or makes dissipation in circuits lower, etc. In line transmission<br />

using smaller transmitter power is also advantageous for the same reasons. Moreover we have<br />

the effect of cross-talk between different users. We can not combat the noise that results from<br />

this cross-talk by increasing the transmitter power. In this case it is better to achieve a better error<br />

performance given the S/N that occurs there.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 171<br />

19.6 Union bound<br />

19.6.1 Sequence error probability<br />

In what follows we will investigate several coding methods and evaluate their performance. Since<br />

we want to compare the performance of the proposed codes with the baseline performance it<br />

would be nice to know the symbol error probabilities of the proposed codes. But since it is not<br />

clear yet what this means, we consider the (sequence) error probability P E of a coding method<br />

first. The union bound can be used to give a good estimate of this error probability. This bound<br />

is based on the evaluating the pair-wise error probabilities between pairs of coded sequences.<br />

On the AWGN channel the received sequence r = s + n is the sum of the transmitted coded<br />

sequence (signal) s and the noise vector n consisting of N independent components each having<br />

mean zero and variance σ 2 = N 0 /2. Now consider two coded sequences s ∈ S and s ′ ∈ S<br />

where S is the set of sequences (signals). They differ by the Euclidean distance ‖s − s ′ ‖. If the<br />

sequence s is transmitted, the probability that the received vector r = s + n will be closer to s ′<br />

than to s is given by<br />

( ‖s − s ′ )<br />

‖/2<br />

Q<br />

. (19.19)<br />

σ<br />

The probability that r will be closer to s ′ than to s for any s ′ ̸= s in S which is precisely the<br />

probability of error P E (s) for minimum distance decoding when s is sent, is therefore upperbounded<br />

by<br />

P E (s) ≤ ∑ ( ‖s − s ′ )<br />

‖/2<br />

Q<br />

. (19.20)<br />

s ′̸=s σ<br />

The error probability over all coded sequences s is now upper-bounded by<br />

P E = ∑ 1<br />

|S| P E(s) ≤ ∑ 1 ∑<br />

( ‖s − s ′ )<br />

‖/2<br />

Q<br />

= ∑ ( ) d/2<br />

K d Q , (19.21)<br />

|S|<br />

s∈S<br />

s∈S s ′̸=s σ<br />

σ<br />

d<br />

where<br />

K d = ∑ s∈S<br />

# of s ′ such that ‖s − s ′ ‖ = d<br />

, (19.22)<br />

|S|<br />

i.e. the average number of signals at distance d from a coded sequence. Inequality (19.21) is<br />

referred to as the union bound for the sequence error probability.<br />

The function Q(x) decays exponentially not slower than exp(−x 2 /2). This follows from<br />

appendix A. Therefore, when K d does not increase too rapidly with d, for small σ , the union<br />

bound estimate is dominated by its first term, which is called the union bound estimate<br />

( )<br />

dmin /2<br />

P E ≈ K min Q , (19.23)<br />

σ<br />

where d min is the minimum Euclidean distance between any two sequences in S and K min the<br />

average number of signals at distance d min from a coded sequence. K min is called the error<br />

coefficient.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 172<br />

19.6.2 Symbol error probability<br />

To compare a coded system with the uncoded baseline system, we would like to know, instead<br />

of the sequence error probability P E for this coded system, the symbol error probability P s E .<br />

A first problem now appears: What are the symbols? Usually in a coded system a bunch of<br />

bits is transmitted in a single sequence during N transmissions. In the baseline system however,<br />

a symbol is the amount of information that is sent in a single transmission. Therefore in the<br />

coded case we assume also that in each transmission a symbol is sent. Such an abstract symbol<br />

corresponds to a fraction 1/N of the information bits. Thus the coded sequence contains N of<br />

these symbols.<br />

Can we now express the sequence error probability P E in terms of the error probability ˜P s E<br />

for these symbols 1 and vice versa? Now we get a second problem: How do these symbols-errors<br />

depend on each other? To make life easy we assume that the N symbols-errors are independent<br />

of each other. Now we easily obtain<br />

or<br />

P E = 1 − (1 − ˜P s E )N ≈ N ˜P s E (19.24)<br />

˜P s E ≈ P E/N. (19.25)<br />

If we now use the union bound estimate (19.23) obtained in the previous subsection we get<br />

˜P E s ≈ P E/N ≈ K ( )<br />

min<br />

N Q dmin /2<br />

. (19.26)<br />

σ<br />

The ratio K min /N is called the normalized error coefficient (see [6], p. 2399). The symbol error<br />

probability ˜P<br />

E s which is also referred to as sequence error probability per transmission can now<br />

be used for comparison with the uncoded PAM case.<br />

19.7 Extended Hamming codes<br />

To construct sets of coded sequences we can make use of a binary error-correcting code. Such a<br />

code C is a set of |C| codewords {c 1 , c 2 , · · · , c |C| }. Each codeword c is a sequence consisting of<br />

N symbols c 1 , c 2 , · · · , c N taking values in the alphabet {0, 1}.<br />

A code can be linear. In that case all codewords are linear combinations of K independent<br />

codewords 2 . Such a linear code contains 2 K codewords.<br />

Definition 19.2 The minimum Hamming distance of a code is defined as<br />

d H,min<br />

= min<br />

c̸=c ′ d H(c, c ′ ), (19.27)<br />

where d H (c, c ′ ) is the Hamming distance between two codewords c and c ′ , i.e. the number of<br />

positions at which they differ.<br />

1 We write ˜P E s with a tilde since we are not dealing with concrete symbols.<br />

2 The sum c of two codewords c ′ and c ′′ is the component-wise mod 2 sum of both codewords. A codeword c<br />

times a scalar is the codeword c itself if the scalar is 1 or the all-zero codeword 0, 0, · · · , 0 if the scalar is 0.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 173<br />

So-called extended Hamming codes exist and have parameters<br />

(N, K, d H,min ) = (2 q , 2 q − q − 1, 4), for q = 2, 3, · · · . (19.28)<br />

We will use these codes in the next sections of this chapter, just because it is convenient to do so.<br />

Example 19.2 For q = 3 we obtain a code with parameters (N, K, d H,min ) = (8, 4, 4). Indeed this code<br />

can be constructed from the following K = 4 independent codewords of length N = 8:<br />

c 1<br />

= (1, 0, 0, 0, 0, 1, 1, 1),<br />

c 2<br />

= (0, 1, 0, 0, 1, 0, 1, 1),<br />

c 3<br />

= (0, 0, 1, 0, 1, 1, 0, 1),<br />

c 4<br />

= (0, 0, 0, 1, 1, 1, 1, 0). (19.29)<br />

The code consists of the 16 linear combinations of these 4 codewords. Why is the minimum Hamming<br />

distance d H,min of this code 4? To see this, note that the difference of two codewords c ̸= c ′ is a non-zero<br />

codeword itself. The Hamming distance between two codewords is the number of ones i.e. the Hamming<br />

weight of this difference. The minimal Hamming weight of a nonzero codeword is at least 4. This can be<br />

checked easily.<br />

19.8 Coding by set-partitioning<br />

19.8.1 Construction of the signal set<br />

The idea of “mapping by set-partitioning” comes from Ungerboeck [22]. He proposed to partition<br />

the constellation A into subsets A 0 and A 1 in such a way that the minimum distance of<br />

the elementary symbols in the subsets becomes larger than that of the elementary symbols in<br />

the original constellation. E.g. an 8-PAM constellation A can be partitioned into a subset A 0<br />

✄<br />

✂ ✁<br />

-7<br />

A<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

<br />

-5 -3 -1 1 3 5 7<br />

A 0 A 1<br />

✄ ✂<br />

✁<br />

✏✮<br />

✏ ✏✏✏✏✏<br />

<br />

<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

-7 -3 1 5<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

<br />

-5 -1 3 7<br />

Figure 19.4: Partitioning of an 8-PAM set into two subsets with four elementary symbols having<br />

both a twice as large minimum within-subset distance than the original PAM-set.<br />

and a subset A 1 both containing four elementary symbols, and such that the minimum distance<br />

between two elementary symbols in A 0 or two symbols in A 1 , i.e. the minimum within-subset


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 174<br />

distance, is 4, which is twice as large as the minimum distance between two elementary symbols<br />

in A, which is 2 (see figure 19.4). Note that the minimum distance between any symbol from A 0<br />

and any symbol from A 1 is still 2.<br />

How can set-partitioning lead to more reliable transmission? Therefore we consider the<br />

example-encoder shown in figure 19.5. This encoder transforms 20 binary information digits<br />

b 1 , b 2 , · · · , b 20 into 8 elementary 8-PAM symbols a 1 , a 2 , · · · , a 8 . The digits b 1 , b 2 , b 3 , b 4 are<br />

b 1<br />

b 2<br />

b 3<br />

b 4<br />

(8,4,4)<br />

c 1<br />

b 5<br />

b 8<br />

b 19 tab<br />

b 20<br />

a 8<br />

c 8<br />

tab<br />

b 6<br />

c 2<br />

a 1<br />

b 7<br />

tab<br />

a 2<br />

.<br />

. .<br />

Figure 19.5: An encoder that maps by set-partitioning.<br />

.<br />

transformed into a codeword c 1 , c 2 , · · · , c 8 from an (8,4,4) extended Hamming code. Digit c i<br />

together with two information digits b 2i+3 and b 2i+4 are used to form the PAM symbol a i . Therefore<br />

the following table can be used. The table lists a i for all combinations of c i and b 2i+3 , b 2i+4 .<br />

c i b 2i+3 , b 2i+4 00 01 11 10<br />

0 −7 −3 +1 +5<br />

1 −5 −1 +3 +7<br />

Note that according to this table essentially one out of the four elementary symbols from the set<br />

A 0 is chosen if c i = 0 or one out the four elementary symbols from A 1 if c i = 1. The binary<br />

digits b 2i+3 and b 2i+4 determine which symbol is picked.<br />

19.8.2 Distance between the signals<br />

How about the minimum distance d min between the signals s = s 1 , s 2 , · · · , s 8 that is achieved in<br />

this way? Note that there are 2 20 such signals. Just like before each signal (vector)<br />

s = (d 0 /2)a, (19.30)<br />

thus each signal s is a scaled version of an elementary signal (vector) a = a 1 , a 2 , · · · , a 8 . To determine<br />

the minimum Euclidean distance between two signals s ̸= s ′ we determine the minimum<br />

distance between two elementary signals a ̸= a ′ first. We consider two cases.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 175<br />

1. First consider all 2 16 elementary sequences a that result from a fixed codeword c from<br />

the extended Hamming code. For such a sequence each component a i ∈ A ci for i =<br />

1, 2, · · · , 8. The within-subset distance in subsets A 0 and A 1 is 4. Therefore the minimum<br />

distance between two elementary sequences a and a ′ resulting from the same codeword c<br />

is 4.<br />

2. Next consider two elementary sequences a and a ′ that correspond to two different codewords<br />

c and c ′ from the extended Hamming code. Then there are at least d H,min = 4<br />

positions i ∈ {1, 2, · · · , 8} at which c i ̸= c<br />

i ′ . Note that at these positions the elementary<br />

symbol a i ∈ A ci while the symbol a<br />

i ′ ∈ A c i ′ . The minimum distance between any symbol<br />

from A 0 and any symbol from A 1 is 2 and thus at these positions |a i − a<br />

i ′ | ≥ 2 and hence<br />

‖a − a ′ ‖ 2 = ∑<br />

(a i − a i ′ )2 ≥ 4 · 4 = 16. (19.31)<br />

i=1,8<br />

Therefore the distance between two elementary sequences a and a ′ that correspond to two<br />

different codewords c and c ′ is at least 4.<br />

We conclude that the Euclidean distance between two elementary sequences a and a ′ is at least<br />

4. Therefore the Euclidean distance between two sequences s and s ′ is at least 4(d 0 /2) = 2d 0 .<br />

Hence d min = 2d 0 , which is a factor of two larger than in the uncoded case.<br />

19.8.3 Asymptotic coding gain<br />

We have seen in the previous section that the minimum distance between any two coded sequences<br />

for the Ungerboeck coding method d min = 2d 0 . In uncoded PAM, our baseline method,<br />

the distance between any two symbols was at least d 0 . We may conclude that by using the idea<br />

of Ungerboeck the minimum Euclidean distance has increased by a factor of two.<br />

Let us now compute the sequence error probability for our Ungerboeck coding technique. Is it<br />

indeed smaller than the baseline behavior for the symbol error probability for the uncoded PAM<br />

case that was described by (19.18)? Therefore we first note that by the union bound estimate<br />

(19.23) we can write for our coding method that<br />

( )<br />

dmin /2<br />

P E ≈ K min Q . (19.32)<br />

σ<br />

Now we express P E in terms of E N by noting that again<br />

E N = d2 0<br />

4<br />

|A| 2 − 1<br />

, (19.33)<br />

3<br />

where it is important to note that although we use coding, the PAM symbols remain equiprobable.<br />

Moreover<br />

dmin<br />

2<br />

4σ 2 = d2 min<br />

d0<br />

2<br />

d0<br />

2 4σ 2 = d2 min 3 E N<br />

d0<br />

2 |A| 2 − 1 N 0 /2 = d2 min 3<br />

d0<br />

2 |A| 2 S/N . (19.34)<br />

− 1


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 176<br />

Therefore<br />

P E = K min Q<br />

(√<br />

) (√<br />

)<br />

dmin<br />

2 3<br />

d0<br />

2 |A| 2 − 1 S/N dmin<br />

2 2<br />

= K min Q<br />

2R N − 1<br />

d0<br />

2 |A| 2 − 1 3S/N norm , (19.35)<br />

where R N is the rate of the signal set S, i.e.<br />

R N = log 2 |S|<br />

N<br />

= K N + log 2 |A|/2<br />

= 2q − q − 1<br />

2 q + log 2 |A| − 1 = log 2 |A| − q + 1<br />

2 q . (19.36)<br />

Note that the rate R N approaches log 2 |A| for q → ∞, i.e. if we increase the complexity of the<br />

extended Hamming code.<br />

The argument of the Q( √·)-function in terms of S/N norm was 3S/N norm in the uncoded<br />

PAM case. If we use Ungerboeck coding as we described here, it is d2 min<br />

d0<br />

2<br />

we have achieved an increase of the argument by a factor G asymp<br />

cod<br />

of<br />

G asymp<br />

cod<br />

= d2 min<br />

d 2 0<br />

2 2R N −1<br />

|A| 2 −1 3S/N norm, hence<br />

2 2R N<br />

− 1<br />

|A| 2 − 1 = 4 · 22R N<br />

− 1<br />

|A| 2 − 1 . (19.37)<br />

This factor is called the asymptotic (or nominal) coding gain of our coding system with respect<br />

to the baseline performance.<br />

The factor of dmin 2 /d2 0<br />

= 4 corresponds to the increase of the squared Euclidean distance<br />

between the sequences. This factor is referred to as distance gain. This increase of the distance<br />

however was achieved since we have introduced redundancy by using an error-correcting code.<br />

The factor (2 2R N<br />

− 1)/(|A| 2 − 1) that corresponds to this loss is therefore called redundancy<br />

loss.<br />

Example 19.3 The encoder shown in figure 19.5 is based on the (8, 4, 4) extended Hamming code. The<br />

rate of this extended Hamming, corresponding to q = 3, is<br />

R ext.Hamm. = K N = 2q − q − 1<br />

= 23 − 3 − 1<br />

= 1/2. (19.38)<br />

2 q 2 3<br />

Therefore the total rate R N = R ext.Hamm. + 2 = 5/2 bits/transmission. The additional 2 bits come from the<br />

fact that the subsets A 0 and A 1 each contain four symbols.<br />

The distance gain of this method<br />

dmin<br />

2<br />

d0<br />

2 = 4, (19.39)<br />

which is 10 log 10 4 = 6.02dB. The redundancy loss however is<br />

2 2R N<br />

− 1 22·5/2<br />

|A| 2 − 1 = − 1<br />

= 31/63 (19.40)<br />

8 2 − 1<br />

which is 10 log 10 31/63 = −3.08dB. Therefore the asymptotic coding gain for this code is<br />

G asymp<br />

cod<br />

= 10 log 10 4 · 31 = 6.02dB − 3.08dB = 2.94dB. (19.41)<br />

63


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 177<br />

19.8.4 Effective coding gain<br />

In the previous subsection we have considered only the asymptotic coding gain, i.e. we have<br />

only concentrated on the argument of the Q-function (in the expressions of both P E and ˜P s E )<br />

and ignored the coefficient of it. How large are these coefficients? It can be shown that for the<br />

extended Hamming code with parameter q<br />

K min (q) =<br />

( 2 q<br />

4<br />

)<br />

+ (2 q − 1) ( 2 q−1 )<br />

2<br />

2 q (19.42)<br />

is the number of codewords at Hamming distance 4 from a codeword. We have already seen that<br />

the minimum Euclidean distance of our Ungerboeck code is 2d 0 . How many sequences have<br />

distance 2d 0 to a given sequence, in other words how large is our K min ? Therefore suppose that<br />

a sequence s is a result of extended Hamming codeword c. Note first that there are at most 2N<br />

signals s ′ that result from the same codeword c having distance 2d 0 from s. Moreover for each<br />

extended Hamming codeword c ′ at Hamming distance 4 from c there are at most 2 4 sequence s ′<br />

having distance 2d 0 from s. Therefore in total<br />

K min ≤ 2N + 2 4 K min (q). (19.43)<br />

Note that the normalized error coefficient is K min /N. This normalized error coefficient<br />

K min /N is larger than the coefficient 2 in the uncoded case. What is the effect of this larger<br />

coefficient in dB? In other words, how much do we have to increase the S/N norm , to compensate<br />

for the larger K min /N? Note that we are interested in achieving a symbol error probability<br />

of 10 −6 . To achieve the baseline performance the argument α of Q( √·) has to satisfy<br />

2Q( √ α) = 10 −6 (19.44)<br />

in other words α = 23.928. For an alternative coding methods with normalized error coefficient<br />

K min /N we need an argument β that satisfies<br />

K min<br />

N Q(√ β) = 10 −6 , (19.45)<br />

or a S/N norm that is a factor β/α larger, to get the same performance. Therefore the coding loss<br />

L that we have to take into account is<br />

L = 10 log 10<br />

β<br />

α . (19.46)<br />

In practice it turns out that if the normalized error coefficient K min /N is increased by a factor of<br />

two, we have to accept an additional loss of roughly 0.2dB (at error rate 10 −6 ).<br />

The resulting coding gain, called the effective coding gain, is the difference of the asymptotic<br />

coding gain and the loss, hence<br />

G eff<br />

cod = Gasymp cod<br />

− L = G asymp<br />

cod<br />

− 10 log 10<br />

β<br />

α . (19.47)


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 178<br />

4<br />

R N<br />

(bits/transm.)<br />

6<br />

Asympt. cod. gain (dB)<br />

3<br />

5<br />

4<br />

2<br />

3<br />

1<br />

2<br />

1<br />

0<br />

2 3 4 5 6<br />

0<br />

2 3 4 5 6<br />

3000<br />

K min<br />

/N<br />

6<br />

Effect. cod. gain (dB)<br />

2500<br />

5<br />

2000<br />

4<br />

1500<br />

3<br />

1000<br />

2<br />

500<br />

1<br />

0<br />

2 3 4 5 6<br />

0<br />

2 3 4 5 6<br />

Figure 19.6: Rate, asymptotic coding gain (in dB), normalized error coefficient, and effective<br />

coding gain (in dB) versus q.<br />

Example 19.4 Again consider the encoder shown in figure 19.5 which is based on the (8, 4, 4) extended<br />

Hamming code. We can show that K min (q) i.e. the number of “nearest neighbors” in the extended Hamming<br />

code, for q = 3, is<br />

( 2 3) + (2 3 − 1) ( )<br />

2 3−1<br />

4<br />

K min (3) =<br />

= 14. (19.48)<br />

2 3<br />

This results in the following bound for the error coefficient in the set S of coded sequences<br />

2<br />

K min ≤ 2N + 2 4 K min (3) = 2 · 8 + 16 · 14 = 240. (19.49)<br />

The normalized error coefficient K min /N is therefore at most 240/8 = 30. Now we have to solve<br />

K min<br />

N Q(√ β) = 30 · Q( √ β) = 10 −6 , (19.50)<br />

It turns out that β = 29.159 and thus the loss is 10 log 10 (29.159/23.928) = 0.86dB. Therefore the<br />

effective coding gain G eff<br />

cod = Gasymp cod<br />

− 0.86dB = 2.94dB − 0.86dB = 2.08dB.<br />

19.8.5 Coding gains for more complex codes<br />

For the values 2, 3, · · · , 6 of the parameter q that determines the extended Hamming code we<br />

have computed rates, normalized error coefficients, and asymptotic and effective coding gains of<br />

our Ungerboeck codes. These data can be found in the table below.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 179<br />

q N R N G asymp<br />

cod<br />

K min /N L G eff<br />

cod<br />

2 4 2.250 1.38 dB 6 0.37 1.01 dB<br />

3 8 2.500 2.94 dB 30 0.86 2.08 dB<br />

4 16 2.687 4.10 dB 142 1.29 2.81 dB<br />

5 32 2.813 4.87 dB 622 1.66 3.21 dB<br />

6 64 2.891 5.35 dB 2606 1.99 3.36 dB<br />

The results from the table are also shown in figure 19.6. It can be observed that the rate R N of<br />

our code tends to 3 bits/transmission if q increases. Moreover we see that the asymptotic coding<br />

gain approaches 6dB. The effective coding gain is considerably smaller however. The reason for<br />

this is that the normalized error coefficient increases very quickly with q.<br />

19.9 Remark<br />

What we did not do is the following. To keep things simple we have only discussed onedimensional<br />

signal structures. Ungerboeck’s method of set-partitioning also applies to two or<br />

more-dimensional structures.<br />

19.10 Exercises<br />

1. Consider instead of the extended Hamming code a so-called single-parity-check code of<br />

length 4. This code contains 8 codewords and has distance d H,min = 2. A codeword<br />

now has four binary components c 1 , c 2 , c 3 , c 4 and is determined by three information bits<br />

b 1 , b 2 , b 3 .<br />

✘✘✘✘✘ ✘ ✘✘❳ ❳ ❳❳❳❳❳<br />

✘✘✘ ✘ ❳<br />

A ✲ A<br />

❳❳❳❳3<br />

0<br />

A 1<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

3 ✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✄<br />

3 ✂ ✂ ✁<br />

✄<br />

✁<br />

3 ✂ ✁<br />

<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

1 ✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

1 ✂ ✁<br />

✄<br />

✂ ✁<br />

1<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄ 1<br />

✂ ✁<br />

✄ 3<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

1<br />

✄ 3<br />

✂ ✁<br />

<br />

✄<br />

✂ ✁<br />

✄ 1 3<br />

✂ ✁<br />

<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

✄<br />

✂ ✁<br />

Figure 19.7: Partitioning a 16-QAM constellation into two subsets.<br />

As before we use set-partitioning, but now we partition a 16-QAM constellation into two<br />

subsets A 0 and A 1 (see figure). Note that the signals in these sets are two-dimensional.


CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 180<br />

Each component c i , i = 1, 2, 3, 4 of the chosen codeword c determines the used subset<br />

A ci , i.e. if c i = 0 then A 0 is used, if c i = 1 we use A 1 . Three binary digits<br />

b 3i+1 , b 3i+2 , b 3i+3 then determine which one of the (two-dimensional) signal points<br />

a 2i−1 , a 2i from the subset A ci is transmitted. Note that N = 8, since four two-dimensional<br />

signals are transmitted.<br />

As before the elementary signal a = a 1 , a 2 , · · · , a 8 is multiplied by d 0 /2 to get the actual<br />

signal s.<br />

(a) What is the minimum Euclidean distance d min now? What is the rate R N of this code<br />

in bits per transmission? Give the asymptotic coding gain of this code (relative to the<br />

baseline performance).


Part VI<br />

Appendices<br />

181


Appendix A<br />

An upper bound for the Q-function<br />

Note that the Q-function is defined as<br />

Q(x) =<br />

∫ ∞<br />

x<br />

1<br />

√ exp(− α2<br />

)dα. (A.1)<br />

2π 2<br />

The lemma below gives a useful upper bound for Q(x). In figure A.1 the Q-function and the<br />

derived upper bound are plotted.<br />

LEMMA A.1 For x ≥ 0 the Q-function is bounded as<br />

Q(x) ≤ 1 2<br />

Proof: Note that for α ≥ x, and since x ≥ 0,<br />

Therefore<br />

Q(x) =<br />

≤<br />

x2<br />

exp(− ). (A.2)<br />

2<br />

α 2 = (α − x) 2 + 2αx − x 2<br />

∫ ∞<br />

x<br />

∫ ∞<br />

x<br />

≥ (α − x) 2 + 2x 2 − x 2<br />

= (α − x) 2 + x 2 . (A.3)<br />

1<br />

√<br />

2π<br />

exp(− α2<br />

2 )dα<br />

1<br />

√ exp(− (α − x)2 + x 2<br />

)dα<br />

2π 2<br />

= exp(− x2<br />

2 ) ∫ ∞<br />

= 1 2<br />

x<br />

1<br />

√<br />

2π<br />

exp(−<br />

(α − x)2<br />

)dα<br />

2<br />

x2<br />

exp(− ). (A.4)<br />

2<br />

✷<br />

182


APPENDIX A. AN UPPER BOUND FOR THE Q-FUNCTION 183<br />

10 0 Figure A.1: The Q-function and the derived upper bound.<br />

10 −1<br />

10 −2<br />

10 −3<br />

10 −4<br />

10 −5<br />

10 −6<br />

10 −7<br />

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5


Appendix B<br />

The Fourier transform<br />

B.1 Definition<br />

THEOREM B.1 If the signal x(t) satisfies the Dirichlet conditions, that is,<br />

1. The signal x(t) is absolutely integrable on the real line, i.e. ∫ ∞<br />

−∞<br />

|x(t)|dt < ∞.<br />

2. The number of maxima and minima of x(t) in any finite interval on the real line is finite.<br />

3. The number of discontinuities of x(t) in any finite interval on the real line is finite.<br />

4. When x(t) is discontinuous at t then x(t) = x(t+ )+x(t − )<br />

2<br />

,<br />

then the Fourier transform (or Fourier integral) of x(t), defined by<br />

X ( f ) =<br />

∫ ∞<br />

−∞<br />

x(t) exp(− j2π f t)dt<br />

(B.1)<br />

exists and the original signal can be obtained from its Fourier transform by<br />

x(t) =<br />

∫ ∞<br />

−∞<br />

X ( f ) exp( j2π f t)d f.<br />

(B.2)<br />

B.2 Some properties<br />

B.2.1 Parseval’s relation<br />

RESULT B.2 If the Fourier transforms of the signals f (t) and g(t) are denoted by F( f ) and<br />

G( f ) respectively, then<br />

∫ ∞<br />

−∞<br />

f (t)g(t)dt =<br />

∫ ∞<br />

−∞<br />

F( f )G ∗ ( f )d f. (B.3)<br />

184


APPENDIX B. THE FOURIER TRANSFORM 185<br />

Proof: Note that for a real signal g(t) the Fourier transform G( f ) satisfies G(− f ) = G ∗ ( f ).<br />

Then<br />

∫ ∞<br />

∫ ∞<br />

[∫ ∞<br />

] [∫ ∞<br />

]<br />

f (t)g(t)dt =<br />

F( f ) exp( j2π f t)d f G( f ) exp( j2π f t)d f dt<br />

−∞<br />

−∞ −∞<br />

−∞<br />

∫ ∞<br />

[∫ ∞<br />

] [∫ ∞<br />

]<br />

=<br />

F( f ) exp( j2π f t)d f G ∗ ( f ′ ) exp(− j2π f ′ t)d f ′ dt<br />

We know that<br />

therefore<br />

∫ ∞<br />

−∞<br />

=<br />

−∞<br />

∫ ∞<br />

−∞<br />

−∞<br />

[∫ ∞<br />

[∫ ∞<br />

F( f ) G ∗ ( f ′ )<br />

∫ ∞<br />

−∞<br />

f (t)g(t)dt =<br />

−∞<br />

−∞<br />

exp( j2πt( f − f ′ ))dt<br />

−∞<br />

exp( j2πt( f − f ′ ))dt = δ( f − f ′ ),<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

[∫ ∞<br />

]<br />

F( f ) G ∗ ( f ′ )δ( f − f ′ )d f ′ d f<br />

−∞<br />

]<br />

d f ′ ]<br />

d f.<br />

(B.4)<br />

(B.5)<br />

F( f )G ∗ ( f )d f. (B.6)<br />


Appendix C<br />

Impulse signal, filters<br />

C.1 The impulse or delta signal<br />

In mathematical sense (see e.g. [17]) the delta signal δ(t) is not a funtion but a distribution or a<br />

generalized function. A distribution is defined in terms of its effect on another function (usually<br />

called “test function”). The impulse distribution can be defined by its effect on the test function<br />

f (t), which is supposed to be continuous at the origin, by the relation<br />

∫ ∞<br />

−∞<br />

f (t)δ(t)dt = f (0).<br />

(C.1)<br />

We call this property the sifting property of the impulse signal. Note that we defined the impulse<br />

signal δ(t) by desribing its action on the test function f (t) and not by specifying its value for<br />

different values of t.<br />

C.2 Linear time-invariant systems, filters<br />

Definition C.1 A system L is linear if and only if for any two legitimate input signals x 1 (t)<br />

and x 2 (t) and any two scalars α 1 and α 2 , the linear combination α 1 x 1 (t) + α 2 x 2 (t) is again a<br />

legitimate input, and<br />

L[α 1 x 1 (t) + α 2 x 2 (t)] = α 1 L[x 1 (t)] + α 2 L[x 2 (t)].<br />

(C.2)<br />

A system that does not satisfy this relation is nonlinear.<br />

Definition C.2 The impulse response h(t) of a system L is the response of the system to an<br />

impulse input δ(t), thus<br />

h(t) = L[δ(t)].<br />

(C.3)<br />

The response of the time-invariant system time-invariant system to a unit response applied at<br />

time τ, i.e. δ(t − τ) is obviously h(t − τ).<br />

186


APPENDIX C. IMPULSE SIGNAL, FILTERS 187<br />

Now we can determine the response of a linear time-invariant system L to an input signal<br />

x(t) as follows:<br />

y(t) = L[x(t)]<br />

= L[<br />

=<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

x(τ)δ(t − τ)dτ]<br />

x(τ)L[δ(t − τ)]dτ<br />

x(τ)h(t − τ)dτ.<br />

(C.4)<br />

The final integral is called the convolution of the signal x(t) and the impulse response h(t).<br />

A linear time-invariant system is often called a filter.


Appendix D<br />

Correlation functions, power spectra<br />

We will investigate here what the influence is of a filter with impulse response h(t) on the spectrum<br />

S x ( f ) of the input random process X (t). Then we study the consequences of our findings.<br />

Consider figure D.1. The filter produces the random process Y (t) at its output while the input<br />

process is X (t).<br />

x(t)<br />

✲<br />

h(t)<br />

y(t)<br />

✲<br />

Figure D.1: Filtering the noise process X (t).<br />

D.1 Expectation of an integral<br />

For the expected value of the random process Y (t) we can write 1<br />

∫ ∞<br />

m y (t) = E[Y (t)] = E[<br />

=<br />

−∞<br />

∫ ∞<br />

−∞<br />

The autocorrelation function of the random process is<br />

R y (t, s) = E[Y (t)Y (s)] = E[<br />

=<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞ ∫ ∞<br />

−∞ −∞<br />

∫ ∞ ∫ ∞<br />

−∞<br />

X (α)h(t − α)dα]<br />

E[X (α)]h(t − α)dα.<br />

X (α)h(t − α)dα<br />

−∞<br />

1 Note that we do not assume (yet) that the process X (t) is Gaussian.<br />

∫ ∞<br />

−∞<br />

X (β)h(s − β)dβ]<br />

E[X (α)X (β)]h(t − α)h(s − β)dαdβ<br />

R x (α, β)h(t − α)h(s − β)dαdβ.<br />

(D.1)<br />

(D.2)<br />

188


APPENDIX D. CORRELATION FUNCTIONS, POWER SPECTRA 189<br />

D.2 Power spectrum<br />

To study the effect of filtering a random process we assume that R x (t, s) = R x (τ) with τ =<br />

t − s, i.e. the autocorrelation function R x (t, s) of the input random process depends only on the<br />

difference in time t − s between the sample times t and s. If this condition is satisfied we can<br />

investigate the distribution of the mean power of a random process as a function of frequency.<br />

Now<br />

R y (t, s) =<br />

=<br />

∫ ∞ ∫ ∞<br />

−∞ −∞<br />

∫ ∞ ∫ ∞<br />

−∞<br />

−∞<br />

R x (α − β)h(t − α)h(s − β)dαdβ<br />

R z (t − s + µ − ν)h(ν)h(µ)dνdµ,<br />

(D.3)<br />

with ν = t − α and µ = s − β. Observe that now also R y (t, s) only depends on the time<br />

difference τ = t − s hence R y (t, s) = R y (τ).<br />

Next consider the Fourier transforms S x ( f ) and S y ( f ) of R x (τ) and R y (τ) respectively (see<br />

appendix B):<br />

S x ( f ) =<br />

S y ( f ) =<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

R x (τ) exp(− j2π f τ)dτ<br />

R y (τ) exp(− j2π f τ)dτ<br />

and R x (τ) = ∫ ∞<br />

−∞ S x( f ) exp( j2π f τ)d f<br />

and R y (τ) = ∫ ∞<br />

−∞ S y( f ) exp( j2π f τ)d f.<br />

(D.4)<br />

Note that these Fourier transforms can only be defined if the correlation functions depend only<br />

on the time-difference τ = t − s. Next we obtain<br />

R y (τ) =<br />

=<br />

=<br />

=<br />

hence<br />

∫ ∞ ∫ ∞<br />

−∞ −∞<br />

∫ ∞ ∫ ∞ ∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

−∞<br />

[<br />

S x ( f )[<br />

R x (τ + µ − ν)h(ν)h(µ)dνdµ<br />

−∞<br />

∫ ∞<br />

−∞<br />

S x ( f ) exp( j2π f (τ + µ − ν)d f ]h(ν)h(µ)dνdµ<br />

h(ν) exp(− j2π f ν)dν][<br />

S x ( f )H( f )H ∗ ( f ) exp( j2π f τ)d f,<br />

D.3 Interpretation<br />

∫ ∞<br />

−∞<br />

S y ( f ) = S x ( f )H( f )H ∗ ( f ) = S x ( f )|H( f )| 2 .<br />

h(µ) exp(− j2π f µ)dµ] exp( j2π f τ)d f<br />

(D.5)<br />

(D.6)<br />

First note that the mean square value of the filter output process Y (t) is time-independent if<br />

R x (t, s) = R x (t − s). This follows from<br />

E[Y 2 (t)] = R y (t, t) = R y (t − t) = R y (0).<br />

(D.7)


APPENDIX D. CORRELATION FUNCTIONS, POWER SPECTRA 190<br />

✻ H( f )<br />

1<br />

− f 2<br />

− f 1 0 f 1 f 2<br />

f<br />

✲<br />

Figure D.2: An ideal bandpass filter.<br />

Therefore R y (0) is the expected value of the power that is dissipated in a resistor of 1 connected<br />

to the output of the filter, at any time instant. Next note that<br />

R y (0) =<br />

∫ ∞<br />

−∞<br />

S y ( f )d f =<br />

∫ ∞<br />

−∞<br />

S x ( f )|H( f )| 2 d f.<br />

(D.8)<br />

Consider a filter with transfer function H( f ) shown in figure D.2. This is a bandpass filter and<br />

H( f ) =<br />

{ 1 for f1 ≤ | f | ≤ f 2 ,<br />

0 elsewhere.<br />

(D.9)<br />

Then we obtain<br />

R y (0) =<br />

∫ − f1<br />

− f 2<br />

S y ( f )d f +<br />

∫ f2<br />

f 1<br />

S y ( f )d f. (D.10)<br />

Now let f 1 = f and f 2 = f + f . The filter H( f ) now only tranfers components of X (t)<br />

in the frequency band ( f, f + f ) and stops all other components. Since a power spectrum<br />

is always even, see section D.6, the expected power at the output of the filter is approximately<br />

2S x ( f )f . This implies that S x ( f ) is the distribution of average power in the process X (t) over<br />

the frequencies. Therefore we call S x ( f ) the power spectral density function of X (t).<br />

D.4 Wide-sense stationarity<br />

Definition D.1 A random process Z(t) is called wide-sense stationary (WSS) if and only if<br />

m z (t) = constant<br />

and R z (t, s) = R z (t − s). (D.11)<br />

For a WSS process Z(t) it is meaningful to consider its power spectral density S z ( f ). For<br />

a WSS process Y (t) the expected power can be expressed as in equation (D.7). When a WSS<br />

process X (t) is the input of a filter then the filter output process Y (t) is also WSS and its power<br />

spectral density S y ( f ) is related to the input power spectral density S x ( f ) as given by (D.8).<br />

These consequences already justify the concept wide-sense stationarity.


APPENDIX D. CORRELATION FUNCTIONS, POWER SPECTRA 191<br />

D.5 Gaussian processes<br />

Recall that a Gaussian process Z(t) is completely specified by its mean m z (t) and autocorrelation<br />

function R z (t, s) for all t and s. Therefore a WSS (wide-sense stationary) Gaussian process Z(t)<br />

is also (strict-sense) stationary.<br />

Furthermore note that when a Gaussian process X (t) is the input of a filter then the filter<br />

output process Y (t) is also Gaussian. Remember that the output process is WSS when input<br />

process is WSS.<br />

D.6 Properties of S x ( f ) and R x (τ)<br />

Let X (t) be a WSS random process again.<br />

a) First observe that R x (τ) is a (real) even function of τ. This follows from<br />

R x (−τ) = E[X (t − τ)X (t)] = E[X (t)X (t − τ)] = R x (τ).<br />

(D.12)<br />

b) Next we show that S x ( f ) is a real even function of f . Since R x (τ) is an even funtion and<br />

sin(2π f τ) an odd function of τ<br />

Therefore<br />

S x ( f ) =<br />

=<br />

=<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

∫ ∞<br />

−∞<br />

R x (τ) sin(2π f τ)dτ = 0.<br />

R x (τ) exp(−2π f τ)dτ<br />

R x (τ)[cos(2π f τ) − j sin(2π f τ)]dτ<br />

R x (τ) cos(2π f τ)dτ,<br />

(D.13)<br />

(D.14)<br />

which is even and real.<br />

c) The power spectral density S x ( f ) is a non-negative function of the frequency f . To see<br />

why this statement is true suppose that S x ( f ) is negative for f 1 < | f | < f 2 . Then consider an<br />

ideal bandpass filter H( f ) as given by (D.9). The expected power of the output Y (t) of this filter<br />

when the input is the random process X (t) is given by<br />

E[Y 2 (t)] = R y (0) = 2<br />

∫ f2<br />

f 1<br />

S x ( f )d f < 0, (D.15)<br />

which yields a contradiction.<br />

d) Without proof we give another property of R x (τ). It can be shown that<br />

|R x (τ)| ≤ R x (0). (D.16)


Appendix E<br />

Gram-Schmidt procedure, proof, example<br />

We will only prove result 6.2 for signal sets with signals s m (t) ̸≡ 0 for all m ∈ M. The statement<br />

then also should hold for sets that do contain all-zero signals.<br />

E.1 Proof of result 6.2<br />

The proof is based in induction.<br />

1. First we will show that the induction hypothesis holds for one signal. Therefore consider<br />

s 1 (t) and take<br />

ϕ 1 (t) = s ∫ T<br />

1(t)<br />

√ with E s1 = s1 2 (t)dt.<br />

(E.1)<br />

Es1 0<br />

Note that by doing so<br />

∫ T<br />

0<br />

ϕ 2 1 (t)dt = ∫ T<br />

0<br />

1<br />

s1 2 (t)dt = 1,<br />

E (E.2)<br />

s1<br />

i.e. ϕ 1 (t) has unit energy (is normal). Since s 1 (t) = √ E s1 ϕ 1 (t) the coefficient s 11 = √ E s1 .<br />

2. Now suppose that induction hypothesis holds for m − 1 ≥ 1 signals. Thus there exists an<br />

orthonormal basis ϕ 1 (t), ϕ 2 (t), · · · , ϕ n−1 (t) for the signals s 1 (t), s 2 (t), · · · , s m−1 (t) with<br />

n − 1 ≤ m − 1. Now consider the next signal s m (t) and an auxiliary signal θ m (t) which is<br />

defined as follows:<br />

θ m (t) = s m (t) −<br />

∑<br />

s mi ϕ i (t)<br />

(E.3)<br />

with<br />

s mi =<br />

∫ T<br />

We can distinguish between two cases now:<br />

0<br />

i=1,n−1<br />

s m (t)ϕ i (t)dt for i = 1, n − 1.<br />

(E.4)<br />

(a) If θ m (t) ≡ 0 then s m (t) = ∑ i=1,n−1 s miϕ i (t) and the induction hypothesis also holds<br />

for m signals in this case.<br />

192


APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 193<br />

(b) If on the other hand θ m (t) ̸≡ 0 then take a new building-block waveform<br />

By doing so<br />

ϕ n (t) = θ ∫ T<br />

m(t)<br />

√ with E θm = θm 2 (t)dt.<br />

Eθm 0<br />

∫ T<br />

0<br />

ϕ 2 n (t)dt = ∫ T<br />

0<br />

1<br />

θm 2 (t)dt = 1,<br />

E (E.6)<br />

θm<br />

i.e. also ϕ n (t) has unit energy. Moreover for all i = 1, n − 1<br />

(E.5)<br />

∫ T<br />

0<br />

ϕ n (t)ϕ i (t)dt =<br />

=<br />

=<br />

=<br />

∫ T<br />

0<br />

1<br />

√<br />

Eθm<br />

θ m (t)ϕ i (t)dt<br />

⎛<br />

1<br />

√ ⎝<br />

Eθm<br />

∫ T<br />

0<br />

⎛<br />

1<br />

√ ⎝s mi −<br />

Eθm<br />

1<br />

√<br />

Eθm<br />

(s mi −<br />

s m (t)ϕ i (t)dt −<br />

∑<br />

j=1,n−1<br />

∑<br />

j=1,n−1<br />

∫ T<br />

0<br />

∑<br />

j=1,n−1<br />

⎞<br />

∫ T<br />

s mj ϕ j (t)ϕ i (t)dt⎠<br />

0<br />

s mj δ ji ) = 0,<br />

⎞<br />

s mj ϕ j (t)ϕ i (t)dt⎠<br />

(E.7)<br />

and therefore ϕ n (t) is orthogonal to all ϕ i (t).<br />

Note that now<br />

s m (t) = θ m (t) + ∑<br />

s mi ϕ i (t)<br />

i=1,n−1<br />

= √ E θm ϕ n (t) + ∑<br />

i=1,n−1<br />

s mi ϕ i (t) = ∑<br />

i=1,n<br />

s mi ϕ i (t)<br />

(E.8)<br />

with s mn = √ E θm . Thus also in this case for m signals there exists an orthonormal<br />

basis with n ≤ m building-block waveforms. Therefore the induction hypothesis also<br />

holds for m signals.<br />

It should be noted that, when using the Gram-Schmidt procedure, any ordering of the<br />

signals other than s 1 (t), s 2 (t), · · · , s |M| (t) will yield a basis, i.e. a set of building-block<br />

waveforms, of the smallest possible dimensionality (see exercise 1 in chapter 6), however<br />

in general with different building-block waveforms.<br />

E.2 An example<br />

We will next discuss an example in which we actually carry out the Gram-Schmidt procedure for<br />

a given set of waveforms.


APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 194<br />

3<br />

s 1 (t)<br />

3<br />

ϕ 1 (t)<br />

1<br />

t<br />

1/ √ 3<br />

t<br />

-1<br />

1 2 3<br />

−1/ √ 3<br />

1 2 3<br />

-3<br />

-3<br />

3<br />

s 2 (t)<br />

3<br />

θ 2 (t)<br />

3<br />

ϕ 2 (t)<br />

1<br />

t<br />

1<br />

t<br />

1/ √ 2<br />

t<br />

-1<br />

1 2 3 -1 1 2 3<br />

1 2 3<br />

-3<br />

-3<br />

-3<br />

3<br />

s 3 (t)<br />

1<br />

-1<br />

1 2<br />

3<br />

t<br />

θ 3 (t) ≡ 0<br />

-3<br />

Figure E.1: An example of the Gram-Schmidt procedure.<br />

Example E.1 Consider the three waveforms s m (t), m = 1, 2, 3, shown in figure E.1. Note that T = 3.<br />

We first determine from the energy E 1 of the first signal s 1 (t)<br />

E 1 = 4 + 4 + 4 = 12.<br />

(E.9)<br />

Therefore the first building-block waveform ϕ 1 (t) is<br />

ϕ 1 (t) = s 1(t)<br />

√<br />

E1<br />

= s 1(t)<br />

2 √ 3 , and thus s 11 = 2 √ 3. (E.10)<br />

Now we continue with the second signal s 2 (t). First we determine the projection s 21 of s 2 (t) on the first<br />

dimension i.e. the dimension that corresponds to ϕ 1 (t):<br />

s 21 =<br />

∫ T<br />

0<br />

s 2 (t)ϕ 1 (t)dt = 1 √<br />

3<br />

(−1 − 3 + 1) = − √ 3.<br />

(E.11)


APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 195<br />

s 2<br />

s 1<br />

− √ 2<br />

ϕ 2<br />

❏❪<br />

❏<br />

2 √ 2<br />

❏<br />

❏<br />

❏<br />

√<br />

2<br />

√<br />

− 3<br />

❏<br />

❏<br />

❏<br />

❏<br />

✡<br />

✡<br />

✡<br />

✡<br />

✡<br />

✡<br />

✡<br />

s 3 −2 √ 2<br />

✡<br />

√<br />

3<br />

2 √ 3<br />

✲<br />

ϕ 1<br />

Figure E.2: A vector representation of the signals s m (t), m = 1, 2, 3, of figure E.1.<br />

hence the auxiliary signal θ 2 (t) is now<br />

θ 2 (t) = s 2 (t) + √ 3ϕ 1 (t).<br />

(E.12)<br />

Next we determine the energy E θ2 of the auxiliary signal θ 2 (t):<br />

E θ2 = 4 + 4 = 8,<br />

(E.13)<br />

and the second building-block waveform ϕ 2 (t) is<br />

Now we obtain for the signal s 2 (t):<br />

ϕ 2 (t) = θ 2(t)<br />

√<br />

Eθ2<br />

s 2 (t) = θ 2 (t) − √ 3ϕ 1 (t)<br />

= θ 2(t)<br />

2 √ 2 . (E.14)<br />

= 2 √ 2ϕ 2 (t) − √ 3ϕ 1 (t), thus s 22 = 2 √ 2. (E.15)<br />

For the third signal s 3 (t) we can now determine the projections s 31 and s 32 and the auxiliary signal θ 3 (t):<br />

s 31 =<br />

s 32 =<br />

∫ T<br />

0<br />

∫ T<br />

0<br />

s 3 (t)ϕ 1 (t)dt = 1 √<br />

3<br />

(−1 + 1 − 3) = − √ 3<br />

s 3 (t)ϕ 2 (t)dt = 1 √<br />

2<br />

(−1 − 3) = −2 √ 2 hence<br />

θ 3 (t) = s 3 (t) + √ 3ϕ 1 (t) + 2 √ 2ϕ 2 (t) ≡ 0. (E.16)<br />

Since this auxiliary signal is always 0 we can express the third signal s 3 t) in terms of ϕ 1 (t) and ϕ 2 (t) as<br />

follows:<br />

s 3 (t) = − √ 3ϕ 1 (t) − 2 √ 2ϕ 2 (t),<br />

(E.17)


APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 196<br />

hence the coefficients corresponding to s 3 (t) are<br />

s 31 = − √ 3 and<br />

s 32 = −2 √ 2. (E.18)<br />

We have now obtained three vectors of coefficients, one for each signal:<br />

s 1<br />

= (s 11 , s 12 ) = (2 √ 3, 0),<br />

s 2<br />

= (s 21 , s 22 ) = (− √ 3, 2 √ 2), and<br />

s 3<br />

= (s 31 , s 32 ) = (− √ 3, −2 √ 2). (E.19)<br />

These vectors are depicted in figure E.2.


Appendix F<br />

Schwarz inequality<br />

LEMMA F.1 (Schwarz inequality) For two finite-energy waveforms a(t) and b(t) the inequality<br />

(∫ ∞<br />

2 ∫ ∞ ∫ ∞<br />

a(t)b(t)dt)<br />

≤ a 2 (t)dt b 2 (t)dt<br />

(F.1)<br />

−∞<br />

−∞<br />

−∞<br />

holds. Equality is obtained only if b(t) ≡ Ca(t) for some constant C.<br />

Proof: Form an orthonormal expansion for a(t) and b(t), i.e.<br />

a(t) = a 1 φ 1 (t) + a 2 φ 2 (t),<br />

b(t) = b 1 φ 1 (t) + b 2 φ 2 (t), (F.2)<br />

with ∫ ∞<br />

−∞ φ i(t)φ j (t)dt = δ i j for i = 1, 2 and j = 1, 2.<br />

Then with a = (a 1 , a 2 ) and b = (b 1 , b 2 ) and the Parseval results (a · b) = ∫ ∞<br />

−∞ a(t)b(t)dt,<br />

(a · a) = ∫ ∞<br />

−∞ a2 (t)dt, and (b · b) = ∫ ∞<br />

−∞ b2 (t)dt, we find for the vectors a and b that<br />

(a · b)<br />

‖a‖‖b‖ =<br />

∫ ∞<br />

−∞<br />

√ a(t)b(t)dt<br />

∫ √<br />

∞ ∫ . (F.3)<br />

∞<br />

−∞ a2 (t)dt<br />

−∞ b2 (t)dt<br />

If we now can prove that (a · b) 2 ≤ ‖a‖ 2 ‖b‖ 2 we get Schwarz inequality.<br />

We start by stating the triangle inequality for a and b:<br />

‖a + b‖ ≤ ‖a‖ + ‖b‖.<br />

(F.4)<br />

Moreover<br />

hence<br />

and therefore<br />

‖a + b‖ 2 = (a + b) 2 = ‖a‖ 2 + ‖b‖ 2 + 2(a · b)<br />

(F.5)<br />

‖a‖ 2 + ‖b‖ 2 + 2(a · b) ≤ ‖a‖ 2 + ‖b‖ 2 + 2‖a‖‖b‖,<br />

(F.6)<br />

(a · b) ≤ ‖a‖‖b‖. (F.7)<br />

197


APPENDIX F. SCHWARZ INEQUALITY 198<br />

From this it also follows that<br />

−(a · b) = (a · (−b)) ≤ ‖a‖‖ − b‖ = ‖a‖‖b‖<br />

(F.8)<br />

or<br />

(a · b) ≥ −‖a‖‖b‖. (F.9)<br />

We therefore may conclude that −‖a‖‖b‖ ≤ (a · b) ≤ ‖a‖‖b‖ or (a · b) 2 ≤ ‖a‖ 2 ‖b‖ 2 and this<br />

proves Schwarz inequality.<br />

Equality in Schwarz inequality is achieved only when equality in the triangle inequality is<br />

obtained. This the case when (first part of proof) b = Ca or (second part) −b = (−b) = Ca for<br />

some non-negative C.<br />


Appendix G<br />

Bound error probability orthogonal<br />

signaling<br />

Note that the correct probability satisfies<br />

P C =<br />

∫ ∞<br />

−∞<br />

p(µ − b)(1 − Q(µ)) |M|−1 dµ<br />

(G.1)<br />

where b = √ 2E s /N 0 , hence<br />

P E = 1 − P C =<br />

∫ ∞<br />

−∞<br />

p(µ − b)[1 − (1 − Q(µ)) |M|−1 ]dµ.<br />

(G.2)<br />

The term in square brackets is the probability that at least one of |M| − 1 noise components<br />

exceeds µ. By the union bound this probability is not larger than the sum of the probabilities that<br />

individual components exceed µ, thus<br />

1 − (1 − Q(µ)) |M|−1 ≤ (|M| − 1)Q(µ) ≤ |M|Q(µ). (G.3)<br />

Moreover, since this term is a probability we can also write<br />

1 − (1 − Q(µ)) |M|−1 ≤ 1. (G.4)<br />

When µ is small Q(µ) is large and then the unity bound is tight. On the other hand for large µ<br />

the bound |M|Q(µ) is tighter. Therefore we split the integration range into two parts, (−∞, a)<br />

and (a, ∞):<br />

∫ a<br />

∫ ∞<br />

P E ≤ p(µ − b)dµ + |M| p(µ − b)Q(µ)dµ.<br />

(G.5)<br />

−∞<br />

a<br />

If we take a ≥ 0 we can use the upper bound Q(µ) ≤ exp(−µ 2 /2), see the proof in appendix<br />

A, and we obtain<br />

∫ a<br />

∫ ∞<br />

P E ≤ p(µ − b)dµ + |M| p(µ − b) exp(− µ2<br />

)dµ. (G.6)<br />

−∞<br />

a<br />

2<br />

199


APPENDIX G. BOUND ERROR PROBABILITY ORTHOGONAL SIGNALING 200<br />

If we denote the first integral by P 1 and the second by P 2 , we get<br />

P E ≤ P 1 + |M|P 2 .<br />

(G.7)<br />

To minimize the bound we choose a such that the derivative of (G.6) with respect to a is equal<br />

to 0, i.e.<br />

This results in<br />

0 = d<br />

da [P 1 + |M|P 2 ]<br />

= p(a − b) − |M|p(a − b) exp(− a2<br />

). (G.8)<br />

2<br />

exp( a2<br />

) = M. (G.9)<br />

2<br />

Therefore the value a = √ 2 ln |M| achieves a (at least a local) minimum in P E . Note that a ≥ 0<br />

as was required. Now we can consider the separate terms. For the first term we can write<br />

P 1 =<br />

=<br />

∫ a<br />

−∞<br />

∫ a−b<br />

−∞<br />

= Q(b − a)<br />

1<br />

√<br />

2π<br />

exp(−<br />

(µ − b)2<br />

)dµ<br />

2<br />

1<br />

√ exp(− γ 2<br />

2π 2 )dγ<br />

(∗) (b − a)2<br />

≤ exp(− ) if 0 ≤ a ≤ b. (G.10)<br />

2<br />

Here the inequality (∗) follows from appendix A. For the second term we get<br />

∫ ∞<br />

1 (µ − b)2<br />

P 2 = √ exp(− ) exp(− µ2<br />

a 2π 2<br />

2 )dµ<br />

∫<br />

(∗∗)<br />

= exp(− b2 ∞<br />

4 ) 1<br />

√ exp(−(µ − b/2) 2 )dµ<br />

2π<br />

= exp(− b2<br />

4 ) 1 √<br />

2<br />

∫ ∞<br />

= exp(− b2<br />

4 ) 1 √<br />

2<br />

∫ ∞<br />

a<br />

a<br />

a √ 2<br />

1<br />

√<br />

2π<br />

exp(−<br />

1<br />

√<br />

2π<br />

exp(−<br />

(<br />

µ √ 2 − (b/2) √ ) 2<br />

2<br />

2<br />

(<br />

γ − (b/2) √ 2<br />

2<br />

) 2<br />

)dγ<br />

)dµ √ 2<br />

= exp(− b2<br />

4 ) 1 √<br />

2<br />

Q(a √ 2 − (b/2) √ 2). (G.11)


APPENDIX G. BOUND ERROR PROBABILITY ORTHOGONAL SIGNALING 201<br />

Here the equality (∗∗) follows from<br />

From (G.11) we get<br />

1<br />

2 (µ − b)2 + µ2<br />

2<br />

= µ2<br />

2<br />

b2<br />

− µb +<br />

2 + µ2<br />

2<br />

= µ 2 − µb + b2<br />

4 + b2<br />

4<br />

= (µ − b 2 )2 + b2<br />

4 . (G.12)<br />

{ exp(−b<br />

P 2 ≤<br />

2 /4) 0 ≤ a ≤ b/2<br />

exp(−b 2 /4 − (a − b/2) 2 ) a ≥ b/2.<br />

(G.13)<br />

If we now collect the bounds (G.10) and (G.13) and substitute the optimal value of a found in<br />

(G.9) we get<br />

P E ≤<br />

{ exp(−(b − a) 2 /2) + exp(a 2 /2) exp(−b 2 /4) 0 ≤ a < b/2<br />

exp(−(b − a) 2 /2) + exp(a 2 /2) exp(−b 2 /4 − (a − b/2) 2 ) b/2 ≤ a ≤ b.<br />

(G.14)<br />

Now we note that<br />

(a − b) 2 ( b<br />

2<br />

)<br />

−<br />

2 4 − a2<br />

=<br />

2<br />

(<br />

a − b 2) 2<br />

≥ 0. (G.15)<br />

Therefore for 0 ≤ a < b/2 the first term is not larger than the second one while for b/2 ≤ a ≤ b<br />

both terms are equal. This leads to<br />

P E ≤<br />

{ 2 exp(−b 2 /4 + a 2 /2) 0 ≤ a < b/2<br />

2 exp(−(b − a) 2 /2) b/2 ≤ a ≤ b.<br />

(G.16)<br />

Now we rewrite both a an b as<br />

a = √ 2 ln |M| = √ 2 ln 2 · √log<br />

2 |M|<br />

b = √ 2E s /N 0 = √ 2E b /N 0 · √log<br />

2 |M|, (G.17)<br />

where we used the following definition for E b i.e. the energy per transmitted bit of information<br />

E b<br />

=<br />

E s<br />

log 2 |M| .<br />

(G.18)<br />

This finally leads to<br />

P E ≤<br />

{ 2 exp(− log2 |M|[E b /(2N 0 ) − ln 2]) E b /N 0 ≥ 4 ln 2<br />

2 exp(− log 2 |M|[ √ E b /N 0 − √ ln 2] 2 ) ln 2 ≤ E b /N 0 ≤ 4 ln 2.<br />

(G.19)


Appendix H<br />

Decibels<br />

Definition H.1 An energy ratio X > 0 can be expressed in decibels as follows:<br />

X| in dB<br />

= 10 log10 X. (H.1)<br />

It is easy to work with decibels since by doing so we transform multiplications into additions.<br />

Table H.1 shows for some energy ratios X the equivalent ratio in dB.<br />

Note that 1 dB = 0.1 Bel.<br />

X X| in dB<br />

0.01 -20.0 dB<br />

0.1 -10.0 dB<br />

1.0 0.0 dB<br />

2.0 3.0 dB<br />

3.0 4.8 dB<br />

5.0 7.0 dB<br />

10.0 10.0 dB<br />

100.0 20.0 dB<br />

Table H.1: Decibels.<br />

202


Appendix I<br />

Whitening filters<br />

Suppose that the power spectral density of the noise is rational, and can be written as<br />

S n ( f ) = k N( f )<br />

D( f ) . (I.1)<br />

Here the numerator polynomial N( f ) is assumed to have degree n, its complex roots are ν 1 , ν 2 ,<br />

· · · , ν n , the coefficient of f n is one. The roots of N( f ) are the zeros of S n ( f ). We assume that<br />

the denominator polynomial D( f ) has degree d, its complex roots are π 1 , π 2 , · · · , π d , and the<br />

coefficient of f d is one. The roots of D( f ) are the poles of S n ( f ). Finally k is some constant.<br />

Note that if S n ( f ) is not rational it can be approximated as a rational function of f .<br />

A power density function is real and even (see appendix D). Therefore the zeros and poles<br />

of S n ( f ) come in pairs, i.e. each zero ν = a + jb has a counterpart ν ′ = a − jb and each pole<br />

π = a + jb has a counterpart π ′ = a − jb (see figure I.1). Note that both n and d are even.<br />

To construct the whitening filter we number the roots (poles and zeros) of S n ( f ) in such a<br />

way that odd-numbered roots have a positive imaginary part and the even-numbered roots have<br />

a negative imaginary part. Now we define<br />

N pos ( f ) = ( f − ν 1 )( f − ν 3 ) · · · ( f − ν n−1 ),<br />

N neg ( f ) = ( f − ν 2 )( f − ν 4 ) · · · ( f − ν n ),<br />

D pos ( f ) = ( f − π 1 )( f − π 3 ) · · · ( f − π d−1 ),<br />

D neg ( f ) = ( f − π 2 )( f − π 4 ) · · · ( f − π d ). (I.2)<br />

The whitening filter G( f ) can now be specified by<br />

G( f ) =<br />

√<br />

N0 /2<br />

k<br />

D pos ( f )<br />

N pos ( f ) ,<br />

(I.3)<br />

thus the numerator of G( f ) corresponds to the poles of S n ( f ) with a positive imaginary part and<br />

the denominator of G( f ) corresponds to the zeros of S n ( f ) with a positive imaginary part.<br />

203


APPENDIX I. WHITENING FILTERS 204<br />

✻jβ<br />

❅<br />

✄<br />

✂ ✁<br />

ν 1<br />

❅<br />

π 1<br />

✄<br />

✂ ✁ν 3<br />

❅ <br />

π 3<br />

α<br />

✲<br />

❅<br />

✄<br />

✂ ✁<br />

ν 2<br />

π 6<br />

❅<br />

π 5<br />

π 4<br />

π 2<br />

✄<br />

✂ ✁ν 4<br />

❅<br />

Figure I.1: Poles and zeros of S n ( f ).<br />

How does this filter change the spectral density of the noise? The power spectral density at<br />

the filter output is S n ( f )|G( f )| 2 and<br />

|G( f )| 2 = G( f )G ∗ ( f ) = N 0/2<br />

k<br />

= N 0/2<br />

k<br />

= N 0/2<br />

k<br />

D pos ( f ) Dpos ∗ ( f )<br />

N pos ( f ) Npos ∗ ( f )<br />

D pos ( f ) D neg ( f )<br />

N pos ( f ) N neg ( f )<br />

D( f )<br />

N( f ) = N 0/2<br />

S n ( f ) ,<br />

(I.4)<br />

thus the output power spectral density S n o( f ) = N 0 /2, i.e. the output process is white Gaussian<br />

noise.<br />

We next have to verify that both filters G( f ) and G −1 ( f ) = 1/G( f ) are realizable. The<br />

transfer functions of both these filters contain only factors [ f − (a + jb)] with b ≥ 0. Let<br />

s = j2π f denote the complex frequency then we can express such a factor as<br />

[s/j2π − (a + jb)] = 1 [s + 2πb − 2 jπa].<br />

j2π (I.5)<br />

Note that the roots of these factors are s = −2πb + 2 jπa and thus have non-positive real part.<br />

Therefore none of the poles of both filters occur in the right-half s-plane. Consequently both<br />

G( f ) and G −1 ( f ) are realizable and our hence whitening filter G( f ) is reversible.


Appendix J<br />

Elementary energy E pam of an |A|-PAM<br />

constellation<br />

The elementary energy E pam of an |A|-PAM constellation was defined as<br />

E pam<br />

=<br />

1<br />

|A|<br />

∑<br />

s=−|A|+1,−|A|+3,··· ,|A|−1<br />

s 2 .<br />

(J.1)<br />

We want to express E pam in terms of the number of symbols |A|. We will only consider the case<br />

where |A| is even. A similar treatment can be given for odd |A|. For even |A|<br />

By induction we will show that now<br />

E pam = 2<br />

|A|<br />

∑<br />

k=1,H<br />

∑<br />

k=1,|A|/2<br />

(2k − 1) 2 = 4H 3 − H<br />

3<br />

(2k − 1) 2 . (J.2)<br />

. (J.3)<br />

Note that the induction hypothesis holds for H = 1. Next suppose that it holds for H = h ≥ 1.<br />

Now it turns out that<br />

∑<br />

(2k − 1) 2 = ∑<br />

(2k − 1) 2 + (2h + 1) 2<br />

k=1,h+1<br />

k=1,h<br />

= 4h3 − h<br />

3<br />

+ (2h + 1) 2<br />

= 4h3 − h + 12h 2 + 12h + 3<br />

3<br />

= 4h3 + 12h 2 + 12h + 4 − h − 1<br />

3<br />

= 4(h + 1)3 − (h + 1)<br />

, (J.4)<br />

3<br />

205


APPENDIX J. ELEMENTARY ENERGY E PAM OF AN |A|-PAM CONSTELLATION 206<br />

and thus the induction hypothesis also holds for h + 1 and thus for all H. Therefore we obtain<br />

E pam = 2<br />

|A|<br />

∑<br />

k=1,|A|/2<br />

(2k − 1) 2 = 2<br />

|A|<br />

4(|A|/2) 3 − |A|/2<br />

3<br />

= |A|2 − 1<br />

. (J.5)<br />

3


Bibliography<br />

[1] A.R. Calderbank, T.A. Lee, and J.E. Mazo, “Baseband Trellis Codes with a Spectral Null at<br />

Zero,” IEEE Trans. Inform. Theory, vol. IT-34, pp. 425-434, May 1988.<br />

[2] G.A. Campbell, U.S. Patent 1,227,113, May 22, 1917, “Basic Types of Electric Wave Filters.”<br />

[3] J.R. Carson, “Notes on the Theory of Modulation,” Proc. IRE, vol. 10, pp. 57-64, February<br />

1922. Reprinted in Proc. IEEE, Vol. 51, pp. 893-896, June 1963.<br />

[4] T.M. Cover, “Enumerative Source Encoding,” IEEE Trans. Inform. Theory, vol. IT-19, pp.<br />

73-77, Jan. 1973.<br />

[5] G.D. Forney, Jr., R.G. Gallager, G.R. Lang, F.M. Longstaff, and S.U. Quereshi, “Efficient<br />

Modulation for Band-Limited Channels,” IEEE Journ. Select. Areas Comm., vol. SAC-2, pp.<br />

632-647, Sept. 1984.<br />

[6] G.D. Forney, Jr., and G. Ungerboeck, “Modulation and Coding for Linear Gaussian Channels,”<br />

IEEE Trans. Inform. Theory, vol. 44, pp. 2384 - 2415, October 1998.<br />

[7] R.G. Gallager, Information Theory and Reliable Communication, Wiley and Sons, 1968.<br />

[8] R.V.L. Hartley, “Transmission of Information,” Bell Syst. Tech. J., vol. 7, pp. 535 - 563, July<br />

1928.<br />

[9] C.W. Helstrom, Elements of <strong>Signal</strong> Detection and Estimation, Prentice Hall, 1995.<br />

[10] S. Haykin, Communication <strong>Systems</strong>. John Wiley & Sons, 4th edition, 2000.<br />

[11] V.A. Kotelnikov, The Theory of Optimum Noise Immunety. Dover Publications, 1960.<br />

[12] R. Laroia, N. Farvardin, S. Tretter, “On Optimal Shaping of Multidimensional Constellations”,<br />

submitted to IEEE Trans. on Inform. Theory, vol. IT-40, pp. 1044-1056, July 1994.<br />

[13] E.A. Lee and D.G. Messerschmitt, Digital Communication, 2nd Edition, Kluwer Academic,<br />

1994.<br />

207


BIBLIOGRAPHY 208<br />

[14] D.O. North, Analysis of the Factors which determine <strong>Signal</strong>/Noise Discrimination in Radar.<br />

RCA Laboratories, Princeton, Technical Report, PTR-6C, June 1943. Reprinted in Proc.<br />

IEEE, vol. 51, July 1963.<br />

[15] H. Nyquist, “Certain factors affecting telegraph speed,” Bell Syst. Tech. J., vol. 3, pp. 324 -<br />

346, April 1924.<br />

[16] B.M. Oliver, J.R. Pierce, and C.E. Shannon, “The Philosophy of PCM,” Proc. IRE, vol. 36,<br />

pp.1324-1331, Nov. 1948.<br />

[17] J.G. Proakis and M. Salehi, Communication <strong>Systems</strong> Engineering. Prentice Hall, 1994.<br />

[18] J.G. Proakis, Digital Communications. McGraw-Hill, 4th Edition, 2001.<br />

[19] J.P.M. Schalkwijk, “An Algorithm for Source Coding,” IEEE Trans. Inform. Theory, vol.<br />

IT-18, pp. 395-399, May 1972.<br />

[20] C.E. Shannon, “A Mathematical Theory of Communication,” Bell Syst. Tech. J., vol. 27, pp.<br />

379 - 423 and 623 - 656, July and October 1948. Reprinted in the Key Papers on Information<br />

Theory.<br />

[21] C.E. Shannon, “Communication in the Presence of Noise,” Proc. IRE, vol. 37, pp. 10 - 21,<br />

January 1949. Reprinted in the Key Papers on Information Theory.<br />

[22] G. Ungerboeck, “Channel Coding with Multilevel/Phase <strong>Signal</strong>s,” IEEE Trans. Inform.<br />

Theory, vol. IT-28, pp. 55-67, Jan. 1982.<br />

[23] S. Verdu, Multi-user detection, Cambridge, 1998.<br />

[24] J.H. Van Vleck and D. Middleton, “A Theoretical Comparison of Visual, Aural, and Meter<br />

Reception of Pulsed <strong>Signal</strong>s in the Presence of Noise,” Journal of Applied Physics, vol. 17,<br />

pp. 940-971, November 1946.<br />

[25] J.M. Wozencraft and I.M. Jacobs, Principles of Communication Engineering, Wiley, New<br />

York, 1965.<br />

[26] S.T.M. Ackermans and J.H. van Lint, Algebra en Analyse, Academic Service, Den Haag,<br />

1976.


Index<br />

a-posteriori message probability, 18<br />

a-priori message probability, 15<br />

additive Gaussian noise channel<br />

scalar, 27<br />

additive Gaussian noise vector channel, 38<br />

aliased spectrum, 120<br />

Armstrong, E.H., 11<br />

asymptotic coding gain, 175<br />

autocorrelation function<br />

of white noise, 47<br />

available energy<br />

per dimension, 104<br />

bandlimited transmission, 104<br />

bandwidth-limited case, 116<br />

Bardeen J., Brattain W., and Shockley W., 10<br />

Bayes rule, 18<br />

Bell, A.G., 9<br />

bi-orthogonal signaling, 80<br />

binary modulation<br />

frequency shift keying (FSK), 85<br />

phase shift keying (PSK), 85<br />

Branly, E., 9<br />

Braun, K.F., 10<br />

building-block waveforms, 53<br />

Campbell, G.A., 9, 11<br />

capacity<br />

of bandlimited waveform channel, 112<br />

of wideband waveform channel, 114<br />

per dimension, 104, 111<br />

carrier-phase, 137<br />

Carson, J.R., 11<br />

center of gravity of the signal structure, 85<br />

channel<br />

additive Gaussian noise<br />

scalar, 27<br />

bandpass, 127<br />

binary symmetric, 23<br />

capacity, 13<br />

discrete, 15<br />

discrete scalar, 20<br />

discrete vector, 20<br />

real scalar, 25<br />

real vector, 35, 36<br />

waveform, 46<br />

Clarke, A.C., 10<br />

code<br />

error-correcting, 172<br />

coding gain<br />

asymptotic, 175<br />

coherent demodulation, 137<br />

correlation, 50<br />

correlation receiver, 50<br />

cross-talk, 170<br />

decibel definition, 202<br />

decision intervals, 29<br />

decision region, 37<br />

definition for real vector channel, 37<br />

decision rule, 16<br />

maximum a-posteriori<br />

for discrete channel, 17<br />

for real scalar channel, 27<br />

maximum likelihood<br />

for discrete channel, 19<br />

for real scalar channel, 27<br />

minimax, 22<br />

minimum Euclidean distance, 39<br />

decision variables<br />

for discrete channel, 18<br />

for real scalar channel, 26<br />

209


INDEX 210<br />

for real vector channel, 36<br />

for the discrete vector channel, 21<br />

DeForest, L., 10<br />

demodulator, 61<br />

destination, 16<br />

diode, 10<br />

direct receiver, 66<br />

discrete, 16<br />

distance gain, 176<br />

dot product, 38<br />

Dudley, H., 12<br />

energy<br />

of vector, 69<br />

envelope detection, 142<br />

equally likely messages, 19<br />

error probability, 16<br />

minimum, for the scalar AGN channel, 30<br />

symbol, 168<br />

union bound, 88<br />

error-correcting code, 172<br />

estimate, 16, 21<br />

excess bandwidth, 122<br />

extended Hamming codes, 172<br />

Faraday, M., 9<br />

Fleming, J.A., 10<br />

frequency shift keying, 75<br />

FSK, 75<br />

Galvani, L., 8<br />

Gauss, C.F. and Weber, W.E., 9<br />

Gram-Schmidt procedure, 55, 192<br />

Hadamard matrix, 79<br />

Hamming codes<br />

extended, 172<br />

Hamming distance, 172<br />

minimum, 172<br />

Hamming-distance, 23<br />

Hartley, R., 12<br />

Henry, J., 8<br />

Hertz, H., 9<br />

hypershere<br />

interior volume of, 107<br />

information source, 15<br />

inner product, 38<br />

integrated circuit, 10<br />

interval<br />

observation, 47<br />

irrelevant data (noise), 58<br />

Kilby J. and Noyce R., 10<br />

Kotelnikov, V., 12<br />

Kronecker delta function, 50<br />

likelihood, 19<br />

Lodge, O., 9<br />

MAP decision rule<br />

for the discrete vector channel, 21<br />

for discrete channel, 17<br />

for real scalar channel, 27<br />

for the scalar AGN channel, 28<br />

MAP decoding<br />

for real vector channel, 36<br />

Marconi, G., 10<br />

matched filter, 12<br />

matched filters, 64<br />

matched-filter receiver, 64<br />

Maxwell, J.C., 9<br />

message, 15<br />

message sequences, 92<br />

minimax decision rule, 22<br />

minimum Hamming distance, 172<br />

ML decision rule<br />

for discrete channel, 19<br />

for real scalar channel, 27<br />

modulation<br />

amplitude (AM), 11<br />

double sideband suppressed carrier (DSB-<br />

SC), 137<br />

frequency (FM), 11<br />

pulse amplitude (PAM), 119<br />

pulse code (PCM), 12<br />

pulse position, 75<br />

quadrature amplitude (QAM), 127, 135


INDEX 211<br />

modulation interval, 120, 125<br />

modulator, 56, 60<br />

Morse<br />

code, 8<br />

telegraph, 8<br />

Morse, S., 8<br />

multi-pulse transmission, 125<br />

normalized signal-to-noise-ratio, 168<br />

North, D.O., 12, 70<br />

Nyquist criterion, 120<br />

Nyquist, H., 11<br />

observation interval, 47<br />

observation space, 37<br />

Oersted, H.K., 8<br />

optimum decision rule, 16<br />

for real channel, 26<br />

for the discrete channel, 18<br />

optimum receiver, 16<br />

for the AGN vector channel, 39<br />

for the scalar AGN channel, 28<br />

orthogonal signaling, 75<br />

capacity, 77<br />

energy, 88<br />

error probability, 76<br />

optimum receiver for, 76<br />

signal structure, 75<br />

orthonormal, 47<br />

orthonormal functions, 47<br />

Parseval relation, 68<br />

PCM, 12<br />

Pierce, J.R., 10<br />

Poisson distribution, 23<br />

Popov, A., 10<br />

power spectral density<br />

of white noise, 47<br />

power-limited case, 116<br />

PPM, 75<br />

probability of correct decision, 16<br />

probability of error<br />

definition, 16<br />

pulse amplitude modulation, 167<br />

pulse transmission, 120<br />

Pupin, M., 9<br />

Q-function, 30<br />

definition of, 30<br />

table of, 31<br />

upper bound for, 182<br />

quadrature amplitude modulation, 127, 135<br />

quadrature multiplexing, 127<br />

random code, 104, 108<br />

real vector channel, 36<br />

receiver, 16<br />

direct, 66<br />

redundancy loss, 176<br />

Reeves, A., 12<br />

reversibility<br />

theorem of, 42<br />

Righi, A., 9<br />

satellites, 10<br />

Schwarz inequality, 197<br />

serial pulse transmission, 120<br />

set-partitioning, 173<br />

Shannon, C.E., 13<br />

shaping gain, 136<br />

signal, 15<br />

signal constellation, 167<br />

signal energy, 83<br />

signal space, 56<br />

signal structure, 56, 61<br />

average energy of, 84<br />

rotation of, 83<br />

translation of, 83<br />

signal vector, 20<br />

signal-to-noise ratio, 66, 115<br />

at output of matched filter, 67<br />

normalized, 168<br />

signaling<br />

antipodal, 85<br />

bit-by-bit, 104<br />

block-orthogonal, 104<br />

orthogonal binary, 85<br />

source, 15


INDEX 212<br />

sphere hardening<br />

noise vector, 105<br />

received vector, 106<br />

sphere-packing argument, 107<br />

sufficient statistic, 52<br />

sufficient statistics, 53<br />

symbol error probability, 168<br />

telegraph, 9<br />

telephone, 9<br />

Telstar I, 10<br />

transistor, 10<br />

transmitter, 15<br />

triode, 10<br />

uncoded transmission, 167<br />

union bound, 89, 110<br />

van der Bijl, H., Hartley, R., and Heising, R.,<br />

11<br />

vector channel, 20<br />

vector receiver, 21<br />

vector transmitter, 20<br />

Volta, A., 8<br />

water-filling procedure, 113<br />

waveform channel<br />

relation with vector channel, 60<br />

Wiener, N., 12<br />

zero-forcing (ZF), 124<br />

zero-forcing criterion, 120

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!