COMMUNICATION THEORY - Signal Processing Systems
COMMUNICATION THEORY - Signal Processing Systems
COMMUNICATION THEORY - Signal Processing Systems
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>COMMUNICATION</strong> <strong>THEORY</strong><br />
Frans M.J. Willems 1<br />
Spring 2005<br />
1 Group <strong>Signal</strong> <strong>Processing</strong> <strong>Systems</strong>, Electrical Engineering Department, Eindhoven University of Technology.
Contents<br />
I Some History 7<br />
1 Historical facts related to communication 8<br />
1.1 Telegraphy and telephony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8<br />
1.2 Wireless communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />
1.3 Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
1.4 Classical modulation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
1.5 Fundamental bounds and concepts . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
II Optimum Receiver Principles 14<br />
2 Decision rules for discrete channels 15<br />
2.1 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15<br />
2.2 Decision rules, an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16<br />
2.3 The “maximum a-posteriori” decision rule . . . . . . . . . . . . . . . . . . . . . 18<br />
2.4 The “maximum likelihood” decision rule . . . . . . . . . . . . . . . . . . . . . . 19<br />
2.5 Scalar and vector channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />
3 Decision rules for the real scalar channel 25<br />
3.1 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />
3.2 MAP and ML decision rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
3.3 The scalar additive Gaussian noise channel . . . . . . . . . . . . . . . . . . . . 27<br />
3.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />
3.3.2 The MAP decision rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />
3.3.3 The Q-function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />
3.3.4 Probability of error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />
4 Real vector channels 35<br />
4.1 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35<br />
4.2 Decision variables, MAP decoding . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />
4.3 Decision regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37<br />
1
CONTENTS 2<br />
4.4 The additive Gaussian noise vector channel . . . . . . . . . . . . . . . . . . . . 38<br />
4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />
4.4.2 Minimum Euclidean distance decision rule . . . . . . . . . . . . . . . . 39<br />
4.4.3 Error probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />
4.5 Theorem of reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />
5 Waveform channels, correlation receiver 46<br />
5.1 System description, additive white Gaussian noise . . . . . . . . . . . . . . . . . 46<br />
5.2 Waveform expansions in series of orthonormal functions . . . . . . . . . . . . . 47<br />
5.3 An equivalent vector channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />
5.4 A correlation receiver is optimum . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />
5.5 Sufficient statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br />
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br />
6 Waveform channels, building-blocks 53<br />
6.1 Another sufficient statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />
6.2 The Gram-Schmidt orthogonalization procedure . . . . . . . . . . . . . . . . . . 55<br />
6.3 Transmitting vectors instead of waveforms . . . . . . . . . . . . . . . . . . . . . 55<br />
6.4 <strong>Signal</strong> space, signal structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56<br />
6.5 Receiving vectors instead of waveforms . . . . . . . . . . . . . . . . . . . . . . 58<br />
6.6 The additive Gaussian vector channel . . . . . . . . . . . . . . . . . . . . . . . 58<br />
6.7 <strong>Processing</strong> the received vector . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />
6.8 Relation between waveform and vector channel . . . . . . . . . . . . . . . . . . 60<br />
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61<br />
7 Matched filters 64<br />
7.1 Matched-filter receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64<br />
7.2 Direct receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
7.3 <strong>Signal</strong>-to-noise ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
7.4 Parseval relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />
7.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />
8 Orthogonal signaling 75<br />
8.1 Orthogonal signal structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75<br />
8.2 Optimum receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />
8.3 Error probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />
8.4 Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77<br />
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
CONTENTS 3<br />
III Transmitter Optimization 82<br />
9 <strong>Signal</strong> energy considerations 83<br />
9.1 Translation and rotation of signal structures . . . . . . . . . . . . . . . . . . . . 83<br />
9.2 <strong>Signal</strong> energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br />
9.3 Translating a signal structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84<br />
9.4 Comparison of orthogonal and antipodal signaling . . . . . . . . . . . . . . . . . 85<br />
9.5 Energy of |M| orthogonal signals . . . . . . . . . . . . . . . . . . . . . . . . . 88<br />
9.6 Upper bound on the error probability . . . . . . . . . . . . . . . . . . . . . . . . 88<br />
9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89<br />
10 <strong>Signal</strong>ing for message sequences 92<br />
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92<br />
10.2 Definitions, problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 92<br />
10.3 Bit-by-bit signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93<br />
10.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93<br />
10.3.2 Probability of error considerations . . . . . . . . . . . . . . . . . . . . . 96<br />
10.4 Block-orthogonal signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />
10.4.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />
10.4.2 Probability of error considerations . . . . . . . . . . . . . . . . . . . . . 97<br />
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98<br />
11 Time, bandwidth, and dimensionality 99<br />
11.1 Number of dimensions for bit-by-bit and block-orthogonal signaling . . . . . . . 99<br />
11.2 Dimensionality as a function of T . . . . . . . . . . . . . . . . . . . . . . . . . 99<br />
11.3 Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103<br />
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103<br />
12 Bandlimited transmission 104<br />
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104<br />
12.2 Sphere hardening (noise vector) . . . . . . . . . . . . . . . . . . . . . . . . . . 105<br />
12.3 Sphere hardening (received vector) . . . . . . . . . . . . . . . . . . . . . . . . . 106<br />
12.4 The interior volume of a hypersphere . . . . . . . . . . . . . . . . . . . . . . . . 107<br />
12.5 An upper bound for the number |M| of signal vectors . . . . . . . . . . . . . . . 107<br />
12.6 A lower bound to the number |M| of signal vectors . . . . . . . . . . . . . . . . 108<br />
12.6.1 Generating a random code . . . . . . . . . . . . . . . . . . . . . . . . . 108<br />
12.6.2 Error probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108<br />
12.6.3 Achievability result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111<br />
12.7 Capacity of the bandlimited channel . . . . . . . . . . . . . . . . . . . . . . . . 111<br />
12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
CONTENTS 4<br />
13 Wideband transmission 114<br />
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114<br />
13.2 Capacity of the wideband channel . . . . . . . . . . . . . . . . . . . . . . . . . 114<br />
13.3 Relation between capacities, signal-to-noise ratio . . . . . . . . . . . . . . . . . 115<br />
13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116<br />
IV Channel Models 118<br />
14 Pulse amplitude modulation 119<br />
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119<br />
14.2 Orthonormal pulses: the Nyquist criterion . . . . . . . . . . . . . . . . . . . . . 120<br />
14.2.1 The Nyquist result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120<br />
14.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121<br />
14.2.3 Proof of the Nyquist result . . . . . . . . . . . . . . . . . . . . . . . . . 122<br />
14.2.4 Receiver implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 124<br />
14.3 Multi-pulse transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125<br />
14.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126<br />
15 Bandpass channels 127<br />
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127<br />
15.2 Quadrature multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128<br />
15.3 Optimum receiver for quadrature multiplexing . . . . . . . . . . . . . . . . . . . 132<br />
15.4 Transmission of a complex signal . . . . . . . . . . . . . . . . . . . . . . . . . . 134<br />
15.5 Dimensions per second . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134<br />
15.6 Capacity of the bandpass channel . . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />
15.7 Quadrature amplitude modulation . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />
15.8 Serial quadrature amplitude modulation . . . . . . . . . . . . . . . . . . . . . . 135<br />
15.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136<br />
16 Random carrier-phase 137<br />
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137<br />
16.2 Optimum incoherent reception . . . . . . . . . . . . . . . . . . . . . . . . . . . 138<br />
16.3 <strong>Signal</strong>s with equal energy, receiver implementation . . . . . . . . . . . . . . . . 140<br />
16.4 Envelope detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142<br />
16.5 Probability of error for two orthogonal signals . . . . . . . . . . . . . . . . . . . 144<br />
16.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148<br />
17 Colored noise 150<br />
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150<br />
17.2 Another result on reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 150<br />
17.3 Additive nonwhite Gaussian noise . . . . . . . . . . . . . . . . . . . . . . . . . 151<br />
17.4 Receiver implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
CONTENTS 5<br />
17.4.1 Correlation type receiver . . . . . . . . . . . . . . . . . . . . . . . . . . 152<br />
17.4.2 Matched-filter type receiver . . . . . . . . . . . . . . . . . . . . . . . . 153<br />
17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155<br />
18 Capacity of the colored noise waveform channel 157<br />
18.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157<br />
18.2 The sub-channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158<br />
18.3 Capacity of parallel channels, water-filling . . . . . . . . . . . . . . . . . . . . . 159<br />
18.4 Back to the non-white waveform channel again . . . . . . . . . . . . . . . . . . 161<br />
18.5 Capacity of the linear filter channel . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />
18.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />
V Coded Modulation 165<br />
19 Coding for bandlimited channels 166<br />
19.1 AWGN channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166<br />
19.2 |A|-Pulse amplitude modulation . . . . . . . . . . . . . . . . . . . . . . . . . . 167<br />
19.3 Normalized S/N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168<br />
19.4 Baseline performance in terms of S/N norm . . . . . . . . . . . . . . . . . . . . 169<br />
19.5 The promise of coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170<br />
19.6 Union bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171<br />
19.6.1 Sequence error probability . . . . . . . . . . . . . . . . . . . . . . . . . 171<br />
19.6.2 Symbol error probability . . . . . . . . . . . . . . . . . . . . . . . . . . 172<br />
19.7 Extended Hamming codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172<br />
19.8 Coding by set-partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173<br />
19.8.1 Construction of the signal set . . . . . . . . . . . . . . . . . . . . . . . . 173<br />
19.8.2 Distance between the signals . . . . . . . . . . . . . . . . . . . . . . . . 174<br />
19.8.3 Asymptotic coding gain . . . . . . . . . . . . . . . . . . . . . . . . . . 175<br />
19.8.4 Effective coding gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177<br />
19.8.5 Coding gains for more complex codes . . . . . . . . . . . . . . . . . . . 178<br />
19.9 Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179<br />
19.10Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179<br />
VI Appendices 181<br />
A An upper bound for the Q-function 182<br />
B The Fourier transform 184<br />
B.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184<br />
B.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184<br />
B.2.1 Parseval’s relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
CONTENTS 6<br />
C Impulse signal, filters 186<br />
C.1 The impulse or delta signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186<br />
C.2 Linear time-invariant systems, filters . . . . . . . . . . . . . . . . . . . . . . . . 186<br />
D Correlation functions, power spectra 188<br />
D.1 Expectation of an integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />
D.2 Power spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />
D.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189<br />
D.4 Wide-sense stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190<br />
D.5 Gaussian processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191<br />
D.6 Properties of S x ( f ) and R x (τ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 191<br />
E Gram-Schmidt procedure, proof, example 192<br />
E.1 Proof of result 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192<br />
E.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193<br />
F Schwarz inequality 197<br />
G Bound error probability orthogonal signaling 199<br />
H Decibels 202<br />
I Whitening filters 203<br />
J Elementary energy E pam of an |A|-PAM constellation 205
Part I<br />
Some History<br />
7
Chapter 1<br />
Historical facts related to communication<br />
SUMMARY: In this chapter we mention some historical facts related to telegraphy and<br />
telephony, wireless communication, electronics, and modulation. Furthermore we shortly<br />
discuss some classic fundamental bounds and concepts relevant for digital communication.<br />
1.1 Telegraphy and telephony<br />
1800 Alessandro Volta (Italy 1745 - 1827) Announcement of the electric battery (Volta’s pile).<br />
This battery made constant-current electricity possible. Motivated by the “frog-leg” experiments<br />
of Luigi Galvani (Italy 1737 - 1798), anatomy professor at the university of<br />
Bologna, Volta discovered that it was possible to construct a battery from a plate of silver<br />
and a plate of zinc separated by spongy matter impregnated with a saline solution, repeated<br />
thirty or forty times.<br />
1819 Hans Christian Oersted (Denmark 1777 - 1851) Discovery of the fact that an electric<br />
current generates a magnetic field. This can be regarded as the first electro-magnetic result.<br />
It took twenty years after Volta’s invention to realize and prove that this effect exists<br />
although it was known at the time that a stroke of lightning magnetized objects.<br />
1827 Joseph Henry (U.S. 1797 - 1878) Constructed electromagnets using coils of wire.<br />
1838 Samuel Morse (U.S. 1791 - 1872) Demonstration of the electric telegraph in Morristown,<br />
New Jersey. The first telegraph line linked Washington with Baltimore. It became operational<br />
in May 1844. By 1848 every state east of the Mississippi was linked by telegraph<br />
lines. The alphabet that was designed by Morse (a portrait-painter) transforms letters into<br />
variable-length sequences (code words) of dots and dashes (see table 1.1). A dash should<br />
last roughly three times as long as a dot. Frequent letters (e.g. E) get short code words,<br />
less frequently occurring letters (e.g. J, Q, and Y) are represented by longer words. The<br />
first transcontinental telegraph line was completed in 1861. A transatlantic cable became<br />
operational in 1866. However already in 1858 a cable was completed but it failed after a<br />
few weeks.<br />
8
CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 9<br />
A . - G - - . M - - S . . . Y - . - -<br />
B - . . . H . . . . N - . T - Z - - . .<br />
C - . - . I . . O - - - U . . - period . - . - . -<br />
D - . . J . - - - P . - - . V . . . - ? . . - - . .<br />
E . K - . - Q - - . - W . - -<br />
F . . - . L . - . . R . - . X - . . -<br />
Table 1.1: Morse code.<br />
It should be mentioned that in 1833 C.F. Gauss and W.E. Weber demonstrated an electromagnetic<br />
telegraph in Göttingen, Germany.<br />
1875 Alexander Graham Bell (Scotland 1847 - 1922 U.S.) Invention of the telephone. Bell<br />
was a teacher of the deaf. His invention was patented in 1876 (electro-magnetic telephone).<br />
In 1877 Bell established the Bell Telephone Company. Early versions provided<br />
service over several hundred miles. Advances in quality resulted e.g. from the invention<br />
of the carbon microphone.<br />
1900 Michael Pupin (Yugoslavia 1858 - 1935 U.S.) Obtained a patent on loading coils for the<br />
improvement of telephone communications. Adding these Pupin-coils at specified intervals<br />
along a telephone line reduced attenuation significantly. The same invention was<br />
disclosed by Campbell two days later than Pupin. Pupin sold his patent for $455000 to<br />
AT&T Co. Pupin (and Campbell) were the first to set up a theory of transmission lines.<br />
The implications of this theory were not obvious at all for ”experimentalists”.<br />
1.2 Wireless communications<br />
1831 Michael Faraday (England 1791 - 1867) Discovery of the fact that a changing magnetic<br />
field induces an electric current in a conducting circuit. This is the famous law of induction.<br />
1873 James Clerk Maxwell (Scotland 1831 - 1879, England) The publication of “Treatise on<br />
Electricity and Magnetism”. Maxwell combined the results obtained by Oersted and Faraday<br />
into a single theory. From this theory he could predict the existence of electromagnetic<br />
waves.<br />
1886 Heinrich Hertz (Germany 1857 - 1894) Demonstration of the existence of electromagnetic<br />
waves. In his laboratory at the university of Karlsruhe, Hertz used a spark-transmitter<br />
to generate the waves and a resonator to detect them.<br />
1890 Edouard Branly (France 1844 - 1940) Origination of the coherer. This is a receiving device<br />
for electromagnetic waves based on the principle that, while most powdered metals are<br />
poor direct current conductors, metallic powder becomes conductive when high-frequency<br />
current is applied. The coherer was further refined by Augusto Righi (Italy 1850 - 1920)<br />
at the University of Bologna and by Oliver Lodge (England 1851 - 1940) .
CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 10<br />
1896 Aleksander Popov (Russia 1859 - 1906) Wireless telegraph transmission between two<br />
buildings (200 meters) was shown to be possible. Marconi did similar experiments at<br />
roughly the same time.<br />
1901 Guglielmo Marconi (Italy 1874 - 1937) attended the lectures of Righi in Bologna and<br />
built a first radio telegraph in 1895. He used Hertz’s spark transmitter, and Lodge’s coherer<br />
and added antennas to improve performance. In 1898 he developed a tuning device<br />
and was now able to use two radio circuits simultaneously. The first cross-Atlantic radio<br />
link was operated by Marconi in 1901. A radio signal was received at St. John’s, Newfoundland<br />
while being originated from Cornwall, England, 1700 miles away. Important<br />
was that Marconi applied directional antennas.<br />
1955 John R. Pierce (U.S. 1910 - 2002) Pierce (Bell Laboratories) proposed the use of satellites<br />
for communications and did pioneering work in this area. Originally this idea came<br />
from Arthur C. Clarke who suggested already in 1945 the idea to use earth-orbiting satellites<br />
as relay points between earth stations. The first satellite, Telstar I, built by Bell Laboratories,<br />
was launched in 1962. It served as a relay station for TV programs across the<br />
Atlantic. Pierce also headed the Bell-Labs team that developed the transistor and suggested<br />
the name for it.<br />
1.3 Electronics<br />
1874 Karl Ferdinand Braun (Germany 1850 - 1918) Observation of rectification at metal contacts<br />
to galena (lead sulfide). These semiconductor devices were the forerunners of the<br />
“cat’s whisker” diodes used for detection of radio channels (crystal detectors).<br />
1904 John Ambrose Fleming (England 1849 - 1945) Invention of the thermionic diode, “the<br />
valve”, that could be used as rectifier. A rectifier can convert trains of high-frequency<br />
oscillations into trains of intermittent but unidirectional current. A telephone can then be<br />
used to produce a sound with the frequency of the trains of the sparks.<br />
1906 Lee DeForest (U.S. 1873 - 1961) Invention of the vacuum triode, a diode with grid control.<br />
This invention made practical the cascade amplifier, the triode oscillator and the<br />
regenerative feedback circuit. As a result transcontinental telephone transmission became<br />
operational in 1915. A transatlantic cable was laid not earlier than 1953.<br />
1948 John Bardeen, Walter Brattain and William Shockley Development of the theory of the<br />
junction transistor. This transistor was first fabricated in 1950. The point contact transistor<br />
was invented in December 1947 at Bell Telephone Labs.<br />
1958 Jack Kilby and Robert Noyce (U.S.) Invention of the integrated circuit.
CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 11<br />
1.4 Classical modulation methods<br />
Beginning of the 20th century Carrier-frequency currents were generated by electric arcs or<br />
by high-frequency alternators. These currents were modulated by carbon transmitters and<br />
demodulated (or converted to audio) by a crystal detector or an other kind of rectifier.<br />
1909 George A. Campbell (U.S. 1870 - 1954) Internal AT&T memorandum on the “Electric<br />
Wave Filter”. Campbell there discussed band-pass filters that would reject frequencies<br />
other than those in a narrow band. A patent application filed in 1915 was awarded [2].<br />
1915 Edwin Howard Armstrong (U.S. 1890 - 1954) Invention of regenerative amplifiers and<br />
oscillators. Regenerative amplification (using positive feedback to obtain a nearly oscillating<br />
circuit) considerably increased the sensitivity of the receiver. DeForest claimed the<br />
same invention.<br />
1915 Hendrik van der Bijl, Ralph V.L. Hartley, and Raymond A. Heising In a patent that<br />
was filed in 1915 van der Bijl showed how the non-linear portion of a vacuum tube could be<br />
utilized to modulate and demodulate. Heising participated in a project (in Arlinton, VA) in<br />
which a vacuum-tube transmitter based on the van der Bijl modulator was designed. Heising<br />
invented the constant-current modulator. Hartley also participated in the Arlington<br />
project and designed the receiver. He invented an oscillator that was named after him. In a<br />
paper published in 1923 Hartley explained how suppression of the carrier signal and using<br />
only one of the sidebands could economize transmitter power and reduce interference.<br />
1915 John R. Carson Publication of a mathematical analysis of the modulation and demodulation<br />
process. Description of single-sideband and suppressed carrier methods.<br />
1918 Edwin Howard Armstrong Invention of the superheterodyne radio receiver. The combination<br />
of the received signal with a local oscillation resulted in an audio beat-note.<br />
1922 John R. Carson “Notes on the Theory of Modulation [3]”. Carson compares amplitude<br />
modulation (AM) and frequency modulation (FM) . He came to the conclusion that FM<br />
was inferior to AM from the perspective of bandwidth requirement. He overlooked the<br />
possibility that wide-band FM might offer some advantages over AM as was later demonstrated<br />
by Armstrong.<br />
1933 Edwin Howard Armstrong Demonstration of an FM system to RCA. A patent was granted<br />
to Armstrong. Despite of the larger bandwidth that is needed, FM gives a considerable<br />
better performance than AM.<br />
1.5 Fundamental bounds and concepts<br />
1924 Harry Nyquist (Sweden 1889 - 1976 U.S.) Shows that the number of resolvable (noninterfering)<br />
pulses that can be transmitted per second over a bandlimited channel is proportional<br />
to the channel bandwidth [15]. If the bandwidth is W Hz the number of pulses
CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 12<br />
n(t)<br />
s(t)<br />
✗❄<br />
✔<br />
✲ +<br />
✖ ✕<br />
✲<br />
linear<br />
filter<br />
ŝ(t)<br />
✲<br />
Figure 1.1: The communication problem considered by Wiener.<br />
per second cannot exceed 2W . This is strongly related to the sampling theorem which says<br />
that time is essentially discrete. If a time-continuous signal has bandwidth not larger than<br />
W Hz then it is completely specified by samples taken from it at discrete time instants<br />
1/2W seconds apart.<br />
1928 Ralph V.L. Hartley (U.S. 1888 - 1970) Determined the number of distinguishable pulse<br />
amplitudes [8]. Suppose that the amplitudes are confined to the interval [−A, +A], and<br />
that the receiver can estimate an amplitude reliably only to an accuracy of ±. Then<br />
the number of distinguishable pulses is roughly 2(A+)<br />
2<br />
= A <br />
+ 1. Clearly Hartley was<br />
concerned with digital communication. He realized that inaccuracy (due to noise) limited<br />
the amount of information that could be transmitted.<br />
1938 Alec Reeves Invention of Pulse Code Modulation (PCM) for digital encoding of speech<br />
signals. In World War II PCM-encoding allowed transmission of encrypted speech (Bell<br />
Laboratories). In [16] Oliver, Pierce and Shannon compared PCM to classical modulation<br />
systems.<br />
1939 Homer Dudley Description of a vocoder. This system did overthrow the idea that communication<br />
requires a bandwidth at least as wide as that of the signal to be communicated.<br />
1942 Norbert Wiener (U.S. 1894 - 1964) Investigated the problem shown in figure 1.1. The<br />
linear filter is to be chosen such that its output ŝ(t) is the best mean-square approximation<br />
to s(t) given the statistical properties of the processes S(t) and N(t). A drawback of<br />
the result of Wiener was that that modulation did not fit into his model and could not be<br />
analyzed.<br />
1943 Dwight O. North Discovery of the matched filter. This filter was shown to be the optimum<br />
detector of a known signal in additive white noise.<br />
1947 Vladimir A. Kotelnikov (Russia 1908 - ) Analysis of several modulation systems. His<br />
noise immunity work [11] dealt with how to design receivers (but not how to choose the<br />
transmitted signals) to minimize error probability in the case of digital signals or mean<br />
square error in the case of analog signals. Kotelnikov discovered independently of Nyquist,<br />
the sampling theorem in 1933.
CHAPTER 1. HISTORICAL FACTS RELATED TO <strong>COMMUNICATION</strong> 13<br />
Figure 1.2: Claude E. Shannon, founder of Information Theory. Photo IEEE-IT Soc. Newsl.,<br />
Summer 1998.<br />
1948 Claude E. Shannon (U.S. 1916 - 2001) Publication of “A Mathematical Theory of Communication<br />
[20].” Shannon (see figure 1.2) showed that noise does not place an inescapable<br />
restriction on the accuracy of communication. He showed that noise properties, channel<br />
bandwidth, and restricted signal magnitude can be incorporated into a single parameter C<br />
which he called the channel capacity. If the cardinality |M| of the message M grows as a<br />
function of the signal duration T such that<br />
1<br />
T log 2 |M| ≈ C − ɛ,<br />
for some small ɛ > 0, then arbitrarily high communication accuracy is possible by increasing<br />
T , i.e. by taking longer and longer signals (code words). Conversely Shannon showed<br />
that reliable communication is not possible when<br />
1<br />
T log 2 |M| ≈ C + ɛ,<br />
for any ɛ > 0. Therefore a channel can be considered as a pipe through which which bits<br />
can reliably be transmitted up to rate C.<br />
Other results of Shannon include the application of Boole’s algebra to switching circuits<br />
(M.S. thesis, MIT, 1938). It was also Shannon who introduced the sampling theorem to<br />
the engineering community in 1949.
Part II<br />
Optimum Receiver Principles<br />
14
Chapter 2<br />
Decision rules for discrete channels<br />
SUMMARY: In this first chapter we investigate a discrete channel, i.e. a channel with a<br />
discrete input and output alphabet. For this discrete channel we determine the optimum<br />
receiver, i.e. the receiver that minimizes the error probability P E . The so-called maximum<br />
a-posteriori (MAP) receiver turns out to be optimum. When all messages are equally likely<br />
the optimum receiver reduces to a maximum-likelihood (ML) receiver. In the last section<br />
of this chapter we distinguish between discrete scalar and vector channels.<br />
2.1 System description<br />
m<br />
s m<br />
r<br />
ˆm<br />
✲<br />
✲ discrete<br />
source transmitter ✲ receiver ✲ destin.<br />
channel<br />
Figure 2.1: Elements in a communication system based on a discrete channel.<br />
We start by considering a very simple communication system based on a discrete channel. It<br />
consists of the following elements (see figure 2.1):<br />
SOURCE An information source produces message m ∈ M = {1, 2, · · · , |M|}, one out of<br />
|M| alternatives. Message m occurs with probability Pr{M = m} for m ∈ M. This<br />
probability is called the a-priori message probability and M is the name of the random<br />
variable associated with this mechanism.<br />
TRANSMITTER The transmitter sends a signal s m if message m is to be transmitted. This<br />
signal is input to the channel. It assumes values from the discrete channel input alphabet<br />
S. The random variable corresponding to the signal is denoted by S. The collection of<br />
used signals is s 1 , s 2 , · · · , s |M| .<br />
15
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 16<br />
DISCRETE CHANNEL The channel produces an output r that assumes a value from a discrete<br />
alphabet R. If the channel input is signal s ∈ S the output r ∈ R occurs with conditional<br />
probability Pr{R = r|S = s}. This channel output is directed to the receiver. The random<br />
variable associated with the channel output is denoted by R.<br />
When message m ∈ M occurs, the transmitter chooses s m as channel input. Therefore<br />
Pr{R = r|M = m} = Pr{R = r|S = s m } for all r ∈ R. (2.1)<br />
These conditional probabilities describe the behavior of the transmitter followed by the<br />
channel.<br />
RECEIVER The receiver forms an estimate ˆm of the transmitted message (or signal) by looking<br />
at the received channel output r ∈ R, hence ˆm = f (r). The random variable corresponding<br />
to this estimate is called ˆM. We assume here that ˆm ∈ M = {1, 2, · · · , |M|}, thus the<br />
receiver has to choose one of the possible messages (and can not declare an error, e.g.).<br />
DESTINATION The destination accepts the estimate ˆm.<br />
Note that we call our system discrete because the channel is discrete. It has a discrete input<br />
and output alphabet.<br />
The performance of our communication system is evaluated by considering the probability<br />
of error. This performance depends on the receiver, i.e. on the mapping f (·).<br />
Definition 2.1 The mapping f (·) is called the decision rule.<br />
Definition 2.2 The probability of error is defined as<br />
P E<br />
= Pr{ ˆM ̸= M}. (2.2)<br />
We are now interested in choosing a decision rule f (·) that minimizes P E . A decision rule that<br />
minimizes the error probability P E is called optimum. The corresponding receiver is called an<br />
optimum receiver.<br />
The probability of correct (right) decision is denoted as P C and obviously P E + P C = 1.<br />
2.2 Decision rules, an example<br />
To get more familiar with the properties of our system we study an example first.<br />
Example 2.1 We assume that |M| = 2, i.e. there are two possible messages. Their a-priori probabilities<br />
can be found in the following table:<br />
m Pr{M = m}<br />
1 0.4<br />
2 0.6
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 17<br />
The two signals corresponding to the messages are s 1 and s 2 and the conditional probabilities Pr{R =<br />
r|S = s m } are given in the table below for the values of r and m that can occur. There are three values that<br />
r can assume, i.e. R = {a, b, c}.<br />
m Pr{R = a|S = s m } Pr{R = b|S = s m } Pr{R = c|S = s m }<br />
1 0.5 0.4 0.1<br />
2 0.1 0.3 0.6<br />
Note that the row sums are equal to one.<br />
Now suppose that we use the decision rule f (·) that is given by:<br />
r a b c<br />
f (r) 1 1 2<br />
This means that the receiver outputs the estimate ˆm = 1 if the channel output is a or b and ˆm = 2 if the<br />
channel output is c. To determine the probability of error P E that is achieved by this rule we first compute<br />
the joint probabilities<br />
Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m}<br />
= Pr{M = m} Pr{R = r|S = s m }, (2.3)<br />
where we used (2.1). We list these joint probabilities in the following table:<br />
m Pr{M = m, R = a} Pr{M = m, R = b} Pr{M = m, R = c}<br />
1 0.20 0.16 0.04<br />
2 0.06 0.18 0.36<br />
Now we can determine the probability of correct decision P C = 1 − P E which is:<br />
P C = Pr{M = 1, R = a} + Pr{M = 1, R = b} + Pr{M = 2, R = c}<br />
= 0.20 + 0.16 + 0.36 = 0.72. (2.4)<br />
This decision rule is not optimum. To see why, note that for every r ∈ R a certain message ˆm ∈<br />
M is chosen. In other words the decision rule selects in each column exactly one joint probability.<br />
These selected probabilities are added together to form P C . Our decision rule selects the joint probability<br />
Pr{M = 1, R = b} = 0.16 in the column that corresponds to R = b. A larger P C is obtained if for<br />
output R = b instead of message 1 the message 2 is chosen. In that case the larger joint probability<br />
Pr{M = 2, R = b} = 0.18 is selected. Now the probability of correct decision becomes<br />
P C = Pr{M = 1, R = a} + Pr{M = 2, R = b} + Pr{M = 2, R = c}<br />
= 0.20 + 0.18 + 0.36 = 0.74. (2.5)<br />
We may conclude that, in general, to maximize P C , we have to choose the m that achieves the largest<br />
Pr{M = m, R = r} for the r that was received. This will be made more precise in the following section.
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 18<br />
2.3 The “maximum a-posteriori” decision rule<br />
We can upper bound the probability of correct decision as follows:<br />
P C = ∑ r<br />
Pr{M = f (r), R = r}<br />
≤<br />
∑ r<br />
max Pr{M = m, R = r}. (2.6)<br />
m<br />
This upper bound is achieved if for all r the estimate f (r) corresponds to the message m that<br />
achieves the maximum in max m Pr{M = m, R = r}, hence an optimum decision rule for the<br />
discrete channel achieves<br />
Pr{M = f (r), R = r} ≥ Pr{M = m, R = r}, for all m ∈ M, (2.7)<br />
for all r ∈ R that can be received, i.e. that have Pr{R = r} > 0. Note that it is possible that for<br />
some channel outputs r ∈ R more than one decision f (r) is optimum.<br />
Definition 2.3 For a communication system based on a discrete channel the joint probabilities<br />
Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m}<br />
= Pr{M = m} Pr{R = r|S = s m }. (2.8)<br />
which are, given the received output value r, indexed by m ∈ M, are called the decision variables.<br />
An optimum decision rule f (·) is based on these variables.<br />
RESULT 2.1 (MAP decision rule) To minimize the probability of error P E , the decision rule<br />
f (·) should produce for each received r a message m having the largest decision variable. Hence<br />
for r ∈ R that actually can occur, i.e. that have Pr{R = r} > 0,<br />
f (r) = arg max<br />
m∈M Pr{M = m} Pr{R = r|S = s m}. (2.9)<br />
For<br />
∑<br />
such r we can divide all decision variables by channel output probability Pr{R = r} =<br />
m∈M Pr{M = m} Pr{R = r|M = m}. This results in the a-posteriori probabilities of the<br />
message m ∈ M given the channel output r. By the Bayes rule<br />
Pr{M = m} Pr{R = r|S = s m }<br />
Pr{R = r}<br />
= Pr{M = m|R = r}, for all m ∈ M, (2.10)<br />
therefore the optimum receiver chooses as ˆm a message that has maximum a-posteriori probability<br />
(MAP). This decision rule is called the MAP-decision rule, the receiver is called a MAPreceiver.
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 19<br />
Example 2.2 Below are the a-posteriori probabilities that correspond to the example in the previous section<br />
(obtained from Pr{R = a} = 0.26, Pr{R = b} = 0.34, and Pr{R = c} = 0.4). Note that the<br />
probabilities in a column add up to one now.<br />
m Pr{M = m|R = a} Pr{M = m|R = b} Pr{M = m|R = c}<br />
1 20/26 16/34 4/40<br />
2 6/26 18/34 36/40<br />
From this table it is clear that the optimum decision rule is<br />
This leads to an error probability<br />
r a b c<br />
f (r) 1 2 2<br />
P E = ∑ m<br />
Pr{M = m} Pr{ ˆM ̸= m|M = m} = 0.4(0.4 + 0.1) + 0.6 · 0.1 = 0.26, (2.11)<br />
where Pr{ ˆM ̸= m|M = m} = ∑ r: f (r)̸=m Pr{R = r|S = s m} is the probability of error conditional on the<br />
fact that message m was sent.<br />
2.4 The “maximum likelihood” decision rule<br />
When all messages are equally likely, i.e. when<br />
we get for the joint probabilities<br />
Pr{M = m} = 1 for all m ∈ M = {1, 2, · · · , |M|}, (2.12)<br />
|M|<br />
Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m} = 1<br />
|M| Pr{R = r|S = s m}. (2.13)<br />
Considering (2.7) we obtain:<br />
RESULT 2.2 (ML decision rule) If the a-priori message probabilities are all equal, in order to<br />
minimize the error probability P E , a decision rule f (·) has to be applied that satisfies<br />
f (r) = arg max<br />
m∈M Pr{R = r|S = s m}, (2.14)<br />
for r ∈ R that can occur i.e. for r with Pr{R = r} > 0. Such a decision rule is called a maximum<br />
likelihood (ML) decision rule, the resulting receiver is a maximum likelihood receiver.<br />
Given the received channel output r ∈ R the receiver chooses a message ˆm ∈ M for which<br />
the received r has maximum likelihood. The transition probabilities Pr{R = r|S = s m } for<br />
r ∈ R and m ∈ M are called likelihoods. A receiver that operates like this (no matter whether<br />
or not the messages are indeed equally likely) is called a maximum likelihood receiver. It should<br />
be noted that such a receiver is less complex than a maximum a-posteriori receiver since it does<br />
not have to multiply the transition probabilities Pr{R = r|S = s m } by the a-priori probabilities<br />
Pr{M = m}.
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 20<br />
Example 2.3 Consider again the transition probabilities of the example in section 2.2:<br />
m Pr{R = a|S = s m } Pr{R = b|S = s m } Pr{R = c|S = s m }<br />
1 0.5 0.4 0.1<br />
2 0.1 0.3 0.6<br />
From this table follows the maximum-likelihood decision rule which is<br />
r a b c<br />
f (r) 1 1 2<br />
When the messages 1 and 2 both have a-priori probability 1/2 the probability of correct decision is<br />
P C = ∑ m<br />
Pr{M = m} Pr{ ˆM = m|M = m} = 1 2 (0.5 + 0.4) + 1 0.6 = 0.75. (2.15)<br />
2<br />
Here Pr{ ˆM = m|M = m} = ∑ r: f (r)=m Pr{R = r|S = s m} is the correct-probability conditioned on the<br />
fact that message m was sent. Note that the maximum likelihood decision rule is optimum here since both<br />
messages have the same a-priori probability.<br />
2.5 Scalar and vector channels<br />
s m<br />
r<br />
s ✲<br />
r<br />
m<br />
m1<br />
1 ✲<br />
s m2 ✲ discrete r<br />
ˆm<br />
2 ✲<br />
source ✲ transmitter vector<br />
receiver ✲ destin.<br />
.<br />
.<br />
✲ channel ✲<br />
s m N<br />
r N<br />
Figure 2.2: Elements in a communication system based on a discrete vector channel.<br />
So far we have considered in this chapter a discrete channel that accepts a single input s ∈ S<br />
and produces a single output r ∈ R. This channel could be called a discrete scalar channel.<br />
There is also a vector variant of this channel, see figure 2.2. The components of a discrete vector<br />
system are described below.<br />
TRANSMITTER Let N be a positive integer. Then the vector transmitter sends a signal vector<br />
s m = (s m1 , s m2 , · · · , s m N ) of N components from the discrete alphabet S if message m<br />
is to be conveyed. This signal vector is input to the discrete vector channel. The random<br />
variable corresponding to the signal signal vector is denoted by S. The collection of used<br />
signal vectors is s 1 , s 2 , · · · , s |M| .<br />
DISCRETE VECTOR CHANNEL This channel produces an output vector r = (r 1 , r 2 , · · · ,<br />
r N ) with N components all assuming a value from a discrete alphabet R. If the channel<br />
input was signal s ∈ S N the output r ∈ R N occurs with conditional probability Pr{R =
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 21<br />
r|S = s}. This channel output vector is directed to the vector receiver. The random<br />
variable associated with the channel output is denoted by R.<br />
When message m ∈ M occurs, s m is chosen by the transmitter as channel input vector.<br />
Then the conditional probabilities<br />
Pr{R = r|M = m} = Pr{R = r|S = s m } for all r ∈ R, (2.16)<br />
describe the behavior of the vector transmitter followed by the discrete vector channel.<br />
RECEIVER The vector receiver forms an estimate ˆm of the transmitted message (or signal) by<br />
looking at the received channel output vector r ∈ R N , hence ˆm = f (r). The mapping<br />
f (·) is again called the decision rule.<br />
It will not be a big surprise that we define the decision variables for the discrete vector channel<br />
as follows:<br />
Definition 2.4 For a communication system based on a discrete vector channel the decision<br />
variables are again the joint probabilities<br />
Pr{M = m, R = r} = Pr{M = m} Pr{R = r|M = m}<br />
which are, given the received output vector r, indexed by m ∈ M.<br />
= Pr{M = m} Pr{R = r|S = s m }. (2.17)<br />
The following result tells us what the optimum decision rule should be for the discrete vector<br />
channel:<br />
RESULT 2.3 (MAP decision rule for the discrete vector channel) To minimize the probability<br />
of error P E , the decision rule f (·) should produce for each received vector r a message m<br />
having the largest decision variable. Hence for vectors r ∈ R N that actually can occur, i.e. that<br />
have Pr{R = r} > 0,<br />
This decision rule is the MAP-decision rule.<br />
f (r) = arg max Pr{M = m} Pr{R = r|S = s m }. (2.18)<br />
m∈M<br />
There is obviously also a ”maximum likelihood version” of this result.<br />
2.6 Exercises<br />
1. Consider the noisy discrete communication channel illustrated in figure 2.3.<br />
(a) If Pr{M = 1} = 0.7 and Pr{M = 2} = 0.3, determine the optimum decision rule<br />
(assignment of a, b or c to 1, 2) and the resulting probability of error P E .
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 22<br />
1<br />
2<br />
✄<br />
✂ ✁<br />
0.7<br />
✲<br />
✄<br />
◗ ✑✂ ✁ a<br />
◗<br />
✑<br />
◗ 0.2 ✑<br />
0.1◗<br />
✑<br />
◗ ✑<br />
◗✑<br />
✏ ✄<br />
✑◗<br />
✂ ✁ b<br />
✑ ◗<br />
0.3✑<br />
◗<br />
✑<br />
✄<br />
✄<br />
✂✏✁<br />
✏✏✏✏✏✏✏✏✏✏<br />
◗<br />
✑ 0.2 ◗<br />
✑ ✲<br />
◗✂ ✁ c<br />
0.5<br />
Figure 2.3: A noisy discrete communication system with input M and output R.<br />
(b) There are eight decision rules. Plot the probability of error for each decision rule<br />
versus Pr{M = 1} on one graph.<br />
(c) Each decision rule has a maximum probability of error which occurs for some least<br />
favorable a-priori probability Pr{M = 1}. The decision rule which has the smallest<br />
is called the minimax decision rule. Which of the eight rules is minimax?<br />
(Exercise 2.12 from Wozencraft and Jacobs [25].)<br />
2. Consider a communication system based on a discrete channel. The number of messages<br />
is five i.e. M = {1, 2, 3, 4, 5}. The a-priori probabilities are Pr{M = 1} = Pr{M = 5} =<br />
0.1, Pr{M = 2} = Pr{M = 4} = 0.2, and Pr{M = 3} = 0.4. The signal corresponding to<br />
message m is equal to m, thus s m = m, for m ∈ M. This signal is the input for the discrete<br />
channel.<br />
The channel adds the noise n to the signal, hence the channel output<br />
r = s m + n. (2.19)<br />
The noise variable N is independent of the channel input and can have the values −1, 0,<br />
and +1 only. It is given that Pr{N = 0} = 0.4 and Pr{N = −1} = Pr{N = +1} = 0.3.<br />
(a) Note that the channel output assumes values in {0, 1, 2, 3, 4, 5, 6}. Determine the<br />
probability Pr{R = 2}. For each message m ∈ M compute the a-posteriori probability<br />
Pr{M = m|R = 2}.<br />
(b) Note that an optimum receiver minimizes the probability of error P E . Give for each<br />
possible channel output the estimate ˆm that an optimum receiver will make. Determine<br />
the resulting error probability.<br />
(c) Consider a maximum-likelihood receiver. Give for each possible channel output the<br />
estimate ˆm that a maximum-likelihood receiver will make. Again determine the resulting<br />
probability of error.<br />
(Exam Communication Principles, July 5, 2002.)
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 23<br />
3. Consider an optical communication system. A binary message is transmitted by switching<br />
the intensity λ of a laser ”on” or ”off” within the time-interval [0, T ]. For message m = 1<br />
(”off”) the intensity λ = λ off ≥ 0. For message m = 2 (”on”) the intensity λ = λ on > λ off .<br />
The laser produces photons that are counted by a detector. The number K of photons is a<br />
non-negative random variable, and the corresponding probabilities are<br />
Pr{K = k} =<br />
(λT )k<br />
k!<br />
exp(−λT ), for k = 0, 1, 2, · · · .<br />
This probability distribution is known as Poisson distribution.<br />
The a-priori message probabilities are Pr{M = 1} = p off and Pr{M = 2} = p on . It is<br />
assumed that 0 < p on = 1 − p off < 1.<br />
The detector forms the message estimate ˆm based on the number k of photons that are<br />
counted.<br />
(a) Suppose that the detector chooses ˆm such that the probability of error P E = Pr{ ˆM ̸=<br />
M} is minimized. For what values of k does the detector choose ˆm = 1 and when<br />
does it choose ˆm = 2?<br />
(b) Consider a maximum-likelihood detector. For what values of k does this detector<br />
choose ˆm = 1 and when does it choose ˆm = 2 now?<br />
(c) Assume that λ off = 0. Consider a maximum-a-posteriori detector. For what values<br />
of p on does the detector choose ˆm = 2 no matter what k is? What is in that case the<br />
error probability P E ?<br />
(Exam Communication Principles, July 8, 2003.)<br />
0<br />
1<br />
1 − p<br />
✄✂ ❍ ✁ ✲<br />
❍❍<br />
✄✂ ✁<br />
❍❍❍❥<br />
✟<br />
❍<br />
✟<br />
p ❍<br />
✟<br />
❍ ✟<br />
✟<br />
❍✟<br />
✟<br />
✟❍<br />
❍<br />
p<br />
✄<br />
✂✟ ✁<br />
✟✟✟✯ ✟ ❍<br />
✟<br />
❍<br />
✟✟<br />
❍<br />
✲ ❍✄<br />
✂ ✁<br />
1 − p<br />
0<br />
1<br />
Figure 2.4: A binary symmetric channel with cross-over probability p.<br />
4. Consider transmission over a binary symmetric channel with cross-over probability 0 <<br />
p < 1/2 (see figure 2.4. All messages m ∈ M are equally likely. Each message m<br />
now corresponds to a signal vector (codeword) s m = (s m1 , s m2 , · · · , s m N ) consisting of N<br />
binary digits, hence s mi ∈ {0, 1} for i = 1, N. Such a codeword (sequence) is transmitted<br />
over the binary symmetric channel, the resulting channel output sequence is denoted by r.<br />
The Hamming-distance d H (x, y) between two sequences x and y, both consisting of N<br />
binary digits, is the number of positions at which they differ.
CHAPTER 2. DECISION RULES FOR DISCRETE CHANNELS 24<br />
(a) Show that the optimum receiver should choose the codeword that has minimum Hamming<br />
distance to the received channel output sequence.<br />
(b) What is the optimum receiver for 1/2 < p < 1?<br />
(c) Assume that the messages are not equally likely and that the cross-over probability<br />
p = 1/2? What is the minimum error probability now? What does the corresponding<br />
receiver do?<br />
(Exam Communication Principles, October 6, 2003.)<br />
5. The weather type in Paris is a random variable that is denoted by P. Similarly the weather<br />
type in Marseille is a random variable denoted by M. Both random variables can assume<br />
the values {s, c, r}. Here s stands for sunny, c for cloudy, and r for rainy. We assume that<br />
the weather type in Marseille M depends on the weather type in Paris P. The joint probability<br />
of a certain weather type in Paris together with a certain weather type in Marseille<br />
can be found in the table below.<br />
p Pr{P = p, M = s} Pr{P = p, M = c} Pr{P = p, M = r}<br />
s 0.25 0.05 0.00<br />
c 0.20 0.10 0.10<br />
r 0.05 0.15 0.10<br />
The table shows e.g. that the joint probability of cloudy weather in Paris and sunny weather<br />
in Marseille is 0.20.<br />
(a) Suppose that you have to predict a combination of weather types for Paris and Marseille<br />
such that your prediction is correct with a probability as large as possible. What<br />
combination do you choose and what is the probability that your prediction will be<br />
wrong?<br />
(b) Suppose that you have to guess the weather type in Marseille knowing the weather<br />
type in Paris. How do you decide for each of the three possible types of weather in<br />
Paris if you want to be correct with the highest possible probability? How large is the<br />
probability that your guess is wrong?<br />
(c) Suppose that you know the weather type in Marseille and that you must suggest two<br />
types of weather in Paris such that the probability that at least one of your suggestions<br />
is correct is maximal. What is your pair of suggestions for each of the three possible<br />
types of weather in Marseille? Compute the probability that both your suggestions<br />
are wrong also for this case.<br />
(Exam Communication Theory, January 12, 2005)
Chapter 3<br />
Decision rules for the real scalar channel<br />
SUMMARY: Here we will consider transmission of information over a channel with a single<br />
real-valued input and output. Again the optimum receiver is determined. As an example<br />
we investigate the additive Gaussian noise channel, i.e. the channel that adds Gaussian<br />
noise to the input signal.<br />
3.1 System description<br />
m<br />
s m<br />
r<br />
ˆm<br />
✲<br />
✲ real<br />
source transmitter ✲ receiver ✲ destin.<br />
channel<br />
Figure 3.1: Elements in a communication system based on a real scalar channel.<br />
A communication system based on a channel with real-valued input and output alphabet, i.e.<br />
a real scalar channel does not differ very much from a system based on a discrete channel (see<br />
figure 3.1).<br />
SOURCE An information source produces the message m ∈ M = {1, 2, · · · , |M|} with a-<br />
priori probability Pr{M = m} for m ∈ M. Again M is the name of the random variable<br />
associated with this mechanism.<br />
TRANSMITTER The transmitter now sends a real-valued scalar signal s m if message m is<br />
to be transmitted. The scalar signal is input to the channel. It assumes a value in the<br />
range (−∞, ∞). The random variable corresponding to the signal is denoted by S. The<br />
collection of used signals is s 1 , s 2 , · · · , s |M| .<br />
REAL SCALAR CHANNEL The channel now produces an output r in the range (−∞, ∞).<br />
When the input signal is the real-valued scalar s the channel output is generated according<br />
25
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 26<br />
to the conditional probability density function p R (r|S = s) and thus R is a real-valued<br />
scalar random variable. The probability of receiving a channel output r ≤ R < r + dr is<br />
equal to p R (r|S = s)dr for an infinitely small interval dr.<br />
When message m ∈ M occurs, signal s m is chosen by the transmitter as channel input.<br />
Then the conditional probability density function<br />
p R (r|M = m) = p R (r|S = s m ) for all r ∈ (−∞, ∞), (3.1)<br />
describes the behavior of the transmitter followed by the real scalar channel.<br />
RECEIVER The receiver forms an estimate ˆm of the transmitted message (or signal) based on<br />
the received real-valued scalar channel output r, hence ˆm = f (r). The mapping f (·) is<br />
called the decision rule.<br />
DESTINATION The destination accepts the estimate ˆm.<br />
3.2 MAP and ML decision rules<br />
For the real scalar channel we can write the probability of correct decision is<br />
P C =<br />
∫ ∞<br />
−∞<br />
Pr{M = f (r)}p R (r|M = f (r))dr. (3.2)<br />
An optimum decision rule is obtained if, after receiving the scalar R = r, the decision f (r) is<br />
taken in such a way that<br />
Pr{M = f (r)}p R (r|M = f (r)) ≥ Pr{M = m}p R (r|M = m) for all m ∈ M. (3.3)<br />
This leads to the definition of the decision variables given below.<br />
Alternatively we can assume that a value r ≤ R < r + dr was received. Then the decision<br />
variables would be<br />
Pr{M = m, r ≤ R < r + dr} = Pr{M = m} Pr{r ≤ R < r + dr|S = s m }<br />
= Pr{M = m}p R (r|S = s m )dr. (3.4)<br />
Also this reasoning leads to the definition of the decision variables below.<br />
Definition 3.1 For a system based on a real scalar channel the products<br />
Pr{M = m}p R (r|M = m) = Pr{M = m}p R (r|S = s m ) (3.5)<br />
which are, given the received output r, indexed by m ∈ M, are called the decision variables.<br />
The optimum decision rule f (·) is again based on these variables. It is possible that for certain<br />
channel outputs r more decisions f (r) are optimum.
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 27<br />
RESULT 3.1 (MAP) To minimize the error probability P E , the decision rule f (·) should be<br />
such that for each received r a message m is chosen with the largest decision variable. Hence<br />
for r that can be received, an optimum decision rule f (·) should satisfy<br />
f (r) = arg max<br />
m∈M Pr{M = m}p R(r|S = s m ). (3.6)<br />
Both sides of the inequality (3.3) can be divided by p R (r) = ∑ m∈M Pr{M = m}p R(r|S = s m ),<br />
i.e. de density for the value of r that actually did occur. Then we obtain<br />
f (r) = arg max Pr{M = m|R = r}, (3.7)<br />
m∈M<br />
for r for which p R (r) > 0. This rule is again called maximum a-posteriori (MAP) decision rule.<br />
RESULT 3.2 (ML) When all messages have equal a-priori probabilities. i.e. when Pr{M =<br />
m} = 1/|M| for all m ∈ M, we observe from (3.6), that the optimum receiver has to choose<br />
f (r) = arg max<br />
m∈M p R(r|S = s m ), (3.8)<br />
for all r with p R (r) > 0. This rule is referred to as the maximum likelihood (ML) decision rule.<br />
3.3 The scalar additive Gaussian noise channel<br />
3.3.1 Introduction<br />
Gaussian noise is probably the most important kind of impairment. Therefore we will investigate<br />
a simple communication situation based on a channel that adds a Gaussian noise sample n to the<br />
real scalar channel input signal s (see figure 3.2).<br />
n<br />
p N (n) = √ 1<br />
n2<br />
exp(− )<br />
2πσ 2 2σ 2<br />
s<br />
✛❄<br />
✘r = s + n<br />
✲ +<br />
✲<br />
✚✙<br />
Figure 3.2: Scalar additive Gaussian noise channel.<br />
Definition 3.2 The scalar additive Gaussian noise (AGN) channel adds Gaussian noise N to<br />
the input signal S. This Gaussian noise N has variance σ 2 and mean 0. The probability density<br />
function of the noise is defined to be<br />
p N (n) =<br />
1<br />
√<br />
2πσ 2<br />
The noise variable N is assumed to be independent of the signal S.<br />
n2<br />
exp(− ). (3.9)<br />
2σ 2
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 28<br />
3.3.2 The MAP decision rule<br />
We assume e.g. that |M| = 2, i.e. there are two messages and M can be either 1 or 2. The<br />
corresponding two signals s 1 and s 2 are the inputs of our channel. Without loss of generality let<br />
s 1 > s 2 .<br />
The conditional probability density function of receiving R = r given the message m depends<br />
only on the value n that the noise variable N gets. In order to receive R = r when the signal is<br />
s m , the noise variable N should have value r − s m . The noise N is independent of the signal S<br />
(and the message M). Therefore (assuming that dr is infinitely small)<br />
p R (r|S = s m ) = Pr{r ≤ R < r + dr|S = s m }/dr<br />
= Pr{r − s m ≤ N < r − s m + dr}/dr<br />
= p N (r − s m )<br />
=<br />
1<br />
√ exp(−(r − s m) 2<br />
2πσ 2 2σ 2 ), for m = 1, 2. (3.10)<br />
We obtain an optimum (maximum a-posteriori) receiver when ˆm = 1 is chosen if<br />
1<br />
Pr{M = 1} √ exp(−(r − s 1) 2<br />
1<br />
2πσ 2 2σ 2 ) ≥ Pr{M = 2} √ exp(−(r − s 2) 2<br />
2πσ 2 2σ 2 ), (3.11)<br />
and ˆm = 2 otherwise. The following inequalities are all equivalent to (3.11):<br />
ln Pr{M = 1} − (r − s 1) 2<br />
2σ 2 ≥ ln Pr{M = 2} − (r − s 2) 2<br />
2σ 2 ,<br />
2σ 2 ln Pr{M = 1} − (r − s 1 ) 2 ≥ 2σ 2 ln Pr{M = 2} − (r − s 2 ) 2 ,<br />
2σ 2 ln Pr{M = 1} + 2rs 1 − s1 2 ≥ 2σ 2 ln Pr{M = 2} + 2rs 2 − s2 2 ,<br />
2rs 1 − 2rs 2 ≥ 2σ 2 Pr{M = 2}<br />
ln<br />
Pr{M = 1} + s2 1 − s2 2 . (3.12)<br />
RESULT 3.3 (Optimum receiver for the scalar additive Gaussian noise channel) A receiver<br />
that decides ˆm = 1 if<br />
r ≥ r ∗ σ 2 Pr{M = 2}<br />
= ln<br />
s 1 − s 2 Pr{M = 1} + s 1 + s 2<br />
, (3.13)<br />
2<br />
and ˆm = 2 otherwise, is optimum. When the a-priori probabilities Pr{M = 1} and Pr{M = 2}<br />
are equal the optimum threshold is<br />
r ∗ = s 1 + s 2<br />
. (3.14)<br />
2<br />
Example 3.1 Assume that σ 2 = 1 and s 1 = +1 and s 2 = −1. If the a-priori message probabilities are<br />
equal, i.e. when Pr{M = 1} = Pr{M = 2}, we obtain an optimum receiver if only for r ≥ r ∗ = s 1+s 2<br />
we<br />
2<br />
decide for ˆm = 1. The decision changes exactly halfway between s 1 and s 2 . The intervals (−∞, r ∗ ) and
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 29<br />
0.2<br />
0.18<br />
0.16<br />
0.14<br />
0.12<br />
0.1<br />
0.08<br />
0.06<br />
0.04<br />
0.02<br />
0<br />
−5 −4 −3 −2 −1 0 1 2 3 4 5<br />
Figure 3.3: Decision variables for equal a-priori probabilities as a function of r.<br />
0.35<br />
0.3<br />
0.25<br />
0.2<br />
0.15<br />
0.1<br />
0.05<br />
0<br />
−5 −4 −3 −2 −1 0 1 2 3 4 5<br />
Figure 3.4: Decision variables as a function of r for a-priori probabilities 1/4 and 3/4.<br />
(r ∗ , ∞) are called decision intervals. In our case, since s 2 = −s 1 , the threshold r ∗ = 0. This also can be<br />
seen from figure 3.3 where the decision variables<br />
1<br />
Pr{M = 1} √ exp(−(r − s 1) 2<br />
1<br />
), Pr{M = 2} √<br />
2πσ<br />
2 2σ exp(−(r − s 2) 2<br />
) (3.15)<br />
2 2πσ<br />
2 2σ 2<br />
are plotted as a function of r assuming that Pr{M = 1} = Pr{M = 2} = 1/2.<br />
Next assume that the a-priori probabilities are not equal but let Pr{M = 1} = 3/4 and Pr{M = 2} =<br />
1/4. Now the decision variables change (see figure 3.4) and we must also change the decision rule. It<br />
turns out that for r ≥ r ∗ = − ln 3 ≈ −0.5493 the optimum receiver should choose ˆm = 1 (see again figure<br />
2<br />
3.4). The threshold r ∗ has moved away from the more probable signal s 1 .<br />
Note that, no matter how much the a-priori probabilities differ, there is always a value of r, which we<br />
have called r ∗ , the threshold, before, for which (3.11) is satisfied with equality. For r > r ∗ the left side in<br />
(3.11) is larger than the right side, for r < r ∗ the right side is the largest.
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 30<br />
3.3.3 The Q-function<br />
To compute the probability of error we introduce the so-called Q-function (see figure 3.5).<br />
Definition 3.3 (Q-function) This function of x ∈ (−∞, ∞) is defined as<br />
Q(x) =<br />
∫ ∞<br />
x<br />
1<br />
√ exp(− α2<br />
)dα. (3.16)<br />
2π 2<br />
It is the probability that a Gaussian random variable with mean 0 and variance 1 assumes a value<br />
larger than x.<br />
0.4<br />
0.35<br />
0.3<br />
0.25<br />
0.2<br />
0.15<br />
0.1<br />
0.05<br />
0<br />
−4 −3 −2 −1 0 1 2 3 4<br />
Figure 3.5: Gaussian probability density function for mean 0 and variance 1. The shaded area<br />
corresponds to Q(1).<br />
A useful property of the Q-function is<br />
Q(x) + Q(−x) = 1. (3.17)<br />
In appendix A an upper bound for Q(x) is derived. In table 3.1 we tabulated Q(x) for several<br />
values of x.<br />
3.3.4 Probability of error<br />
We now want to find an expression for the error probability of our scalar system with two messages.<br />
We write<br />
P E = Pr{M = 1} Pr{R < r ∗ |M = 1} + Pr{M = 2} Pr{R ≥ r ∗ |M = 2}, (3.18)
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 31<br />
x Q(x) x Q(x) x Q(x)<br />
0.00 0.50000 1.5 6.6681 ·10 −2 5.0 2.8665 ·10 −7<br />
0.25 0.40129 2.0 2.2750 ·10 −2 6.0 9.8659 ·10 −10<br />
0.50 0.30854 2.5 6.2097 ·10 −3 7.0 1.2798 ·10 −12<br />
0.75 0.22663 3.0 1.3499 ·10 −3 8.0 6.2210 ·10 −16<br />
1.00 0.15866 4.0 3.1671 ·10 −5 10.0 7.6200 ·10 −24<br />
Table 3.1: Table of Q(x) for some values of x.<br />
where<br />
Pr{R ≥ r ∗ |M = 2} =<br />
=<br />
=<br />
∫ ∞<br />
1<br />
r=r<br />
∫ ∗ ∞<br />
1<br />
r=r<br />
∫ ∗ ∞<br />
√ exp(−(r − s 2) 2<br />
2πσ 2 2σ 2 )dr<br />
√ exp(− ((r/σ ) − s 2/σ ) 2<br />
)d(r/σ )<br />
2π 2<br />
µ=r ∗ /σ −s 2 /σ<br />
1<br />
√<br />
2π<br />
exp(− µ2<br />
2 )dµ = Q(r ∗ /σ − s 2 /σ ), (3.19)<br />
and similarly<br />
Pr{R < r ∗ |M = 1} =<br />
=<br />
=<br />
∫ r=r ∗<br />
−∞<br />
∫ r=r ∗<br />
1<br />
√ exp(−(r − s 1) 2<br />
2πσ 2 2σ 2 )dr<br />
1<br />
√ exp(− ((r/σ ) − s 1/σ ) 2<br />
)d(r/σ )<br />
2π 2<br />
−∞<br />
∫ µ=r ∗ /σ −s 1 /σ<br />
−∞<br />
Note that the last equality (*) follows from (3.17).<br />
1<br />
√<br />
2π<br />
exp(− µ2<br />
2 )dµ<br />
= 1 − Q(r ∗ /σ − s 1 /σ ) (∗)<br />
= Q(s 1 /σ − r ∗ /σ ). (3.20)<br />
RESULT 3.4 (Minimum probability of error) If we combine (3.18), (3.19), and (3.20), we obtain<br />
P E = Pr{M = 1}Q( s 1 − r ∗<br />
) + Pr{M = 2}Q( r ∗ − s 2<br />
). (3.21)<br />
σ<br />
σ<br />
If the a-priori message probabilities Pr{M = 1} and Pr{M = 2} are equal then, according to<br />
(3.14), we get r ∗ = s 1+s 2<br />
2<br />
and<br />
s 1 − r ∗<br />
σ<br />
r ∗ − s 2<br />
σ<br />
= s 1 − s 1+s 2<br />
2<br />
σ<br />
s 1 +s 2<br />
2<br />
− s 2<br />
=<br />
σ<br />
= s 1 − s 2<br />
2σ<br />
= s 1 − s 2<br />
2σ<br />
(3.22)
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 32<br />
hence for the minimum probability of error we obtain<br />
P E = Q( s 1 − s 2<br />
). (3.23)<br />
2σ<br />
Example 3.2 For s 1 = +1, s 2 = −1, and σ 2 = 1, we obtain if Pr{M = 1} = Pr{M = 2} that<br />
P E = Q(1) ≈ 0.1587. (3.24)<br />
For Pr{M = 1} = 3/4 and Pr{M = 2} = 1/4 we get that r ∗ = − ln 3<br />
2<br />
≈ −0.5493 and<br />
P E = 3 ln 3<br />
Q(1 +<br />
4 2 ) + 1 4 Q(−ln 3<br />
2 + 1)<br />
≈ 0.75Q(1.5493) + 0.25Q(0.4507)<br />
≈ 0.75 · 0.0607 + 0.25 · 0.3261 ≈ 0.1270. (3.25)<br />
Note that this is smaller than Q(1) ≈ 0.1587 which would be the error probability after ML-detection<br />
(which is suboptimum here).<br />
3.4 Exercises<br />
p R (r|M = 1) p R (r|M = 2)<br />
1 1<br />
0<br />
❅<br />
❅❅<br />
1/4<br />
❅<br />
1 2 r -1 0 1 2 3 r<br />
Figure 3.6: Conditional probability density functions.<br />
1. A communication system is used to transmit one of two equally likely messages, 1 and 2.<br />
The channel output is a real-valued random variable R, the conditional density functions of<br />
which are shown in figure 3.6. Determine the optimum receiver decision rule and compute<br />
the resulting probability of error.<br />
(Exercise 2.23 from Wozencraft and Jacobs [25].)<br />
2. The noise n in figure 3.7a is Gaussian with zero mean, i.e. E[N] = 0. If one of two<br />
equally likely messages is transmitted, using the signals of 3.7b, an optimum receiver<br />
yields P E = 0.01.<br />
(a) What is the minimum attainable probability of error P min<br />
E<br />
when the channel of 3.7a<br />
is used with three equally likely messages and the signals of 3.7c? And with four<br />
equally likely messages and the signals of 3.7d?<br />
(b) How do the answers to the previous questions change if it is known that E[N] = 1<br />
instead of 0?
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 33<br />
s<br />
n<br />
✛❄<br />
✘r = s + n<br />
✲ + ✲<br />
✚✙<br />
(a)<br />
s 1<br />
s 2<br />
✄<br />
✂ ✄<br />
✁ ✂ ✁<br />
-2 +2<br />
(b)<br />
s 1 s 2 s 3<br />
s 1 s 2 s 3 s 4<br />
✄ ✄ ✄<br />
✄ ✄ ✄ ✄<br />
✂ ✁ ✂ ✁ ✂ ✁<br />
✂ ✁ ✂ ✁<br />
✂ ✁ ✂ ✁<br />
-4 0 +4 -4 0 +4 +8<br />
(c)<br />
(d)<br />
Figure 3.7:<br />
(Exercise 4.1 from Wozencraft and Jacobs [25].)<br />
3. Show that p(n) = √ 1 exp(− n2<br />
2π 2<br />
) for −∞ < n < ∞ is a probability density function<br />
(non-negative, integrating to 1).<br />
4. Consider a communication channel with an input s that can only assume positive values.<br />
The probability density function of the output R when the input is s is given by<br />
p R (r|S = s) =<br />
{ s−|r|<br />
s 2<br />
for 0 ≤ |r| ≤ s<br />
0 for |r| > s<br />
,<br />
see figure 3.8.<br />
1/s<br />
p R (r|S = s)<br />
❅<br />
❅❅❅❅<br />
❅<br />
−s<br />
s<br />
r →<br />
Figure 3.8: The probability density function p R (r|S = s).<br />
There are two messages i.e. M = {1, 2} and the corresponding signals are s 1 = 1/2 and<br />
s 2 = 2. Let the a-priori probabilities Pr{M = 1} = Pr{M = 2} = 1/2.<br />
(a) Sketch the decision variables as a function of the output r in a single figure. For what<br />
values r does an optimum receiver decide ˆM = 1?
CHAPTER 3. DECISION RULES FOR THE REAL SCALAR CHANNEL 34<br />
(b) Determine the corresponding error probability P E .<br />
(c) If the a-priori probability Pr{M = 1} is small enough, an optimum receiver will<br />
choose ˆM = 2 for all r. What is the largest value of Pr{M = 1} for which this<br />
happens? What is the error probability for this value of Pr{M = 1}?<br />
(Exam Communication Theory, November 18, 2003)<br />
5. Consider a communication channel with an input s that can only assume positive values.<br />
The probability density function of the output R when the input is s is given by<br />
(<br />
1<br />
p R (r|S = s) = √ exp − r 2 )<br />
2πs 2 2s 2 .<br />
Note that conditionally on the input s, this R is Gaussian, with mean zero and variance s 2 .<br />
In figure 3.9 this density is plotted for s = 1.<br />
0.45<br />
0.4<br />
0.35<br />
0.3<br />
p R<br />
(r|S=1)<br />
0.25<br />
0.2<br />
0.15<br />
0.1<br />
0.05<br />
0<br />
−3 −2 −1 0 1 2 3<br />
r<br />
Figure 3.9: The probability density function p R (r|S = 1).<br />
There are two messages i.e. M = {1, 2} and the corresponding signals are s 1 = 1 and<br />
s 2 = √ e. Let the a-priori probabilities Pr{M = 1} = Pr{M = 2} = 1/2.<br />
(a) Sketch both probability density functions p R (r|S = s 1 ) and p R (r|S = s 2 ) in a single<br />
figure. For what values r does an optimum receiver decide ˆM = 1?<br />
(b) Determine the corresponding error probability P E . Express it in terms of the Q(·)<br />
function.<br />
(c) If the a-priori probability Pr{M = 1} is small enough, an optimum receiver will<br />
choose ˆM = 2 for all r. What is the largest value of Pr{M = 1} for which this<br />
happens? What is the error probability for this value of Pr{M = 1}?<br />
(Exam Communication Theory, January 23, 2004)
Chapter 4<br />
Real vector channels<br />
SUMMARY: In this chapter we discuss the channel whose input and output are vectors of<br />
real-valued components. An important example of such a real vector channel is the additive<br />
Gaussian noise (AGN) vector channel. This channel adds zero-mean Gaussian noise to each<br />
input signal component. All noise samples are assumed to be independent of each other and<br />
have the same variance. For the AGN vector channel we determine the optimum receiver.<br />
If all messages are equally likely this receiver chooses the signal vector that is closest to the<br />
received vector in Euclidean sense. We also investigate the error behavior of this channel.<br />
The last part of this chapter deals with the theorem of reversibility.<br />
4.1 System description<br />
s m<br />
r<br />
source<br />
m<br />
✲<br />
transmitter<br />
s m1 ✲<br />
r 1 ✲<br />
s m2 ✲ real r 2 ✲<br />
vector<br />
.<br />
✲ channel .<br />
✲<br />
s m N r N<br />
receiver<br />
ˆm<br />
✲<br />
destin.<br />
Figure 4.1: A communication system based on a real vector channel.<br />
In a communication system based on a real vector channel (see figure 4.1) the channel input<br />
and output are vectors with real-valued components.<br />
SOURCE The information source generates the message m ∈ M = {1, 2, · · · , |M|} with a-<br />
priori probability Pr{M = m} for m ∈ M. M is the random variable associated with this<br />
mechanism.<br />
TRANSMITTER The transmitter sends a vector-signal s m = (s m1 , s m2 , · · · , s m N ) consisting<br />
of N components if message m is to be transmitted. Each component assumes a value in<br />
the range (−∞, ∞). The random variable corresponding to the signal vector is denoted<br />
35
CHAPTER 4. REAL VECTOR CHANNELS 36<br />
by S. The vector-signal is input to the channel. The collection of used signal vectors is<br />
s 1 , s 2 , · · · , s |M| .<br />
REAL VECTOR CHANNEL The channel produces an output vector r = (r 1 , r 2 , · · · , r N )<br />
consisting of N components. We assume here that all these components assume values<br />
from (−∞, ∞). The conditional probability density function of the channel output vector<br />
r when the input vector s is transmitted is p R (r|S = s). Thus, for S = s, the probability<br />
of receiving an N-dimensional channel output vector R with components R n for n = 1, N<br />
satisfying r n ≤ R n < r n + dr n is equal to p R (r|S = s)dr for an infinitely small dr =<br />
dr 1 dr 2 · · · dr N and r = (r 1 , r 2 , · · · , r N ).<br />
When message m ∈ M is sent, signal s m is chosen by the transmitter as channel input<br />
vector. Then the conditional probability density function<br />
p R (r|M = m) = p R (r|S = s m ) for all r ∈ (−∞, ∞) N , (4.1)<br />
describes the behavior of the cascade of the vector transmitter and the vector channel.<br />
RECEIVER The receiver forms ˆm based on the received real-valued channel output vector r,<br />
hence ˆm = f (r). Mapping f (·) is called the decision rule.<br />
DESTINATION The destination accepts ˆm.<br />
4.2 Decision variables, MAP decoding<br />
The optimum receiver upon receiving r determines which one of the possible messages has<br />
maximum a-posteriori probability. It therefore chooses the decision rule f (r) such that for all r<br />
that actually can be received<br />
Pr{M = f (r)}p R (r|M = f (r)) ≥ Pr{M = m}p R (r|M = m), for all m ∈ M. (4.2)<br />
In other words, upon receiving r, the optimum receiver produces an estimate ˆm that corresponds<br />
to a largest decision variable. Given r, the decision variables for the real vector channel are<br />
Pr{M = m}p R (r|M = m) = Pr{M = m}p R (r|S = s m ) for all m ∈ M. (4.3)<br />
That this is MAP-decoding follows from the Bayes rule which says that<br />
Pr{M = m|R = r} = Pr{M = m}p R(r|M = m)<br />
p R (r)<br />
(4.4)<br />
for r with p R (r) = ∑ m∈M Pr{M = m}p R(r|M = m) > 0. Note that p R (r) > 0 for an output<br />
vector r that has actually been received.
CHAPTER 4. REAL VECTOR CHANNELS 37<br />
4.3 Decision regions<br />
The optimum receiver, upon receiving r, determines the maximum of all decision variables which<br />
are given in (4.3). To compute these decision variables the a-priori probabilities Pr{M = m}<br />
must be known to the receiver and also the conditional density functions p R (r|S = s m ) for all<br />
m ∈ M. This calculation can be carried out for every r in the observation space, i.e. the set of<br />
all possible output vectors r. Each point r in the observation space is therefore assigned to one of<br />
the possible messages m ∈ M, the message that is actually chosen by the receiver. This results<br />
in a partitioning of the observation space in at most |M| regions.<br />
Definition 4.1 Given the decision rule f (·) we can write<br />
I m<br />
= {r| f (r) = m}. (4.5)<br />
where I m is called the decision region that corresponds to message m ∈ M.<br />
Note that in the example in subsection 3.3.2 of the previous chapter we considered decision<br />
intervals. A decision region is an N-dimensional generalization of a (one-dimensional) decision<br />
interval.<br />
I 1 ✚ ✚✘ ✘<br />
s 1<br />
✄ ✂ ✁<br />
✡ ✡✡✡✡✚✚✘ I 2<br />
✄ <br />
✂ ✁ s 2<br />
<br />
✑<br />
✑ ❅<br />
✑ ❅❅❆❆<br />
✟<br />
✑<br />
✟<br />
✟<br />
✟<br />
✏ ✏✏✟✟<br />
❇<br />
✏<br />
❅ ◗<br />
✄ <br />
❳❳❵❵❵❵<br />
✂ ✁<br />
s 3<br />
I 3<br />
Figure 4.2: Three two-dimensional signal vectors and their decision regions.<br />
Example 4.1 Suppose (see figure 4.2) that we have three messages i.e. M = {1, 2, 3}, and the corresponding<br />
signal-vectors are two-dimensional i.e. s 1<br />
= (1, 2), s 2<br />
= (2, 1), and s 3<br />
= (1, −3). A possible<br />
partitioning of the observation space into the three decision regions I 1 , I 2 , and I 3 is shown in the figure.
CHAPTER 4. REAL VECTOR CHANNELS 38<br />
4.4 The additive Gaussian noise vector channel<br />
4.4.1 Introduction<br />
The actual shape of the decision regions is determined by the a-priori message probabilities<br />
Pr{M = m}, the signals s m , and the conditional density functions p R (r|S = s m ) for all m ∈<br />
M. A relatively simple but again quite important situation is the case where the channel adds<br />
n p N (n) =<br />
1<br />
(2πσ 2 ) N/2 exp(− ‖n‖2<br />
2σ 2 )<br />
s<br />
✛❄<br />
✘<br />
r = s + n<br />
✲ +<br />
✲<br />
✚✙<br />
Figure 4.3: Additive Gaussian noise vector channel.<br />
Gaussian noise to the signal components (see figure 4.3).<br />
Definition 4.2 For the output of the additive Gaussian noise (AGN) vector channel we have<br />
that<br />
r = s + n, (4.6)<br />
where n = (n 1 , n 2 , · · · , n N ) is an N-dimensional noise vector. We denote this random noise<br />
vector by N. The noise vector N is assumed to be independent of the signal vector S. Moreover<br />
the N components of the noise vector are assumed to be independent of each other. All noise<br />
components have mean 0 and the same variance σ 2 . Therefore the joint probability density<br />
function of the noise vector is given by<br />
p N (n) = ∏<br />
i=1,N<br />
1<br />
√ exp(− n2 i<br />
2πσ 2 2σ 2 ) = 1<br />
1<br />
(2πσ 2 exp(−<br />
) N/2 2σ 2<br />
∑<br />
i=1,N<br />
n 2 i<br />
). (4.7)<br />
The notation in the definition can be contracted by noting that<br />
∑<br />
ni 2 = (n · n) = ‖n‖ 2 , (4.8)<br />
i=1,N<br />
where (a · b) = ∑ i=1,N a ib i is the dot (inner) product of the vectors a = (a 1 , a 2 , · · · , a N ) and<br />
b = (b 1 , b 2 , · · · , b N ). Thus<br />
p N (n) =<br />
1<br />
(2πσ 2 exp(−‖n‖2 ). (4.9)<br />
) N/2 2σ 2
CHAPTER 4. REAL VECTOR CHANNELS 39<br />
4.4.2 Minimum Euclidean distance decision rule<br />
If the channel output is r and the channel input was s m then the noise vector must have been<br />
n = r − s m . This and the independence of the noise vector and the signal vector yields that<br />
p R (r|M = m) = p R (r|S = s m ) = p N (r − s m |S = s m ) = p N (r − s m ), (4.10)<br />
similar to (3.10) in the previous chapter. Now we can easily determine the decision variables,<br />
one for each m ∈ M:<br />
Pr{M = m}p R (r|M = m) = Pr{M = m}p N (r − s m )<br />
1<br />
= Pr{M = m}<br />
(2πσ 2 ) N/2 exp(−‖r − s m ‖2<br />
2σ 2 ). (4.11)<br />
Note that the factor (2πσ 2 ) N/2 is independent of m. Hence maximizing over the decision variables<br />
in (4.11) is equivalent to minimizing<br />
‖r − s m ‖ 2 − 2σ 2 ln Pr{M = m} (4.12)<br />
over all m ∈ M. We obtain a very simple decision rule if all messages are equally likely.<br />
RESULT 4.1 (Minimum Euclidean distance decision rule) If the a-priori message probabilities<br />
are all equal, the optimum receiver has to minimize the squared Euclidean distance<br />
‖r − s m ‖ 2 , over all m ∈ M. (4.13)<br />
In other words the receiver has to choose the message ˆm with corresponding signal vector s ˆm<br />
that is closest in Euclidean sense to the received vector r. Note that this result holds for the<br />
additive Gaussian noise vector channel.<br />
Example 4.2 Consider again (see figure 4.4) three two dimensional signal vectors s 1<br />
= (1, 2), s 2<br />
=<br />
(2, 1), and s 3<br />
= (1, −3).<br />
The optimum decision regions can be found by drawing the perpendicular bisectors of the sides of the<br />
signal triangle. These are the boundaries of the decision regions I 1 , I 2 , and I 3 (see figure). Note that the<br />
three bisectors have a single point in common.<br />
4.4.3 Error probabilities<br />
The error probability of an additive Gaussian noise vector channel is determined by the location<br />
of the signals s 1 , s 2 , · · · , s |M| , and, more importantly, the hyperplanes that separate these signalvectors.<br />
An error occurs if the noise ”pushes” a signal-point to the wrong side of a hyperplane. In<br />
general it is quite difficult to determine the exact error probability, however it is easy to compute<br />
the probability that the received signal is on the wrong side of a hyperplane.<br />
To study this behavior we investigate a simple example in two dimensions. i.e. N = 2.<br />
Consider a signal vector s = (a, b), see figure 4.5. This signal vector is corrupted by the additive
CHAPTER 4. REAL VECTOR CHANNELS 40<br />
<br />
I 1<br />
s 1<br />
❡<br />
I 2<br />
❅<br />
❅<br />
❅❡<br />
s 2<br />
❳❳❳❳❳❳❳ ❳ ❳ ❳ ❳❳❳❳<br />
❡✄ ✄✄✄✄✄✄✄✄✄✄✄<br />
s 3<br />
I 3<br />
Figure 4.4: Three two-dimensional signal vectors and the corresponding optimum decision regions<br />
for the additive Gaussian noise channel.<br />
◗<br />
◗<br />
◗<br />
◗<br />
◗<br />
r 2 ◗<br />
I<br />
◗<br />
◗<br />
✻<br />
◗<br />
◗<br />
◗<br />
◗<br />
(r 1 , r 2 ) ◗<br />
◗<br />
✡◗ ◗<br />
❆❑ ◗◗◗◗◗◗<br />
✡<br />
❆<br />
✢✡ ✡✡✡✡✡✡✣<br />
◗<br />
◗❦<br />
◗<br />
◗<br />
✡<br />
❆<br />
◗<br />
◗✡<br />
❆<br />
✡ ✡✡✡✣ ◗<br />
◗<br />
◗<br />
◗❦<br />
❆<br />
◗<br />
u 2<br />
◗<br />
◗ ❆<br />
◗ ❆<br />
✡ ✡✣ ✡✡ u 1<br />
◗<br />
◗ line<br />
◗<br />
◗<br />
◗❆<br />
◗✡ ❆<br />
✡ ◗ θ<br />
◗<br />
✄<br />
✂ ✁<br />
(a, b)<br />
r 1<br />
✲<br />
Figure 4.5: A signal point and a line.
CHAPTER 4. REAL VECTOR CHANNELS 41<br />
Gaussian noise vector n = (n 1 , n 2 ) with independent components that both have mean zero and<br />
variance σ 2 . What is now the probability P I that the received vector r = (r 1 , r 2 ) = (a, b) +<br />
(n 1 , n 2 ) is in region I, the region above the line, see the figure?<br />
To solve this problem we have to change the coordinate system. Let<br />
r 1 = a + u 1 cos θ − u 2 sin θ (4.14)<br />
r 2 = b + u 1 sin θ + u 2 cos θ, (4.15)<br />
where (a, b) is the center of a Cartesian system with coordinates u 1 and u 2 , and θ the rotationangle.<br />
Coordinate u 1 is perpendicular to and coordinate u 2 is parallel to the line in the figure.<br />
Note that<br />
∫ (<br />
1<br />
P I =<br />
I 2πσ 2 exp − (r 1 − a) 2 + (r 2 − b) 2 )<br />
2σ 2 dr 1 dr 2 . (4.16)<br />
Elementary calculus (see e.g. [26], p. 386, theorem 7.7.13) tells us that<br />
∫<br />
∫<br />
f (r 1 , r 2 )dr 1 dr 2 = f (r 1 (u 1 , u 2 ), r 2 (u 1 , u 2 ))|J|du 1 du 2 , (4.17)<br />
I<br />
where |J| is the determinant of the Jacobian J which is<br />
( ∂r1 ∂r 2<br />
(<br />
J =<br />
∂u 1 ∂u 1 cos θ − sin θ<br />
=<br />
sin θ cos θ<br />
u 1 =<br />
I<br />
∂r 1<br />
∂u 2<br />
∂r 2<br />
∂u 2<br />
)<br />
)<br />
. (4.18)<br />
Note that |J| = cos 2 θ + sin 2 θ = 1 and (r 1 − a) 2 + (r 2 − b) 2 = u 2 1 + u2 2 . Therefore<br />
∫ (<br />
1<br />
P I =<br />
I 2πσ 2 exp − u2 1 + )<br />
u2 2<br />
2σ 2 du 1 du 2<br />
∫ ∞ ∫ (<br />
∞<br />
1<br />
=<br />
u 1 = u 2 =−∞ 2πσ 2 exp − u2 1 + )<br />
u2 2<br />
2σ 2 du 1 du 2<br />
∫ ( )<br />
∞<br />
∫ ( )<br />
1<br />
= √ exp − u2 ∞<br />
1<br />
1<br />
2πσ 2 2σ 2 du 1 √ exp − u2 2<br />
2πσ 2 2σ 2 du 2<br />
u 2 =−∞<br />
= Q( σ ) · 1 = Q( ). (4.19)<br />
σ<br />
Conclusion is that the probability depends only on the distance from the signal point (a, b) to<br />
the line. This result carries over to more than two dimensions:<br />
RESULT 4.2 For the additive Gaussian noise vector channel, the probability that the noise<br />
pushes a signal to the wrong side of a hyperplane is<br />
P I = Q( ), (4.20)<br />
σ<br />
where is the distance from the signal-point to the hyperplane and σ 2 is the variance of each<br />
noise component. All noise components are assumed to be zero-mean.
CHAPTER 4. REAL VECTOR CHANNELS 42<br />
4.5 Theorem of reversibility<br />
It is interesting to investigate what kind of processing of the output of a channel is allowed, in the<br />
sense that it does not degrade the performance. An important result corresponding to this issue<br />
is stated below.<br />
THEOREM 4.3 (Theorem of reversibility) The minimum attainable probability of error is not<br />
affected by the introduction of a reversible operation at the output of a channel, see figure 4.6.<br />
s<br />
✲<br />
channel<br />
r<br />
✲<br />
G<br />
r ′<br />
✲<br />
Figure 4.6: If G is reversible then an optimum decision can also be made from r ′ = G(r).<br />
Proof: Consider the receiver for r ′ that is depicted in figure 4.7. This receiver first recovers<br />
r = G −1 (r ′ ). This is possible since the mapping G from r to r ′ is reversible. Then an optimum<br />
receiver for r is used to determine ˆm. The receiver constructed in this way for r ′ yields the same<br />
error probability as an optimum receiver that observes r directly, thus a reversible operation does<br />
not (necessarily) lead to an increase of P E .<br />
✷<br />
r ′<br />
✲<br />
G −1<br />
r<br />
✲<br />
optimum<br />
receiver<br />
for r<br />
ˆm<br />
✲<br />
Figure 4.7: An optimum receiver for r ′<br />
4.6 Exercises<br />
1. One of four equally likely messages is to be communicated over a vector channel which<br />
adds a (different) statistically independent zero-mean Gaussian random variable with variance<br />
N 0 /2 to each transmitted vector component. Assume that the transmitter uses the<br />
signal vectors shown in figure 4.8 and express the P E produced by an optimum receiver in<br />
terms of the function Q(x).<br />
(Exercise 4.2 from Wozencraft and Jacobs [25].)<br />
2. In the communication system diagrammed in figure 4.9 the transmitted signal s m and the<br />
noises n 1 and n 2 are all random voltages and all statistically independent. Assume that
CHAPTER 4. REAL VECTOR CHANNELS 43<br />
s 2 ✄ ✂<br />
✁<br />
✄ s<br />
✂ ✁ 1<br />
s 3<br />
✄<br />
✂ ✁<br />
√<br />
Es /2<br />
✄<br />
✂ ✁<br />
s 4<br />
|M| = 2 i.e. M = {1, 2} and that<br />
Figure 4.8: <strong>Signal</strong> structure.<br />
Pr{M = 1} = Pr{M = 2} = 1/2,<br />
s 1 = √ 2E s ,<br />
s 2 = 0,<br />
p N1 (n) = p N2 (n) =<br />
1<br />
√<br />
2πσ 2<br />
n2<br />
exp(− ). (4.21)<br />
2σ 2<br />
m<br />
n 1<br />
✲ transmitter<br />
n 2<br />
✛✘r 1 = s m + n 1<br />
✲ +<br />
✲<br />
✚✙<br />
✻<br />
s m<br />
✛❄<br />
✘<br />
✲ +<br />
✲<br />
✚✙r 2 = s m + n 2<br />
optimum<br />
receiver<br />
ˆm<br />
✲<br />
Figure 4.9: Communication system.<br />
r 1<br />
✛❄<br />
✘r 1 + r 2<br />
+<br />
✚✙<br />
✻<br />
r 2<br />
✲ threshold device<br />
ˆm<br />
✲<br />
Figure 4.10: Detector structure.<br />
(a) Show that the optimum receiver can be realized as diagrammed in figure 4.10. Determine<br />
the optimum threshold setting and the value ˆm for r 1 + r 2 larger than the<br />
threshold.
CHAPTER 4. REAL VECTOR CHANNELS 44<br />
m<br />
s m<br />
✛✘r 1 = s m + n 1<br />
✲ transmitter ✲ +<br />
✲<br />
✚✙<br />
✻<br />
n 1<br />
✛❄<br />
✘<br />
✲ +<br />
✲<br />
n 2<br />
✚✙r 2 = n 1 + n 2<br />
optimum<br />
receiver<br />
ˆm<br />
✲<br />
Figure 4.11:<br />
r 1<br />
✛❄<br />
✘r 1 + ar 2<br />
+<br />
✲ threshold device<br />
✛✘✚<br />
✙<br />
✲<br />
✻<br />
×<br />
r 2<br />
✚✙<br />
✻<br />
a<br />
ˆm<br />
✲<br />
Figure 4.12:<br />
(b) Express the resulting P E in terms of Q(·).<br />
(c) By what factor would E s have to be increased to yield this same probability of error<br />
if the receiver were restricted to observing only r 1 .<br />
(Exam Communication Principles, July 4, 2001.)<br />
3. In the communication system diagrammed in figure 4.11 the transmitted signal s and the<br />
noises n 1 and n 2 are all random voltages and all statistically independent. Assume that<br />
|M| = 2 i.e. M = {1, 2} and that<br />
Pr{M = 1} = Pr{M = 2} = 1/2<br />
s 1 = −s 2 = √ E s<br />
p N1 (n) = p N2 (n) =<br />
1<br />
√<br />
2πσ 2<br />
n2<br />
exp(− ). (4.22)<br />
2σ 2<br />
(a) Show that the optimum receiver can be realized as diagrammed in figure 4.12 where<br />
a is an appropriately chosen constant.<br />
(b) What is the optimum value for a?<br />
(c) What is the optimum threshold setting?<br />
(d) Express the resulting P E in terms of Q(x).
CHAPTER 4. REAL VECTOR CHANNELS 45<br />
(e) By what factor would E s have to be increased to yield this same probability of error<br />
if the receiver were restricted to observing only r 1 .<br />
(Exercise 4.6 from Wozencraft and Jacobs [25].)<br />
4. Consider a set of signal vectors {s 1 , s 2 , · · · , s |M| }. Result 4.1 says that if the a-priori<br />
probabilities of the corresponding messages are equal, the minimum Euclidean distance<br />
decision rule is optimum if the channel is an additive Gaussian noise vector channel (definition<br />
4.2). In this case the decision regions can be described by a set of hyperplanes (one<br />
for each message pair). Show that this is also the case if the a-priori probabilities are not<br />
all equal.
Chapter 5<br />
Waveform channels, correlation receiver<br />
SUMMARY: All topics discussed so far are necessary to analyze the additive white Gaussian<br />
noise waveform channel. In a waveform channel continuous-time waveforms are transmitted<br />
over a channel that adds continuous-time white Gaussian noise to them. We show in<br />
this chapter that an optimum receiver for such a waveform channel correlates the received<br />
waveform with all possible transmitted waveforms and makes a decision based on all these<br />
correlations. Together these correlations form a sufficient statistic.<br />
5.1 System description, additive white Gaussian noise<br />
In a communication system based on an additive white Gaussian noise waveform channel (see<br />
figure 5.1) the channel input and output signals are waveforms.<br />
TRANSMITTER The transmitter chooses the waveform s m (t) for 0 ≤ t < T , if message m is<br />
to be transmitted. The set of used waveforms is therefore {s 1 (t), s 2 (t), · · · , s |M| (t)}. The<br />
chosen waveform is input to the channel. We assume here that all these waveforms have<br />
finite-energy, i.e.<br />
E sm<br />
=<br />
∫ T<br />
0<br />
sm 2 (t)dt < ∞ for all m ∈ M, (5.1)<br />
where E sm is the energy of waveform s m (t). To convince yourself that this is a reasonable<br />
definition assume that s m (t) is the voltage across (or the current through) a resistor of 1.<br />
The total energy that is dissipated in this resistor is then equal to E sm .<br />
n w (t)<br />
m<br />
✲<br />
transmitter<br />
s m (t)<br />
✛❄<br />
✘<br />
✲ +<br />
✲<br />
✚r(t) ✙= s m (t) + n w (t)<br />
receiver<br />
ˆm<br />
✲<br />
Figure 5.1: Communication over an additive white Gaussian noise waveform channel.<br />
46
CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 47<br />
WAVEFORM CHANNEL The channel accepts the input waveform s m (t) and adds white Gaussian<br />
noise n w (t) to it, i.e. for the received waveform r(t) we have that<br />
r(t) = s m (t) + n w (t). (5.2)<br />
The noise n w (t) is supposed to be generated by the random process N w (t) which is zeromean<br />
(i.e. E[N w (t) = 0 for all t), stationary, white, and Gaussian 1 . The autocorrelation<br />
function of the noise is<br />
R Nw (t, s) = E[N w (t)N w (s)] = N 0<br />
δ(t − s), (5.3)<br />
2<br />
and depends only on the time difference t − s (by the stationarity). The noise has power<br />
spectral density<br />
S Nw ( f ) = N 0<br />
for −∞ < f < ∞, (5.4)<br />
2<br />
i.e. it is white, see appendix D. Note that f is the frequency in Hz.<br />
RECEIVER The receiver forms the message estimate ˆm based on the observed waveform<br />
r(t), 0 ≤ t < T . The interval [0, T ) is called the observation interval.<br />
The transmission of messages over a waveform channel is what we actually want to study in<br />
the first part of these course notes. It is a model for many digital communication systems (e.g.<br />
a telephone line including the modems on both sides, wireless communication links, recording<br />
channels). In the next sections we will show that the optimum receiver should correlate the<br />
received signal r(t) with all possible transmitted waveforms s m (t), m ∈ M to form a good<br />
message estimate. In a second chapter on waveform communication we will study alternative<br />
implementations of optimum receivers.<br />
GAUSSIAN NOISE is always everywhere. It originates in the thermal fluctuations of all<br />
matter in the universe. It creates randomly varying electromagnetic fields that will a produce a<br />
fluctuating voltage across the terminals of an antenna. The random movements of electrons in<br />
e.g. resistors will create random voltages in the electronics of the receiver. Since these voltages<br />
are the sum of many minuscule effects, by the central-limit theorem, they are Gaussian.<br />
5.2 Waveform expansions in series of orthonormal functions<br />
To determine the optimum receiver some new techniques have to be introduced. It is our aim<br />
to transform the waveform channel into vector channel first and then use our knowledge of the<br />
vector channel to determine an optimum receiver. Therefore we first expand the noise process<br />
N w (t) and the signals s m (t) in terms of a countable set of orthonormal functions (waveforms).<br />
Consider a set of functions { f i (t), i = 1, 2, · · · } that are orthonormal over the interval [0, T )<br />
in the sense that<br />
∫ T<br />
{<br />
f i (t) f j (t)dt =<br />
1 if i = j<br />
(5.5)<br />
0 if i ̸= j.<br />
0<br />
1 One can argue that white noise is always Gaussian, see Helstrom [9] p.36.
CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 48<br />
f 1<br />
(t)<br />
f 2<br />
(t)<br />
f 3<br />
(t)<br />
f 4<br />
(t)<br />
1<br />
1<br />
1<br />
1<br />
0.8<br />
0.8<br />
0.8<br />
0.8<br />
0.6<br />
0.6<br />
0.6<br />
0.6<br />
0.4<br />
0.4<br />
0.4<br />
0.4<br />
0.2<br />
0.2<br />
0.2<br />
0.2<br />
0<br />
0<br />
0<br />
0<br />
−0.2<br />
0 2 4<br />
−0.2<br />
−0.2<br />
−0.2<br />
0 2 4 0 2 4 0 2 4<br />
Figure 5.2: Example of orthonormal functions: four time-translated pulses.<br />
Some well-known sets of orthonormal functions are given in the example that follows.<br />
Example 5.1 In figure 5.2 four<br />
∫<br />
functions are shown that are time-translated orthonormal pulses. Note<br />
T<br />
that the energy in a pulse i.e.<br />
0<br />
fi 2 (t)dt = 1 for i = 1, 4. In figure 5.3 five functions are shown on<br />
a√ one-second time interval, a pulse with amplitude 1 and four sine and cosine waves whose amplitude is<br />
2. All these functions are again mutually orthogonal and have unit energy. Note that functions like these<br />
form the first building blocks that are used in Fourier series.<br />
Now assume 2 that all outcomes n w (t) of the random process N w (t) can be represented in<br />
terms of the orthonormal functions f 1 (t), f 2 (t), · · · as<br />
n w (t) =<br />
∞∑<br />
n i f i (t). (5.6)<br />
i=1<br />
Then the coefficients n i , i = 1, 2, · · · , will satisfy<br />
n i =<br />
∫ T<br />
0<br />
n w (t) f i (t)dt. (5.7)<br />
To see why, substitute (5.6) in the right side of (5.7) and integrate making use of (5.5). Next<br />
assume that also all signals can be expressed in terms of the same set of orthonormal functions,<br />
i.e.<br />
∞∑<br />
s m (t) = s mi f i (t), for m ∈ {1, 2, · · · , |M|}, (5.8)<br />
i=1<br />
2 In this discussion we will focus only on the main concepts and we do not bother very much about mathematical<br />
details as e.g. the existence of such a “complete set” of orthonormal functions, extra conditions on the random<br />
process N w (t) and on the waveforms s m (t), the convergence of series, the interchanging of orders of summation and<br />
integration, etc. . For a more precise treatment we refer e.g. to Gallager [7], Helstrom [9], or Lee and Messerschmitt<br />
[13].
CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 49<br />
1.5<br />
constant 1<br />
1.5<br />
sqrt(2)*cos(2*pi t)<br />
1.5<br />
sqrt(2)*cos(4*pi*t)<br />
1<br />
1<br />
1<br />
0.5<br />
0.5<br />
0.5<br />
0<br />
0<br />
0<br />
−0.5<br />
−0.5<br />
−0.5<br />
−1<br />
−1<br />
−1<br />
−1.5<br />
0 0.5 1<br />
−1.5<br />
0 0.5 1<br />
−1.5<br />
0 0.5 1<br />
1.5<br />
sqrt(2)*sin(2*pi*t)<br />
1.5<br />
sqrt(2)*sin(4*pi*t)<br />
1<br />
1<br />
0.5<br />
0.5<br />
0<br />
0<br />
−0.5<br />
−0.5<br />
−1<br />
−1<br />
−1.5<br />
0 0.5 1<br />
−1.5<br />
0 0.5 1<br />
Figure 5.3: Example of orthonormal functions: first five functions in a Fourier series.<br />
where the coefficients s mi , m ∈ M, i = 1, 2, · · · , are given by<br />
s mi =<br />
∫ T<br />
0<br />
s m (t) f i (t)dt. (5.9)<br />
Then also the received signal r(t) = s m (t) + n w (t) can be expanded as<br />
∞∑<br />
r(t) = r i f i (t), (5.10)<br />
with coefficients r i , i = 1, 2, · · · , that satisfy<br />
i=1<br />
r i =<br />
=<br />
∫ T<br />
0<br />
∫ T<br />
0<br />
r(t) f i (t)dt<br />
s m (t) f i (t)dt +<br />
∫ T<br />
0<br />
n w (t) f i (t)dt = s mi + n i . (5.11)<br />
5.3 An equivalent vector channel<br />
An optimum receiver can now make a decision based on the vector r ∞ = (r 1 , r 2 , r 3 , · · · ) instead<br />
of the waveform r(t). By the theorem of reversibility (Theorem 4.3) this does not affect the<br />
smallest possible achievable error probability P E .<br />
Similarly we can assume that instead of the signal s m (t) the vector s ∞ m = (s m1, r m2 , r m3 , · · · )<br />
is transmitted in order to convey the message m ∈ M, the reason for this is that s ∞ m and s m(t)<br />
are equivalent.
CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 50<br />
All this implies that we have transformed our waveform channel into a vector channel (with<br />
an infinite number of dimensions however). Since<br />
r ∞ = s ∞ m + n∞ , (5.12)<br />
with n ∞ = (n 1 , n 2 , n 3 , · · · ) the resulting vector channel adds noise to the input vector. What<br />
can we say now about the noise components N i , i = 1, 2, · · · ?<br />
Note first that each component results from linear operations on the Gaussian process N w (t).<br />
Therefore the noise components are jointly Gaussian.<br />
For the mean of component N i for i = 1, 2, · · · , we find that<br />
∫ T<br />
E[N i ] = E[<br />
0<br />
N w (t) f i (t)dt] =<br />
∫ T<br />
0<br />
E[N w (t)] f i (t)dt = 0. (5.13)<br />
For the correlation of component N i , i = 1, 2, · · · , and component N j , j = 1, 2, · · · , we obtain<br />
∫ T<br />
E[N i N j ] = E[<br />
=<br />
=<br />
=<br />
∫ T<br />
0 0<br />
∫ T ∫ T<br />
0 0<br />
∫ T ∫ T<br />
0<br />
∫ T<br />
0<br />
0<br />
N w (α)N w (β) f i (α) f j (β)dαdβ]<br />
E[N w (α)N w (β)] f i (α) f j (β)dαdβ<br />
N 0<br />
2 δ(α − β) f i(α) f j (β)dαdβ<br />
N 0<br />
2 f i(α) f j (α)dα = N 0<br />
2 δ i j, (5.14)<br />
and thus the noise variables all have variance N 0 /2 and are uncorrelated and thus independent of<br />
each other. In our derivation we used the Kronecker delta function which is defined as<br />
δ i j<br />
=<br />
{ 1 if i = j,<br />
0 if i ̸= j.<br />
(5.15)<br />
As a consequence of all this our vector channel is an additive Gaussian noise vector channel<br />
as described in definition 4.2.<br />
5.4 A correlation receiver is optimum<br />
In the previous subsection we have seen that our waveform channel is equivalent to an additive<br />
Gaussian noise vector channel (with an infinite number of dimensions). From (4.12) we know<br />
what the decision variables are for this situation. The decision variable for m ∈ M is expressed<br />
as<br />
‖r ∞ − s ∞ m ‖2 − 2σ 2 ln Pr{M = m} = ‖r ∞ − s ∞ m ‖2 − N 0 ln Pr{M = m}, (5.16)
CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 51<br />
where we have substituted N 0 /2 for the variance σ 2 . The receiver has to minimize (5.16) over<br />
m ∈ M. With (a ∞ · b ∞ ) = ∑ ∞<br />
i=1 a i b i for a ∞ = (a 1 , a 2 , · · · ) and b ∞ = (b 1 , b 2 , · · · ), we<br />
rewrite<br />
‖r ∞ − s ∞ m ‖2 = ((r ∞ − s ∞ m ) · (r ∞ − s ∞ m ))<br />
= (r ∞ · r ∞ ) − 2(r ∞ · s ∞ m ) + (s∞ m · s∞ m )<br />
= ‖r ∞ ‖ 2 − 2(r ∞ · s ∞ m ) + ‖s∞ m ‖2 . (5.17)<br />
Since ‖r ∞ ‖ 2 does not depend on m, the optimum receiver should maximize<br />
with<br />
(r ∞ · s ∞ m ) + c m over m ∈ M = {1, 2, · · · , |M|} (5.18)<br />
N 0<br />
c m =<br />
2 ln Pr{M = m} − ‖s∞ m ‖2<br />
. (5.19)<br />
2<br />
Now it would be nice if the receiver could use some simple method to determine the dot<br />
product (r ∞ · s ∞ m ) and the squared vector length ‖s∞ m ‖, which are both sums of an infinite<br />
number of components. We will see next that this is possible and in fact quite easy. For the dot<br />
product we find<br />
∫ T<br />
0<br />
r(t)s m (t)dt =<br />
=<br />
∫ T<br />
0<br />
r(t)<br />
∞∑<br />
s mi f i (t)dt<br />
i=1<br />
∞∑<br />
∫ ∞<br />
s mi r(t) f i (t)dt =<br />
i=1<br />
−∞<br />
Similarly we get for the squared vector length<br />
E sm =<br />
∫ T<br />
This leads to an important result.<br />
0<br />
∫ T ∞∑<br />
sm 2 (t)dt = s m (t) s mi f i (t)dt<br />
=<br />
0<br />
i=1<br />
∞∑<br />
s mi r i = (r ∞ · s ∞ m ). (5.20)<br />
i=1<br />
∞∑<br />
∫ ∞<br />
s mi s m (t) f i (t)dt =<br />
i=1<br />
−∞<br />
∞∑<br />
smi 2 = ‖s∞ m ‖2 . (5.21)<br />
RESULT 5.1 The optimum receiver for transmission of the signals s m (t), m ∈ M, over an<br />
additive white Gaussian noise waveform channel has to maximize<br />
Here<br />
∫ ∞<br />
−∞<br />
r(t)s m (t)dt + c m over m ∈ M = {1, 2, · · · , |M|}. (5.22)<br />
N 0<br />
c m =<br />
2 ln Pr{M = m} − E s m<br />
2 . (5.23)<br />
This receiver is called a correlation receiver (see figure 5.4).<br />
i=1
CHAPTER 5. WAVEFORM CHANNELS, CORRELATION RECEIVER 52<br />
r(t)<br />
s 1 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
s 2 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
integrator<br />
integrator<br />
∫ T<br />
0 r(t)s 1(t)dt<br />
∫ T<br />
0 r(t)s 2(t)dt<br />
c 1<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
c 2<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
ˆm<br />
✲<br />
s |M| (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
.<br />
integrator<br />
∫ T<br />
0 r(t)s |M|(t)dt<br />
c |M|<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
Figure 5.4: Correlation receiver, an optimum receiver for waveforms in white Gaussian noise.<br />
5.5 Sufficient statistic<br />
In the previous section we have seen that the receiver can make an optimum decision by looking<br />
only at the correlations<br />
∫ T<br />
0<br />
r(t)s m (t)dt, for all m ∈ M. (5.24)<br />
These |M| correlations together form a so-called sufficient statistic for making an optimum decision.<br />
It is unnecessary to store the received waveform r(t) after having determined the correlations.<br />
Or, to put it in another way, given the correlations the received waveform r(t) is not<br />
relevant anymore.<br />
5.6 Exercises<br />
1. Consider a waveform communication system in which one of two equally likely messages<br />
is transmitted, hence M = {1, 2} and Pr{M = 1} = Pr{M = 2} = 1/2. The waveforms<br />
s 1 (t) and s 2 (t) corresponding to the messages both have energy E, i.e.<br />
∫ T<br />
0<br />
s 2 1 (t)dt = ∫ T<br />
0<br />
s2 2 (t)dt = E,<br />
where [0, T ) is the observation interval. The channel adds zero-mean additive Gaussian<br />
noise with power spectral density S Nw ( f ) = 1 to the transmitted waveform.<br />
(a) Find an expression for the error probability obtained by a correlation receiver when<br />
s 1 (t) = −s 2 (t).<br />
(b) What is the error probability of a correlation receiver when ∫ T<br />
0 s 1(t)s 2 (t)dt = 0?
Chapter 6<br />
Waveform channels, building-blocks<br />
SUMMARY: In this chapter we assume that when signal waveforms are linear combinations<br />
of a set of orthonormal waveforms (building-bock waveforms) the waveform channel<br />
can be transformed into an AGN vector channel without loss of optimality. To each input<br />
waveform there corresponds a vector in what is called the signal space. It is possible to<br />
make an optimum decision based on the correlations of the channel output waveform with<br />
the building-block waveforms. The difference between this vector of correlations and the<br />
transmitted vector is a noise vector which is Gaussian and spherically distributed.<br />
6.1 Another sufficient statistic<br />
Suppose that the number |M| of waveforms s m (t) is large. Then the receiver has to determine<br />
|M| correlations which is quite complex. The question that pops up is now is whether an optimum<br />
decision can be based on a smaller number of correlations? This turns out to be true when<br />
all waveforms s m (t), m ∈ M, can be represented as linear combinations of N orthonormal functions<br />
and N < |M|. We call these orthonormal functions building-block waveforms or building<br />
blocks.<br />
The decomposition of waveforms into building-blocks is also essential when we are interested<br />
in evaluating the error probability of a system.<br />
Definition 6.1 The building-block waveforms {ϕ i (t), i = 1, N} satisfy by definition<br />
∫ T<br />
{<br />
ϕ i (t)ϕ j (t)dt =<br />
1 if i = j<br />
0 if i ̸= j.<br />
0<br />
Note that such a set of building blocks is just a set of functions orthonormal over the interval<br />
[0, T ).<br />
We say that a basis of N building-block waveforms exists for a collection s 1 (t), s 2 (t), · · · ,<br />
s |M| (t) of transmitted waveforms if we can decompose these transmitted waveforms as<br />
s m (t) = ∑<br />
s mi ϕ i (t), for m ∈ M = {1, 2, · · · , |M|}. (6.2)<br />
i=1,N<br />
53<br />
(6.1)
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 54<br />
Note that the coefficient s mi corresponding to a building-block ϕ i (t) satisfies<br />
∫ T<br />
0<br />
s m (t)ϕ i (t)dt =<br />
∫ T<br />
0<br />
∑<br />
s mj ϕ j (t)ϕ i (t)dt = ∑ ∫ T<br />
s mj ϕ j (t)ϕ i (t)dt = s mi . (6.3)<br />
j=1,N<br />
j=1,N<br />
0<br />
Now the correlation of r(t) and the signal s m (t) for m ∈ M can be expressed as<br />
with<br />
∫ T<br />
0<br />
r(t)s m (t)dt =<br />
∫ T<br />
0<br />
i=1,N<br />
r(t) ∑<br />
i=1,N<br />
0<br />
s mi ϕ i (t)dt<br />
= ∑ ∫ T<br />
s mi r(t)ϕ i (t)dt = ∑<br />
s mi r i , (6.4)<br />
i=1,N<br />
∫ T<br />
<br />
r i = r(t)ϕ i (t)dt for i = 1, N. (6.5)<br />
0<br />
Observe that all the correlations ∫ T<br />
0 r(t)s m(t)dt, m ∈ M, can be computed from the N<br />
correlations ∫ T<br />
0 r(t)ϕ i(t)dt, i = 1, N. Therefore also these N correlations ∫ T<br />
0 r(t)ϕ i(t)dt, i =<br />
1, N, are a sufficient statistic since from these N numbers an optimum decision can be made.<br />
This is summarized in the following theorem.<br />
RESULT 6.1 If the transmitted waveforms s m (t), m ∈ M, are linear combinations of the N<br />
building-block waveforms ϕ i (t), i = 1, N, i.e. if<br />
s m (t) = ∑<br />
s mi ϕ i (t), for m ∈ M, (6.6)<br />
i=1,N<br />
then the receiver can make an optimum decision from<br />
r i =<br />
∫ T<br />
0<br />
r(t)ϕ i (t)dt, for i = 1, N. (6.7)<br />
The numbers r 1 , r 2 , · · · , r N therefore form a sufficient statistic. The optimum receiver produces<br />
as estimate the message m that maximizes ∑ N<br />
i=1 r i s mi + c m over m ∈ M. Note that c m is given<br />
by ( 5.23 ).<br />
All this results in the receiver that is shown in figure 6.1. There we first see a bank of N<br />
multipliers and integrators. This bank yields the correlations r i = ∫ T<br />
0 r(t)ϕ i(t)dt, i = 1, N.<br />
Then a matrix multiplication yields ∑ i=1,N r is mi = ∫ T<br />
0 r(t)s m(t)dt for all m ∈ M. Adding the<br />
constants c m and choosing as message estimate the m that achieves the largest sum yields again<br />
an optimum receiver.
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 55<br />
r(t)<br />
ϕ 1 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
ϕ 2 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
integrator<br />
integrator<br />
r 1<br />
✲<br />
r 2<br />
✲<br />
weighting<br />
matrix<br />
c 1<br />
∑ Ni=1<br />
r i s 1i<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
c 2<br />
∑ Ni=1<br />
r i s 2i<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
✲ˆm<br />
ϕ N (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
.<br />
integrator<br />
r N<br />
✲<br />
.<br />
c |M|<br />
∑ Ni=1<br />
r i s |M|i<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
Figure 6.1: Correlation receiver based on building blocks.<br />
6.2 The Gram-Schmidt orthogonalization procedure<br />
It is not so difficult to construct an orthonormal basis, i.e a set of N building-block waveforms,<br />
for a given set of |M| finite-energy signaling waveforms. The Gram-Schmidt procedure can be<br />
used.<br />
RESULT 6.2 (Gram-Schmidt) For an arbitrary signal collection, i.e. a collection of waveforms<br />
{s 1 (t), s 2 (t), · · · , s |M| (t)} on the interval [0, T ) we can construct a set of N ≤ |M|<br />
building-block waveforms {ϕ 1 (t), ϕ 2 (t), · · · , ϕ N (t)} and find coefficients s mi such that for m =<br />
1, 2, · · · , |M| the signals can be synthesized as s m (t) = ∑ i=1,N s miϕ i (t).<br />
The proof of this result can found in appendix E. There is also an example there where for<br />
three signals we determine a set of building-blocks.<br />
A certain set of signals can be expanded in many different sets of building-block waveforms,<br />
also in sets with a larger dimensionality in general. What remains the same however is the geometrical<br />
configuration of the vector representations of the signals (see exercise 3 in this chapter).<br />
The Gram-Schmidt procedure always yields a base with the smallest possible dimension.<br />
6.3 Transmitting vectors instead of waveforms<br />
In the previous section we have seen that for each collection of signals s 1 (t), s 2 (t), · · · , s |M| (t)<br />
we can construct a finite orthonormal base such that each waveform s m (t) for m ∈ M is a linear<br />
combination of the building-block waveforms ϕ 1 (t), ϕ 2 (t), · · · , ϕ N (t) that form the base, i.e.<br />
s m (t) = ∑<br />
s mi ϕ i (t). (6.8)<br />
i=1,N
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 56<br />
ϕ 1 (t)<br />
✛❄<br />
✘<br />
s ✲<br />
m1 ×<br />
✚✙<br />
ϕ 2 (t)<br />
✛❄<br />
✘<br />
s ✲<br />
✲<br />
m2 ×<br />
✚✙<br />
. +<br />
✲<br />
ϕ N (t)<br />
✛❄<br />
✘<br />
s ✲<br />
m N<br />
×<br />
✚✙<br />
✲<br />
s m (t)<br />
Figure 6.2: A modulator forms the waveform s m (t) from the vector s m .<br />
Hence to each waveform s m (t) there corresponds a vector consisting of N coefficients<br />
s m = (s m1 , s m2 , · · · , s m N ). (6.9)<br />
Now, instead of the waveform s m (t) we can say that the vector s m is transmitted. A<br />
modulator (see figure 6.2) performs operation (6.8) that transforms a vector into the transmitted<br />
waveform.<br />
6.4 <strong>Signal</strong> space, signal structure<br />
The next result applies to the set of all transmitted waveforms.<br />
RESULT 6.3 If the waveforms s 1 (t), s 2 (t), · · · , s |M| (t) are linear combinations of N buildingblocks,<br />
the collection of waveforms can be represented as a collection of vectors s 1 , s 2 , · · · , s |M|<br />
in an N-dimensional space. This space is called the signal space. The collection of vectors<br />
s 1 , s 2 , · · · , s |M| is called the signal structure.<br />
Example 6.1 Consider for m = 1, 2, 3, 4 the set of phase-modulated transmitter waveforms<br />
{ √<br />
2E s<br />
s m (t) = T<br />
cos(2π f 0t + mπ ) for 0 ≤ t < T and<br />
2<br />
0 elsewhere,<br />
(6.10)<br />
where f 0 is an integral multiple of 1/T .<br />
Note that<br />
cos(2π f 0 t + mπ<br />
2 ) = cos(2π f 0t) cos mπ<br />
2 − sin(2π f 0t) sin mπ<br />
2 . (6.11)
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 57<br />
ϕ<br />
s 2<br />
3 ✻<br />
s 1<br />
✛<br />
s 2<br />
s<br />
✲ 4<br />
ϕ 1<br />
❄<br />
Figure 6.3: Four signals represented as vectors in a two-dimensional space. The length of all<br />
signal vectors is √ E s .<br />
If we now take as building-block waveforms<br />
ϕ 1 (t) = √ 2/T cos(2π f 0 t), and<br />
ϕ 2 (t) = √ 2/T sin(2π f 0 t), (6.12)<br />
for 0 ≤ t < T and zero elsewhere, we obtain the following vector-representations for the signals:<br />
s 1<br />
= (0, − √ E s ),<br />
s 2<br />
= (− √ E s , 0),<br />
s 3<br />
= (0, √ E s ),<br />
s 4<br />
= ( √ E s , 0). (6.13)<br />
These vectors are depicted in figure 6.3. Note that the four waveforms shown in figure 6.4 have the same<br />
signal structure (with respect to the base also shown there) as the signals defined in (6.10). We will learn<br />
later in this chapter that both sets of waveforms also have identical error behavior.<br />
√<br />
1<br />
τ<br />
0<br />
ϕ 1 (t)<br />
τ<br />
ϕ 2 (t)<br />
2τ<br />
t<br />
√<br />
E<br />
− s<br />
τ<br />
√<br />
E s<br />
τ<br />
0 τ 2τ 0<br />
s 1 (t)<br />
s 3 (t)<br />
t<br />
t<br />
√<br />
E<br />
− s<br />
τ<br />
√<br />
E s<br />
τ<br />
s 2 (t)<br />
s 4 (t)<br />
τ<br />
2τ<br />
t<br />
t<br />
0<br />
τ<br />
2τ<br />
0<br />
τ<br />
2τ<br />
Figure 6.4: Another set of waveforms that leads to the vector diagram in figure 6.3.
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 58<br />
6.5 Receiving vectors instead of waveforms<br />
In the previous sections we have shown that a waveform-transmitter actually sends a vector out of<br />
an N-dimensional signal space. In this section we will see that the optimum waveform-receiver<br />
as derived in section 6.1, which operates with sufficient statistics corresponding to the buildingblocks,<br />
actually receives a vector in this N-dimensional signal space.<br />
r(t)<br />
ϕ 1 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
ϕ 2 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
integrator<br />
integrator<br />
r 1<br />
✲<br />
r 2<br />
✲<br />
ϕ N (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
.<br />
integrator<br />
r N<br />
✲<br />
Figure 6.5: Recovering the vector r = (r 1 , r 2 , · · · , r N ) from the received waveform r(t).<br />
Consider the first part (demodulator) of the optimum receiver for waveform communication<br />
working with sufficient statistics (see figure 6.5). This receiver observes<br />
r(t) = s m (t) + n w (t), (6.14)<br />
where s m (t) for some m ∈ M is the transmitted waveform and n w (t) is a realization of a white<br />
Gaussian noise process. The first part of the receiver determines the N-dimensional vector r =<br />
(r 1 , r 2 , · · · , r N ) whose components are defined as (see figure 6.5)<br />
∫ T<br />
<br />
r i = r(t)ϕ i (t)dt for i = 1, 2, · · · , N. (6.15)<br />
0<br />
The components of this vector are the sufficient statistics mentioned in section 6.1. So we can<br />
say that the optimum receiver makes a decision based on the vector r instead of on the<br />
waveform r(t).<br />
6.6 The additive Gaussian vector channel<br />
In the previous sections we have seen that without losing optimality we may assume that an<br />
N-dimensional vector s m for some m ∈ M is transmitted and an N-dimensional vector r is
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 59<br />
received. We will now consider the vector-channel that has s m as input and r as output. Since<br />
r(t) = s m (t) + n w (t) we have that<br />
with for i = 1, N,<br />
r i =<br />
=<br />
∫ T<br />
0<br />
∫ T<br />
0<br />
r(t)ϕ i (t)dt<br />
Hence again, as in section 5.3, we get<br />
s m (t)ϕ i (t)dt +<br />
∫ T<br />
0<br />
n w (t)ϕ i (t)dt = s mi + n i , (6.16)<br />
∫ T<br />
<br />
n i = n w (t)ϕ i (t)dt. (6.17)<br />
0<br />
r = s m + n, (6.18)<br />
with r = (r 1 , r 2 , · · · , r N ), s = (s 1 , s 2 , · · · , s N ), and n = (n 1 , n 2 , · · · , n N ). Note that the<br />
dimension N of the vectors is the dimension of the signal space (the number of building-block<br />
waveforms), which is now finite. In section 5.3 the dimension of the vectors was infinite.<br />
What can we say about the noise vector n that disturbs the transmitted vector s m ? Just<br />
like before, in section 5.3, the noise components are jointly Gaussian. They result from linear<br />
operations on the Gaussian process N w (t). For the mean of component N i , i = 1, N, we again<br />
find that<br />
∫ T<br />
∫ T<br />
E[N i ] = E[ N w (t)ϕ i (t)dt] = E[N w (t)]ϕ i (t)dt = 0. (6.19)<br />
0<br />
0<br />
By the orthonormality of the building-block waveforms we get for the correlation of component<br />
N i , i = 1, N, and component N j , j = 1, N, that<br />
∫ T ∫ T<br />
E[N i N j ] = E[ N w (α)N w (β)ϕ i (α)ϕ j (β)dαdβ]<br />
=<br />
=<br />
=<br />
0 0<br />
∫ T ∫ T<br />
0 0<br />
∫ T ∫ T<br />
0<br />
∫ T<br />
0<br />
0<br />
E[N w (α)N w (β)]ϕ i (α)ϕ j (β)dαdβ<br />
N 0<br />
2 δ(α − β)ϕ i(α)ϕ j (β)dαdβ<br />
N 0<br />
2 ϕ i(α)ϕ j (α)dα = N 0<br />
2 δ i j, (6.20)<br />
and thus all N noise variables have variance N 0 /2 and are uncorrelated and since they are jointly<br />
Gaussian also independent of each other.<br />
RESULT 6.4 The joint density function of the N-dimensional noise vector n, that is added to<br />
the input of a vector channel, which is derived from a waveform communication system, is<br />
p N (n) =<br />
1<br />
exp(−‖n‖2 ), (6.21)<br />
(π N 0 ) N/2 N 0
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 60<br />
hence the noise is spherically symmetric and depends on the magnitude but not on the direction<br />
of n. The noise projected on each dimension i = 1, N has variance N 0 /2.<br />
We may conclude that our channel with input s and output r is an additive Gaussian noise<br />
(AGN) vector channel as was described in definition 4.2.<br />
6.7 <strong>Processing</strong> the received vector<br />
Note (see 4.12) that after having determined the vector r, an optimum vector receiver should<br />
search for the message m that minimizes<br />
‖r − s m ‖ 2 − 2σ 2 ln Pr{M = m}. (6.22)<br />
Note that by result 6.4 we must substitute N 0 /2 for σ 2 here. Inspection now shows that an<br />
optimum vector receiver should maximize<br />
(r · s m ) + N 0<br />
2 ln Pr{M = m} − ‖s m ‖2<br />
2<br />
= ∑<br />
i=1,N<br />
r i s mi + N 0<br />
2 ln Pr{M = m} − ‖s m ‖2<br />
2<br />
(6.23)<br />
over m ∈ M. If we note that<br />
E sm =<br />
∫ T<br />
0<br />
∫ T<br />
sm 2 (t)dt = s m (t) ∑<br />
s mi ϕ i (t)dt<br />
0<br />
i=1,N<br />
i=1,N<br />
= ∑ ∫ T<br />
s mi s m (t)ϕ i (t)dt = ∑<br />
smi 2 = ‖s m ‖2 , (6.24)<br />
0<br />
i=1,N<br />
we can observe that the back end of the receiver shown in figure 6.1 performs exactly as we<br />
expect.<br />
6.8 Relation between waveform and vector channel<br />
We have seen that to each set of waveforms there corresponds a finite set of building-block waveforms<br />
such that each waveform is a linear combination of these building-block waveforms. We<br />
have observed that instead of the waveform s m (t) the vector s m is transmitted over the channel.<br />
We have also seen that an optimum receiver only needs to consider the projections (correlations)<br />
of the received signal r(t) onto the building-block waveforms. The receiver can base its<br />
decision only on the vector r, a sufficient statistic. It is important to note that these conclusions<br />
depend on the fact that the channel adds white Gaussian noise to the input waveform.<br />
From the above we may conclude that transmission over an additive white Gaussian waveform<br />
channel can actually be considered as transmission over an additive Gaussian noise vector<br />
channel (see figure 6.6). A vector transmitter produces the vector s m when message m is generated<br />
by the source. This vector is transformed into the waveform s m (t) by a modulator. The
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 61<br />
transmitter<br />
optimum receiver<br />
<br />
❅ ❅ m s m s m (t) r(t) r ˆm<br />
source<br />
✲ vector ✲<br />
✲ waveform<br />
✲<br />
✲ ✲<br />
✲ modulator<br />
demodulator<br />
vector<br />
✲<br />
transmitter<br />
channel<br />
receiver<br />
❅ ❅<br />
vector channel<br />
<br />
Figure 6.6: Reduction of waveform channel to vector channel.<br />
waveform s m (t) is input to the channel that adds white Gaussian noise. The demodulator operates<br />
on the received waveform r(t) and forms the vector r. A vector receiver determines the<br />
estimate ˆm of the message that was sent from r. The combination modulator - waveform channel<br />
- demodulator is an additive Gaussian noise vector channel. In the previous chapter we have<br />
seen how an optimum receiver for such a channel can be constructed. The decision function that<br />
applies to this situation can be found in (4.12).<br />
We may conclude that, by converting the problem of designing an optimum receiver for<br />
waveform communication to designing an optimum receiver for an additive white Gaussian noise<br />
vector channel, we have solved it. We want to end this chapter with a very important observation:<br />
RESULT 6.5 The error behavior of the waveform communication system is only determined by<br />
the collection of signal vectors (the signal structure) and N 0 /2. We should realize that depending<br />
on the chosen building-block waveforms different sets of waveforms having the same signal structure<br />
yield the same error performance. What building-block waveforms actually are chosen by<br />
the communication engineer depends possibly on other factors as e.g. bandwidth requirements,<br />
as we will see in later parts of these course notes.<br />
6.9 Exercises<br />
1. For each set of waveforms s 1 (t), s 2 (t), · · · , s |M| (t) we can apply the Gram-Schmidt procedure<br />
to construct an orthonormal base such that each waveform s m (t) for m ∈ M is a<br />
linear combination of the building-block waveforms ϕ 1 (t), ϕ 2 (t), · · · , ϕ N (t). Show that<br />
for a given set of waveforms the Gram-Schmidt procedure results in a base with the smallest<br />
possible dimension N.<br />
2. If we apply the Gram-Schmidt procedure we must assume that the waveforms s m (t), for<br />
m ∈ M, have finite energy. In the construction it is essential that the energy E θm corresponding<br />
the the auxiliary signal θ m, (t) is finite. Show that<br />
E θm ≤<br />
∫ T<br />
0<br />
sm 2 (t)dt < ∞. (6.25)
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 62<br />
3. Show that independent of the choice of the building-block waveforms, the length of any<br />
signal vector s m for m ∈ M is constant. Moreover show that the Euclidean distance<br />
between two signal vectors s m and s m ′ for m, m ′ ∈ M is constant. What does this imply<br />
for energy and expected error probability of the system?<br />
4. (a) Calculate PE<br />
min when the signal sets (a), (b), and (c) specified in figure 6.7 are used to<br />
communicate one of two equally likely messages over a channel disturbed by additive<br />
Gaussian noise with S n ( f ) = 0.15. Note that the third set is specified by the Fourier<br />
transforms (see appendix B) of the waveforms. Use Parseval’s equation.<br />
(b) Also calculate the PE<br />
min for the same three sets when the a-priori message probabilities<br />
are 1/4, 3/4.<br />
(Based on exercise 4.8 from Wozencraft and Jacobs [25].)<br />
s 1 (t)<br />
1 1<br />
s 2 (t)<br />
s 1 (t)<br />
2 2<br />
s 2 (t)<br />
t t t t<br />
0 1 0 1<br />
0 1 0 2<br />
(a)<br />
(b)<br />
1<br />
S 1 ( f )<br />
-1 0 +1<br />
f<br />
(c)<br />
− 3 2<br />
1<br />
− 1 2<br />
S 2 ( f )<br />
Figure 6.7: S i ( f ), the Fourier transform of s i (t), is real.<br />
1<br />
2<br />
3<br />
2<br />
f<br />
5. One of three equally likely messages is to be transmitted over an additive white Gaussian<br />
noise channel with S n ( f ) = N 0 /2 = 1 by means of pulse amplitude modulation.<br />
Specifically<br />
for 0 ≤ t ≤ 1 and zero elsewhere.<br />
s 1 (t) = +2 √ 2 sin(2πt),<br />
s 2 (t) = 0,<br />
s 3 (t) = −2 √ 2 sin(2πt), (6.26)<br />
(a) What mathematical operations are performed by the optimum receiver?<br />
(b) Express the resulting average error probability in terms of Q(·).
CHAPTER 6. WAVEFORM CHANNELS, BUILDING-BLOCKS 63<br />
(c) Calculate the minimum attainable average error probability if instead of being equal<br />
the message probabilities satisfy<br />
Pr{M = 1} = Pr{M = 3} = 1 and Pr{M = 2} =<br />
e<br />
e + 2<br />
where e is the base of the natural logarithm.<br />
(Exam Communication Principles, July 5, 2002.)<br />
e + 2 , (6.27)<br />
6. One of two equally likely messages is to be transmitted over an additive white Gaussian<br />
noise channel with S n ( f ) = N 0 /2 = 2 using two waveforms s 1 (t) and s 2 (t). Specifically<br />
s 1 (t) = 2 √ 2φ 1 (t), and<br />
where φ 1 (t) and φ 2 (t) are two orthonormal waveforms, i.e.<br />
∫ ∞<br />
−∞<br />
s 2 (t) = 2 √ 2φ 2 (t), (6.28)<br />
∫ ∞<br />
∫ ∞<br />
φ1 2 (t)dt = φ2 2 (t)dt = 1 and φ 1 (t)φ 2 (t)dt = 0. (6.29)<br />
−∞<br />
(a) What mathematical operations are performed by an optimum receiver?<br />
(b) Express the the resulting average error probability in terms of Q(·).<br />
(c) Calculate the minimum attainable average error probability if instead of being equal<br />
the message probabilities satisfy<br />
−∞<br />
Pr{M = 1} = 1<br />
e 2 and Pr{M = 2} =<br />
e2<br />
+ 1<br />
where e is the base of the natural logarithm.<br />
(Exam Communication Principles, July 8, 2003.)<br />
e 2 + 1 , (6.30)
Chapter 7<br />
Matched filters<br />
SUMMARY: Here we discuss matched-filter receiver structures for optimum waveform<br />
communication. The optimum receiver can be implemented as a matched-filter receiver,<br />
so a matched filter not only maximizes the signal-to-noise ratio but it can also be used to<br />
minimize the error probability. In the last part of this chapter we consider the Parseval<br />
relation between correlation and dot product.<br />
7.1 Matched-filter receiver<br />
r(t)<br />
✲<br />
h i (t) = ϕ i (T − t)<br />
u i (t)<br />
✲<br />
Figure 7.1: A filter matched to the building-block waveform ϕ i (t).<br />
Note that for all i = 1, N the building-block waveforms are such that ϕ i (t) ≡ 0 for t < 0<br />
and t ≥ T .<br />
We can now replace the N multipliers and integrators by N matched filters and samplers.<br />
This can be advantageous in analog implementations since accurate multipliers are in that case<br />
more expensive than filters. Note that appendix C contains some elementary material on filters.<br />
Consider (see figures 7.1 and 7.2) filters with an impulse response h i (t) = ϕ i (T − t) for<br />
i = 1, N. Note that h i (t) ≡ 0 for t ≤ 0 (i.e. the filter is causal) and also for t > T . For the<br />
output of such a filter we can write<br />
u i (t) = r(t) ∗ h i (t) =<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(α)h i (t − α)dα<br />
r(α)ϕ i (T − t + α)dα. (7.1)<br />
64
CHAPTER 7. MATCHED FILTERS 65<br />
ϕ i (t)<br />
ϕ i (T − t)<br />
❆<br />
✁ ✁✁❆ ❆<br />
❆ ❆<br />
✁✁ ✁ ❆<br />
✁ ✁✁✁✁❆ ❆<br />
❆❆ t<br />
❆❆ t<br />
✁<br />
0 T<br />
0<br />
T<br />
Figure 7.2: A building-block waveform ϕ i (t) and the impulse response ϕ i (T − t) of the corresponding<br />
matched filter.<br />
At t = T the matched-filter output<br />
u i (T ) =<br />
∫ ∞<br />
−∞<br />
r(α)ϕ i (α)dα =<br />
∫ T<br />
0<br />
r(α)ϕ i (α)dα = r i , (7.2)<br />
hence we can determine the i-th component of the vector r in this way. Note that r i is the i-th<br />
component of the sufficient statistic corresponding to the building-blocks.<br />
r(t)<br />
✲<br />
✲<br />
ϕ 1 (T − t)<br />
ϕ 2 (T − t)<br />
❍ r 1<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁ ✲<br />
u 1 (t)<br />
❍ r 2<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁ ✲<br />
u 2 (t)<br />
weighting<br />
matrix<br />
c 1<br />
(r · s ❄ 1 ) ✛✘<br />
✲ + ✲<br />
✚✙<br />
c 2<br />
(r · s ❄ 2 ) ✛✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
ˆm<br />
✲<br />
∑<br />
✲ ϕ N (T − t)<br />
❍ r N<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁ ✲<br />
u N (t)<br />
i r is mi<br />
c |M|<br />
(r · s ✛❄<br />
|M| ) ✘<br />
✲ + ✲<br />
✚✙<br />
❄<br />
sample at t = T<br />
Figure 7.3: Matched-filter receiver based on building-blocks.<br />
Now we have an alternative method for computing the vector r. This results in the receiver<br />
shown in figure 7.3. Note that for all m ∈ M<br />
(r · s m ) = ∑<br />
i=1,N<br />
r i s mi =<br />
∫ T<br />
0<br />
r(t)s m (t)dt. (7.3)
CHAPTER 7. MATCHED FILTERS 66<br />
A filter whose impulse response is a delayed time-reversed version of a signal ϕ i (t) is called<br />
”matched to” ϕ i (t). A receiver that is equipped with such filters is called a matched-filter receiver.<br />
7.2 Direct receiver<br />
Note that, just like the building-blocks, for all m ∈ M the transmitted waveforms are such that<br />
s m (t) ≡ 0 for t < 0 and t ≥ T .<br />
Now for all m ∈ M consider a filter with impulse response s m (T − t), let the waveform<br />
channel output r(t) be the input of all these filters and sample the |M| filter outputs at t = T .<br />
Then we obtain<br />
∫ ∞<br />
−∞<br />
r(α)s m (T − t + α)dα t=T =<br />
∫ ∞<br />
−∞<br />
r(α)s m (α)dα =<br />
∫ T<br />
0<br />
r(α)s m (α)dα. (7.4)<br />
This gives another method to form an optimum receiver (see figure 7.4). This receiver is called a<br />
direct receiver since the filters are matched directly to the signals {s m (t), m ∈ M}.<br />
Note that a direct receiver is usually more expensive than a receiver with filters matched to<br />
the building-block waveforms, since always M ≥ N and in practice even often M ≫ N. The<br />
weighting-matrix operations are not needed here however.<br />
r(t)<br />
✲<br />
✲<br />
s 1 (T − t)<br />
s 2 (T − t)<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
c 1<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
c 2<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
ˆm<br />
✲<br />
✲<br />
s |M| (T − t)<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
c |M|<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
❄<br />
sample at t = T<br />
Figure 7.4: Direct matched-filter receiver.<br />
7.3 <strong>Signal</strong>-to-noise ratio<br />
We have seen in the previous section that a matched-filter receiver is optimum, i.e. minimizes<br />
the error probability P E . Not only in this sense is the matched filter optimum, we will show
CHAPTER 7. MATCHED FILTERS 67<br />
r(t) = s(t) + n w (t)<br />
✲<br />
h(t)<br />
❍ u ❍❍ s (T ) + u n (T )<br />
✄<br />
✂ ✄<br />
✁ ✂ ✁<br />
✲<br />
❄ sample at t = T<br />
Figure 7.5: A matched filter maximizes S/N .<br />
next that it also maximizes the signal-to-noise ratio. To see what we mean by this, consider<br />
the communication situation shown in figure 7.5. A signal s(t) is assumed to be non-zero only<br />
for 0 ≤ t < T . This signal is observed in additive white noise, i.e. the observer receives<br />
r(t) = s(t)+n w (t). The process N w (t) is a zero-mean white Gaussian noise process with power<br />
spectral density S w ( f ) = N 0 /2 for all −∞ < f < ∞.<br />
The problem is now e.g. to decide whether the signal s(t) was present in the noise or not<br />
(or in a similar setting to decide whether s(t) or −s(t) was seen). Therefore the observer uses a<br />
linear time-invariant filter with impulse response h(t) and samples the filter output at time t = T .<br />
For the sampled filter output u(t) at time t = T we can write<br />
with<br />
u(T ) = r(t) ∗ h(t)| t=T<br />
=<br />
u s (T )<br />
u n (T )<br />
∫ ∞<br />
−∞<br />
<br />
=<br />
<br />
=<br />
r(T − α)h(α)dα = u s (T ) + u n (T ), (7.5)<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
s(T − α)h(α)dα, and (7.6)<br />
n w (T − α)h(α)dα, (7.7)<br />
where u s (T ) is the signal component and u n (T ) the noise component in the the sampled filter<br />
output.<br />
Definition 7.1 We can now define the signal-to-noise ratio as<br />
i.e. the ratio between signal energy and noise variance.<br />
The noise variance can be expressed as<br />
E[U 2 n (T )] = E[ ∫ ∞<br />
=<br />
= N 0<br />
2<br />
−∞<br />
∫ ∞ ∫ ∞<br />
−∞ −∞<br />
∫ ∞ ∫ ∞<br />
−∞<br />
S/N = u2 s (T )<br />
E[Un 2 (7.8)<br />
(T )],<br />
N w (T − α)h(α)dα<br />
∫ ∞<br />
−∞<br />
N w (T − β)h(β)dβ]<br />
E[N w (T − α)N w (T − β)]h(α)h(β)dαdβ<br />
−∞<br />
δ(β − α)h(α)h(β)dαdβ = N 0<br />
2<br />
∫ ∞<br />
−∞<br />
h 2 (α)dα. (7.9)
CHAPTER 7. MATCHED FILTERS 68<br />
RESULT 7.1 If we substitute (7.9) and (7.6) in (7.8) we obtain for the maximum attainable<br />
signal-to-noise ratio<br />
S/N =<br />
(∗)<br />
≤<br />
=<br />
[∫ ∞<br />
−∞ s(T − α)h(α)dα] 2<br />
N 0<br />
∫ ∞<br />
2 −∞ h2 (α)dα<br />
∫ ∞<br />
−∞ s2 (T − α)dα ∫ ∞<br />
−∞ h2 (α)dα<br />
N 0<br />
∫ ∞<br />
2 −∞ h2 (α)dα<br />
∫ ∞ ∫<br />
−∞ s2 (T − α)dα T<br />
=<br />
N 0<br />
2<br />
0 s2 (T − α)dα<br />
. (7.10)<br />
N 0<br />
2<br />
The inequality (∗) in this derivation comes from Schwarz inequality (see appendix F). Equality<br />
is obtained if and only if h(t) = Cs(T − t) for some constant C, i.e. if the filter h(t) is matched<br />
to the signal s(t).<br />
Note that the maximum signal-to-noise ratio depends only on the energy of the waveform<br />
s(t) and not on its specific shape.<br />
It should be noted that in this section we have demonstrated a weaker form of optimality for<br />
the matched filter than the one that we have obtained in the previous chapter. The matched filter<br />
is not only the filter that maximizes signal-to-noise ratio, but can be used for optimum detection<br />
as well.<br />
7.4 Parseval relation<br />
Equation (7.3) demonstrates that a correlation can be expressed as the dot product of the corresponding<br />
vectors. We will investigate this phenomenon more closely here. Therefore consider a<br />
set of orthonormal waveforms {ϕ i (t), i = 1, 2, · · · , N} over interval [0, T ) and two waveforms<br />
f (t) and g(t) that can be expressed in terms of these building-blocks, i.e.<br />
f (t)<br />
g(t)<br />
<br />
= ∑<br />
i=1,N<br />
<br />
= ∑<br />
i=1,N<br />
f i ϕ i (t),<br />
The vector-representations that correspond to f (t) and g(t) are<br />
Now:<br />
f = ( f 1 , f 2 , · · · , f N ) and<br />
g i ϕ i (t). (7.11)<br />
g = (g 1 , g 2 , · · · , g N ). (7.12)
CHAPTER 7. MATCHED FILTERS 69<br />
RESULT 7.2 (Parseval relation)<br />
∫ T<br />
0<br />
f (t)g(t)dt =<br />
∫ T<br />
0<br />
= ∑<br />
i=1,N<br />
= ∑<br />
i=1,N<br />
∑<br />
i=1,N<br />
∑<br />
j=1,N<br />
∑<br />
j=1,N<br />
f i ϕ i (t) ∑<br />
j=1,N<br />
g j ϕ j (t)dt<br />
∫ T<br />
f i g j ϕ i (t)ϕ j (t)dt<br />
0<br />
f i g j δ i j = ∑<br />
i=1,N<br />
f i g i = ( f · g). (7.13)<br />
This result says that the correlation of f (t) and g(t), which is defined as the integral of their<br />
product, is equal to the dot product of the corresponding vectors. Note that this result can be<br />
regarded as an analogue to the Parseval relation in Fourier analysis (see appendix B) which says<br />
that<br />
with<br />
∫ ∞<br />
−∞<br />
F( f ) =<br />
G( f ) =<br />
f (t)g(t)dt =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
Here G ∗ ( f ) is the complex conjugate of G( f ).<br />
The consequences of result 7.2 are:<br />
• Take g(t) ≡ f (t) then<br />
∫ T<br />
0<br />
∫ ∞<br />
−∞<br />
F( f )G ∗ ( f )d f, (7.14)<br />
f (t) exp(− j2π f t)dt, and<br />
g(t) exp(− j2π f t)dt. (7.15)<br />
f 2 (t)dt = ( f · f ) = ‖ f ‖ 2 . (7.16)<br />
This means that the energy of waveform f (t) is simply the square of the length of the<br />
corresponding vector f . We therefore also call the squared length of a vector its energy.<br />
We have seen before that ∫ T<br />
= sm 2 (t)dt = ‖s m ‖2 , (7.17)<br />
E sm<br />
0<br />
the energy corresponding to the waveform s m (t), for m ∈ M = {1, 2, · · · , |M|}.<br />
• Consider ∫ T<br />
r(t)s m (t)dt =<br />
0<br />
∫ T<br />
0<br />
i=1,N<br />
r(t) ∑<br />
i=1,N<br />
0<br />
s mi ϕ i (t)dt<br />
= ∑ ∫ T<br />
s mi r(t)ϕ i (t)dt = ∑<br />
s mi r i = (s m · r). (7.18)<br />
i=1,N<br />
This result is similar to the Parseval result 7.2 but not identical since r(t) ̸= ∑ i=1,N r iϕ i (t),<br />
i.e. r(t) can not be expressed as a linear combination of building-block waveforms. Note<br />
that (7.18) is identical to equation (7.3).
CHAPTER 7. MATCHED FILTERS 70<br />
7.5 Notes<br />
Figure 7.6: Dwight O. North, inventor of the matched filter. Photo IEEE-IT Soc. Newsl., Dec.<br />
1998.<br />
The matched filter as a filter for maximizing the signal-to-noise ratio was invented by North<br />
[14] in 1943. The result was published in a classified report at RCA Labs in Princeton. The<br />
name “matched filter” was coined by Van Vleck and Middleton who independently published<br />
the result a year later in a Harvard Radio Research Lab report [24].<br />
7.6 Exercises<br />
1. One of two equally likely messages is to be transmitted over an additive white Gaussian<br />
noise channel with S n ( f ) = 0.05 by means of binary pulse position modulation. Specifically<br />
s 1 (t) = p(t)<br />
in which the pulse p(t) is shown in figure 7.7.<br />
s 2 (t) = p(t − 2), (7.19)<br />
(a) What mathematical operations are performed by the optimum receiver?<br />
(b) What is the resulting average error probability?<br />
(c) Indicate two methods of implementing the receiver, each of which uses a single linear<br />
filter followed by a sampler and comparison device. Method I requires that two samples<br />
from the filter output be fed into the comparison device. Method II requires that<br />
just one sample be used. For each method sketch the impulse response of the appropriate<br />
filter and its response to p(t). Which of the methods is most easily extended to<br />
M-ary pulse position modulation, where s m (t) = p(t − 2m + 2), m = 1, 2, · · · , M?
CHAPTER 7. MATCHED FILTERS 71<br />
2 p(t)<br />
❆<br />
1 ❆❆❆❆<br />
0 ✁✁✁✁✁✁❆ 1 2 t<br />
Figure 7.7: A pulse.<br />
(d) Suggest another pair of waveforms that require the same energy as the binary pulseposition<br />
waveforms and yield the same average error probability and a pair that<br />
achieves smaller error probability.<br />
(e) Calculate the minimum attainable average error probability if<br />
Repeat this for<br />
(Exercise 4.10 from Wozencraft and Jacobs [25].)<br />
s 1 (t) = p(t) and s 2 (t) = p(t − 1). (7.20)<br />
s 1 (t) = p(t) and s 2 (t) = −p(t − 1). (7.21)<br />
√<br />
Es /2<br />
s 1 (t)<br />
√<br />
Es /7<br />
s 2 (t)<br />
0<br />
1 2 3 4 5 6 7 t<br />
0 1 2 3 4 5 6 7 t<br />
− √ E s /7<br />
Figure 7.8: The waveforms s 1 (t) and s 2 (t).<br />
2. Specify a matched filter for each of the signals shown in figure 7.8 and sketch each filter<br />
output as a function of time when the signal matched to it is the input. Sketch the output<br />
of the filter matched to s 2 (t) when the input is s 1 (t).<br />
(Exercise 4.14 from Wozencraft and Jacobs [25].)<br />
3. Consider a transmitter that sends the signal<br />
s a (t) = ∑<br />
a k h(t − (k − 1)τ),<br />
k=1,K
CHAPTER 7. MATCHED FILTERS 72<br />
impulses<br />
a 1 , · · · , a K<br />
+1 +1<br />
✻<br />
0<br />
❄<br />
−1<br />
✻<br />
✲<br />
✻<br />
τ 2τ 3τ<br />
t<br />
h(t)<br />
n w (t)<br />
s ❄<br />
a (t) ✗ ✔<br />
✲ + ✲ h(T − t)<br />
✖ ✕<br />
h(t)<br />
✁ ✁✁❅ ❅ ✁ ✁❅ ❅✁ ✁❅ ❅ ❅ ✁ ✁❅<br />
❅<br />
0 τ❆ 2τ 3τ 4τ<br />
❆❆ <br />
−h(t − τ)<br />
h(t − 3τ)<br />
t<br />
❍ ❍❍ ✄<br />
✂ ✁<br />
❄<br />
✲<br />
sample at<br />
t = T + (k − 1)τ<br />
✁ ✁✁❅ ❅ ✂❅ ❅ ❅<br />
❇❇❇ ❅❅<br />
0 τ 2τ 3τ 4τ<br />
❇ ✂✂✂<br />
s a (t)<br />
t<br />
Figure 7.9: Pulse transmission and detection with a sampled-matched-filter receiver. <strong>Signal</strong>s are<br />
also shown.<br />
when the message a = a 1 , a 2 , · · · , a K has to be conveyed to the decoder (i.e. pulsetransmission,<br />
see figure 7.9). Assume that a k ∈ {−1, +1} for k = 1, K , thus there are 2 K<br />
messages. The signal is sent over an additive white Gaussian noise waveform channel.<br />
Assume that h(t) = 0 for t < 0 and t ≥ T . The receiver now uses a matched filter<br />
h(T − t) and samples its output at t = T + (k − 1)τ, for k = 1, K . These K samples are<br />
then processed to form an estimate of the transmitted message (sequence).<br />
Show that this receiver is optimum if processing of the samples is done in the right way.<br />
What is optimum processing here?<br />
4. One of two equally likely messages is to be transmitted over an additive white Gaussian<br />
noise channel with S n ( f ) = N 0 /2 = 1 by means of binary pulse-position modulation.<br />
Specifically<br />
s 1 (t) = p(t),<br />
s 2 (t) = p(t − 2),<br />
in which the pulse p(t) is shown in figure 7.10.<br />
(a) Describe (and sketch) an optimum receiver for this case? Express the resulting error<br />
probability in terms of Q(·).<br />
(b) Give the implementation of an optimum receiver which uses a single linear filter<br />
followed by a sampler and comparison device. Assume that two samples from the<br />
filter output are fed into the comparison device. Sketch the impulse response of the
CHAPTER 7. MATCHED FILTERS 73<br />
2 p(t)<br />
1<br />
0 1 2 t<br />
Figure 7.10: A rectangular pulse.<br />
appropriate filter. What is the output of the filter at both sample moments when the<br />
filter input is s 1 (t)? What are these outputs for filter input s 2 (t)?<br />
(c) Calculate the minimum attainable average error probability if<br />
s 1 (t) = p(t) and s 2 (t) = p(t − 1).<br />
(Exam Communication Principles, October 6, 2003.)<br />
5. In a communication system based on an additive white Gaussian noise waveform channel<br />
six signals (waveforms) are used. All signals are zero for t < 0 and t ≥ 8. For 0 ≤ t < 8<br />
the signals are<br />
s 1 (t) = 0<br />
s 2 (t) = +2 cos(πt/2)<br />
s 3 (t) = +2 cos(πt/2) + 2 sin(πt/2)<br />
s 4 (t) = +2 cos(πt/2) + 4 sin(πt/2)<br />
s 5 (t) = +4 sin(πt/2)<br />
s 6 (t) = +2 sin(πt/2)<br />
The messages corresponding to the signals all have probability 1/6. The power spectral<br />
density of the noise process N w (t) is N 0 /2 = 4/9 for all f . The receiver observes the<br />
received waveform r(t) = s m (t) + n w (t) in the time interval 0 ≤ t < 8.<br />
(a) Determine a set of building-block waveforms for these six signals. Sketch these<br />
building-block waveforms. Show that they are orthonormal over [0, 8). Give the<br />
vector representations of all six signals and sketch the resulting signal structure.<br />
(b) Describe for what received vectors r an optimum receiver chooses ˆM = 1, ˆM =<br />
2, · · · , and ˆM = 6. Sketch the corresponding decision regions I 1 , I 2 , · · · , I 6 . Give<br />
an expression for the error probability P E obtained by an optimum receiver. Use the<br />
Q(·)-function.<br />
(c) Sketch and specify a matched-filter implementation of an optimum receiver for the<br />
six signals (all occurring with equal a-priori probability). Use only two matched<br />
filters.
CHAPTER 7. MATCHED FILTERS 74<br />
(d) Compute the average energy of the signals. We can translate the signal structure<br />
over a vector a such that the average signal energy is minimal. Determine this vector<br />
a. What is the minimal average signal energy? Specify the modified signals<br />
s<br />
1 ′ (t), s′ 2 (t), · · · , s′ 6<br />
(t) that correspond to this translated signal structure.<br />
(Exam Communication Theory, November 18, 2003.)
Chapter 8<br />
Orthogonal signaling<br />
SUMMARY: Here we investigate orthogonal signaling. In this case the waveforms that<br />
correspond to the messages are orthogonal. We determine the average error probability P E<br />
for these signals. It appears that P E can be made arbitrary small by increasing the number<br />
of waveforms if only the energy per transmitted bit is larger than N 0 ln 2. Note that this is<br />
a capacity-result!<br />
8.1 Orthogonal signal structure<br />
Consider |M| signals s m (t), or in vector representation s m , with a-priori probabilities Pr{M =<br />
m} = 1/|M| for m ∈ M = {1, 2, · · · , |M|}. We now define an orthogonal signal set in the<br />
following way:<br />
Definition 8.1 All signals in an orthogonal set are assumed to have equal energy and are orthogonal<br />
i.e.<br />
s m<br />
=<br />
√<br />
Es ϕ m<br />
for m ∈ M, (8.1)<br />
where ϕ m<br />
is the unit-vector corresponding to dimension m. There are as many building-block<br />
waveforms ϕ m (t) and dimensions in the signal space as there are messages.<br />
The signals that we have defined are now orthogonal since<br />
∫ ∞<br />
−∞<br />
s m (t)s k (t)dt = (s m · s k ) = E s (ϕ m · ϕ k<br />
) = E s δ mk for m ∈ M and k ∈ M. (8.2)<br />
Note that all signals have energy equal to E s .<br />
So far we have not been very explicit about the actual signals. However we can e.g. think of<br />
(disjoint) shifts of a pulse (pulse-position modulation, PPM) or sines and cosines with an integer<br />
number of periods over [0, T ) (frequency-shift keying, FSK).<br />
75
CHAPTER 8. ORTHOGONAL SIGNALING 76<br />
8.2 Optimum receiver<br />
How does the optimum receiver decide when it receives the vector r = (r 1 , r 2 , · · · , r |M| )? It has<br />
to choose the message m ∈ M that minimizes the squared Euclidean distance between s m and<br />
r, i.e.<br />
‖r − s m ‖ 2 = ‖r‖ 2 + ‖s m ‖ 2 − 2(r · s m )<br />
= ‖r‖ 2 + E s − 2 √ E s r m . (8.3)<br />
RESULT 8.1 Since only the term −2 √ E s r m depends on m an optimum receiver has to choose<br />
ˆm such that<br />
r ˆm ≥ r m for all m ∈ M. (8.4)<br />
Now we can find an expression for the error probability.<br />
8.3 Error probability<br />
To determine the error probability for orthogonal signaling we may assume, because of symmetry,<br />
that signal s 1 (t) was actually sent. Then<br />
r 1 = √ E s + n 1 ,<br />
r m = n m , for m = 2, 3, · · · , |M|,<br />
with p N (n) =<br />
∏ 1<br />
√ exp(− n2 m<br />
). π N0 N 0<br />
(8.5)<br />
m=1,|M|<br />
Note that the noise vector n = (n 1 , n 2 , · · · , n |M| ) consists of |M| independent components all<br />
with mean 0 and variance N 0 /2.<br />
Suppose that the first component of the received vector is α. Then we can write for the correct<br />
probability, conditional on the fact that message 1 was sent and that the first component of r is<br />
α,<br />
Pr{ ˆM = 1|M = 1, R 1 = α} = Pr{N 2 < α, N 3 < α, · · · , N |M| < α}<br />
= (Pr{N 2 < α}) |M|−1<br />
(∫ α<br />
|M|−1<br />
(β)dβ)<br />
= p N (8.6)<br />
therefore the correct probability<br />
P C =<br />
∫ ∞<br />
−∞<br />
−∞<br />
p N (α − √ (∫ α<br />
|M|−1<br />
E s ) p N (β)dβ)<br />
dα. (8.7)<br />
−∞
CHAPTER 8. ORTHOGONAL SIGNALING 77<br />
We rewrite this correct probability as follows<br />
P C =<br />
=<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
1<br />
√ exp(− (α − √ E s ) 2 (∫ α<br />
|M|−1<br />
) · · · dβ)<br />
dα<br />
π N0 N 0 −∞<br />
1<br />
√ exp(− (α/√ N 0 /2 − √ E s / √ N 0 /2) 2<br />
2π<br />
(∫ α<br />
|M|−1<br />
) · · · dβ)<br />
dα/ √ N 0 /2<br />
−∞<br />
2<br />
1<br />
√ exp(− (µ − √ ( ∫<br />
2E s /N 0 ) 2 √ |M|−1 µ N0 /2<br />
) · · · dβ)<br />
dµ (8.8)<br />
2π 2<br />
−∞<br />
with µ = α/ √ N 0 /2. Furthermore<br />
∫ µ<br />
√ N0 /2<br />
−∞<br />
1<br />
√ π N0<br />
exp(− β2<br />
N 0<br />
)dβ =<br />
=<br />
∫ µ<br />
√ N0 /2<br />
−∞<br />
∫ µ<br />
−∞<br />
1<br />
√ exp(− (β/√ N 0 /2) 2<br />
)dβ/ √ N 0 /2<br />
2π 2<br />
1<br />
√ exp(− λ2<br />
2π 2<br />
)dλ (8.9)<br />
with λ = β/ √ N 0 /2. Therefore<br />
P C =<br />
∫ ∞<br />
−∞<br />
(∫ µ<br />
|M|−1<br />
p(µ − b) p(λ)dλ)<br />
dµ, (8.10)<br />
−∞<br />
with p(γ ) = 1 √<br />
2π<br />
exp(− γ 2<br />
2 ) and b = √ 2E s /N 0 . Note that p(γ ) is the probability density<br />
function of a Gaussian random variable with mean zero and variance 1.<br />
From (8.10) we conclude that for a given |M| the correct probability P C depends only on<br />
b i.e. on the ratio E s /N 0 . This can be regarded as a signal to noise ratio since E s is the signal<br />
energy and N 0 is twice the variance of the noise in each dimension.<br />
Example 8.1 In figure 8.1 the error probability P E = 1 − P C is depicted for values of log 2 |M| =<br />
1, 2, · · · , 16 and as a function of E s /N 0 .<br />
8.4 Capacity<br />
Consider the following experiment. We keep increasing |M| and want to know how we should<br />
increase E s /N 0 such that the error probability P E gets smaller and smaller. It turns out that it is<br />
the energy per bit that counts. We define E b i.e. the energy per transmitted bit of information<br />
E b<br />
=<br />
Reliable communication is possible if E b > N 0 ln 2. More precisely:<br />
E s<br />
log 2 |M| . (8.11)
CHAPTER 8. ORTHOGONAL SIGNALING 78<br />
10 0 Es/N0<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
−4 −2 0 2 4 6 8 10 12 14 16<br />
Figure 8.1: Error probability P E for |M| orthogonal signals as a function of E s /N 0 in dB for<br />
log 2 |M| = 1, 2, · · · , 16. Note that P E increases with |M| for fixed E s /N 0 .<br />
RESULT 8.2 The error probability for orthogonal signalling satisfies<br />
P E ≤<br />
{ 2 exp(− log2 |M|[E b /(2N 0 ) − ln 2]) E b /N 0 ≥ 4 ln 2<br />
2 exp(− log 2 |M|[ √ E b /N 0 − √ ln 2] 2 ) ln 2 ≤ E b /N 0 ≤ 4 ln 2.<br />
(8.12)<br />
The proof of result 8.2 can be found in appendix G.<br />
The consequence of (8.12) is that if E b , i.e. the energy per bit, is larger than N 0 ln 2 we can<br />
get an arbitrary small error probability by increasing |M|, the dimensionality of the signal space.<br />
This a capacity result!<br />
Note that the number of bits that we transmit each time is log 2 |M| and this number grows<br />
much slower than |M| itself.<br />
Example 8.2 In figure 8.2 we have plotted the error probability P E as a function of the ratio E b /N 0 . It<br />
appears that for ratios larger than ln 2 = −1.5917 dB (this number is called the Shannon limit) the error<br />
probability decreases by making |M| larger.
CHAPTER 8. ORTHOGONAL SIGNALING 79<br />
10 0 Eb/N0<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
−4 −2 0 2 4 6 8 10 12 14 16<br />
Figure 8.2: Error probability for |M| = 2, 4, 8, · · · , 32768 now as function of the ratio E b /N 0<br />
in dB.<br />
8.5 Exercises<br />
1. A Hadamard matrix is a matrix whose elements are ±1. When n is a power of 2, an n × n<br />
Hadamard matrix is constructed by means of the recursion:<br />
)<br />
,<br />
H 2<br />
=<br />
( +1 +1<br />
+1 −1<br />
( )<br />
+Hn +H<br />
H 2n = n<br />
. (8.13)<br />
+H n −H n<br />
Let n be a power of 2 and M = {1, 2, · · · , n}. Consider for m ∈ M the signal vectors<br />
s m<br />
=<br />
√<br />
E s<br />
n h m where h m is the m-th row of the Hadamard matrix H n.<br />
(a) Show that the signal set {s m , m ∈ M} consists of orthogonal vectors all having<br />
energy E s .<br />
(b) What is the error probability P E if the signal vectors correspond to waveforms that<br />
are transmitted over a waveform channel with additive white Gaussian noise having<br />
spectral density N 0 /2.<br />
(c) What is the advantage of using the Hadamard signal set over the orthogonal set from<br />
definition 8.1 if we assume that the building-block waveforms are in both cases timeshifts<br />
of a pulse? And the disadvantage?
CHAPTER 8. ORTHOGONAL SIGNALING 80<br />
(d) The matrix n × 2n matrix<br />
H ∗ n<br />
<br />
=<br />
∣ +H ∣<br />
n ∣∣∣<br />
. (8.14)<br />
−H n<br />
defines a set of 2n signals which is called bi-orthogonal. Determine the error probability<br />
if this signal set if the energy of each signal is E s .<br />
A bi-orthogonal “code” with n = 32 was used for an early deep-space mission (Mariner,<br />
1969). A fast Hadamard transform was used as decoding method [6].<br />
2. Consider a communication system based on frequency-shift keying (FSK). There are 8<br />
equiprobable messages, hence Pr{M = m} = 1/8 for m ∈ {1, 2, · · · , 8}. The signal<br />
waveform corresponding to message m is<br />
s m (t) = A √ 2 cos(2πmt), for 0 ≤ t < 1.<br />
For t < 0 and t ≥ 1 all signals are zero. The signals are transmitted over an additive white<br />
Gaussian noise waveform channel. The power spectral density of the noise is S n ( f ) =<br />
N 0 /2 for all frequencies f .<br />
(a) First show that the signals s m (t), m ∈ {1, 2, · · · , 8} are orthogonal 1 . Give the energies<br />
of the signals. What are the building-block waveforms ϕ 1 (t), ϕ 2 (t), · · · , ϕ 8 (t)<br />
that result in the signal vectors<br />
s 1 = (A, 0, 0, 0, 0, 0, 0, 0),<br />
s 2 = (0, A, 0, 0, 0, 0, 0, 0),<br />
· · ·<br />
s 8 = (0, 0, 0, 0, 0, 0, 0, A)?<br />
(b) The optimum receiver first determines the correlations r i = ∫ 1<br />
0 r(t)ϕ i(t)dt for i =<br />
1, 2, · · · , 8. Here r(t) is the received waveform hence r(t) = s m (t) + n w (t). For<br />
what values of the vector r = (r 1 , r 2 , · · · , r 8 ) does the receiver decide that message<br />
m was transmitted?<br />
(c) Give an expression for the error probability P E obtained by the optimum receiver.<br />
(d) Next consider a system with 16 messages all having the same a-priori probability.<br />
The signal waveform corresponding to message m is now given by<br />
s m (t) = A √ 2 cos(2πmt), for 0 ≤ t < 1 for m = 1, 2, · · · , 8,<br />
s m (t) = −A √ 2 cos(2π(m − 8)t), for 0 ≤ t < 1 for m = 9, 10, · · · , 16.<br />
For t < 0 and t ≥ 1 these signals are again zero. What are the signal vectors<br />
s 1 , s 2 , · · · , s 16 if we use the building vectors mentioned in part (a) again?<br />
1 Hint: 2 cos(a) cos(b) = cos(a − b) + cos(a + b).
CHAPTER 8. ORTHOGONAL SIGNALING 81<br />
(e) The optimum receiver again determines the correlations r i = ∫ 1<br />
0 r(t)ϕ i(t)dt for i =<br />
1, 2, · · · , 8. For what values of the vector r = (r 1 , r 2 , · · · , r 8 ) does the receiver now<br />
decide that message m was transmitted? Give an expression for the error probability<br />
P E obtained by the optimum receiver now.<br />
(Exam Communication Principles, October 6, 2003.)
Part III<br />
Transmitter Optimization<br />
82
Chapter 9<br />
<strong>Signal</strong> energy considerations<br />
SUMMARY: The collection of vectors that corresponds to the signal-waveforms is called<br />
the signal structure. If the signal structure is translated or rotated the error probability<br />
need not change. In this chapter we determine the translation vector that minimizes the<br />
average signal energy. After that we demonstrate that orthogonal signaling achieves a certain<br />
error performance with twice as much energy as antipodal signal. The energy loss of<br />
orthogonal sets with many signals can be neglected however.<br />
9.1 Translation and rotation of signal structures<br />
In the additive white 1 Gaussian noise case, if a signal structure, i.e. the collection of vectors<br />
s 1 , s 2 , · · · , s |M| , is translated or rotated the error probability P E need not change. To see why,<br />
just assume that the corresponding decision regions I 1 , I 2 , · · · , I |M| (which need not be optimum)<br />
are translated or rotated in the same way. Then, because the additive white Gaussian<br />
noise-vector is spherically symmetric in all dimensions in the signal space, and since distances<br />
between decoding regions and signal points did not change, the error probability remains the<br />
same. However, in general, the average signal energy changes if a signal structure is translated.<br />
Rotation about the origin has no effect on the average signal energy.<br />
9.2 <strong>Signal</strong> energy<br />
Here we want to stress again that<br />
E sm =<br />
∫ T<br />
0<br />
s 2 m (t)dt = ‖s m ‖2 , (9.1)<br />
1 Although we are concerned with an AGN vector channel, we use the word ”white” to emphasize that this vector<br />
channel is equivalent to a waveform channel with additive white Gaussian noise.<br />
83
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 84<br />
therefore we only need to know the collection of signal vectors, the signal structure, and the<br />
message probabilities to determine the average signal energy which is defined as<br />
E av<br />
=<br />
∑<br />
m∈M<br />
Pr{M = m}E sm = ∑<br />
m∈M<br />
9.3 Translating a signal structure<br />
Pr{M = m}‖s m ‖ 2 = E[‖S‖ 2 ]. (9.2)<br />
The smallest possible average error probability of a waveform communication system only depends<br />
on the signal structure, i.e. on the collection of vectors s 1 , s 2 , · · · , s |M| . It does not<br />
change if the entire signal structure is translated. The optimum decision regions simply translate<br />
too. Translation may affect the average signal energy however. We next determine the translation<br />
ϕ 2<br />
ϕ ′ 2<br />
✄<br />
✂ ✁<br />
a<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
ϕ ′ 1<br />
✄<br />
✡ ✡✡✡✡✣ ✄<br />
✂ ✂ ✁<br />
✁<br />
ϕ 1<br />
Figure 9.1: Translation of a signal structure, moving the origin of the coordinate system to a.<br />
vector that minimizes the average signal energy.<br />
Consider a certain signal structure. Let E av (a) denote the average signal energy when the<br />
structure is translated over −a, or equivalently when the origin of the coordinate system is moved<br />
to a (see figure 9.1), then<br />
E av (a) = ∑<br />
Pr{M = m}‖s m − a‖ 2 = E[‖S − a‖ 2 ]. (9.3)<br />
m∈M<br />
We now have to find out how we should choose the translation vector a that minimizes<br />
E av (a). It turns out that the best choice is<br />
a =<br />
∑<br />
Pr{M = m}s m = E[S]. (9.4)<br />
m=1,|M|<br />
This follows from considering an alternative translation vector b. The energy of the signal struc-
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 85<br />
ture after translation by b is<br />
E av (b) = E[‖S − b‖ 2 ] = E[‖(S − a) + (a − b)‖ 2 ]<br />
= E[‖S − a‖ 2 + 2(S − a) · (a − b) + ‖a − b‖ 2 ]<br />
= E[‖S − a‖ 2 ] + 2(E[S] − a) · (a − b) + ‖a − b‖ 2<br />
= E[‖S − a‖ 2 ] + ‖a − b‖ 2<br />
≥ E[‖S − a‖ 2 ], (9.5)<br />
where equality in the third line follows from a = E[S]. Observe that E av (b) is minimized only<br />
for b = a = E[S].<br />
If we do not translate, i.e. when b = 0, the average signal energy<br />
E av (0) = E[‖S − a‖ 2 ] + ‖a‖ 2 , (9.6)<br />
hence we can save ‖a‖ 2 if we translate the center of the coordinate system to a.<br />
RESULT 9.1 This implies that to minimize the average signal energy we should choose the<br />
center of gravity of the signal structure as the origin of the coordinate system. If the center of<br />
gravity of the signal structure a ̸= 0 we can decrease the average signal energy by ‖a‖ 2 by doing<br />
an optimal translation of the coordinate system.<br />
In what follows we will discuss some examples.<br />
9.4 Comparison of orthogonal and antipodal signaling<br />
Let M = {1, 2} and Pr{M = 1} = Pr{M = 2} = 1/2. Consider a first signal set of two<br />
orthogonal waveforms (see the two sub-figures in the top row in figure 9.2)<br />
s 1 (t) = √ 2E s sin(10πt)<br />
s 2 (t) = √ 2E s sin(12πt), for 0 ≤ t < 1, (9.7)<br />
and a second signal set of two antipodal signals (see the bottom row sub-figures in figure 9.2)<br />
s 1 (t) = √ 2E s sin(10πt)<br />
s 2 (t) = − √ 2E s sin(10πt), also for 0 ≤ t < 1. (9.8)<br />
Note that binary FSK (frequency-shift keying) is the same as orthogonal signaling, while binary<br />
PSK (phase-shift keying) is identical to antipodal signaling.<br />
The vector representations of the signal sets are<br />
s 1 = ( √ E s , 0),<br />
s 2 = (0, √ E s ), (9.9)
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 86<br />
1.5<br />
s1(t)/sqrt(Es)<br />
1.5<br />
s2(t)/sqrt(Es)<br />
1<br />
1<br />
0.5<br />
0.5<br />
0<br />
0<br />
−0.5<br />
−0.5<br />
−1<br />
−1<br />
−1.5<br />
0 0.2 0.4 0.6 0.8 1<br />
−1.5<br />
0 0.2 0.4 0.6 0.8 1<br />
1.5<br />
s1(t)/sqrt(Es)<br />
1.5<br />
s2(t)/sqrt(Es)<br />
1<br />
1<br />
0.5<br />
0.5<br />
0<br />
0<br />
−0.5<br />
−0.5<br />
−1<br />
−1<br />
−1.5<br />
0 0.2 0.4 0.6 0.8 1<br />
−1.5<br />
0 0.2 0.4 0.6 0.8 1<br />
Figure 9.2: The two signal sets of (9.7) and (9.8) in waveform representation.<br />
s 2<br />
✄<br />
✂ ✁<br />
ϕ 2<br />
s 1<br />
0<br />
s 2<br />
✄ ✄ ✂ ✁<br />
✂ ✁<br />
✄<br />
✂ ✁ 0<br />
ϕ 1<br />
s 1<br />
ϕ 1<br />
(a)<br />
(b)<br />
Figure 9.3: Vector representation of the orthogonal (a) and the antipodal signal set (b).
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 87<br />
10 0<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
10 −5<br />
10 −6<br />
10 −7<br />
−15 −10 −5 0 5 10 15<br />
Figure 9.4: Probability of error for binary antipodal and orthogonal signaling as function of<br />
E s /N 0 in dB. It is assumed that Pr{M = 1} = Pr{M = 2} = 1/2. Observe the difference of 3<br />
dB between both curves.<br />
for the first (orthogonal) set, and<br />
s 1 = ( √ E s , 0),<br />
s 2 = (− √ E s , 0), (9.10)<br />
for the second (antipodal) set.<br />
The average signal energy for both sets is equal to E s . For the error probabilities in the case<br />
of additive white Gaussian noise with power spectral density N 0 /2 we get<br />
⎛ ⎞<br />
P orthog.<br />
E<br />
= Q<br />
P antipod.<br />
E<br />
= Q<br />
⎝√ 2Es<br />
√ ⎠ = Q( √ E s /N 0 ),<br />
N<br />
2 0<br />
2<br />
⎛ ⎞<br />
⎝ 2√ E s<br />
√ ⎠ = Q( √ 2E s /N 0 ). (9.11)<br />
2<br />
N 0<br />
2<br />
Note that the distance d between the signal points differs by a factor √ 2 and P E = Q(d/2σ )).<br />
These error probabilities are depicted in figure 9.4. We observe a difference in the error probabilities<br />
of 3.0 dB. For a description of the meaning of decibels etc. we refer to appendix H. By<br />
this we mean that, in order to achieve a certain value of P E , we have to make the signal energy<br />
twice as large for orthogonal signaling than for antipodal signaling. The better performance of<br />
antipodal signaling relative to orthogonal signaling is best explained by the fact that for antipo-
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 88<br />
dal signaling the center of gravity is the origin of the coordinate system while for orthogonal<br />
signaling this is certainly not the case.<br />
We may conclude that binary PSK modulation achieves a better error-performance than binary<br />
FSK. An advantage of FSK modulation is however that efficient FSK receivers can be<br />
designed that do not recover the phase. PSK on the other hand requires coherent demodulation.<br />
9.5 Energy of |M| orthogonal signals<br />
The average energy E av of the signals in an orthogonal set (all signals having energy E s ) is<br />
E av = E[‖S‖ 2 ] = ∑<br />
Pr{M = m}‖s m ‖ 2 = E s . (9.12)<br />
m∈M<br />
The center of gravity of our orthogonal signal structure is<br />
We know from (9.6) that<br />
thus, since E av (0) = E av ,<br />
or<br />
a = ( 1<br />
|M| , 1<br />
|M| , · · · , 1<br />
|M| )√ E s . (9.13)<br />
E av (0) = E av (a) + ‖a‖ 2 (9.14)<br />
E s = E av (a) + E s<br />
|M|<br />
(9.15)<br />
E av (a) = E s (1 − 1 ). (9.16)<br />
|M|<br />
Observe that E av (a) is the average signal energy after translating the coordinate system such that<br />
its origin is the center of gravity of the signal structure. For |M| → ∞ the difference between<br />
the E av (a) and E s can be neglected. Therefore we conclude that an orthogonal signal set is not<br />
optimal in the sense that for a given error performance it needs the smallest possible average<br />
signal energy, but that the difference diminishes if M → ∞.<br />
9.6 Upper bound on the error probability<br />
In general the probability of error P E of an optimum receiver cannot be computed easily. However<br />
we can use the ”union bound” to obtain an upper bound for it. To see how this works we<br />
assume that the a priori message probabilities are all equal. Then for the error probability PE<br />
1<br />
when message 1 is sent, we can write<br />
⋃<br />
PE 1 = Pr{ (‖R − s m ‖ ≤ ‖R − s 1 ‖)|M = 1}<br />
≤<br />
m∈M,m̸=1<br />
∑<br />
m∈M,m̸=1<br />
Pr{‖R − s m ‖ ≤ ‖R − s 1 ‖|M = 1}. (9.17)
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 89<br />
Note that in the last step we used the union bound Pr{∪ i E i } ≤ ∑ Pr{E i }, where E i is an event<br />
indexed by i.<br />
Now the total error probability can upper bounded by<br />
P E ≤ ∑<br />
Pr{M = m}PE<br />
m<br />
≤<br />
m∈M<br />
∑<br />
m∈M<br />
1<br />
|M|<br />
∑<br />
m ′ ∈M,m ′̸=m<br />
Pr{‖R − s m ′‖ ≤ ‖R − s m ‖|M = m}. (9.18)<br />
Also when the a priori probabilities are not all equal we can use the union bound to upper bound<br />
the error probability.<br />
9.7 Exercises<br />
1. Either of the two waveform sets illustrated in figures 9.5 and 9.6 may be used to communicate<br />
one of four equally likely messages over an additive white Gaussian noise channel.<br />
s 1 (t)<br />
s 2 (t)<br />
√<br />
Es /2<br />
√<br />
Es /2<br />
1 2 3 4 t<br />
1<br />
2<br />
3<br />
4<br />
t<br />
s 3 (t)<br />
s 4 (t)<br />
√<br />
Es /2<br />
√<br />
Es /2<br />
1 2 3 4 t<br />
1 2 3 4 t<br />
Figure 9.5: <strong>Signal</strong> set (a).<br />
(a) Show that both sets use the same energy.<br />
(b) Exploit the union bound to show that the set of figure 9.6 uses energy almost 3 dB<br />
more effectively than the set of figure 9.5 when a small P E is required.<br />
(Exercise 4.18 from Wozencraft and Jacobs [25].)<br />
2. In the communication system diagrammed in figure 9.7 the transmitted signal-vectors<br />
(s m1 , s m2 ) and the noises n 1 and n 2 are all statistically independent. Assume that |M| = 2
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 90<br />
s 1 (t)<br />
s 2 (t)<br />
√<br />
Es<br />
1 2 3 4 t<br />
√<br />
Es<br />
1<br />
2<br />
3<br />
4<br />
t<br />
s 3 (t)<br />
s 4 (t)<br />
√<br />
Es<br />
√<br />
Es<br />
1 2 3 4 t<br />
1<br />
2<br />
3<br />
4<br />
t<br />
Figure 9.6: <strong>Signal</strong> set (b).<br />
i.e. M = {1, 2} and that<br />
Pr{M = 1} = Pr{M = 2} = 1/2,<br />
(s 11 , s 12 ) = (3 √ E, 2 √ E),<br />
(s 21 , s 22 ) = (− √ E, −2 √ E),<br />
p N1 (n) = p N2 (n) =<br />
1 n2<br />
√ exp(− ).<br />
2πσ 2 2σ 2<br />
(9.19)<br />
m<br />
✲<br />
transmitter<br />
s m1<br />
s m2<br />
n 1<br />
✛❄<br />
✘r 1 = s m1 + n 1<br />
✲ +<br />
✲<br />
✚✙<br />
✛✘<br />
✲ +<br />
✲<br />
✚✙r 2 = s m2 + n 2<br />
✻<br />
optimum<br />
receiver<br />
ˆm<br />
✲<br />
n 2<br />
Figure 9.7: Communication system.<br />
(a) Show that the optimum receiver can be realized as diagrammed in figure 9.8. Determine<br />
α, the optimum threshold setting, and the value ˆm for r 1 + αr 2 larger than the<br />
threshold.<br />
(b) Express the resulting P E in terms of Q(·).
CHAPTER 9. SIGNAL ENERGY CONSIDERATIONS 91<br />
r 1<br />
2<br />
threshold device<br />
✛❄<br />
✘r 1 + αr<br />
+<br />
✲<br />
r 2<br />
✛✘✚<br />
✙<br />
✲<br />
✻<br />
×<br />
✚✙<br />
✻<br />
α<br />
ˆm<br />
✲<br />
Figure 9.8: Detector structure.<br />
(c) What is the average energy Eav of the signals. By translation of the signal structure<br />
we can reduce this average energy without changing the error probability. Determine<br />
the smallest possible average signal energy that can be obtained in this way.<br />
(Exam Communication Principles, July 5, 2002.)
Chapter 10<br />
<strong>Signal</strong>ing for message sequences<br />
SUMMARY: In this chapter we will describe the problem of transmitting a continuous<br />
stream of messages. How should we use the available signal power P s in such a way that we<br />
achieve a large information rate R together with a small error probability P E ?<br />
10.1 Introduction<br />
In the previous part we have considered transmission of a single randomly-chosen message<br />
m ∈ M, over a waveform channel during the time-interval 0 ≤ t < T . The problem that was to<br />
be solved there was to determine the optimum receiver, i.e the receiver that minimizes the error<br />
probability P E for a channel with additive white Gaussian noise. We have found the following<br />
solution:<br />
1. First form an orthonormal base for the set of signals.<br />
2. The optimum receiver determines the finite-dimensional vector representation of the received<br />
waveform, i.e. it projects the received waveform onto the orthonormal signal base.<br />
3. Then using this vector representation of the received waveform, it determines the message<br />
that has the (or a) largest a-posteriori probability. This is done by minimizing expression<br />
(4.12) over all m ∈ M, hence by acting as an optimum vector receiver.<br />
In the present part we shall investigate the more practical situation where we have a continuous<br />
stream of messages that are to be transmitted over our additive white Gaussian noise<br />
waveform channel.<br />
10.2 Definitions, problem statement<br />
Definition 10.1 We assume that each T seconds one out of |M| messages is to be transmitted.<br />
The messages are equally likely 1 i.e. Pr{M = m} = 1/|M| for all m ∈ M.<br />
1 Only for the sake of simplicity we assume that all messages are equally likely. This is not essential however.<br />
92
CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 93<br />
m 1 ∈ M<br />
m 2 ∈ M<br />
m 3 ∈ M<br />
· · ·<br />
0<br />
T<br />
2T<br />
3T<br />
t<br />
Figure 10.1: A continuous stream of messages.<br />
The waveform that corresponds to message m ∈ M is s m (t). It is non-zero only for 0 ≤<br />
t < T . If the message in the k-th interval [(k − 1)T, kT ) is m, the signal in that interval is<br />
s k,m (t) = s m (t − (k − 1)T ).<br />
Definition 10.2 For each signal s m (t) for m ∈ M the available energy is E s hence<br />
Therefore the available power is<br />
E s ≥<br />
∫ T<br />
0<br />
sm 2 (t)dt Joule. (10.1)<br />
E s<br />
P s = Joule/second. (10.2)<br />
T<br />
Definition 10.3 The transmission rate R is defined as<br />
R = log 2 |M| bit/second. (10.3)<br />
T<br />
Definition 10.4 Now the available energy per transmitted bit is defined as<br />
E b<br />
=<br />
E s<br />
log 2 |M|<br />
Joule/bit. (10.4)<br />
From the above definitions we can deduce for the available energy per transmitted bit that<br />
E b = E s<br />
T<br />
T<br />
log 2 |M| = P s<br />
R , (10.5)<br />
hence P s = RE b .<br />
Our problem is now to determine the maximum rate at which we can communicate reliably<br />
over a waveform channel when the available power is P s . What are the signals that are to<br />
be used to achieve this maximum rate? In the next sections we will first consider two extremal<br />
situations, namely bit-by-bit signaling and block-orthogonal signaling.<br />
10.3 Bit-by-bit signaling<br />
10.3.1 Description<br />
Suppose that in T seconds, we want to transmit K binary digits b 1 b 2 · · · b K . Then<br />
|M| = 2 K ,<br />
R = log 2 |M|<br />
T<br />
= K T . (10.6)
CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 94<br />
x(t)<br />
c<br />
❅ ❅<br />
τ<br />
t<br />
∫ ∞<br />
−∞ x2 (t)dt = E b<br />
❅ ❅<br />
x(t − 3τ)<br />
c<br />
s(t)<br />
❅ ❅<br />
❅ ❅<br />
❅ ❅<br />
❅ ❅<br />
❅ ❅ τ 2τ❅ ❅ ❅ ❅<br />
<br />
❅ ❅<br />
3τ 4τ 5τ t<br />
Figure 10.2: A bit-by-bit waveform for K = 5 and message 11010.<br />
We can realize this by transmitting a signal s(t) that is composed out of K pulses x(t) that<br />
are shifted in time (see figure 10.2). More precisely<br />
s(t) = ∑<br />
{ −1 if bi = 0<br />
s i x(t − (i − 1)τ) with s i =<br />
(10.7)<br />
+1 if b i = 1.<br />
i=1,K<br />
Here x(t) is a pulse with energy E b and duration τ.<br />
To evaluate the performance of bit-by-bit signaling we first have to determine the buildingblock<br />
waveforms that are the base of our signals. It will be clear that if we take for i =<br />
1, 2, · · · , K<br />
ϕ i (t) = x(t − (i − 1)τ)<br />
√ (10.8)<br />
Eb<br />
then for i = 1, 2, · · · , K and j = 1, 2, · · · , K<br />
∫ ∞<br />
−∞<br />
ϕ i (t)ϕ j (t)dt = δ i j , (10.9)<br />
and we have an orthonormal base. Hence the building-block waveforms are the time-shifts over<br />
multiples of τ of the normalized pulse x(t)/ √ E b as is shown in figure 10.3. Now we can determine<br />
the signal structures that correspond to all the signals. For K = 1, 2, 3 we have shown the<br />
signal structures for bit-by-bit signaling in figure 10.4. The structure is always a K -dimensional<br />
hypercube.<br />
To see how the decision regions look like note first that<br />
s 1 = (− √ E b , − √ E b , · · · , − √ E b ). (10.10)
CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 95<br />
c √ Eb<br />
ϕ 1 (t)<br />
❅ ❅<br />
ϕ 2 (t)<br />
❅ ❅<br />
ϕ 3 (t)<br />
ϕ 4 (t)<br />
❅ ❅<br />
ϕ 5 (t)<br />
❅ ❅<br />
<br />
❅<br />
❅ ❅ <br />
❅❅ ❅<br />
❅ ❅ τ 2τ 3τ 4τ 5τ t<br />
Figure 10.3: Our K = 5 building-block waveforms.<br />
✛<br />
2 √ E b<br />
✲<br />
✄<br />
✂ ϕ 2<br />
✄<br />
✁<br />
✂ ✁<br />
<br />
2 √ s 4<br />
E b<br />
✛<br />
✲<br />
✄<br />
✄<br />
✂ ✁<br />
✂ ✁<br />
ϕ 1 ϕ 1<br />
s 0 1 s 0<br />
2<br />
s 1<br />
s 3<br />
s 2<br />
K = 1<br />
ϕ<br />
s 2<br />
s<br />
✄ 3<br />
4<br />
✂ ✄<br />
✁<br />
✂ ✁<br />
✟✟<br />
✄<br />
s ✟ ✟✟✟✟✟✟ ✄✟ ✟✟✟✟✟✟<br />
7 ✂ ✁<br />
✂ ✁<br />
s 8<br />
s 6<br />
✄<br />
✂ ✁<br />
<br />
✄<br />
✂ ✁<br />
K = 2<br />
ϕ 1<br />
✟✟<br />
✟<br />
✟<br />
✟ 0<br />
ϕ 3 ✟<br />
✟<br />
✟<br />
✟ ✄<br />
✂ ✄<br />
✁<br />
✂ ✁ s 2<br />
s 1<br />
✄<br />
✂✟ ✟✟✟✟✟✟ ✄<br />
✁<br />
✂✟✁<br />
✟✟✟✟✟✟<br />
s 5 K = 3<br />
✛<br />
✲<br />
2 √ E b<br />
Figure 10.4: Bit-by-bit signal structure for K = 1, 2, 3.
CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 96<br />
We claim 2 that the optimum receiver decides ˆm = 1 if<br />
r i < 0, for all i = 1, K (10.11)<br />
Now for messages other than for m = 1 similar arguments show that the decision regions are<br />
separated by hyperplanes r i = 0 for i = 1, K .<br />
To find an expression for P E note that the signal hypercube is symmetrical and assume that<br />
s 1 was transmitted. No error occurs if r i = − √ E b + n i < 0 for all i = 1, K or, in other words,<br />
if<br />
n i < √ E b for all i = 1, K , (10.12)<br />
hence, noting that σ 2 = N 0 /2, we get<br />
P C =<br />
(<br />
1 − Q( √ 2E b /N 0 )) K<br />
. (10.13)<br />
Therefore<br />
P E = 1 −<br />
(<br />
1 − Q( √ 2E b /N 0 )) K<br />
. (10.14)<br />
Observe that to estimate bit b i for i = 1, K , an optimum receiver only needs to consider the<br />
received signal r i in dimension i.<br />
10.3.2 Probability of error considerations<br />
Note that for bit-by-bit signaling K = RT (see equation (10.6)) and that E b = P s /R (see<br />
equation (10.5)) and substitute this in (10.14), our expression for P E . We then obtain<br />
P E = 1 −<br />
(<br />
1 − Q( √ 2P s /RN 0 )) RT<br />
. (10.15)<br />
Based on this expression for P E we can now investigate what we should do to improve the<br />
reliability of our system.<br />
First note that K = RT ≥ 1. For K = 1 we get P E = Q( √ 2P s /RN 0 ). There are two<br />
conclusions that can be drawn now:<br />
• For fixed average power P s and fixed rate R the average error probability increases and<br />
approaches 1 if we increase T . We cannot improve reliability by increasing T therefore.<br />
• For fixed T the average error probability P E can only be decreased by increasing the power<br />
P s or by decreasing the rate R.<br />
This seemed to be ”the end of the story” for communication engineers before Shannon presented<br />
his ideas in [20] and [21].<br />
2 To see why, consider a signal s m for some m ̸= 1. Consider dimensions i for which s mi = √ E b . Note that<br />
s 1i = − √ E b hence for r i < 0 we have (r i − s mi ) 2 = (r i − √ E b ) 2 > (r i + √ E b ) 2 = (r i − s 1i ) 2 . For dimensions<br />
i with s mi = − √ E b = s 1i we get (r i − s mi ) 2 = (r i − s 1i ) 2 . Taking together all dimensions i = 1, K we obtain<br />
‖r − s m ‖ 2 > ‖r − s 1 ‖ 2 and therefore as claimed ˆm = 1 is chosen by an optimum receiver if r i < 0 for all i = 1, K .
CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 97<br />
10.4 Block-orthogonal signaling<br />
10.4.1 Description<br />
✄ ✄❈ ❈<br />
ϕ(t)<br />
✄ ✄❈ ❈<br />
s 27 (t)<br />
0 τ t 0 8τ 16τ 24τ 32τ t<br />
∫ ∞<br />
−∞ ϕ2 (t)dt = 1<br />
∫ ∞<br />
−∞ s2 27 (t)dt = E s = T P s<br />
Figure 10.5: A block-orthogonal waveform for K = 5 and message 11010 or m = 27.<br />
Next we will consider block-orthogonal signaling. We again want to transmit K bits in T<br />
seconds. We do this by sending one out of 2 K orthogonal pulses every T seconds. If we use<br />
pulse-position modulation (PPM) the |M| = 2 K signals are<br />
s m (t) = √ E s ϕ(t − (m − 1)τ), for m = 1, 2 K , (10.16)<br />
where ϕ(t) has energy 1 and duration not more than τ with<br />
τ = T<br />
2 K . (10.17)<br />
All signals within the block [0, T ) are orthogonal and have energy E s . That is why we call our<br />
signaling method block-orthogonal.<br />
10.4.2 Probability of error considerations<br />
To determine the error probability P E for block-orthogonal signaling we assume that<br />
E b /N 0 = (1 + ɛ) 2 ln 2, for 0 ≤ ɛ ≤ 1, (10.18)<br />
i.e. we are willing to spend slightly more than N 0 ln 2 Joule per transmitted bit of information.<br />
Now we obtain from (8.12) that<br />
P E ≤ 2 exp(− log 2 |M|[ √ (1 + ɛ) 2 ln 2 − √ ln 2] 2 )<br />
= 2 exp(− log 2 |M|ɛ 2 ln 2). (10.19)
CHAPTER 10. SIGNALING FOR MESSAGE SEQUENCES 98<br />
Finally substitution of log 2 |M| = RT (see (10.3) yields<br />
P E ≤ 2 exp(−ɛ 2 RT ln 2). (10.20)<br />
What does (10.18) imply for the rate R? Note that E b = P s /R (see (10.5)). This results in<br />
E b /N 0 = P s /RN 0 = (1 + ɛ) 2 ln 2, (10.21)<br />
in other words<br />
1 P s<br />
R =<br />
(1 + ɛ) 2 N 0 ln 2 . (10.22)<br />
Based on (10.20) and (10.22) we can now investigate what we should do to improve the reliability<br />
of our system. This leads to the following result.<br />
THEOREM 10.1 For an available average power P s we can achieve rates R smaller than but<br />
arbitrarily close to<br />
P s<br />
<br />
C ∞ =<br />
bit/second (10.23)<br />
N 0 ln 2<br />
while the error probability P E can be made arbitrarily small by increasing T . Observe that C ∞ ,<br />
the capacity, depends only on the available power P s and power spectral density N 0 /2 of the<br />
noise.<br />
This is a Shannon-type of result. The reliability can be increased not only by increasing the<br />
power P s or decreasing the rate R but also by increasing the “codeword-lengths” T . It is also<br />
important to note that only rates up to the capacity C ∞ can be achieved reliably. We will see<br />
later that rates larger than C ∞ are indeed not achievable. Therefore it is correct to use the term<br />
capacity here.<br />
One might get the feeling that this could finish our investigations. We have found the capacity<br />
of the waveform channel with additive white Gaussian noise. In the next chapter we will see that<br />
so far we have ignored an important property of signal sets: their dimensionality.<br />
10.5 Exercises<br />
1. Rewrite the bound in (8.12) on P E in terms of R, T and C ∞ . Consider 3<br />
E(R) = lim<br />
T →∞ − 1 T ln P E(T, R), (10.24)<br />
for the best possible signaling methods with rate not smaller than R having signals with<br />
duration T . The function E(R) of R is called the reliability function.<br />
Now compute a lower bound on the reliability function E(R) and draw a plot of this lower<br />
bound. It can be shown that E(R) is equal to the lower bound that we have just derived<br />
(see Gallager [7]).<br />
3 Just assume that the limit in this definition exists.
Chapter 11<br />
Time, bandwidth, and dimensionality<br />
SUMMARY: Here we first discuss the fact that, to obtain a small error probability P E ,<br />
block-orthogonal signaling requires a huge number of dimensions per second. Then we<br />
show that a bandwidth W can only accommodate roughly 2W dimensions per second.<br />
Hence block-orthogonal signaling is only practical when the available bandwidth is very<br />
large.<br />
11.1 Number of dimensions for bit-by-bit and block-orthogonal<br />
signaling<br />
We again assume that in an interval of duration T we have to transmit K bits.<br />
Then for bit-by-bit signaling we need K = RT dimensions per block. For block-orthogonal<br />
signaling we need 2 K = 2 RT dimensions per block.<br />
Therefore for bit-by-bit signaling we need K/T = R dimensions per second. For blockorthogonal<br />
signaling we need 2 K /T = 2 RT /T dimensions per second.<br />
Note that for block-orthogonal signaling the number of dimensions per second explodes by<br />
increasing T . In the next section we will see that a channel with a finite bandwidth cannot<br />
accommodate all these dimensions. Increasing T is however necessary to improve the reliability<br />
of a block-orthogonal system hence finite bandwidth creates a problem.<br />
11.2 Dimensionality as a function of T<br />
The following theorem will not be proved. For more information on this subject check Wozencraft<br />
and Jacobs [25]. The Fourier transform is described shortly in appendix B.<br />
RESULT 11.1 (Dimensionality theorem) Let {φ i (t), for i = 1, N} denote any set of orthogonal<br />
waveforms such that for all i = 1, N,<br />
1. φ i (t) = 0 for t outside [− T 2 , T 2 ), 99
CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 100<br />
2. and also ∫ W<br />
∫ ∞<br />
| i ( f )| 2 d f ≥ (1 − ηW 2 ) | i ( f )| 2 d f (11.1)<br />
−W<br />
for the spectrum i ( f ) of φ i (t). The parameter W is called bandwidth (in Hz).<br />
Then the dimensionality N of the set of orthogonal waveforms is upper-bounded by<br />
−∞<br />
N ≤<br />
2W T<br />
1 − ηW<br />
2 . (11.2)<br />
The theorem says that if we require almost all spectral energy of the waveforms to be in the<br />
frequency range [−W, W ], the number of waveforms cannot be much more than roughly 2W T .<br />
In other words the number of dimensions is not much more than 2W per second.<br />
The “definition” of bandwidth in the dimensionality theorem may seem somewhat arbitrary,<br />
but what is important is that the number of dimensions grows not faster than linear in T .<br />
Note that instead of [0, T ) as in previous chapters, we have restricted ourselves here to the<br />
interval [− T 2 , T 2<br />
) here. The reason for this is that the analysis is more easy this way.<br />
Next we want to show the converse statement, i.e. that the number of dimensions N can grow<br />
linearly with T , with a coefficient roughly equal to 2W . Therefore consider 2K + 1 orthogonal<br />
waveforms that are always zero except for − T 2 ≤ t < T 2<br />
. In that case the waveforms are defined<br />
as<br />
φ 0 (t) = 1<br />
φ c 1 (t) = √ 2 cos(2π t T ) and φs 1 (t) = √ 2 sin(2π t T )<br />
φ c 2 (t) = √ 2 cos(4π t T ) and φs 2 (t) = √ 2 sin(4π t T )<br />
· · ·<br />
φK c (t) = √ 2 cos(2K π t T ) and φs K (t) = √ 2 sin(2K π t ). (11.3)<br />
T<br />
We now determine the spectra of all these waveforms.<br />
1. The spectrum of the first waveform φ 0 (t) is<br />
0 ( f ) =<br />
∫ T/2<br />
exp(− j2π f t)dt<br />
−T/2<br />
∫ T/2<br />
= −1<br />
j2π f<br />
= −1<br />
j2π f<br />
d exp(− j2π f t)<br />
−T/2<br />
(exp(− j2π f T 2 ) − exp( j2π f T )<br />
2 )<br />
= sin π f T<br />
π f<br />
. (11.4)<br />
Note that this is the so called sinc-function. In figure 11.1 the signal φ 0 (t) and the corresponding<br />
spectrum 0 ( f ) are shown for T = 1.
CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 101<br />
x0(t)<br />
X0(f)<br />
1<br />
1<br />
0.8<br />
0.8<br />
0.6<br />
0.6<br />
0.4<br />
0.4<br />
0.2<br />
0.2<br />
0<br />
0<br />
−0.2<br />
−1 −0.5 0 0.5 1<br />
−0.2<br />
−4 −2 0 2 4<br />
Figure 11.1: The signal φ 0 (t) and the corresponding spectrum 0 (t) for T = 1.<br />
2. The spectrum of the cosine-waveform φk c (t) for a specified k can be determined by observing<br />
that<br />
φk c (t) = φ 0(t) √ 2 cos 2π kt<br />
T<br />
= φ 0 (t) √ 2 1 (<br />
exp( j2π kt<br />
)<br />
kt<br />
) + exp(− j2π<br />
2 T T )<br />
(11.5)<br />
hence (see appendix B)<br />
c k ( f ) = 1 √<br />
2<br />
(<br />
0 ( f − k T ) + 0( f + k T ) )<br />
. (11.6)<br />
In figure 11.2 the signal φ5 c(t) and the corresponding spectrum c 5<br />
( f ) are shown again for<br />
T = 1.<br />
3. Similarly we can determine the spectrum of the sine-waveforms φk s (t) for specified k.<br />
We now want to find out how much energy is in certain frequency bands. E.g. MATLAB<br />
tells us that<br />
∫ ∞<br />
( ) sin π f T 2<br />
d f = T and<br />
−∞<br />
∫ 1/T<br />
π f<br />
( ) sin π f T 2<br />
d f = 0.9028T, (11.7)<br />
−1/T<br />
π f
CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 102<br />
1.5<br />
x5c(t)<br />
0.7<br />
X5c(f)<br />
1<br />
0.6<br />
0.5<br />
0.5<br />
0.4<br />
0<br />
0.3<br />
0.2<br />
−0.5<br />
0.1<br />
0<br />
−1<br />
−0.1<br />
−1.5<br />
−1 −0.5 0 0.5 1<br />
−0.2<br />
−5 0 5<br />
Figure 11.2: The signal φ5 c(t) and the corresponding spectrum c 5<br />
( f ) for T = 1.<br />
hence less than 10 % of the energy of the sinc-function is outside the frequency band [−1/T, 1/T ].<br />
Numerical analysis shows that never more than 12 % of the energy of c k<br />
( f ) is outside<br />
the frequency band [− k+1<br />
T , k+1<br />
T<br />
] for k = 1, 2, · · · . Similarly we can show that the spectrum<br />
of the sine-waveform φk s (t) has never more than 5 % of its energy outside the frequency band<br />
[− k+1<br />
T , k+1<br />
T<br />
] for such k. For large k both percentages approach roughly 5 %.<br />
The frequency band needed to accommodate most of the energy of all 2K + 1 waveforms is<br />
[− K +1<br />
T , K +1<br />
T<br />
]. Now suppose that the available bandwidth is W . Then, to have not more than 12<br />
% out-of-band energy, K should satisfy<br />
W ≥ K + 1 . (11.8)<br />
T<br />
If, for a certain bandwidth W we take the largest K , the number N of orthogonal waveforms is<br />
N = 2K + 1 = 2⌊W T − 1⌋ + 1 > 2(W T − 2) + 1 = 2W T − 3. (11.9)<br />
RESULT 11.2 There are at least N = 2W T − 3 orthogonal waveforms over [− T 2 , T 2<br />
] such that<br />
less than 12 % of their spectral energy is outside the frequency band [−W, W ]. This implies that<br />
a fixed bandwidth W can accommodate 2W dimensions per second for large T .<br />
The dimensionality theorem shows that block-orthogonal signaling has a very unpleasant<br />
property that concerns bandwidth. This is demonstrated by the next example.<br />
Example 11.1 For block-orthogonal signaling the number of required dimensions in an interval of T<br />
seconds is 2 RT . If the channel bandwidth is W , the number of available dimensions is roughly 2W T ,<br />
hence the bandwidth W should be such that W ≥ 2 RT /(2T ).
CHAPTER 11. TIME, BANDWIDTH, AND DIMENSIONALITY 103<br />
Now consider subsection 10.4.2 and take ɛ = 0.1. Then (see (10.22)) R = 100<br />
121 C ∞. To get an<br />
acceptable P E we have to take RT ≫ 100 (see (10.20)), hence<br />
W ≥ 2RT<br />
2T = 2RT<br />
2RT R ≫ 2100<br />
R. (11.10)<br />
200<br />
Even for very small values of the rate R, e,g. 1 bit per second, this causes a big problem. We can also<br />
conclude that only very small spectral efficiencies R/W can be realized 1 .<br />
11.3 Remark<br />
In this chapter we have studied building-block waveforms that are time-limited. As a consequence<br />
these waveforms have a spectrum which is not frequency-limited. We can also investigate<br />
waveforms that are frequency-limited. However this implies that these waveforms are not<br />
time-limited anymore. Chapter 14 on pulse modulation deals with such building blocks.<br />
11.4 Exercises<br />
1. Consider the Gaussian pulse x(t) = ( √ 2πσ 2 ) −1 exp(−t 2 /(2σ 2 )) and signals such as<br />
s m (t) = ∑<br />
s mi x(t − iτ), for m = 1, 2, · · · , |M|, (11.11)<br />
i=1,N<br />
constructed from successive τ-second translates of x(t). Constrain the inter-pulse interference<br />
by requiring that<br />
∫ ∞<br />
−∞<br />
x(t − kτ)x(t − lτ)dt ≤ 0.05<br />
∫ ∞<br />
−∞<br />
x 2 (t)dt, for all k ̸= l, (11.12)<br />
and constrain the signal bandwidth W by requiring that x(t) have no more than 10 % of<br />
its energy outside the frequency interval [−W, W ]. Determine the largest possible value<br />
of the coefficient c in the equation N = cT W when N ≫ 1.<br />
(Exercise 5.4 from Wozencraft and Jacobs [25].)<br />
1 Note however that e.g. for ɛ = 1 the required number of dimensions can be acceptable. However then the rate<br />
R is only one fourth of the capacity P s /(N 0 ln 2).
Chapter 12<br />
Bandlimited transmission<br />
SUMMARY: Here we determine the capacity of the bandlimited additive white Gaussian<br />
noise waveform channel. We approach this problem in a geometrical way. The key idea is<br />
random code generation (Shannon, 1948).<br />
12.1 Introduction<br />
In the previous chapter we have seen that for a waveform channel with bandwidth W , the number<br />
of available dimensions per second is roughly 2W . We have also seen that reliable blockorthogonal<br />
signaling requires, already for very modest rates, many more dimensions per second.<br />
The reason for this was that, for rates close to capacity, although the error probability decreases<br />
exponentially with T , the number of necessary dimensions increases exponentially with T . On<br />
the other hand bit-by-bit signaling requires as many dimensions as there are bits to be transmitted,<br />
which is acceptable from the bandwidth perspective. But bit-by-bit signaling can only made<br />
reliable by increasing the transmitter power P s or by decreasing the rate R.<br />
This raises the question whether we can achieve reliable transmission at certain rates R by<br />
increasing T , when both the bandwidth W and available power P s are fixed. The following<br />
sections show that this is indeed possible. We will describe the results obtained by Shannon<br />
[21]. His analysis has a strongly geometrical flavor. These early results may not be as strong as<br />
possible but they certainly are very surprising and give the right intuition.<br />
We will determine the capacity per dimension, not per unit of bandwidth since the definition<br />
of bandwidth is somewhat arbitrary. We will work with vectors as we learned in the previous<br />
part of these course notes. Note that all messages are equiprobable.<br />
Definition 12.1 The energies of all signals are upper-bounded by E s which is defined to be the<br />
number of dimensions N times E N , i.e.<br />
‖s m ‖ 2 = ∑<br />
i=1,N<br />
E N is therefore the energy available per dimension.<br />
s 2 mi ≤ E s = N E N , for all m ∈ M. (12.1)<br />
104
CHAPTER 12. BANDLIMITED TRANSMISSION 105<br />
0<br />
❳❳<br />
✘✘ ❳❳ <br />
✑✟✟<br />
✏✘ ✘<br />
❍ ◗<br />
r ′ ❅❅<br />
n ′<br />
❅ ❏<br />
✂✡ ✁<br />
❆ ❆❈❈<br />
✂ ✂✂✂✂✂✍<br />
✄ ✄<br />
✄<br />
✂✂✁<br />
<br />
❈ ❈<br />
s ′ m<br />
❇ ❆<br />
✁ ✁✄✄<br />
✄✏✑ ✏✏✏✏✏✏✏✏✏✏✏✏✏✏✶ ✑✑✑✑✑✑✑✑✑✑✑✑✑✑✑✑✸<br />
❅<br />
✂ ❏ ✡<br />
✁<br />
❅❅◗ ❅<br />
❍❍<br />
❳❳ ✘✘✟✟ <br />
❳❳ ✘✘ ✏✑<br />
❇<br />
❈<br />
✄<br />
✂<br />
Figure 12.1: Hardening of the noise sphere around a signal point.<br />
It has advantages to consider here normalized versions of the signal, noise and received vectors.<br />
These normalized versions are defined in the following way:<br />
Definition 12.2 The normalized version r ′ of r is defined by<br />
r ′ = r/<br />
√<br />
N.<br />
As usual N is the number of components in r or the number of dimensions. Consequently also<br />
R ′ √<br />
= R/ N for the random variables R ′ and R that correspond to the realizations r ′ and r.<br />
Similar definitions hold for normalized signal vectors s ′ m , for m ∈ M, and for the normalized<br />
noise vector n ′ .<br />
12.2 Sphere hardening (noise vector)<br />
The random noise vector N has N components, each with mean 0 and variance σ 2 = N 0 /2. The<br />
normalized noise vector N ′ = N/ √ N. Its expected squared length is<br />
E[‖N ′ ‖ 2 ] = E[ 1 N<br />
∑<br />
i=1,N<br />
N 2<br />
i ] = 1 N<br />
∑<br />
i=1,N<br />
E[N 2<br />
i ] = 1 N<br />
∑<br />
i=1,N<br />
For the variance of the squared length of N ′ we find (using σ 2 = N 0 /2) that<br />
var[‖N ′ ‖ 2 ] = 1 N 2 var[ ∑<br />
= 1 N 2 ∑<br />
i=1,N<br />
= 1 N 2 ∑<br />
i=1,N<br />
i=1,N<br />
N 2<br />
i ] = 1 N 2 ∑<br />
i=1,N<br />
(<br />
E[N 4<br />
i ] − (E[N 2<br />
i ])2)<br />
(3σ 4 − (σ 2 ) 2) = 2σ 4<br />
var[N 2<br />
i ]<br />
N<br />
= 2 N<br />
N 0<br />
2 = N 0<br />
2 . (12.2)<br />
(<br />
N0<br />
2<br />
) 2<br />
. (12.3)
CHAPTER 12. BANDLIMITED TRANSMISSION 106<br />
The second equality in this derivation holds since var[ ∑ i=1,N X i] = ∑ i=1,N var[X i] for independent<br />
random variables X 1 , X 2 , · · · , X N . Moreover we have used that E[Ni 4]<br />
= 3σ 4 , see<br />
( ) 2<br />
exercise 1 at the end of this chapter. Finally observe that 2 N0<br />
N 2 approaches zero for N → ∞.<br />
Chebyshev’s inequality can be used to show that the probability of observing a normalized<br />
noise vector with squared length smaller than N 0 /2 − ɛ or larger than N 0 /2 + ɛ for any fixed<br />
ɛ > 0 approaches zero for N → ∞. More precisely<br />
( )<br />
{∣ ∣∣∣<br />
Pr ‖N ′ ‖ 2 − N }<br />
0<br />
2 ∣ > ɛ ≤ var[‖N ′ ‖ 2 2 N0<br />
] N 2<br />
ɛ 2 =<br />
ɛ 2 . (12.4)<br />
RESULT 12.1 We have demonstrated that the normalized received vector r ′ is (roughly speaking)<br />
on the surface of a hypersphere with radius √ N 0 /2 around the signal vector s m . Small<br />
fluctuations are possible, however for N → ∞ these fluctuations diminish.<br />
12.3 Sphere hardening (received vector)<br />
The received vector r = s m + n has N components. The normalized received vector r ′ =<br />
s ′ m + n′ where s ′ m = s m /√ N. The squared length of the normalized received vector can now be<br />
expressed as<br />
‖r ′ ‖ 2 = ‖s ′ m + n′ ‖ 2 = ‖s ′ m ‖2 + 2(s ′ m · n′ ) + ‖n ′ ‖ 2<br />
= ‖s ′ m ‖2 + 2 ∑<br />
s mi n i + ‖n ′ ‖ 2 . (12.5)<br />
N<br />
i=1,N<br />
Note that the term ‖s ′ m ‖2 is ”non-random” for a given message m while the other two terms in<br />
(12.5) depend on N and are therefore random variables. For the expected squared length of the<br />
random normalized received vector R ′ we therefore obtain that<br />
E[‖R ′ ‖ 2 ] = ‖s ′ m ‖2 + 2 N<br />
= ‖s ′ m ‖2 + N 0<br />
2<br />
∑<br />
i=1,N<br />
s mi E[N i ] + E[‖N ′ ‖ 2 ]<br />
(a)<br />
≤ E N + N 0<br />
2 . (12.6)<br />
Here (a) follows from the fact that ‖s ′ m ‖2 = ‖s m ‖ 2 /N ≤ E s /N = E N by definition 12.1. It can<br />
be shown that also the variance of the squared length of R ′ approaches zero for N → ∞. See<br />
also exercise 2 at the end of this chapter.<br />
RESULT 12.2 . We have demonstrated that the normalized received vector r ′ is roughly speaking<br />
within a sphere with radius √ E N + N 0 /2 and center at the origin of the coordinate system.<br />
For a normalized signal vector with energy equal to E N the received vectors are however on the<br />
surface of this hypersphere.
CHAPTER 12. BANDLIMITED TRANSMISSION 107<br />
❳❳<br />
✘✘ ❳❳ <br />
✑✟✟<br />
✏✘ ✘<br />
❍ ◗<br />
<br />
❅❅<br />
<br />
❅ ❏<br />
✂✡ r<br />
✁<br />
❆ ❆❈❈ ❇<br />
ɛ ✄ ✄<br />
✲ ✛<br />
✄✚ ✚✚✚✚✚❃ ❈<br />
✂ ✁ ✲<br />
r − ɛ<br />
❈ ❈<br />
❇ ❆<br />
✁ ✁✄✄<br />
✄<br />
✂<br />
❏ ❅<br />
✡<br />
❅❅◗ ❅<br />
❍❍<br />
❳❳ ✘✘✟✟ <br />
❳❳ ✘✘ ✏✑<br />
Figure 12.2: A hypersphere with radius r − ɛ concentric within a hypersphere of radius r, the<br />
difference is a shell with thickness ɛ.<br />
12.4 The interior volume of a hypersphere<br />
Consider in N dimensions a hypersphere with radius r. The volume of this sphere is V r = B N r N<br />
where B N is some constant depending in N (see e.g. appendix 5D in Wozencraft and Jacobs<br />
[25]). E.g. B 2 = π and B 3 = 4π/3.<br />
Now also consider a hypersphere with the same center but with a slightly smaller radius<br />
r − ɛ for some ɛ > 0, see figure 12.2. The volume of this smaller hypersphere is now V r−ɛ =<br />
B N (r − ɛ) N . This leads to the following result.<br />
RESULT 12.3 The ratio of the volume of the ”interior” of a hypersphere in N dimensions and<br />
the hypersphere itself is<br />
V r−ɛ<br />
V r<br />
= ( r − ɛ ) N (12.7)<br />
r<br />
which approaches zero for N → ∞. Consequently for N → ∞ the shell volume V r − V r−ɛ<br />
constitutes the entire volume V r of the hypersphere, no matter how small the thickness ɛ > 0 of<br />
the shell is.<br />
12.5 An upper bound for the number |M| of signal vectors<br />
For reliable transmission, the (disjoint) decision regions I m , m ∈ M should not be essentially<br />
smaller than a hypersphere with radius (slightly larger than) √ N 0 /2. This follows from result<br />
12.1. Note that we are referring to normalized vectors. On the other hand we know that all<br />
received vectors are inside a hypersphere of radius (slightly larger than) √ E N + N 0 /2. This was<br />
a consequence of result 12.2. To obtain an upper bound on the number |M| of signal vectors that<br />
allow reliable communication, we can now apply a so-called sphere-packing argument.<br />
RESULT 12.4 For reliable transmission, the number of signals |M| can not be larger than the<br />
volume of the hypersphere containing the received vectors, divided by the volume of the noise
CHAPTER 12. BANDLIMITED TRANSMISSION 108<br />
hyperspheres, hence<br />
|M| ≤<br />
B N<br />
(E N + N 0<br />
B N<br />
(<br />
N0<br />
2<br />
2<br />
) N/2<br />
) N/2<br />
=<br />
(<br />
EN + N 0<br />
2<br />
N 0<br />
2<br />
) N/2<br />
. (12.8)<br />
Here B N is again the constant in the formula V r = B N r N for the volume V r of an N-dimensional<br />
hypersphere with radius r.<br />
We now have our upper bound on the number |M| of signal vectors. It should be noted that<br />
in the book of Wozencraft and Jacobs [25] a more detailed proof can be found.<br />
12.6 A lower bound to the number |M| of signal vectors<br />
Shannon [21] constructed an existence proof to show that there are signal sets that allow reliable<br />
communication and that contain surprisingly many signals. His argument involved random<br />
coding.<br />
12.6.1 Generating a random code<br />
Fix the number of signal vectors |M| and their length N. To obtain a signal set, choose |M|<br />
normalized signal vectors s ′ 1 , s′ 2 , · · · , s′ |M|<br />
at random, independently of each other, uniformly<br />
within a hypersphere centered at the origin having radius √ E N . Consider the ensemble of all<br />
signal sets that can be chosen in this way.<br />
12.6.2 Error probability<br />
We are now interested in PE av,<br />
i.e. the error probability P E averaged over the ensemble of signal<br />
sets, i.e.<br />
∫<br />
=<br />
P av<br />
E<br />
s ′ 1 ,s′ 2 ,··· ,s′ |M|<br />
p(s ′ 1 , s′ 2 , · · · , s′ |M| )P E(s ′ 1 , s′ 2 , · · · , s′ |M| )d(s′ 1 , s′ 2 , · · · , s′ |M| ). (12.9)<br />
The averaging corresponds to the choice of the signal sets {s ′ 1 , s′ 2 , · · · , s′ |M|<br />
}. Once we know<br />
PE<br />
av we claim that there exists at least one signal set {s′ 1 , s′ 2 , · · · , s′ |M|<br />
} with error probability<br />
P E (s ′ 1 , s′ 2 , · · · , s′ |M| ) ≤ Pav E<br />
. Therefore we will first show that Pav<br />
E<br />
is small enough.<br />
Consider figure 12.3. Suppose that the signal s ′ m for some fixed m ∈ M was actually sent.<br />
The noise vector n ′ is added to the signal vector and hence the received vector turns out to be<br />
r ′ = s ′ m + n′ . Now an optimum receiver will decode the message m only if there are no other<br />
signals s<br />
k ′ , k ̸= m, inside a hypersphere with radius n′ around r ′ . This is a consequence of result<br />
4.1 that says that minimum Euclidean distance decoding is optimum if all messages are equally<br />
likely.
CHAPTER 12. BANDLIMITED TRANSMISSION 109<br />
✘<br />
✑ ✟ ✦✦✘✘<br />
<br />
✡<br />
✁ ✄ ✄✄☞☞<br />
❈<br />
❈❈▲▲<br />
❆ ❆<br />
❏<br />
❅<br />
❅❅◗❍<br />
❍ ❛❛<br />
❳❳❳<br />
0<br />
❳❳ ❳<br />
❛ ❛ ✑ ✏✘ ✘<br />
❍❍<br />
◗<br />
❅❆ ❩<br />
✂✡ ❅ ❩❩❩❩7 n ′ ❆❆<br />
❅<br />
✄ h ❆ ❏<br />
❆ ✟✯<br />
s ′ ❆❆<br />
❆<br />
m ❈ ✟ ▲<br />
❇ C ▲<br />
❏<br />
❈<br />
❈<br />
✓✟ ✓✓✓✓✓✓✓✓✓✓✓✼❩ r ′ ❅❅◗❳<br />
❈<br />
✄ ✟✟✟✟✟✟✟✟✟✟✟✟<br />
✂ ✁<br />
❳<br />
✄<br />
✄<br />
✄<br />
☞<br />
☞<br />
✁<br />
✡<br />
<br />
<br />
<br />
✑<br />
✟✟<br />
✘✘✘✦ ✦<br />
❳❳ <br />
◗<br />
❅<br />
❏<br />
❇<br />
❈<br />
✄<br />
✂<br />
✡<br />
<br />
✘✘ ✏✑<br />
Figure 12.3: Random coding situation
CHAPTER 12. BANDLIMITED TRANSMISSION 110<br />
• Note that result 12.3 implies that, with probability approaching one for N → ∞, the signal<br />
s ′ m was chosen on the surface of a sphere with radius √ E N . This is a consequence of the<br />
selection procedure of the signal vectors which is uniform over the hypersphere with radius<br />
√<br />
EN . Therefore we may assume that ‖s ′ m ‖2 = E N .<br />
• By result 12.2 the normalized received vector r ′ is now, with probability approaching one,<br />
on the surface of a sphere with radius √ E N + N 0 /2 centered at the origin, and thus ‖r ′ ‖ 2 =<br />
E N + N 0 /2.<br />
• Moreover the normalized noise vector n ′ is on the surface of a sphere with radius √ N 0 /2<br />
hence ‖n ′ ‖ 2 = N 0 /2 by the sphere-hardening argument (see result 12.1).<br />
First we have to determine the probability that the signal corresponding to the fixed message<br />
k ̸= m is inside the sphere with center r ′ and radius ‖n ′ ‖. Therefore we need to know the volume<br />
of the “lens” in the figure 12.3. Determining this volume is not so easy but we know that it is not<br />
larger than the volume of a sphere with radius h, see figure 12.3, and center coinciding with the<br />
center C of the lens. This hypersphere contains the lens. Our next problem is therefore to find<br />
out how large h is.<br />
Observing the lengths of s ′ m , n′ , and r ′ , we may conclude that the angle between the normalized<br />
signal vector s ′ m and the normalized noise vector n′ is π/2 (Pythagoras). Now we can use<br />
simple geometry to determine the length h:<br />
√<br />
√<br />
N EN 0<br />
2<br />
h = √ , (12.10)<br />
E N + N 0<br />
2<br />
and consequently<br />
( )<br />
N<br />
V lens ≤ B N h N EN 0 N/2<br />
2<br />
= B N<br />
E N + N . (12.11)<br />
0<br />
2<br />
The probability that a signal s k for a specific k ̸= m was selected inside the lens is (by the<br />
uniform distribution over the hypersphere with radius √ E N ) now given by the volume of the<br />
lens divided by the volume of the sphere. This ratio can be upper-bounded as follows<br />
( )<br />
N<br />
V 0 N/2<br />
lens<br />
2<br />
B N E N/2 ≤<br />
E<br />
N<br />
N + N . (12.12)<br />
0<br />
2<br />
By the union bound 1 the probability that any signal s k for k ̸= m is chosen inside the lens is at<br />
most |M| − 1 times as large as the ratio (probability) considered in (12.12). Therefore we obtain<br />
for the error probability averaged over the ensemble of signal sets<br />
P av<br />
E ≤ |M| (<br />
N 0<br />
2<br />
E N + N 0<br />
2<br />
) N/2<br />
. (12.13)<br />
1 For the K events E 1 , E 2 , · · · , E K the probability Pr{E 1 ∪ E 2 ∪ · · · ∪ E K } ≤ Pr{E 1 } + Pr{E 2 } + · · · + Pr{E K }.
CHAPTER 12. BANDLIMITED TRANSMISSION 111<br />
12.6.3 Achievability result<br />
RESULT 12.5 Fix any δ > 0. Suppose that the number of signals |M| as a function of the<br />
number of dimensions N is given by<br />
then<br />
(<br />
|M| = 2 −δN EN + N 0<br />
2<br />
N 0<br />
2<br />
) N/2<br />
, (12.14)<br />
lim<br />
N→∞ Pav E ≤ lim<br />
N→∞ 2−δN = 0. (12.15)<br />
This implies the existence of signal sets, one for each value of N, with |M| as in (12.14), for<br />
which lim N→∞ P E = 0.<br />
Note that relation (12.14) makes reliable transmission possible. Since better methods could<br />
exist, it serves as a lower bound on |M| when reliable transmission is required.<br />
12.7 Capacity of the bandlimited channel<br />
We can now combine results 12.4 and 12.5. Define<br />
( )<br />
1<br />
C N =<br />
2 log EN + N 0 /2<br />
2<br />
, (12.16)<br />
N 0 /2<br />
then result 12.4 (upper bound) shows that reliable communication is only possible if the rate R N<br />
per dimension satisfies<br />
R N = log 2 |M| ≤ C N . (12.17)<br />
N<br />
On the other hand result 12.5 (lower bound) demonstrates the existence of signal sets that allow<br />
reliable communication with rates<br />
R N ≥ C N − δ, (12.18)<br />
for any value of δ > 0. Since we can take δ as small as we like we get the following theorem:<br />
THEOREM 12.6 (Shannon, 1949) Reliable communication is possible over a waveform channel,<br />
when the available energy per dimension is E N and the power spectral density of the noise<br />
is N 0<br />
2 , for all rates R N satisfying<br />
R N < C N bit/dimension. (12.19)<br />
Rates R N larger than C N are not realizable with reliable methods. The quantity C N is called the<br />
capacity of the waveform channel in bit per dimension.
CHAPTER 12. BANDLIMITED TRANSMISSION 112<br />
Example 12.1 In figure 12.4 we have depicted the behavior of randomly generated codes. In a MATLAB<br />
program we have generated codes (signal sets) with R N = 1 bit/dimension and vector-lengths N =<br />
4, 8, 12, and 16. Note that reliable transmission of R N = 1-codes is only possible for signal-to-noise<br />
ratios E N / N 0<br />
≥ 3, or 4.7 dB. These codes are therefore tested with E<br />
2 N / N 0<br />
between 3 and 11 dB. The<br />
2<br />
resulting error probabilities for vector-lengths 4,8,12, and 16 are listed in the figure. For each experiment<br />
we have generated 32 codes, and each code was tested with 256 randomly-generated messages. The<br />
resulting number of errors was divided by 8192 to give an estimate of the error probability. The figure<br />
10 0<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
3 4 5 6 7 8 9 10 11<br />
Figure 12.4: Error probability (vertically) as a function of E N / N 0<br />
2<br />
(∗), 8 (◦), 12 (△), and 16 (✷) .<br />
for codes with length N = 4<br />
clearly shows that increasing N results in decreasing error-probabilities for E N / N 0<br />
≥ 5 dB. Simulations<br />
2<br />
for N ≥ 20 turned out to be unfeasible in MATLAB.<br />
In the previous chapter we have seen that a waveform channel with bandwidth W can accommodate<br />
(roughly) at most 2W dimensions per second. With 2W dimensions per second, the<br />
available energy per dimension is E N = P s /(2W ) if P s is the available transmitterpower. In that<br />
case the channel capacity in bit per dimension is<br />
C N = 1 2 log 2<br />
(<br />
Ps<br />
Therefore we obtain the following result:<br />
2W + N 0/2<br />
N 0 /2<br />
)<br />
= 1 2 log 2<br />
(<br />
1 + P s<br />
W N 0<br />
)<br />
. (12.20)<br />
THEOREM 12.7 For a waveform channel with spectral noise density N 0 /2, frequency bandwidth<br />
W , and available transmitter power P s , the capacity in bit per second is<br />
(<br />
C = 2WC N = W log 2 1 + P )<br />
s<br />
bit/second. (12.21)<br />
W N 0
CHAPTER 12. BANDLIMITED TRANSMISSION 113<br />
Thus reliable communication is possible for rates R in bit per second smaller than C, while rates<br />
larger than C are not realizable with arbitrary small P E .<br />
12.8 Exercises<br />
1. Let N be a random variable with density p N (n) = √ 1<br />
n2<br />
exp(− ). Show that E[N 4 ] =<br />
2πσ 2 2σ 2<br />
3σ 4 .<br />
2. Show that the variance of the squared length of the normalized received vector R ′ approaches<br />
zero for N → ∞.<br />
m<br />
✲<br />
transmitter<br />
n w,a (t)<br />
✛❄<br />
✘<br />
✲ +<br />
✚✙<br />
n w,b (t)<br />
✛❄<br />
✘<br />
✲ +<br />
✚✙<br />
✲<br />
✲<br />
receiver<br />
ˆm<br />
✲<br />
Figure 12.5: Water-filling situation<br />
3. A transmitter having power P s = 10 Watt is connected to a receiver by two waveform<br />
channels (see figure 12.5). Each of these channels adds white Gaussian noise to its input<br />
waveform. The power densities of the two noise processes are N0 a/2 = 1 and N 0 b/2 = 2<br />
for −∞ < f < ∞, while the frequency bandwidth W = 1Hz. What is the total capacity<br />
of this parallel channel in bit per second? What happens if P s = 1 Watt? The power<br />
allocation procedure that should be used here is called “water-filling”.
Chapter 13<br />
Wideband transmission<br />
SUMMARY: We determine the capacity of the additive white Gaussian noise waveform<br />
channel in the case of unlimited bandwidth resources. The result that is obtained in this<br />
way shows that block-orthogonal signaling is optimal.<br />
13.1 Introduction<br />
We have seen that with block-orthogonal signaling we can achieve reliable communication at<br />
rates<br />
R <<br />
P s<br />
bit/second, (13.1)<br />
N 0 ln 2<br />
if we only had access to as much bandwidth as we would want. We will prove in this chapter<br />
that rates larger than P s /(N 0 ln 2) are not realizable with reliable signaling schemes. Therefore<br />
we may indeed call P s /(N 0 ln 2) the capacity of the wideband waveform channel.<br />
13.2 Capacity of the wideband channel<br />
In the previous chapter the capacity of a waveform channel with frequency bandwidth limited to<br />
W was shown to be<br />
(<br />
W log 2 1 + P )<br />
s<br />
bit/second, (13.2)<br />
W N 0<br />
see theorem 12.7. We can easily determine the wideband capacity by letting W → ∞.<br />
RESULT 13.1 The capacity C ∞ of the wideband waveform channel with additive white Gaussian<br />
noise with power density N 0 /2 for −∞ < f < ∞, when the transmitter power is P s , follows<br />
114
CHAPTER 13. WIDEBAND TRANSMISSION 115<br />
2.5<br />
ln( 1 + S/N ), S/N, and ln( 1 + S/N )/ S/N<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
0 0.5 1 1.5 2 2.5<br />
S/N<br />
Figure 13.1: The ratio ln(1 + S/N )/S/N and ln(1 + S/N ) as function of the S/N .<br />
from:<br />
(<br />
C ∞ = lim W log 2<br />
W →∞<br />
= lim<br />
W →∞<br />
P ln s<br />
N 0 ln 2<br />
1 + P )<br />
s<br />
W N<br />
( 0<br />
)<br />
1 + P s<br />
W N 0<br />
= P s<br />
P s N<br />
W N 0 ln 2 . (13.3)<br />
0<br />
Expressed in nats per second this capacity is equal to P s /N 0 .<br />
13.3 Relation between capacities, signal-to-noise ratio<br />
Note that we can express the capacity of the bandlimited channel as<br />
(<br />
C = W log 2 1 + P )<br />
s<br />
W N<br />
( 0<br />
)<br />
1 + P s<br />
W N 0<br />
=<br />
P ln s<br />
N 0 ln 2<br />
ln(1 + S/N )<br />
= C<br />
P ∞ s<br />
S/N<br />
W N 0<br />
, with S/N = P s<br />
W N 0<br />
. (13.4)<br />
Here S/N is the signal-to-noise ratio of the waveform channel. Note that the total noise power<br />
is equal to the frequency band 2W times the power spectral density N 0 /2 of the noise. We have<br />
plotted the ratio ln(1 + S/N )/S/N in figure 13.1.
CHAPTER 13. WIDEBAND TRANSMISSION 116<br />
RESULT 13.2 If we note that C = W log 2 (1 + S/N ) then we can can distinguish two cases.<br />
C ≈<br />
{<br />
C∞ if S/N ≪ 1,<br />
W log 2 (S/N ) if S/N ≫ 1.<br />
(13.5)<br />
The case S/N ≪ 1 is called the power-limited case. There is enough bandwidth. When on the<br />
other hand S/N ≫ 1 we speak about bandwidth-limited channels. Increasing the power by a<br />
factor of two does not double the capacity but increases it only by one bit per second per Hz.<br />
Finally we give some other ways to express the signal-to-noise ratio (under the assumption<br />
that the number of dimensions per second is exactly 2W ):<br />
13.4 Exercises<br />
S/N =<br />
P s<br />
W N 0<br />
=<br />
E N<br />
N 0 /2 = 2R E b<br />
N = R N 0 W<br />
E b<br />
N 0<br />
. (13.6)<br />
1. To obtain result 12.7 we have assumed that the number of dimensions that a channel with<br />
bandwidth W can accommodate per second is 2W . What would happen with the wideband<br />
capacity if instead the channel would give us αW dimensions per second?<br />
2. Find out whether a telephone line channel (W = 3400Hz, C = 34000 bit/sec) is power- or<br />
bandwidth-limited by determining its S/N . Also determine E b /N 0 and R N .<br />
3. For a fixed ratio E b /N 0 , for some values of R/W reliable transmission is possible, for<br />
other values of R/W this is not possible. Therefore the E b /N 0 × R/W - plane can be<br />
divided in a region in which reliable transmission is possible and a region where this is<br />
not possible. Find out what function of R/W describes the boundary between these two<br />
regions.<br />
(The ratio R/W is sometimes called the bandwidth efficiency. Note that R/W = 2R N<br />
where R N is the rate per dimension. Hence a similar partitioning is possible for the<br />
E b /N 0 × R N plane.)<br />
4. A receiver is connected to a transmitter by two channels, a bandlimited channel and a<br />
wide-band channel. The transmitter’s signalling power P s = 90 Watt. The transmitter can<br />
use all its power to communicate only via the bandlimited channel, only via the wide-band<br />
channel, or it may distribute its power over the bandlimited channel and the wide-band<br />
channel.<br />
(a) The bandlimited channel has bandwidth W = 200 Hz. The power spectral density of<br />
the noise in this channel N<br />
0 ′ /2 = 0.015 (Watt/Hz). Assume that the transmitter uses<br />
all of its power to communicate over the bandlimited channel. First determine the<br />
capacity C<br />
N ′ of this channel in bits per dimension. Then determine the capacity C′ of<br />
the bandlimited channel in bits per second.
CHAPTER 13. WIDEBAND TRANSMISSION 117<br />
transm.<br />
P s = 90<br />
✲<br />
W = 200<br />
−W<br />
1<br />
W<br />
☛❄<br />
✟<br />
✲ ✲<br />
✡+ ✠ receiv.<br />
☛ ✟<br />
✲<br />
✡+<br />
✠<br />
✻<br />
N<br />
0 ′ /2 = 0.015<br />
✲<br />
N<br />
0 ′′ /2 = 0.06<br />
Figure 13.2: Transmitter, two parallel channels (bandlimited and wide-band channel), and receiver.<br />
(b) The spectral density of the noise in the wide-band channel N<br />
0 ′′ /2 = 0.06 (Watt/Hz).<br />
Assume now that the transmitter only communicates via the wide-band channel to<br />
the receiver. What is the capacity C ′′ of this wide-band channel in bits per second.<br />
(c) How should the transmitter split up its power over the bandlimited channel and the<br />
wide-band channel to achieve the maximum total capacity in bits per second? How<br />
large is this maximum total capacity C tot ?<br />
(Exam Communication Theory, January 12, 2005.)
Part IV<br />
Channel Models<br />
118
Chapter 14<br />
Pulse amplitude modulation<br />
SUMMARY: In this chapter we discuss serial pulse amplitude modulation. This method<br />
provides us with a new channel dimension every T seconds if we choose the pulse according<br />
to the so-called Nyquist criterion. The Nyquist criterion implies that the required bandwidth<br />
W is larger than 1/(2T ). Multi-pulse transmission is also considered in this chapter.<br />
Just as in the single-pulse case a bandwidth of 1/(2T ) Hz per used pulse is required. This<br />
corresponds again to 2W = 1/T dimensions per second.<br />
14.1 Introduction<br />
impulses<br />
a 0 , a 1 , · · · , a K −1<br />
✲<br />
p(t)<br />
✁ ✁✁<br />
0<br />
❅ p(t)<br />
❅ ❅ t<br />
T 2T<br />
s a (t)<br />
n w (t)<br />
✗❄<br />
✔<br />
✲ +<br />
✖ ✕<br />
r(t)<br />
✲<br />
+1<br />
✻<br />
0<br />
T<br />
❄<br />
−1<br />
+1<br />
✻<br />
2T<br />
+1<br />
✻<br />
3T<br />
t<br />
✁✁ ✁ ❇ ❇ ✂ ❅ ❅ s ❅❅ a (t)<br />
❅ ❇ 0 T ❇ 2T<br />
✂ ✂ 3T 4T 5T<br />
❅❅✂<br />
t<br />
Figure 14.1: A serial pulse-amplitude modulation system.<br />
In chapter 11 we could observe that a channel with a bandwidth of W Hz can accommodate<br />
roughly 2W T dimensions each T seconds (for large enough T ). We considered there buildingblock<br />
waveforms of finite duration, therefore the bandwidth constraint could only be met approx-<br />
119
CHAPTER 14. PULSE AMPLITUDE MODULATION 120<br />
imately. Here we will investigate whether it is possible to get a new dimension every 1/(2W )<br />
seconds. We allow the building blocks to have a non-finite duration. Moreover all these buildingblock<br />
waveforms are time shifts of a ”pulse”. The subject of this chapter is therefore serial pulse<br />
transmission.<br />
Consider figure 14.1. We there assume that the transmitter sends a signal s(t) that consists<br />
of amplitude-modulated time shifts of a pulse p(t) by an integer multiple k of the so-called<br />
modulation interval T , hence<br />
s a (t) =<br />
∑<br />
a k p(t − kT ). (14.1)<br />
k=0,K −1<br />
The vector of amplitudes a = (a 0 , a 1 , · · · , a K −1 ) consists of symbols a k , k = 0, K − 1 taking<br />
values in the alphabet A. We call this modulation method serial pulse-amplitude modulation<br />
(PAM).<br />
14.2 Orthonormal pulses: the Nyquist criterion<br />
14.2.1 The Nyquist result<br />
If we have the possibility to choose the pulse p(t) ourselves we can choose it in such a way that<br />
all time shifts of the pulse form an orthonormal base. In that case the pulse p(t) has to satisfy<br />
∫ ∞<br />
{<br />
p(t − kT )p(t − k ′ 1 if k = k<br />
T )dt =<br />
′ ,<br />
0 if k ̸= k ′ (14.2)<br />
,<br />
−∞<br />
for integer k and k ′ . This is equivalent to<br />
∫ ∞<br />
−∞<br />
p(τ)p(τ − kT )dτ = p(t) ∗ p(−t)| t=kT = h(kT ) =<br />
{ 1 if k = 0,<br />
0 if k ̸= 0,<br />
(14.3)<br />
where h(t) = p(t) ∗ p(−t). This time-domain restriction on the pulse p(t) is called the zeroforcing<br />
(ZF) criterion. Later we will see why.<br />
THEOREM 14.1 The frequency-domain equivalent to (14.3) which is known as the Nyquist<br />
criterion for zero intersymbol interference is<br />
Z( f ) = 1 ∑<br />
H( f + m/T )<br />
T<br />
= 1 T<br />
m=··· ,−1,0,1,2,···<br />
∑<br />
m=··· ,−1,0,1,2,···<br />
‖P( f + m/T )‖ 2 = 1 for all f ,<br />
where H( f ) = P( f )P ∗ ( f ) = ‖P( f )‖ 2 is the Fourier transform of h(t) = p(t) ∗ p(−t). Note<br />
that ‖P( f )‖ is the modulus of P( f ). Moreover Z( f ) is called the 1/T -aliased spectrum of<br />
H( f ).<br />
Later in this section we will give the proof of this theorem. First we will discuss it however<br />
and consider an important consequence of it.
CHAPTER 14. PULSE AMPLITUDE MODULATION 121<br />
H( f )<br />
Z( f )<br />
✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆ ✁ ✁ ❆ ❆<br />
❆❆ ✁ ✁✁ ❆ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ❆<br />
1 1 3<br />
− 1<br />
1 f<br />
− 3 − 1 T − 1<br />
2T 2T<br />
2T 2T 2T T 2T<br />
✁ f<br />
Figure 14.2: The spectrum H( f ) = ‖P( f )‖ 2 corresponding to a pulse p(t) that does not satisfy<br />
the Nyquist criterion.<br />
H( f )<br />
T 1<br />
Z( f )<br />
− 1<br />
2T<br />
1<br />
2T<br />
f<br />
− 3<br />
2T<br />
− 1 T − 1<br />
2T<br />
1<br />
2T<br />
1<br />
T<br />
3<br />
2T<br />
f<br />
Figure 14.3: The ideally bandlimited spectrum H( f ) = ‖P( f )‖ 2 . Note that P( f ) is a spectrum<br />
with the smallest possible bandwidth satisfying the Nyquist criterion.<br />
14.2.2 Discussion<br />
• Since p(t) is a real signal, the real part of its spectrum P( f ) is even in f , and the imaginary<br />
part of this spectrum is odd. Therefore the modulus ‖P( f )‖ of P( f ) is an even function<br />
of the frequency f .<br />
• If the bandwidth W of the pulse p(t) is strictly smaller than 1/(2T ) (see figure 14.2), then<br />
the Nyquist criterion, which is based on H( f ) = ‖P( f )‖ 2 , can not be satisfied. Thus<br />
no pulse p(t) that satisfies a bandwidth-W constraint, can lead to orthogonal signaling if<br />
W < 1/(2T ).<br />
• The smallest possible bandwidth W of a pulse that satisfies the Nyquist criterion is 1/(2T ).<br />
The ”basic” pulse with bandwidth 1/(2T ) for which the Nyquist criterion holds has a socalled<br />
ideally bandlimited spectrum, which is given by<br />
P( f ) =<br />
{ √<br />
T if | f | < 1/(2T ),<br />
0 if | f | > 1/(2T ),<br />
(14.4)<br />
and √ T /2 for | f | = 1/(2T ). The corresponding H( f ) = ‖P( f )‖ 2 and 1/T -aliased<br />
spectrum Z( f ) can be found in figure 14.3.<br />
The basic pulse p(t) that corresponds to the ideally bandlimited spectrum is shown in<br />
figure 14.4. It is the well-known sinc-pulse<br />
p(t) = √ 1 sin(πt/T )<br />
. (14.5)<br />
T πt/T
CHAPTER 14. PULSE AMPLITUDE MODULATION 122<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
−0.2<br />
−0.4<br />
−5 −4 −3 −2 −1 0 1 2 3 4 5<br />
Figure 14.4: The sinc-pulse that corresponds to T = 1, i.e. p(t) = sin(πt)/(πt).<br />
H( f ) Z( f )<br />
T<br />
1<br />
❅ ❅<br />
❅ ❅ <br />
❅ ❅<br />
❅ <br />
❅ ❅ ❅ ❅ ❅<br />
❅ ❅ ❅<br />
❅ ❅ ❅<br />
1 f<br />
− 1<br />
2T 2T<br />
− 3<br />
2T<br />
− 1 − 1<br />
T 2T<br />
1<br />
2T<br />
1<br />
T<br />
3<br />
2T<br />
❅ ❅ <br />
❅<br />
f<br />
Figure 14.5: A spectrum H( f ) with a positive excess bandwidth satisfying the Nyquist criterion.<br />
• A pulse with a bandwidth larger than 1/(2T ) can also satisfy the Nyquist criterion as can<br />
be seen in figure 14.5. Note that the so-called excess bandwidth, i.e. the bandwidth minus<br />
1/(2T ), can be larger than 1/(2T ). Square-root raised-cosine pulses p(t) have a spectrum<br />
H( f ) = |P( f )| 2 that satisfies the Nyquist criterion and an excess bandwidth that can be<br />
controlled.<br />
At he end of this subsection we state the implication of the previous discussion.<br />
RESULT 14.2 The smallest possible bandwidth W of a pulse that satisfies the Nyquist criterion<br />
is W = 1<br />
1<br />
2T<br />
. The sinc-pulse p(t) = √<br />
sin(πt/T )<br />
T πt/T<br />
has this property. Note that this way of serial<br />
pulse transmission leads to exactly 1 T<br />
= 2W dimensions per second.<br />
Moreover observe that unlike before in chapter 11, our pulses and signals are not timelimited!<br />
14.2.3 Proof of the Nyquist result<br />
We will next give the proof of result 14.1, the Nyquist result.<br />
Proof: Since H( f ) is the Fourier transform of h(t), the condition (14.3) can be rewritten as<br />
h(kT ) =<br />
∫ ∞<br />
−∞<br />
H( f ) exp( j2π f kT )d f =<br />
{ 1 if k = 0,<br />
0 if k ̸= 0.<br />
(14.6)
CHAPTER 14. PULSE AMPLITUDE MODULATION 123<br />
We now break up the integral in parts, a part for each integer m, and obtain<br />
h(kT ) =<br />
=<br />
=<br />
=<br />
∞∑<br />
∫ (2m+1)/2T<br />
m=−∞ (2m−1)/2T<br />
∫ 1/2T<br />
∞∑<br />
m=−∞<br />
∫ 1/2T<br />
−1/2T<br />
∫ 1/2T<br />
−1/2T<br />
−1/2T<br />
[ ∞<br />
∑<br />
m=−∞<br />
H( f ) exp( j2π f kT )d f<br />
H( f + m ) exp( j2π f kT )d f<br />
T<br />
H( f + m T ) ]<br />
exp( j2π f kT )d f<br />
Z( f ) exp( j2π f kT )d f (14.7)<br />
where we have defined Z( f ) by<br />
Z( f ) =<br />
∞∑<br />
m=−∞<br />
H( f + m ). (14.8)<br />
T<br />
Observe that Z( f ) is a periodic function in f with period 1/T . Therefore it can be expanded in<br />
terms of its Fourier series coefficients · · · , z −1 , z 0 , z 1 , z 2 , · · · as<br />
where<br />
z k = T<br />
Z( f ) =<br />
∫ 1/2T<br />
−1/2T<br />
∞∑<br />
k=−∞<br />
If we now combine (14.7) and (14.10) we obtain that<br />
z k exp( j2πk f T ) (14.9)<br />
Z( f ) exp(− j2π f kT )d f. (14.10)<br />
T h(−kT ) = z k , (14.11)<br />
for all integers k. Condition (14.3) now tells us that only z 0 = T and all other z k are zero. This<br />
implies that<br />
Z( f ) = T, (14.12)<br />
or equivalently<br />
∞∑<br />
m=−∞<br />
H( f + m ) = T. (14.13)<br />
T<br />
✷
CHAPTER 14. PULSE AMPLITUDE MODULATION 124<br />
r(t)<br />
✲<br />
p(T p − t)<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
r k<br />
✲<br />
❄<br />
sample at t = T p + kT<br />
Figure 14.6: The optimum receiver front-end for detection of serially transmitted orthonormal<br />
pulses.<br />
14.2.4 Receiver implementation<br />
If we use serial PAM with orthonormal pulses p(t − kT ), k = 0, K − 1, then we can use a<br />
single matched-filter m(t) = p(T p − t) for the computations of the correlations of the received<br />
waveform r(t) with all building-block waveforms, i.e. with all pulses. Assume that the delay T p<br />
is chosen in such a way that m(t) is (effectively 1 ) causal. The output of this filter m(t) when r(t)<br />
is the input signal is<br />
u(t) =<br />
∫ ∞<br />
−∞<br />
r(α)m(t − α)dα =<br />
At time t = T p + kT we see at the filter output<br />
r k = u(T p + kT ) =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(α)p(T p − t + α)dα. (14.14)<br />
r(α)p(α − kT )dα, (14.15)<br />
which is what the optimum receiver should determine, i.e. the correlation of r(t) with the pulse<br />
p(t − kT ). This leads to the very simple receiver structure shown in figure 14.6. <strong>Processing</strong> the<br />
samples r k , k = 1, K, should be done in the usual way.<br />
When there is no noise r(t) = ∑ k=0,K −1 a k p(t − kT ). Then at time t = T p + kT we see at<br />
the filter output<br />
r k = u(T p + kT ) =<br />
=<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
r(α)p(α − kT )dα<br />
∑<br />
−∞<br />
k ′ =0,K −1<br />
∑<br />
∫ ∞<br />
a k ′<br />
k ′ =0,K −1<br />
−∞<br />
a k ′ p(α − k ′ T )p(α − kT )dα<br />
p(α − k ′ T )p(α − kT )dα = a k , (14.16)<br />
by the orthonormality of the pulses. Conclusion is that there is no intersymbol interference<br />
present in the samples. In other words the Nyquist criterion forces the intersymbol interference<br />
to be zero (zero-forcing (ZF)).<br />
1 Note that p(t) has a non-finite duration in general.
CHAPTER 14. PULSE AMPLITUDE MODULATION 125<br />
14.3 Multi-pulse transmission<br />
Instead of using shifted versions of a single pulse we can apply the shifts of a set of orthonormal<br />
pulses. Consider J pulses p 1 (t), p 2 (t), · · · , p J (t) which are orthonormal. The shifted versions<br />
of these pulses can be used to convey a stream of messages. Again the shifts are over multiples<br />
of T and T is called the modulation interval again. Thus a signal s(t) can be written as<br />
s(t) =<br />
∑ ∑<br />
a kj p j (t − kT ), (14.17)<br />
k=0,K −1 j=1,J<br />
for a kj ∈ A. If we want all pulses and its time-shifts to form an orthonormal basis, then the<br />
time-shifts of all pulses should be orthogonal to the pulses p j (t), j = 1, J. In other words, we<br />
require the J pulses to satisfy<br />
∫ ∞<br />
{<br />
p j (t − kT )p j ′(t − k ′ 1 if j = j<br />
T )dt =<br />
′ and k = k ′ ,<br />
(14.18)<br />
0 elsewhere,<br />
−∞<br />
for all j = 1, J, j ′ = 1, J, and all integer k and k ′ . A set of pulses p j (t) for j = 1, 2, · · · , J<br />
that satisfies this restriction is<br />
p j (t) = √ 1 sin(πt/(2T ))<br />
cos ((2 j − 1)πt/(2T )) . (14.19)<br />
T πt/(2T )<br />
For J = 4, and assuming that T = 1, these pulses and their spectra are shown in figure 14.7.<br />
Observe that in this example the total bandwidth of the J pulses is J/(2T ).<br />
RESULT 14.3 The smallest possible bandwidth that is occupied by orthogonal multi-pulse signaling<br />
with J pulses each T seconds is J/(2T ). Note therefore that also bandwidth-efficient<br />
serial multi-pulse transmission leads to exactly 2W dimensions per second.<br />
Proof: (Lee and Messerschmitt [13], p. 266) Just like in the proof of theorem 14.1 we can show<br />
that the pulse-spectra P j ( f ) for j = 1, J must satisfy<br />
1<br />
∞∑<br />
P j ( f + m T<br />
T )P∗ j ′( f + m T ) = δ j j ′. (14.20)<br />
m=−∞<br />
Now fix a frequency f ∈ [−1/(2T ), +1/(2T )) and define for each j = 1, J the vector<br />
P j<br />
= (· · · , Pj ( f − 1/T ), P j ( f ), P j ( f + 1/T ), P j ( f + 2/T ), · · · ). (14.21)<br />
Here P j ( f +m/T ) is the component at position m. With this definition condition (14.20) implies<br />
that the vectors P j , j = 1, J are orthogonal. Assume for a moment that there are less than J<br />
positions m for which at least one component P j ( f + m/T ) for some j = 1, J is non-zero. This<br />
would imply that the J vectors P j , j = 1, J would be dependent. Contradiction!<br />
Thus for each frequency interval d f ∈ [−1/(2T ), 1/(2T )) we may conclude that the J pulses<br />
”fill” at least J disjoint intervals of size d f of the frequency spectrum. In total the J pulses fill a<br />
part J/T of the spectrum. The minimally occupied bandwidth is therefore J/(2T ). ✷<br />
Note that the optimum receiver for multi-pulse transmission can be implemented with J<br />
matched filters that are sampled each T seconds.
CHAPTER 14. PULSE AMPLITUDE MODULATION 126<br />
1<br />
0<br />
−1<br />
−3 −2 −1 0 1 2 3<br />
1<br />
1<br />
0.5<br />
0<br />
−3 −2 −1 0 1 2 3<br />
1<br />
0<br />
−1<br />
−3 −2 −1 0 1 2 3<br />
1<br />
0<br />
−1<br />
−3 −2 −1 0 1 2 3<br />
1<br />
0.5<br />
0<br />
−3 −2 −1 0 1 2 3<br />
1<br />
0.5<br />
0<br />
−3 −2 −1 0 1 2 3<br />
1<br />
0<br />
−1<br />
−3 −2 −1 0 1 2 3<br />
0.5<br />
0<br />
−3 −2 −1 0 1 2 3<br />
Figure 14.7: Time domain and frequency domain plots of the pulses in (14.19) for j = 1, 2, 3, 4<br />
and T = 1.<br />
14.4 Exercises<br />
1. Determine the spectra of the pulses p j (t) for j = 1, 2, · · · , J defined in (14.19). Show<br />
that these pulses satisfy (14.18). Determine the total bandwidth of these J pulses.
Chapter 15<br />
Bandpass channels<br />
SUMMARY: Here we consider transmission over an ideal bandpass channel with bandwidth<br />
2W . We show that building-block waveforms for transmission over a bandpass channel<br />
can be constructed from baseband building-block waveforms having a bandwidth not<br />
larger than W . A first set of orthonormal baseband building-block waveforms can be modulated<br />
on a cosine carrier, a second orthonormal set on a sine carrier, and the resulting<br />
bandpass waveforms are all orthogonal to each other. This technique is called quadrature<br />
multiplexing. We determine the optimum receiver for this signaling method and the capacity<br />
of the bandpass channel. We finally discuss quadrature amplitude modulation (QAM).<br />
15.1 Introduction<br />
So far we have only considered baseband communication. In baseband communication the channel<br />
allows signaling only in the frequency band [−W, W ]. But what should we do if the channel<br />
only accepts signals with a spectrum in the frequency band ±[ f 0 − W, f 0 + W ]? In the present<br />
chapter we will describe a method that can be used for signaling over such a channel. We start<br />
by discussing this so-called bandpass channel. First we give a definition of the ideal bandpass<br />
channel.<br />
Definition 15.1 Consider figure 15.1. The input signal s(t) to the bandpass channel is first sent<br />
through a filter W 0 ( f ). This filter W 0 ( f ) is an ideal bandpass filter with bandwidth 2W and<br />
s(t)<br />
✲ W 0 ( f )<br />
n w (t)<br />
✛❄<br />
✘r(t)<br />
✲ + ✲<br />
✚✙<br />
W 0 ( f )<br />
f<br />
1 0 − W f 0 + W<br />
− f 0 f 0 f<br />
Figure 15.1: Ideal bandpass channel with additive white Gaussian noise.<br />
127
CHAPTER 15. BANDPASS CHANNELS 128<br />
center frequency f 0 > W . It is defined by the transfer function<br />
{<br />
W 0 ( f ) =<br />
1 if f0 − W < | f | < f 0 + W ,<br />
0 elsewhere.<br />
(15.1)<br />
The noise process N w (t) is assumed to be stationary, Gaussian, zero-mean, and white. The noise<br />
has power spectral density function<br />
S Nw ( f ) = N 0<br />
2<br />
for −∞ < f < ∞, (15.2)<br />
and autocorrelation function R Nw (t, s) = E[N w (t)N w (s)] = N 0<br />
2<br />
δ(t − s). The noise-waveform<br />
n w (t) is added to the output of the bandpass filter W 0 ( f ) which results in r(t), the output of the<br />
bandpass channel.<br />
In the next section we will describe a signaling method for a bandpass channel. This method<br />
is called quadrature multiplexing.<br />
15.2 Quadrature multiplexing<br />
In quadrature multiplexing we combine two waveforms with a baseband spectrum into a waveform<br />
with a spectrum that matches with the bandpass channel. We assume that the two baseband<br />
waveforms together contain a single message.<br />
Consider a first set of baseband waveforms {s1 c(t), sc 2 (t), · · · , sc |M| (t)} with waveform sc m (t)<br />
for all m ∈ {1, 2, · · · , |M|}, having bandwidth smaller than W , i.e. with a spectrum Sm c ( f ) ≡ 0<br />
for | f | ≥ W . To this set of baseband-waveforms there corresponds a set of building-block waveforms<br />
{φ i (t), i = 1, 2, · · · , N c }. For all i = {1, 2, · · · , N c } the spectrum of the corresponding<br />
building-block waveform i ( f ) ≡ 0 for | f | ≥ W since the building-block waveforms are linear<br />
combinations of the signal waveforms (Gram-Schmidt procedure, see appendix E).<br />
Also consider a second set of baseband waveforms {s1 s(t), ss 2 (t), · · · , ss |M|<br />
(t)} with waveform<br />
sm s (t) for all m ∈ {1, 2, · · · , |M|} having bandwidth smaller than W , i.e. Ss m ( f ) ≡ 0<br />
for | f | ≥ W . To this second set of waveforms there corresponds a set of building-block waveforms<br />
{ψ j (t), j = 1, 2, · · · , N s }. For all j = {1, 2, · · · , N s } the spectrum of the corresponding<br />
building-block waveform j ( f ) ≡ 0 for | f | ≥ W .<br />
Now, to obtain a passband signal, the two waveforms sm c (t) and ss m (t) are combined into a<br />
signal s m (t) by modulating them on a cosine resp. sine wave with frequency f 0 and then adding<br />
the resulting signals together (see figure 15.2), thus<br />
s m (t) = s c m (t)√ 2 cos(2π f 0 t) + s s m (t)√ 2 sin(2π f 0 t). (15.3)<br />
We now make (15.3) more precise. Note that to each message m ∈ M there corresponds a<br />
waveform s c m (t) = ∑ i=1,N c<br />
s c mi φ i(t) and a waveform s s m (t) = ∑ j=1,N s<br />
s s mj ψ j(t). Therefore we
CHAPTER 15. BANDPASS CHANNELS 129<br />
√<br />
2 cos 2π f0 t<br />
s c (t)<br />
s s (t)<br />
✛❄<br />
✘<br />
✲ ×<br />
✚✙<br />
✛❄<br />
✘<br />
+<br />
✚✙<br />
✛✘<br />
✻<br />
✲ ×<br />
✚✙<br />
✻<br />
√<br />
2 sin 2π f0 t<br />
s(t)<br />
✲<br />
Figure 15.2: Quadrature multiplexing.<br />
write:<br />
s m (t) = sm c (t)√ 2 cos(2π f 0 t) + sm s (t)√ 2 sin(2π f 0 t)<br />
= ∑<br />
smi c φ i(t) √ 2 cos(2π f 0 t) +<br />
∑<br />
smj s ψ j(t) √ 2 sin(2π f 0 t). (15.4)<br />
i=1,N c j=1,N s<br />
This means that the signal s m (t) can be regarded as a linear combination of “building-block”<br />
waveforms φ i (t) √ 2 cos(2π f 0 t) and ψ j (t) √ 2 sin(2π f 0 t). To see whether these waveforms are<br />
actually building blocks we have to check whether they form an orthonormal set.<br />
To this end consider the Fourier transforms of the building block waveforms 1 φ i (t) and ψ j (t):<br />
i ( f ) =<br />
j ( f ) =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
φ i (t) exp(−2π f t)dt,<br />
ψ j (t) exp(−2π f t)dt. (15.5)<br />
Since both sets {φ i (t), i = 1, N c } and {ψ i (t), j = 1, N s } form an orthonormal base, using the<br />
Parseval relation, we have for all i and j and all i ′ and j ′ that<br />
δ i,i ′ =<br />
δ j, j ′ =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
φ i (t)φ i ′(t)dt =<br />
ψ j (t)ψ j ′(t)dt =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
i ( f )i ∗ ′( f )d f,<br />
j ( f ) ∗ j ′( f )d f. (15.6)<br />
1 Note that we do not assume here that the baseband building blocks are limited in time as in the chapters on<br />
waveform communication.
CHAPTER 15. BANDPASS CHANNELS 130<br />
i ( f )<br />
c,i ( f )<br />
❅ f 0 − W f 0 + W<br />
❅<br />
❅<br />
✚ ✚✚❩ ❩ ✚ ✚❩ ❩ ❩ ✚ ❩<br />
−W 0 W f<br />
0<br />
f<br />
− f 0<br />
f 0<br />
j ( f )<br />
j s, j ( f )<br />
❅<br />
❅❅ ❩ ❩❩✚ ✚ ✚<br />
− f 0<br />
−W<br />
0<br />
W<br />
f<br />
0<br />
f<br />
✚ ✚✚❩ ❩ ❩<br />
f 0<br />
Figure 15.3: Spectra after modulation with √ 2 cos 2π f 0 t and √ 2 sin 2π f 0 t. For simplicity it is<br />
assumed that the baseband spectra are real.<br />
Now we are ready to determine the spectrum c,i ( f ) of a cosine or in-phase building-block<br />
waveform<br />
φ c,i (t) = φ i (t) √ 2 cos 2π f 0 t (15.7)<br />
and the spectrum s, j ( f ) of a sine or quadrature building-block waveform<br />
ψ s, j (t) = ψ j (t) √ 2 sin 2π f 0 t. (15.8)<br />
We express these spectra in terms of the baseband spectra i ( f ) and j ( f ) (see also figure<br />
15.3).<br />
c,i ( f ) =<br />
∫ ∞<br />
−∞<br />
φ i (t) √ 2 cos(2π f 0 t) exp(−2π f t)dt<br />
= √ 1 ( i ( f − f 0 ) + i ( f + f 0 )) ,<br />
2<br />
s, j ( f ) =<br />
=<br />
∫ ∞<br />
−∞<br />
1<br />
j √ 2<br />
ψ j (t) √ 2 sin(2π f 0 t) exp(−2π f t)dt<br />
(<br />
j ( f − f 0 ) − j ( f + f 0 ) ) . (15.9)
CHAPTER 15. BANDPASS CHANNELS 131<br />
Note (see again figure 15.3) that the spectra c,i ( f ) and s, j ( f ) are zero outside the passband<br />
±[ f 0 − W, f 0 + W ].<br />
We now first consider the correlation between φ c,i (t) and φ c,i ′(t) for all i and i ′ . This leads<br />
to<br />
∫ ∞<br />
−∞<br />
= 1 2<br />
= 1 2<br />
φ c,i (t)φ c,i ′(t)dt =<br />
+ 1 2<br />
(a)<br />
= 1 2<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
∫ ∞<br />
−∞<br />
c,i ( f ) ∗ c,i ′( f )d f<br />
[ i ( f − f 0 ) + i ( f + f 0 )][ ∗ i ′( f − f 0) + ∗ i ′( f + f 0)]d f<br />
−∞<br />
i ( f − f 0 ) ∗ i ′( f − f 0)d f + 1 2<br />
∫ ∞<br />
∫ ∞<br />
−∞<br />
i ( f + f 0 ) ∗ i ′( f − f 0)d f + 1 2<br />
−∞<br />
i ( f − f 0 ) ∗ i ′( f − f 0)d f + 1 2<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
i ( f − f 0 ) ∗ i ′( f + f 0)d f<br />
i ( f + f 0 ) ∗ i ′( f + f 0)d f<br />
i ( f + f 0 ) ∗ i ′( f + f 0)d f<br />
(b)<br />
= 1 2 δ i,i ′ + 1 2 δ i,i ′ = δ i,i ′. (15.10)<br />
Here equality (a) follows from the fact that f 0 > W and the observation that for all i = 1, N c<br />
the spectra i ( f ) ≡ 0 for | f | ≥ W . Therefore the cross-terms are zero, see also figure 15.3.<br />
Equality (b) follows from (15.6) which holds since the baseband building-blocks φ i (t) for i =<br />
1, N c , form an orthonormal base.<br />
In a similar way we can show that for all j and j ′<br />
∫ ∞<br />
−∞<br />
ψ s, j (t)ψ s, j ′(t)dt = δ j, j ′. (15.11)<br />
What remains to be investigated are the correlations between all in-phase (cosine) buildingblock<br />
waveforms φ c,i (t) for i = 1, N c and all quadrature (sine) building-block waveforms<br />
ψ s, j (t) for j = 1, N s :<br />
∫ ∞<br />
−∞<br />
= j 2<br />
= j 2<br />
φ c,i (t)ψ s, j (t)dt =<br />
+ j 2<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
∫ ∞<br />
−∞<br />
c,i ( f ) ∗ s, j ( f )d f<br />
[ i ( f − f 0 ) + i ( f + f 0 )][ ∗ j ( f − f 0) − ∗ j ( f + f 0)]d f<br />
−∞<br />
i ( f − f 0 ) ∗ j ( f − f 0)d f − j 2<br />
∫ ∞<br />
∫ ∞<br />
−∞<br />
i ( f + f 0 ) ∗ j ( f − f 0)d f − j 2<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
i ( f − f 0 ) ∗ j ( f + f 0)d f<br />
i ( f + f 0 ) ∗ j ( f + f 0)d f<br />
(c)<br />
= j i ( f − f 0 ) ∗ j<br />
2<br />
( f − f 0)d f − j i ( f + f 0 ) ∗ j<br />
−∞ 2<br />
( f + f 0)d f<br />
−∞<br />
= 0. (15.12)
CHAPTER 15. BANDPASS CHANNELS 132<br />
Here equality (c) follows from the fact that f 0 > W and the observation that for all i = 1, N c<br />
the spectra i ( f ) ≡ 0 for | f | ≥ W and for all j = 1, N s the spectra j ( f ) ≡ 0 for | f | ≥ W .<br />
Therefore the cross-terms are zero. See figure 15.3.<br />
RESULT 15.1 We have shown that all in-phase building-block waveforms φ c,i (t) for i = 1, N c<br />
and all quadrature building-block waveforms ψ s, j (t) for j = 1, N s together form an orthonormal<br />
base.<br />
Moreover the spectra c,i ( f ) and s, j ( f ) of all these building-block waveforms are zero outside<br />
the passband ±[ f 0 − W, f 0 + W ]. Therefore none of these building-block waveforms is hindered<br />
by the bandpass filter W 0 ( f ) when they are sent over our bandpass channel.<br />
It is important to note that the baseband building-block waveforms φ i (t) and ψ j (t), for any<br />
i and j, need not be orthogonal. Multiplication of these baseband waveforms by √ 2 cos 2π f 0 t<br />
resp. √ 2 sin 2π f 0 t results in the orthogonality of the bandpass building-block waveforms φ c,i (t)<br />
and ψ s, j (t).<br />
Also note that all the bandpass building block waveforms have unit energy. This follows from<br />
(15.10) and (15.11). Therefore the energy of s m (t) is equal to the squared length of the vector<br />
s m = (s c m , ss m ) = (sc m1 , sc m2 , · · · , sc m N c<br />
, s s m1 , ss m2 , · · · , ss m N s<br />
). (15.13)<br />
15.3 Optimum receiver for quadrature multiplexing<br />
The result of the previous section actually states that by applying quadrature multiplexing for a<br />
bandpass channel we obtain a “normal” waveform channel communication problem. Quadrature<br />
multiplexing apparently yields the right building-block waveforms! The building-block waveforms<br />
have spectra that are matched to the bandpass channel. Each message m ∈ M results in<br />
a signal s m (t) that is a linear combination of N c cosine and N s sine building-block waveforms.<br />
Therefore the ideal bandpass filter at the input of the bandpass channel has no effect on the<br />
transmission of the signals.<br />
We have seen before that a waveform channel is actually equivalent to a vector channel. Also<br />
we know how the optimum receiver for a waveform channel should be constructed. If we restrict<br />
ourselves for a moment to the receiver that correlates the received signal r(t) with the building<br />
blocks, we know that it should start with N c + N s multipliers followed by integrators.<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(t)φ c,i (t)dt =<br />
r(t)ψ s, j (t)dt =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(t) √ 2 cos(2π f 0 t)φ i (t)dt = r c i , and<br />
r(t) √ 2 sin(2π f 0 t)ψ j (t)dt = r s j . (15.14)<br />
After having determined the received vector r = (r c , r s ) = (r1 c, r 2 c, · · · , r N c c<br />
, r1 s, r 2 s, · · · , r N s s<br />
) we<br />
can form the dot-products (r ·s m ) where the signal vector s m = (s c m , ss m ) = (sc m1 , sc m2 , · · · , sc m N c<br />
,<br />
sm1 s , ss m2 , · · · , ss m N s<br />
):<br />
(r · s m ) = ∑<br />
ri c sc mi +<br />
∑<br />
r s j ss mj . (15.15)<br />
i=1,N c j=1,N s
CHAPTER 15. BANDPASS CHANNELS 133<br />
φ 1 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
φ 2 (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
√<br />
2 cos 2π f0 t .<br />
φ Nc (t)<br />
✛❄<br />
✘✛❄<br />
✘<br />
✲ × ✲ × ✲<br />
✚✙✚✙<br />
r(t)<br />
ψ 1 (t)<br />
✛✘✛❄<br />
✘<br />
✲ × ✲ × ✲<br />
✚✙✚✙<br />
✻ ψ 2 (t)<br />
√ ✛❄<br />
✘<br />
2 sin 2π f0 t ✲ × ✲<br />
✚✙<br />
∫<br />
∫<br />
∫<br />
∫<br />
∫<br />
r1<br />
c ✲<br />
r2<br />
c ✲<br />
rN c c<br />
✲<br />
r1<br />
s ✲<br />
r2<br />
s ✲<br />
weighting<br />
matrix<br />
∑<br />
i r c i sc mi<br />
+ ∑ j r s j ss mj<br />
c 1<br />
(r · s ❄ 1 ) ✛✘<br />
✲ + ✲<br />
✚✙<br />
c 2<br />
(r · s ❄ 2 ) ✛✘<br />
✲ + ✲<br />
✚✙<br />
.<br />
c |M|<br />
(r · s |M| ✛)<br />
❄ ✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
ˆm<br />
✲<br />
.<br />
ψ Ns (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
∫<br />
rN s s<br />
✲<br />
Figure 15.4: Optimum receiver for quadrature-multiplexed signals that are transmitted over a<br />
bandpass channel.
CHAPTER 15. BANDPASS CHANNELS 134<br />
Adding the constants c m is now the last step before selecting ˆm as the message m that maximizes<br />
(r · s m ) + c m where (see result (6.1))<br />
with<br />
c m = N 0<br />
2 ln Pr{M = m} − ‖s m ‖2<br />
, (15.16)<br />
2<br />
‖s m ‖ 2 = ‖s c m ‖2 + ‖s s m ‖2 = ∑<br />
=<br />
i=1,N c<br />
(s c mi )2 +<br />
∫ ∞<br />
−∞<br />
(s c m (t))2 dt +<br />
∑<br />
(smj s )2<br />
j=1,N s<br />
∫ ∞<br />
Note that ‖s m ‖ 2 = ∫ ∞<br />
−∞ s2 m (t)dt.<br />
All this leads to the receiver implementation shown in figure 15.4.<br />
15.4 Transmission of a complex signal<br />
−∞<br />
(s s m (t))2 dt. (15.17)<br />
After the previous section we may think of quadrature multiplexing as a method for transmitting<br />
two real-valued signals sm c (t) and ss m (t). However we can also see quadrature multiplexing as a<br />
technique to transmit a single complex waveform<br />
Then we can write<br />
sm cs (t) = sc m (t) − jss m (t). (15.18)<br />
s m (t) = R[s cs<br />
m (t)√ 2 exp( j2π f 0 t)]<br />
= s c m (t)√ 2 cos(2π f 0 t) + s s m (t)√ 2 sin(2π f 0 t), (15.19)<br />
where R[x] denotes the real part of x.<br />
The notion of energy of the complex signal sm cs (t) is in accordance with the energy that we<br />
have used in expressing the constants c m , hence<br />
∫ ∞<br />
−∞<br />
‖s cs<br />
m (t)‖2 dt =<br />
∫ ∞<br />
−∞<br />
15.5 Dimensions per second<br />
(s c m (t))2 dt +<br />
∫ ∞<br />
−∞<br />
(s s m (t))2 dt. (15.20)<br />
By applying quadrature multiplexing we can not obtain more than 4W dimensions per second<br />
from our bandpass channel. By choosing both the baseband building-block waveform sets<br />
{φ i (t), i = 1, N c } and {ψ j (t), j = 1, N s } as large (rich) as possible given the bandwidth constraint<br />
of W Hz, i.e. by taking 2W dimensions per second for each of the building-block waveform<br />
sets, we obtain the upper bound of 4W dimensions per second. Note however that, as usual,<br />
this corresponds to 2 dimensions per Hz per second.
CHAPTER 15. BANDPASS CHANNELS 135<br />
15.6 Capacity of the bandpass channel<br />
In the previous section we saw that the number of dimensions in the case of bandpass transmission<br />
is at most 4W per second. Moreover there exist baseband building-block sets that achieve<br />
this maximum.<br />
RESULT 15.2 Therefore the capacity in bit per dimension of the bandpass channel with bandwidth<br />
±[ f 0 − W, f 0 + W ] is<br />
C N = 1 2 log 2 (1 + ), (15.21)<br />
2N 0 W<br />
if the transmitter power is P s and the noise spectral density is N 0 /2 for all f . Consequently the<br />
capacity per second is<br />
C = 4WC N = 2W log 2 (1 +<br />
15.7 Quadrature amplitude modulation<br />
P s<br />
P s<br />
) bit/second. (15.22)<br />
2N 0 W<br />
In general we take N c = N s = N and the baseband in-phase and quadrature building-block<br />
waveforms equal, i.e. φ i (t) = ψ i (t) for all i = 1, N. Then the two dimensions corresponding<br />
to the passband building-blocks φ i (t) √ 2 cos(2π f 0 t) and ψ i (t) √ 2 sin(2π f 0 t) for i = 1, N are<br />
in some way linked together. This will become clear in chapter 16 on random carrier-phase<br />
communication. We refer to this method as quadrature amplitude modulation (QAM).<br />
15.8 Serial quadrature amplitude modulation<br />
It is important to realize that we could use serial pulse amplitude modulation with a pulse p(t)<br />
satisfying the Nyquist criterion (see result 14.1) on both carriers (cosine and sine). A Nyquist<br />
pulse corresponds to an orthonormal basis consisting of time shifts of this pulse. Two such sets<br />
(based on the same pulse p(t)) can be used in a QAM scheme. Note that this signaling method<br />
leads to an extremely simple receiver structure.<br />
Example 15.1 QAM modulation results in two-dimensional signal structures. In serial QAM we can take<br />
as baseband building blocks<br />
φ 1 (t) = ψ 1 (t) = √ sin(2π W t)<br />
2W , (15.23)<br />
2πWt<br />
and all other baseband building blocks are shifts over multiples of 1/(2W ) seconds of this sinc-pulse.<br />
Therefore the frequency bandwidth of all building blocks is W . After modulation a bandpass spectrum<br />
is obtained, thus the spectra of the signals fit into ±[ f 0 − W, f 0 + W ]. In each pair of dimensions<br />
corresponding to a shift over (k − 1)/(2W ) seconds we can observe a two-dimensional part (sk c, ss k<br />
) of the<br />
signal vector. This part assumes values in a two-dimensional signal structure.<br />
In figure 15.5 two signal structures are shown. The signal points are placed on a rectangular grid.<br />
This implies that their minimum Euclidean distance is not smaller than a certain value d. It is important to<br />
place the required number of signal points in such a way on the grid that their average energy is minimal.
CHAPTER 15. BANDPASS CHANNELS 136<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄ d<br />
d<br />
✛ ✲✄<br />
✂ ✄<br />
✄<br />
✁<br />
✛ ✲✄<br />
✂ ✄<br />
✁ ✂ ✂ ✄<br />
✂ ✁<br />
✁ ✁ ✂ ✁ ✂ ✁<br />
<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
(a)<br />
(b)<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
Figure 15.5: Two QAM signal structures, (a) 16-QAM, (b) 32-QAM. Note that 32-QAM is not<br />
square!<br />
15.9 Exercises<br />
1. A signal structure must contain M signal points. These point are to be chosen on an integer<br />
grid.<br />
Now consider a hypercube and a hypersphere in N dimensions. Both have their center in<br />
the origin of the coordinate system.<br />
We can either choose our signal structure as all the grid points inside the sphere or all the<br />
grid points inside the cube. Assume that the dimensions of the sphere and cube are such<br />
that they contain equally many signal points.<br />
Let M ≫ 1 so that the signal points can be assumed to be uniformly distributed over the<br />
sphere and the cube.<br />
(a) Let N = 2. Find the ratio between the average signal energies of the sphere-structure<br />
and that of the cube-structure. What is the difference in dB?<br />
(b) Let N be even. What is then the ratio for N → ∞? In dB?<br />
Hint: The volume of an N-dimensional sphere is π N/2 R N /(N/2)! for even N.<br />
The computed ratios are called shaping gains.
Chapter 16<br />
Random carrier-phase<br />
SUMMARY: Here we consider carrier modulation under the assumption that the carrierphase<br />
is not known to the receiver. Only one baseband signal s b (t) will be transmitted. First<br />
we determine the optimum receiver for this case. Then we consider the implementation of<br />
the optimum receiver when all signals have equal energy. We investigate envelope detection<br />
and work out an example in which one of out of two orthogonal signals is transmitted.<br />
16.1 Introduction<br />
Because of oscillator drift or differences in propagation time it is not always reasonable to assume<br />
that the receiver knows the phase of the wave that is used as carrier of the message, when<br />
quadrature multiplexing would be used. In that case coherent demodulation is not possible.<br />
To investigate this situation we assume that the transmitter modulates with the 1 baseband<br />
signal s b (t) a wave with a random phase θ, i.e.<br />
s(t) = s b (t) √ 2 cos(2π f 0 t − θ), (16.1)<br />
where the random phase is assumed to be uniform over [0, 2π) hence<br />
p (θ) = 1 , for 0 ≤ θ < 2π. (16.2)<br />
2π<br />
More precisely, for each message m ∈ M, let the signal sm b (t) correspond to the vector<br />
s m = (s m1 , s m2 , · · · , s m N ) hence<br />
sm b (t) = ∑<br />
s mi φ i (t). (16.3)<br />
i=1,N<br />
We assume as before that the spectrum of the signals sm b (t) for all m ∈ M is zero outside<br />
[−W, W ]. Note that the same holds for the building-block waveforms φ i (t) for i = 1, N. This<br />
1 Note that there is only one baseband signal involved here. The type of modulation is called double sideband<br />
suppressed carrier (DSB-SC) modulation.<br />
137
CHAPTER 16. RANDOM CARRIER-PHASE 138<br />
is since the building-block waveforms are linear combinations of the signals. If we now use the<br />
goniometric identity cos(a − b) = cos a cos b + sin a sin b we obtain<br />
s m (t) = s b m (t)√ 2 cos(2π f 0 t − θ)<br />
= sm b (t) cos θ√ 2 cos(2π f 0 t) + sm b (t) sin θ√ 2 sin(2π f 0 t)<br />
= ∑<br />
s mi cos θφ i (t) √ 2 cos(2π f 0 t) + ∑<br />
s mi sin θφ i (t) √ 2 sin(2π f 0 t).(16.4)<br />
i=1,N<br />
i=1,N<br />
If we now turn to the vector approach we can say that, although θ is unknown, this is equivalent<br />
to quadrature multiplexing where vector-signals s c m and ss m are transmitted satisfying<br />
s c m = s m cos θ,<br />
s s m = s m sin θ. (16.5)<br />
After receiving the signal r(t) = s m (t) + n w (t) an optimum-receiver forms the vectors r c and r s<br />
for which<br />
r c = s c m + nc = s m cos θ + n c ,<br />
r s = s s m + ns = s m sin θ + n s , (16.6)<br />
where both n c and n s are independent Gaussian vectors with independent components all having<br />
mean zero and variance N 0 /2. In the next section we will further determine the optimum receiver<br />
for this situation.<br />
16.2 Optimum incoherent reception<br />
To determine the optimum receiver we first write out the decision variable for message m, i.e.<br />
Pr{M = m}p R c ,R s (r c , r s |M = m)<br />
= Pr{M = m}<br />
∫ 2π<br />
0<br />
p (θ)p R c ,R s (r c , r s |M = m, = θ)dθ. (16.7)<br />
Note that we average over the unknown phase θ. Assume first that all messages are equally<br />
likely. Therefore only the integral is relevant. The informative term inside the integral is<br />
Therefore we investigate<br />
p R c ,R s (r c , r s |M = m, = θ)<br />
= p N c ,N s (r c − s m cos θ, r s − s m sin θ)<br />
1<br />
= ( ) N exp<br />
(− ‖r c − s m cos θ‖ 2 + ‖r s − s m sin θ‖ 2 )<br />
. (16.8)<br />
π N 0 N 0<br />
‖r c − s m cos θ‖ 2 + ‖r s − s m sin θ‖ 2<br />
= ‖r c ‖ 2 − 2(r c · s m ) cos θ + ‖s m ‖ 2 cos 2 θ + ‖r s ‖ 2 − 2(r s · s m ) sin θ + ‖s m ‖ 2 sin 2 θ<br />
= ‖r c ‖ 2 + ‖r s ‖ 2 − 2(r c · s m ) cos θ − 2(r s · s m ) sin θ + ‖s m ‖ 2 , (16.9)
CHAPTER 16. RANDOM CARRIER-PHASE 139<br />
and note that both ‖r c ‖ 2 and ‖r s ‖ 2 do not depend on the message m and can be ignored. Therefore<br />
the relevant part of the decision variable is<br />
∫ ( ) (<br />
2<br />
p (θ) exp [(r c · s<br />
N m ) cos θ + (r s · s m ) sin θ] dθ exp − E )<br />
m<br />
, (16.10)<br />
0 N 0<br />
θ<br />
where E m = ‖s m ‖ 2 . Note that we have to maximize the decision variable over all m ∈ M.<br />
We next consider (r c · s m ) cos θ + (r s · s m ) sin θ. Therefore, for each m ∈ M, we first make<br />
✧<br />
✧<br />
✧<br />
✧<br />
X m ✧<br />
✧<br />
✧<br />
✧<br />
✧<br />
✧<br />
✧<br />
✧<br />
✧<br />
✧ ❇ γ m<br />
(r c , s m )<br />
(r s , s m )<br />
Figure 16.1: Polar transformation of matched-filter outputs.<br />
a transformation of the matched-filter outputs (r c · s m ) and (r s · s m ) from rectangular into the<br />
polar coordinates (amplitude X m and phase γ m ). Define (see figure 16.1):<br />
and let the angle γ m be such that<br />
Therefore<br />
and<br />
X m<br />
=<br />
√(r c · s m ) 2 + (r s · s m ) 2 (16.11)<br />
(r c · s m ) = X m cos γ m ,<br />
(r s · s m ) = X m sin γ m . (16.12)<br />
(r c · s m ) cos θ + (r s · s m ) sin θ = X m cos γ m cos θ + X m sin γ m sin θ<br />
∫ 2π<br />
0<br />
=<br />
(a)<br />
=<br />
= X m cos(θ − γ m ), (16.13)<br />
( )<br />
2<br />
p (θ) exp [(r c · s<br />
N m ) cos θ + (r s · s m ) sin θ] dθ<br />
0<br />
∫ 2π<br />
0<br />
∫ 2π<br />
0<br />
1<br />
2π exp ( 2Xm<br />
)<br />
cos(θ − γ m )<br />
N 0<br />
(<br />
1<br />
2π exp 2Xm<br />
cos θ<br />
N 0<br />
dθ<br />
) ( ) 2Xm<br />
dθ = I 0 . (16.14)<br />
N 0<br />
Here (a) follows from the periodicity of the cosine. Note that therefore the angle γ m is not<br />
relevant! The integral denoted by I 0 (·) is called the “zero-order modified Bessel function of the<br />
first kind” (see figure 16.2).
CHAPTER 16. RANDOM CARRIER-PHASE 140<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
0 0.5 1 1.5 2 2.5 3 3.5 4<br />
Figure 16.2: A plot of I 0 (x) = 1 ∫ 2π<br />
2π 0<br />
exp(x cos θ)dθ as a function of x.<br />
RESULT 16.1 Combining everything we obtain that the optimum receiver for “random-phase<br />
transmission” (incoherent detection) has to choose the message m ∈ M that maximizes<br />
( ) (<br />
2Xm<br />
I 0 exp − E )<br />
m<br />
. (16.15)<br />
N 0 N 0<br />
16.3 <strong>Signal</strong>s with equal energy, receiver implementation<br />
If all the signals have the same energy the optimum receiver only has to maximize I 0 (2X m /N 0 ).<br />
However since I 0 (x) is a monotonically increasing function of x this is equivalent to maximizing<br />
X m or its square<br />
X 2 m = (r c · s m ) 2 + (r s · s m ) 2 . (16.16)<br />
How can we now determine e.g. (r c · s m )? We can use the standard approach as shown in<br />
figure 15.4 and correlate r(t) with the bandpass building-block waveforms φ i (t) √ 2 cos(2π f 0 t)<br />
to obtain the components of r c and then form the dot product, but there is also a more direct way.
CHAPTER 16. RANDOM CARRIER-PHASE 141<br />
Note that<br />
(r c · s m ) = ∑<br />
i=1,N<br />
ri c s mi = ∑ (∫ ∞<br />
)<br />
r(t)φi c (t)dt s mi<br />
i=1,N<br />
−∞<br />
= ∑ (∫ ∞<br />
=<br />
=<br />
i=1,N<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(t)φ i (t) √ 2 cos(2π f 0 t)dt<br />
−∞<br />
r(t) √ 2 cos(2π f 0 t) ∑<br />
i=1,N<br />
s mi φ i (t)dt<br />
)<br />
s mi<br />
r(t) √ 2 cos(2π f 0 t)sm b (t)dt. (16.17)<br />
A similar result holds for the product (r s · s m ). All this suggests the implementation shown in<br />
figure 16.3. As usual there is also a matched-filter version of this incoherent receiver.<br />
√<br />
2 cos(2π f0 t)<br />
r(t) ✛❄<br />
✘<br />
✲ ×<br />
✚✙<br />
✛✘<br />
✲ ×<br />
✚✙<br />
✻<br />
√<br />
2 sin(2π f0 t)<br />
✛✘<br />
(r c · s 1 )<br />
∫<br />
✲ × ✲<br />
✲ (·) 2<br />
✚✙<br />
✛❄<br />
✘X ✻<br />
1<br />
2<br />
s ✲<br />
1 b(t)<br />
+<br />
✛❄<br />
✘<br />
✚✙<br />
∫<br />
✲ ✲<br />
✲<br />
✻<br />
×<br />
(·) 2<br />
✚✙<br />
(r s · s 1 )<br />
✛✘<br />
(r c · s 2 )<br />
∫<br />
✲ × ✲<br />
✲ (·) 2<br />
✚✙<br />
✛❄<br />
✘X ✻<br />
s b + ✲<br />
2 (t)<br />
2<br />
2<br />
✛❄<br />
✘<br />
✚✙<br />
∫<br />
✲ ✲<br />
✲<br />
✻<br />
×<br />
(·) 2<br />
✚✙<br />
(r s · s 2 )<br />
.<br />
✛✘<br />
(r c · s |M| )<br />
∫<br />
✲ × ✲<br />
✲ (·) 2<br />
✚✙<br />
✛❄<br />
X✘<br />
✻<br />
s b + ✲<br />
|M| (t)<br />
|M|<br />
2<br />
✛❄<br />
✘<br />
✚✙<br />
∫<br />
✲ ✲<br />
✲<br />
✻<br />
×<br />
(·) 2<br />
✚✙<br />
(r s · s |M| )<br />
select<br />
largest<br />
ˆm<br />
✲<br />
Figure 16.3: Correlation receiver for equal-energy signals with random phase.
CHAPTER 16. RANDOM CARRIER-PHASE 142<br />
16.4 Envelope detection<br />
Fix an m ∈ M. The matched filter for the (bandpass) signal s m (t) = s b m (t)√ 2 cos(2π f 0 t − θ)<br />
has impulse response s m (T − t) = s b m (T − t)√ 2 cos(2π f 0 (T − t) − θ). Here it is assumed that<br />
the signal s m (t) is zero outside [0, T ]. Since the phase θ of the transmitted signal is unknown to<br />
the receiver, the impulse response<br />
is probably a good matched filter. Let’s investigate this!<br />
The response of this filter on r(t) is<br />
u m (t) =<br />
with<br />
=<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
= cos(2π f 0 t)<br />
r(α)h m (t − α)dα<br />
h m (t) = s b m (T − t)√ 2 cos(2π f 0 t) (16.18)<br />
r(α)s b m (T − t + α)√ 2 cos(2π f 0 (t − α))dα<br />
r(α)s b m (T − t + α)√ 2[cos(2π f 0 t) cos(2π f 0 α) + sin(2π f 0 t) sin(2π f 0 α)]dα<br />
+ sin(2π f 0 t)<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(α)s b m (T − t + α)√ 2 cos(2π f 0 α)dα<br />
r(α)s b m (T − t + α)√ 2 sin(2π f 0 α)dα<br />
= u c m (t) cos(2π f 0t) + u s m (t) sin(2π f 0t), (16.19)<br />
u c m (t) =<br />
∫ ∞<br />
−∞<br />
u s m (t) =<br />
∫ ∞<br />
−∞<br />
r(α) √ 2 cos(2π f 0 α)sm b (T − t + α)dα,<br />
r(α) √ 2 sin(2π f 0 α)sm b (T − t + α)dα. (16.20)<br />
The signals u c m (t) and us m (t) are baseband signals. Why? The reason is that we can regard<br />
e.g. u c m (t) as the output of the filter with impulse response sb m (T − t) when the excitation is<br />
r(t) √ 2 cos(2π f 0 t). It is clear that signal-components outside the frequency band [−W, W ] can<br />
not pass the filter sm b (T − t) since sb m (t) is a baseband signal.<br />
We now use the trick of section 16.2 again. Write the matched-filter output as<br />
where<br />
and angle γ m (t) is such that<br />
u m (t) = X m (t) cos[2π f 0 t − γ m (t)] (16.21)<br />
X m (t) =<br />
√<br />
(u c m (t))2 + (u s m (t))2 (16.22)<br />
u c m (t) = X m(t) cos γ m (t),<br />
u s m (t) = X m(t) sin γ m (t). (16.23)
CHAPTER 16. RANDOM CARRIER-PHASE 143<br />
Now X m (t) is a slowly-varying “envelope” (amplitude) and γ m (t) a slowly-varying ”phase” of<br />
the output u m (t) of the matched filter. By slowly-varying we mean within bandwidth [−W, W ].<br />
Next let us consider what happens at t = T . Observe from equations (16.17) that<br />
hence<br />
X m (T ) =<br />
u c m (T ) = (r c · s m ),<br />
u s m (T ) = (r s · s m ), (16.24)<br />
√<br />
(r c · s m ) 2 + (r s · s m ) 2 . (16.25)<br />
Therefore, for equal-energy signals, we can construct an optimum receiver by sampling the<br />
envelope of the outcomes of the matched filters h m (t) = s b m (T −t)√ 2 cos(2π f 0 t) for all m ∈ M<br />
and comparing the samples. This leads to the implementation shown in figure 16.4.<br />
r(t)<br />
✲<br />
✲<br />
h 1 (t)<br />
h 2 (t)<br />
u 1 (t)<br />
u 2 (t)<br />
✲<br />
✲<br />
envelope<br />
detector<br />
envelope<br />
detector<br />
X 1 (t)<br />
X 2 (t)<br />
❍ X ❍❍ 1<br />
✄<br />
✂ ✄<br />
✁ ✂ ✁ ✲<br />
❍ X ❍❍ 2<br />
✄ ✄<br />
✂ ✁ ✂ ✁ ✲<br />
select<br />
largest<br />
ˆm<br />
✲<br />
.<br />
✲<br />
h |M| (t)<br />
u |M| (t)<br />
✲<br />
envelope<br />
detector<br />
X |M| (t) ❍ X ❍❍ |M|<br />
✄<br />
✂ ✄<br />
✁ ✂ ✁ ✲<br />
sample at t = T<br />
Figure 16.4: Envelope-detector receiver for equal energy signals with random phase.<br />
Note that instead of using a matched filter with (passband) impulse response sm b (T − t)√ 2<br />
cos(2π f 0 t) we can first multiply the received signal r(t) by cos(2π f 0 t) and then use a matched<br />
filter with (baseband) impulse response sm b (T − t).<br />
Example 16.1 For some m ∈ M assume that sm b (t) = 1 for 0 ≤ t ≤ 1 and zero elsewhere. If the random<br />
phase turns out to be θ = π/2 then<br />
s m (t) = s b m (t)√ 2 cos(2π f 0 t − π/2) = s b m (t)√ 2 sin(2π f 0 t). (16.26)<br />
For the matched-filter impulse response, noting that T = 1, we can write<br />
h m (t) = s b m (1 − t)√ 2 cos(2π f 0 t). (16.27)
CHAPTER 16. RANDOM CARRIER-PHASE 144<br />
If we assume that is no noise, hence r(t) = s m (t), the output u m (t) of the matched filter will be<br />
u m (t) =<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
r(α)h m (t − α)dα<br />
s b m (α)√ 2 sin(2π f 0 α)s b m (1 − t + α)√ 2 cos(2π f 0 (t − α)dα. (16.28)<br />
All these signals are shown in figure 16.5 for f 0 = 7Hz.<br />
Note that the envelope of u m (t) has the triangular shape which is the output of a matched filter for a<br />
rectangular pulse if the input of this matched filter is this rectangular pulse.<br />
s b m (t) −0.5 0 0.5 1 1.5 2 2.5<br />
1.5<br />
1<br />
0.5<br />
0<br />
−0.5<br />
−1<br />
−1.5<br />
−0.5 0 0.5 1 1.5 2 2.5<br />
1.5<br />
1<br />
0.5<br />
0<br />
−0.5<br />
−1<br />
−1.5<br />
s m<br />
(t)=s b m (t)sqrt(2)sin(2*pi*f 0 *t)<br />
h m<br />
(t)=s b m (1−t)sqrt(2)cos(2*pi*f 0 *t) −0.5 0 0.5 1 1.5 2 2.5<br />
1.5<br />
1<br />
0.5<br />
0<br />
−0.5<br />
−1<br />
−1.5<br />
−0.5 0 0.5 1 1.5 2 2.5<br />
1.5<br />
1<br />
0.5<br />
0<br />
−0.5<br />
−1<br />
−1.5<br />
u m<br />
(t)<br />
Figure 16.5: <strong>Signal</strong>s s b m (t), s m(t), and u m (t), and impulse-response h m (t).<br />
16.5 Probability of error for two orthogonal signals<br />
An incoherent receiver does not consider phase information. Therefore it can not yield so small<br />
an error probability as a coherent receiver. To illustrate this we will now calculate the probability<br />
of error for white Gaussian noise when one of two equally likely messages is communicated<br />
over a system utilizing an incoherent receiver and DSB-SC modulated equal-energy orthogonal<br />
baseband signals<br />
s b 1 (t) = √ E s φ 1 (t), hence s 1 = ( √ E s , 0), and<br />
s b 2 (t) = √ E s φ 2 (t), hence s 2 = (0, √ E s ). (16.29)
CHAPTER 16. RANDOM CARRIER-PHASE 145<br />
An optimum receiver (see result 16.1) now chooses ˆm = 1 if and only if<br />
(r c · s 1 ) 2 + (r s · s 1 ) 2 > (r c · s 2 ) 2 + (r s · s 2 ) 2 . (16.30)<br />
Here all vectors are two-dimensional, more precisely<br />
r c = s m cos θ + n c<br />
r s = s m sin θ + n s . (16.31)<br />
When M = 1 then the vector components of r c = (r c 1 , r c 2 ) and r s = (r s 1 , r s 2 ) are<br />
r c 1 = √ E s cos θ + n c 1 ,<br />
r s 1 = √ E s sin θ + n s 1 ,<br />
r c 2 = n c 2 ,<br />
r s 2 = n s 2 , (16.32)<br />
where n c = (n c 1 , nc 2 ) and ns = (n s 1 , ns 2 ). Now by s 1 = (√ E s , 0) and s 2 = (0, √ E s ) the optimum<br />
receiver decodes ˆm = 1 if and only if<br />
(r c 1 )2 + (r s 1 )2 > (r c 2 )2 + (r s 2 )2 . (16.33)<br />
The noise components are all statistically independent Gaussian variables with density function<br />
p N (n) = 1 √ π N0<br />
exp(− n2<br />
N 0<br />
). (16.34)<br />
Now fix some θ. Assume that (r c 1 )2 + (r s 1 )2 = ρ 2 . What is now the probability of error given<br />
that M = 1 and = θ? Therefore consider<br />
Pr{ ˆM = 2| = θ, M = 1, (R1 c )2 + (R1 s )2 = ρ 2 } = Pr{(R2 c )2 + (R2 s )2 ≥ ρ 2 }<br />
∫ ∫<br />
=<br />
p N (α)p N (β)dαdβ<br />
α 2 +β 2 ≥ρ<br />
∫ ∫<br />
2<br />
=<br />
exp(− α2 + β 2<br />
)dαdβ<br />
π N 0<br />
=<br />
1<br />
α 2 +β 2 ≥ρ<br />
∫ 2 ∞ ∫ 2π<br />
ρ<br />
0<br />
N 0<br />
1<br />
exp(− r 2<br />
)rdrdθ<br />
π N 0 N 0<br />
= exp(− ρ2<br />
N 0<br />
). (16.35)
CHAPTER 16. RANDOM CARRIER-PHASE 146<br />
We obtain the error probability for the case where M = 1 by averaging over all R c 1 and Rs 1 , hence<br />
Pr{ ˆM = 2| = θ, M = 1}<br />
=<br />
=<br />
=<br />
∫ ∞ ∫ ∞<br />
−∞ −∞<br />
∫ ∞ ∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
−∞<br />
p R c<br />
1 ,R s 1 (r c 1 , r s 1 | = θ, M = 1) exp(−(r c 1 )2 + (r s 1 )2<br />
N 0<br />
)dr c 1 dr s 1<br />
p R c<br />
1<br />
(r c 1 | = θ, M = 1)p R s 1 (r s 1 | = θ, M = 1) exp(−(r c 1 )2 + (r s 1 )2<br />
N 0<br />
)dr c 1 dr s 1<br />
p R c<br />
1<br />
(r1 c c | = θ, M = 1) exp(−(r 1 )2<br />
)dr1 c N ·<br />
0<br />
∫ ∞<br />
· p R s<br />
1<br />
(r1 s | = θ, M = 1) exp(−(r 1 s)2<br />
)dr1 s<br />
−∞<br />
N . (16.36)<br />
0<br />
Consider the first factor. Note that r c 1 = √ E s cos θ + n c 1 . Therefore<br />
∫ ∞<br />
−∞<br />
=<br />
p R c<br />
1<br />
(r1 c c | = θ, M = 1) exp(−(r 1 )2<br />
)dr1<br />
c N 0<br />
∫ ∞<br />
−∞<br />
With m = √ E s cos θ this integral becomes<br />
∫ ∞<br />
−∞<br />
=<br />
=<br />
=<br />
1<br />
√ exp(− (α − √ E s cos θ) 2<br />
) exp(− α2<br />
)dα. (16.37)<br />
π N0 N 0<br />
N 0<br />
1 (α − m)2<br />
√ exp(− ) exp(− α2<br />
)dα<br />
π N0 N 0 N 0<br />
∫ ∞<br />
1<br />
√ exp(− 2α2 − 2αm + m 2<br />
)dα<br />
π N0<br />
−∞<br />
∫ ∞<br />
−∞<br />
N 0<br />
1<br />
√ exp(− 2(α2 − αm + m 2 /4) + m 2 /2<br />
)dα<br />
π N0<br />
exp(−<br />
m2<br />
2N 0<br />
)<br />
√<br />
2<br />
∫ ∞<br />
−∞<br />
1<br />
√ π N0 /2<br />
N 0<br />
− m/2)2<br />
exp(−(α )dα =<br />
N 0 /2<br />
exp(−<br />
m2<br />
2N 0<br />
)<br />
√<br />
2<br />
= exp(− E s cos 2 θ<br />
2N 0<br />
)<br />
√<br />
2<br />
. (16.38)<br />
Combining this with a similar result for the second factor we obtain<br />
Pr{ ˆM = 2| = θ, M = 1} = exp(− E s cos 2 θ<br />
2N 0<br />
)<br />
√<br />
2<br />
· exp(− E s sin 2 θ<br />
2N 0<br />
)<br />
√<br />
2<br />
= 1 2 exp(− E s<br />
2N 0<br />
). (16.39)<br />
Although we have fixed θ this probability is independent of θ. Averaging over θ yields therefore<br />
Pr{ ˆM = 2|M = 1} = 1 2 exp(− E s<br />
2N 0<br />
). (16.40)<br />
Based on symmetry we can obtain a similar result for Pr{ ˆM = 1|M = 2}.
CHAPTER 16. RANDOM CARRIER-PHASE 147<br />
RESULT 16.2 Therefore we obtain for the error probability for an incoherent receiver for two<br />
equally likely orthogonal signals, both having energy E s , that<br />
P E = 1 2 exp(− E s<br />
2N 0<br />
). (16.41)<br />
Note that for coherent reception of two equally likely orthogonal signals we have obtained<br />
before that<br />
√<br />
E s<br />
P E = Q( ) ≤ 1 N 0 2 exp(− E s<br />
). (16.42)<br />
2N 0<br />
Here we have used the bound on the Q-function that was derived in appendix A. Figure 16.6<br />
shows that the error probabilities are comparable however.<br />
10 0 Es/N0<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
10 −5<br />
10 −6<br />
10 −7<br />
−15 −10 −5 0 5 10 15<br />
Figure 16.6: Probability of error for coherent and incoherent reception of two equally likely<br />
orthogonal signals of energy E s as a function of E s /N 0 in dB. Note that probability of error for<br />
incoherent reception is largest.<br />
Another disadvantage of incoherent transmission of two orthogonal signals is that we loose<br />
3 dB relative to antipodal signaling. Also the bandwidth efficiency of random-phase transmission<br />
of two-orthogonal signals is a factor of two smaller than that of coherent transmission. A<br />
transmitted dimension spreads out over two received dimensions.
CHAPTER 16. RANDOM CARRIER-PHASE 148<br />
16.6 Exercises<br />
1. Let X and Y be two independent Gaussian random variables with common variance σ 2 .<br />
The mean of X is m and Y is a zero-mean random variable. We define the random variable<br />
V as V = √ X 2 + Y 2 . Show that<br />
p V (v) = v<br />
σ 2 I 0( mv<br />
σ 2 ) exp(−v2 + m 2<br />
2σ 2 ), for v > 0, (16.43)<br />
and 0 for v ≤ 0. Here I 0 (·) is the modified Bessel function of the first kind and zero order.<br />
The distribution of V is know as the Rician distribution. In the special case where m = 0,<br />
the Rician distribution simplifies to the Rayleigh distribution.<br />
(Exercise 3.31 from Proakis and Salehi [17].)<br />
2. In on-off keying of a carrier-modulated signal, the two possible signals are<br />
s b 1 (t) = √ Eφ(t), hence s 1 = √ E,<br />
s b 2 (t) = 0, hence s 2 = 0. (16.44)<br />
Note that s 1 and s 2 are signals in one-dimensional “vector”-notation.<br />
equiprobable i.e. Pr{M = 1} = Pr{M = 2} = 1/2.<br />
The signals are<br />
These signals are transmitted over a bandpass channel with a carrier with random phase θ<br />
with p (θ) = 1/2π for 0 ≤ θ < 2π. Power density of the noise is N 0<br />
2<br />
for all frequencies.<br />
An optimum receiver first determines r c and r s for which we can write<br />
r c = s m cos θ + n c ,<br />
r s = s m sin θ + n s , (16.45)<br />
where both n c and n s are independent zero-mean Gaussian vectors with variance N 0<br />
2 .<br />
Now let E/N 0 = 10dB.<br />
(a) Determine the optimum decision regions I m for m = 1, 2.<br />
Hint: Note that I 0 (12.1571) = 22025.<br />
(b) Determine the expected error probability that is achieved by the optimum receiver.<br />
Hint: Note that ∫ 2.7184<br />
0<br />
x exp(− x2<br />
2 )I 0(x √ 20)dx = 635.72.<br />
3. In on-off keying of a carrier modulated signal the two baseband signals are<br />
s b 1 (t) = Aφ(t), hence s 1 = A<br />
s b 2 (t) = 0, hence s 2 = 0.<br />
Here φ(t) is a baseband building-block waveform and s 1 and s 2 are the baseband signals<br />
in vector-representation (one dimensional). Both signals are equiprobable, hence Pr{M =<br />
1} = Pr{M = 2} = 1/2.
CHAPTER 16. RANDOM CARRIER-PHASE 149<br />
The baseband signals are modulated with a carrier having frequency f 0 and random phase<br />
, hence<br />
s m (t) = s b m (t)√ 2 cos(2π f 0 t − θ).<br />
The resulting signal s 1 (t) or s 2 (t) is transmitted over a bandpass channel, power spectral<br />
density of the additive white Gaussian noise is N 0 /2 for all frequencies f . Assume that<br />
A 2 /N 0 = ln(13/5).<br />
An optimum receiver correlates the received signal r(t) = s m (t) + n w (t) with the bandpass<br />
building-block waveforms φ(t) √ 2 cos(2π f 0 t) and φ(t) √ 2 sin(2π f 0 t) to obtain the<br />
received vector (r c , r s ).<br />
(a) Fix some θ. Express the received vector (r c , r s ) as a signal vector (s c , s s ) plus a<br />
noise vector (n c , n s ). What is the signal vector (s c , s s ) as a function of θ? What are<br />
the statistical properties of the noise vector (n c , n s )?<br />
(b) The random variable can have the values +π/2 and −π/2 only and Pr{ =<br />
+π/2} = Pr{ = −π/2} = 1/2. What are now the possible signal vectors (s c , s s )?<br />
Give the decision variables as function of (r c , r s ) and specify 2 for what (r c , r s ) an<br />
optimum receiver decides ˆM = 2.<br />
(c) Give an expression for the error probability P E that is achieved by an optimum receiver.<br />
2 Hint: note that x = ln 5 satisfies 26<br />
5<br />
= exp(x) + exp(−x).
Chapter 17<br />
Colored noise<br />
SUMMARY: So far we have only considered additive white Gaussian noise. In this chapter<br />
we will see that we should use a whitening filter if the channel noise is nonwhite. We will<br />
discuss the corresponding receiver structures.<br />
17.1 Introduction<br />
n(t)<br />
m<br />
✲<br />
transmitter<br />
s m (t) ✛❄<br />
✘<br />
✲ + ✲ receiver<br />
✚✙r(t) = s m (t) + n(t)<br />
ˆm<br />
✲<br />
Figure 17.1: Communication over an additive waveform channel. The noise n(t) is nonwhite<br />
and Gaussian.<br />
Consider figure 17.1. There the transmitter sends, depending on the message index m, one of<br />
the signals (waveforms) s 1 (t), s 2 (t), · · · , s |M| (t) over a waveform channel. The channel adds<br />
wide-sense-stationary zero-mean non-white Gaussian noise n(t) with power spectral density<br />
S n ( f ) to this waveform 1 . Therefore the output of the channel r(t) = s m (t) + n(t).<br />
How should an optimum receiver process this output signal r(t) now? We will show next<br />
that we can use a so-called whitening filter to make the colored noise white. Then we proceed as<br />
usual.<br />
17.2 Another result on reversibility<br />
We could try to construct an optimum receiver in two steps (see figure 17.2). First an operation is<br />
performed on the channel output r(t). This yields a new channel output r o (t). Then we construct<br />
1 See appendix D.<br />
150
CHAPTER 17. COLORED NOISE 151<br />
s m (t)<br />
r(t)<br />
r o (t)<br />
✲ channel<br />
✲ operation<br />
✲<br />
receiver<br />
ˆm<br />
✲<br />
Figure 17.2: Insertion of an operation between channel output and receiver.<br />
an optimum receiver for the channel with input s m (t) and output r o (t).<br />
It will be clear that such a two-step procedure cannot result in a smaller average error probability<br />
P E than an optimum one-step procedure would achieve.<br />
If an inverse operation exist which permits r(t) to be reconstructed from r o (t) however, the<br />
theorem 4.3 of reversibility states that the average error probability need not be increased by a<br />
two-step procedure. The second step could in that case consist of the calculation of r(t) from<br />
r o (t), followed by the optimum receiver for r(t). The theorem of reversibility in its most general<br />
form states:<br />
RESULT 17.1 A reversible operation, transforming r(t) into one or more waveforms, may be<br />
inserted between the channel output and the receiver without affecting the minimum attainable<br />
average error probability.<br />
17.3 Additive nonwhite Gaussian noise<br />
Suppose that we use a filter with impulse response g(t) and transfer function G( f ) to change<br />
the power spectral density S n ( f ) of the Gaussian noise process N(t). Consider figure 17.3. The<br />
filter produces the Gaussian noise process N o (t) at its output.<br />
From equation (D.6) in appendix D we know that<br />
S n o( f ) = S n ( f )G( f )G ∗ ( f ) = S n ( f )|G( f )| 2 . (17.1)<br />
If we use the filter g(t) to whiten the noise, i.e. to produce noise at the output with power spectral<br />
density S n o( f ) = N 0 /2, the filter should be such that S n ( f )|G( f )| 2 = N 0 /2. In appendix I we<br />
have shown how to determine a realizable linear filter with transfer function G( f ), which has a<br />
realizable inverse and for which<br />
|G( f )| 2 = N 0<br />
2<br />
1<br />
S n ( f ) . (17.2)<br />
The method in the appendix is applicable if whenever S n ( f ) can be expressed (or approximated)<br />
as a ratio of two polynomials in f . We will work out an example next.<br />
n(t)<br />
✲<br />
h(t)<br />
n o (t)<br />
✲<br />
Figure 17.3: Filtering the noise process N(t).
CHAPTER 17. COLORED NOISE 152<br />
Example 17.1 Consider (see figure 17.4) noise with power spectral density<br />
Observe that we can take as transfer function of our whitening filter<br />
S n ( f ) = f 2 + 4<br />
f 2 + 1 . (17.3)<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
−10 −8 −6 −4 −2 0 2 4 6 8 10<br />
Figure 17.4: The power spectrum S n ( f ) = ( f 2 + 4)/( f 2 + 1) of the noise and the squared<br />
modulus ( f 2 + 1)/( f 2 + 4) of the transfer function G( f ) of the corresponding whitening filter.<br />
then both G( f ) and its inverse are realizable (see appendix I).<br />
G( f ) = f − j<br />
f − 2 j , (17.4)<br />
17.4 Receiver implementation<br />
17.4.1 Correlation type receiver<br />
When the whitening filter g(t) has been determined we can construct the optimum receiver. Note<br />
that we actually place the whitening filter at the channel output r(t) as in the left part of figure<br />
17.5 but this is equivalent to the circuit containing two filters g(t) as in the right part of figure<br />
17.5. Therefore the effect of applying the whitening filter g(t) is that we have transformed the<br />
channel into an additive white Gaussian noise waveform channel and that the signals s m (t) are<br />
changed by the whitening filter g(t) into the signals<br />
s o m (t) = s m(t) ∗ g(t) =<br />
∫ ∞<br />
−∞<br />
s m (α)g(t − α)dα for m ∈ M. (17.5)
CHAPTER 17. COLORED NOISE 153<br />
n(t)<br />
❄<br />
g(t)<br />
n(t)<br />
n w (t)<br />
s m (t)<br />
✛❄<br />
✘<br />
s m (t)<br />
✛❄<br />
✘<br />
✲ + ✲ g(t) ✲<br />
✲ g(t) ✲ + ✲<br />
✚<br />
✙r(t)<br />
r o (t)<br />
s o ✚✙<br />
m (t) r o (t)<br />
Figure 17.5: Two equivalent channel/whitening-filter circuits.<br />
This observation leads to the optimum receiver shown in figure 17.6. This receiver correlates<br />
the output r o (t) of the whitening filter g(t) with the signals sm o (t) for all m ∈ M. The signals<br />
sm o (t) are obtained at the output of filters with impulse response g(t) when s m(t) is input. Note<br />
that the constants cm o for m ∈ M are now<br />
c o m (t) = N 0<br />
2 ln Pr{M = m} − 1 2<br />
since the signal s m (t) has changed into s o m (t).<br />
17.4.2 Matched-filter type receiver<br />
∫ ∞<br />
−∞<br />
[s o m (t)]2 dt, (17.6)<br />
To determine the matched-filter type receiver here, we will do some frequency-domain investigations.<br />
Suppose that for some m ∈ M the Fourier spectrum of the waveform s m (t) is denoted by<br />
S m ( f ). Assume that s m (t) = 0 except for 0 ≤ t ≤ T . The transfer function of the corresponding<br />
matched filter h m (t) = s m (T − t) is then given by<br />
H m ( f ) =<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
s m (T − t) exp(− j2π f t)dt<br />
s m (α) exp(− j2π f (T − α))dα<br />
= exp(− j2π f T )<br />
∫ ∞<br />
−∞<br />
s m (α) exp( j2π f α)dα = exp(− j2π f T )Sm ∗ ( f ), (17.7)<br />
where Sm ∗ ( f ) is the complex conjugate of S m( f ).<br />
Now what is the matched filter corresponding to the signal sm o (t). First note that the spectrum<br />
of sm o (t) is Sm o ( f ) = G( f )S m( f ). (17.8)<br />
Suppose that g(t) is only non-zero for 0 ≤ t ≤ T 1 . Moreover let all s m (t) be zero outside the<br />
time interval 0 ≤ t ≤ T 2 . Then s o m (t) = ∫ ∞<br />
−∞ s m(α)g(t − α)dα can only be non-zero if an α
CHAPTER 17. COLORED NOISE 154<br />
r(t)<br />
✲<br />
g(t)<br />
r o (t)<br />
s 1 (t)<br />
✲<br />
s 2 (t)<br />
✲<br />
g(t)<br />
g(t)<br />
.<br />
s1 o(t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
s2 o(t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
∫<br />
∫<br />
c1<br />
o<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
c2<br />
o<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
ˆm<br />
✲<br />
s |M| (t)<br />
✲<br />
g(t)<br />
s|M| o (t)<br />
✛❄<br />
✘<br />
✲ × ✲<br />
✚✙<br />
.<br />
∫<br />
c|M|<br />
o<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
Figure 17.6: Correlation receiver for the additive nonwhite Gaussian noise channel (filteredreference<br />
receiver).<br />
exists such that 0 ≤ α ≤ T 2 and 0 ≤ t − α ≤ T 1 hence if 0 ≤ t ≤ T 1 + T 2 . The transfer function<br />
H o ( f ) of the filter matched to sm o (t) should therefore be<br />
H o m ( f ) = [So m ( f )]∗ exp(− j2π f (T 1 + T 2 ))<br />
= G ∗ ( f )S ∗ m ( f ) exp(− j2π f (T 1 + T 2 ))<br />
= G ∗ ( f ) exp(− j2π f T 1 )S ∗ m ( f ) exp(− j2π f T 2). (17.9)<br />
This corresponds to a cascade of the filter g(T 1 − t) followed by the filter s m (T 2 − t), one for<br />
each m ∈ M. Thus the filter matched to sm o (t) consists of a part matched to the channel (thus<br />
to the whitening filter) and a part that is matched to the signal s m (t). This leads to the receiver<br />
structure in figure 17.7.<br />
It is interesting to see what the front-end of this receiver does. This consists of the cascade of<br />
the whitening filter g(t) and the part of the matched filter matched to the channel (the whitening<br />
filter). The transfer function of this cascade is<br />
G( f )G ∗ ( f ) exp(− j2π f T 1 ) = |G( f )| 2 exp(− j2π f T 1 )<br />
= N 0/2<br />
S n ( f ) exp(− j2π f T 1). (17.10)<br />
This cascade may be implemented as a single filter. This effect of this filter is to pass energy<br />
over frequency bands where the the noise power is small and to suppress energy over frequency<br />
bands where it is strong.
CHAPTER 17. COLORED NOISE 155<br />
r(t)<br />
✲<br />
g(t)<br />
r o (t)<br />
✲<br />
g(T 1 − t)<br />
✲<br />
✲<br />
s 1 (T 2 − t)<br />
s 2 (T 2 − t)<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
c1<br />
o<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
c2<br />
o<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
select<br />
largest<br />
ˆm<br />
✲<br />
.<br />
✲<br />
s |M| (T 2 − t)<br />
❍<br />
✄ ✄<br />
✂ ❍❍<br />
✁ ✂ ✁<br />
c|M|<br />
o<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚✙<br />
❄<br />
sample at t = T 1 + T 2<br />
Figure 17.7: Matched filter receiver for the additive nonwhite Gaussian noise channel (filteredsignal<br />
receiver).<br />
17.5 Exercises<br />
1. A communication system uses the {ϕ j (t)} shown in figure 17.8 to transmit one of two<br />
equally likely messages by means of the signals s 1 (t) = √ E s ϕ 1 (t), s 2 (t) = √ E s ϕ 2 (t).<br />
The channel is also illustrated in figure 17.8.<br />
(a) What is the probability of error for an optimum receiver when E s /N 0 = 5?<br />
(b) Give a detailed block diagram, including waveshapes in the absence of noise, of the<br />
optimum correlation receiver (filtered-signal receiver).<br />
(Exercise 7.1 from Wozencraft and Jacobs [25].)
CHAPTER 17. COLORED NOISE 156<br />
✻ϕ 1 (t)<br />
1<br />
✻ϕ 2 (t)<br />
1<br />
✲<br />
t<br />
1<br />
2<br />
✲<br />
t<br />
n w (t)<br />
s(t)<br />
✲<br />
h(t)<br />
s 0 (t) ✛❄<br />
✘<br />
✲ +<br />
✚✙<br />
r(t)<br />
✲<br />
1<br />
✻h(t)<br />
1/2<br />
4<br />
9/2<br />
✲<br />
t<br />
Figure 17.8: Building blocks waveforms, communication system, and the channel impulse response<br />
h(t).
Chapter 18<br />
Capacity of the colored noise waveform<br />
channel<br />
SUMMARY: In this chapter we determine the capacity of an additive non-white Gaussian<br />
noise waveform channel. We have to find out how to distribute signal power over a number<br />
of parallel channels, each with their own noise power. These channels partition the<br />
spectrum in different frequency bands. It turns out that a ”water-filling” method achieves<br />
capacity.<br />
18.1 Problem statement<br />
transmitter<br />
n(t)<br />
s(t) ✛❄ ✘<br />
✲ +<br />
✲<br />
✚✙r(t) = s(t) + n(t)<br />
receiver<br />
Figure 18.1: Communication system based on an additive non-white Gaussian noise channel.<br />
Consider the communication system shown on figure 18.1. The channel adds stationary zeromean<br />
Gaussian noise to the input waveform s(t). The power spectral density S n ( f ) of the noise<br />
process N(t) is assumed to be non-white, see figure 18.2. Our problem is now to determine the<br />
capacity of this channel when the available transmitter power is P s ? We will solve this problem<br />
by decomposing the colored-noise channel into sub-channels. Each of these sub-channels is a<br />
bandpass channel. We then have to find out how the signal power P s is distributed over these<br />
sub-channels in an optimal way, i.e. to maximize the total capacity of all the sub-channels.<br />
157
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 158<br />
❊❊<br />
❊<br />
❊<br />
❊<br />
❆<br />
❆<br />
❆<br />
❅<br />
❍❍<br />
<br />
✟✟<br />
✟✟<br />
✻ S n ( f )<br />
❍❍<br />
❅ ❅❍<br />
❍<br />
✟✟ ✁ ✁ ✁✆✆✆✆✆<br />
0<br />
✲<br />
f<br />
Figure 18.2: Power spectral density S n ( f ) of colored noise.<br />
❊❊<br />
❊<br />
❊<br />
❊<br />
❆<br />
.............<br />
❆<br />
❆<br />
.............<br />
❅<br />
❍❍<br />
.............<br />
✟✟<br />
.............<br />
<br />
✟✟<br />
.............<br />
✻S n ( f )<br />
3 2 1 1 2 3<br />
............. .............<br />
4<br />
.............<br />
✆ ✆✆✆<br />
❍❍<br />
.............<br />
.............<br />
❅ ❅❍<br />
❍ ✟✟ ✁✁✁✆ ............. .............<br />
✲<br />
k<br />
−f<br />
0<br />
f 3f<br />
2f<br />
f<br />
Figure 18.3: Decomposition of our channel into sub-channels, each having bandwidth f .<br />
18.2 The sub-channels<br />
Suppose that we divide our channel into K sub-channels, each sub-channel being a bandpass<br />
channel (see figure 18.3). We assume that channel k has a passband which is ±[(k −1)f, kf ]<br />
for k = 1, K . For the moment we assume that f and K are suitably chosen.<br />
Assume that transmitter power used in sub-channel k is P k . The spectral density of the noise<br />
is not constant within the passband of a sub-channel. Therefore we make a staircase approximation<br />
of the spectral density (see figure 18.3). We assume that the spectral density of the noise over<br />
the entire passband of sub-channel k is S n ( f k ) where f k = (k − 1/2)f is the center-frequency<br />
of the passband. Now the capacity of the sub-channel k is<br />
C k = f log 2<br />
(1 +<br />
P k<br />
S n ( f k )2f<br />
)<br />
bit/second. (18.1)<br />
This a consequence of result 15.2. In the next section we investigate how we should distribute<br />
the signal power over all the sub-channels.
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 159<br />
∼ N 1<br />
✛❄<br />
✘<br />
✲ +<br />
✚✙<br />
✲<br />
∼ P 1<br />
∼ N 2<br />
∼ P 2<br />
✛❄<br />
✘<br />
✲ +<br />
✚✙<br />
✲<br />
∼ P K<br />
.<br />
∼ N K<br />
✛❄<br />
✘<br />
✲ +<br />
✚✙<br />
✲<br />
Figure 18.4: K parallel channels with different noise powers.<br />
18.3 Capacity of parallel channels, water-filling<br />
Consider K parallel channels (see figure 18.4). Each channel has the same bandwidth f .<br />
We assume that the signal power used in channel k is P k . The noise power in that channel is<br />
N k = S n ( f k )2f . The total available signal power is P s hence we obtain the condition<br />
∑<br />
P k ≤ P s . (18.2)<br />
k=1,K<br />
Our problem is to distribute the signal power P s over all the sub-channels such that the total<br />
capacity is maximal. In other words<br />
∑<br />
(<br />
C = max f log 2 1 + P )<br />
k<br />
, (18.3)<br />
P 1 ,P 2 ,··· ,P K<br />
N k<br />
k=1,K<br />
where condition (18.2) should be satisfied. We can solve this maximization problem using the<br />
following result.<br />
RESULT 18.1 (Gallager [7], p. 87). Let f (α) be a convex-∩ function of α = (α 1 , α 2 , · · · , α K )<br />
over the region R when α is a probability vector. Assume that the partial derivatives ∂ F(α)/∂α k<br />
are defined and continuous over the region R with the possible exception that lim αk →0 ∂ F(α)/∂α k<br />
may be +∞. Then the conditions<br />
∂ f (α)<br />
= λ for all k such that α k > 0,<br />
∂α k<br />
(18.4)<br />
∂ f (α)<br />
≤ λ for all k such that α j = 0,<br />
∂α k<br />
(18.5)
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 160<br />
are necessary and sufficient conditions on a probability vector α to maximize f (α) over the<br />
region R.<br />
To find the capacity of a collection of K parallel channels we take the function<br />
f (α) = ∑<br />
k=1,K<br />
(<br />
f log 2 1 + α )<br />
k P s<br />
, (18.6)<br />
N k<br />
i.e. the total capacity C. This is a convex-∩ function of the probability vectors α. R is the region<br />
of all these vectors. The partial derivatives of f (α) are<br />
∂ f (α)<br />
∂α j<br />
= f<br />
ln 2 ·<br />
P s<br />
N k + α k P s<br />
. (18.7)<br />
Result 18.1 now implies that the probability vector α reaches a maximum if and only if<br />
N k + α k P s = L for all k such that α k > 0,<br />
N k ≥ L for all k such that α k = 0 (18.8)<br />
for some power level L. This suggest the following procedure:<br />
RESULT 18.2 The procedure is called ”water-filling” (see figure 18.5.) Before the water-filling<br />
N 4<br />
L<br />
α 1 P s α 2 P s α 3 P s<br />
N 3<br />
N 2<br />
N 1<br />
Figure 18.5: Water-filling with four parallel channels.<br />
procedure begins each channel is filled with noise-power only. Channel k contains noise power<br />
N k . Then we start pouring signal power into the channels. The signal power starts filling the<br />
channel with the smallest noise power first, i.e. channel 1 here. When water-level reaches the
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 161<br />
second-smallest noise power level N 2 , and we keep pouring signal power into the channels, both<br />
channel channel 1 and 2 are being filled, etc. Observe that depending on the total amount of<br />
signal power some channels will not get signal power at all (channel 4 here).<br />
This procedure leads to the optimal distribution of signal power P k = α k P s over all the<br />
channels. From these powers the total capacity C can be computed.<br />
18.4 Back to the non-white waveform channel again<br />
Remember that we have decomposed the non-white waveform channel into K sub-channels.<br />
These sub-channels all have bandwidth f . We have seen how we should divide up the signal<br />
power over these sub-channels (water-filling procedure, result 18.2).<br />
Now instead of specifying the signal power in each sub-channel it has advantages to consider<br />
the spectral density of the signal power P s ( f k ) in each sub-channel k hence<br />
P s ( f k ) =<br />
P k<br />
2f<br />
= α k P s<br />
2f . (18.9)<br />
Moreover note that we can express the noise power in sub-channel as N k = S n ( f k )2f . Therefore<br />
a signal power distribution {P s ( f k ), k = 1, K } reaches the maximum if and only if<br />
S n ( f k ) + P s ( f k ) = L ′ for all k such that P s ( f k ) > 0,<br />
S n ( f k ) ≥ L ′ for all k such that P s ( f k ) = 0 (18.10)<br />
for some power level L ′ such that<br />
∑<br />
k=1,K<br />
P s ( f k )2f = P s . (18.11)<br />
Note that L ′ relates to the power level L in the previous section as given by L ′ 2f = L.<br />
Now we are ready to let the bandwidth of the sub-channels f ↓ 0 and the number of<br />
sub-channels K → ∞. This leads to the main result in this chapter.<br />
RESULT 18.3 A spectral density function P s ( f ) of the signal power reaches the maximum if<br />
and only if<br />
for some power level L” such that<br />
S n ( f ) + P s ( f ) = L” for all f such that P s ( f ) > 0,<br />
S n ( f ) ≥ L” for all f such that P s ( f ) = 0 (18.12)<br />
∫ ∞<br />
−∞<br />
P s ( f )d f = P s . (18.13)<br />
This suggests a water-filling procedure in the spectral domain (see figure 18.6).
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 162<br />
❊❊ S n ( f )<br />
❊<br />
❊<br />
❊<br />
❆<br />
❆<br />
❆<br />
❅<br />
❍❍<br />
<br />
✟✟<br />
✟✟<br />
✻<br />
❍❍<br />
P s ( f )<br />
❅ ❅❍<br />
❍ ✟✟ ✁ ✁ ✁✆✆✆✆✆ ✲<br />
L”<br />
0<br />
f<br />
where<br />
Figure 18.6: Water-filling in the spectral domain.<br />
For the capacity of the non-white waveform channel we can now write<br />
∫<br />
1<br />
C =<br />
f ∈B 2 log 2 (1 + P s( f )<br />
S n ( f ) )d f<br />
∫<br />
1<br />
=<br />
2 log 2 ( L”<br />
)d f bit/second. (18.14)<br />
S n ( f )<br />
f ∈B<br />
B = { f |P s ( f ) > 0}. (18.15)<br />
Note that L” is the power spectral density level that determines how the spectral power is distributed<br />
over the frequency spectrum.<br />
18.5 Capacity of the linear filter channel<br />
❄<br />
g(t)<br />
n(t)<br />
channel<br />
n(t)<br />
s(t)<br />
✲<br />
h(t)<br />
✛❄<br />
✘<br />
✲ + ✲<br />
✚<br />
✙r(t)<br />
g(t)<br />
r o (t)<br />
✲<br />
s(t)<br />
✛ ❄ ✘r o (t)<br />
✲ + ✲<br />
✚✙<br />
Figure 18.7: Creating a flat channel changes the spectral density of the noise.<br />
Suppose that the channel filters the input signal s(t) with a filter with impulse response h(t)<br />
and then adds (possibly non-white) Gaussian noise n(t) to the filter output s(t) ∗ h(t). Then,
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 163<br />
at the receiver side, the overal channel can be made flat by filtering the channel output r(t) =<br />
s(t) ∗ h(t) + n(t) with an equalizing filter with response g(t) that satisfies<br />
G( f )H( f ) = 1, (18.16)<br />
thus g(t) is the inverse 1 filter to h(t). The effect of this filter is however that also the noise n(t)<br />
is filtered by this equalizing filter, hence<br />
r o (t) = r(t) ∗ g(t)<br />
= (s(t) ∗ h(t) + n(t)) ∗ g(t)<br />
= s(t) ∗ h(t) ∗ g(t) + n(t) ∗ g(t) = s(t) + n(t) ∗ g(t). (18.17)<br />
The power spectral density of the noise n o (t) = n(t) ∗ g(t) in r o (t) is now given by<br />
S n o( f ) = S n ( f )‖G( f )‖ 2 = S n( f )<br />
‖H( f )‖ 2 . (18.18)<br />
Note that from the previous sections we know how to determine the capacity of this channel now.<br />
18.6 Exercises<br />
1. Determine the capacity C of the channel with noise spectrum S n ( f ) = | f | + 1 depicted in<br />
figure 18.8 as a function of the available signal power P s .<br />
✻S n ( f )<br />
❅<br />
<br />
❅<br />
❅<br />
❅<br />
| f | + 1<br />
❅<br />
❅<br />
❅ 1<br />
✲<br />
0<br />
f<br />
Figure 18.8: A noise spectrum S n ( f ).<br />
2. Consider a waveform channel with additive non-white Gaussian noise. The output waveform<br />
r(t) = s(t) + n(t) where s(t) is the input waveform and n(t) the noise.<br />
The power spectral density of the noise is given by S n ( f ) = exp(| f |) for −∞ < f < ∞<br />
(see figure 18.9). Determine the signal power P s as a function of the capacity C of this<br />
channel. As usual C is expressed in bit per second.<br />
(Exam July 4, 2001.)<br />
1 We assume here that this inverse filter is realizable. Therefore the theorem of reversibility 4.3 applies.
CHAPTER 18. CAPACITY OF THE COLORED NOISE WAVEFORM CHANNEL 164<br />
8<br />
7<br />
6<br />
5<br />
S n<br />
(f)<br />
4<br />
3<br />
2<br />
1<br />
0<br />
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2<br />
f<br />
Figure 18.9: Power spectral density of the noise S n ( f ) = exp(| f |).<br />
3. Consider a waveform channel with additive non-white Gaussian noise. The output waveform<br />
r(t) = s(t) + n(t) where s(t) is the input waveform and n(t) the noise.<br />
The power spectral density of the noise (see figure 18.10) is given by<br />
⎧<br />
1 for | f | ≤ 1,<br />
⎪⎨<br />
2 for 1 < | f | ≤ 2,<br />
S n ( f ) =<br />
4 for 2 < | f | ≤ 3,<br />
⎪⎩<br />
∞ otherwise.<br />
(18.19)<br />
9<br />
8<br />
7<br />
6<br />
5<br />
S n<br />
(f)<br />
4<br />
3<br />
2<br />
1<br />
0<br />
−4 −3 −2 −1 0 1 2 3 4<br />
f<br />
Figure 18.10: Power spectral density of the noise S n ( f ) in Watt/Hz as a function of the frequency<br />
f in Hz.<br />
(Exam July 5, 2002.)
Part V<br />
Coded Modulation<br />
165
Chapter 19<br />
Coding for bandlimited channels<br />
SUMMARY: Modulation makes it possible to transmit sequences of symbols over a waveform<br />
channel. Here we will investigate coding, i.e. we will study which sequences of symbols<br />
should be used to achieve efficient transmission. We study only coding methods, i.e. techniques<br />
that create distance between sequences. Shaping techniques, i.e. methods that aim<br />
at achieving spherical signal structures will not be discussed.<br />
19.1 AWGN channel<br />
Consider transmission over an additive white Gaussian noise (AWGN) channel. For this channel<br />
n i<br />
p N (n i ) = 1 √ π N0<br />
exp(− n2 i<br />
N 0<br />
)<br />
s i<br />
✛❄<br />
✘r i = s i + n i<br />
✲ + ✲<br />
✚✙<br />
Figure 19.1: Additive white Gaussian noise channel.<br />
the i-th channel output r i is equal to the i-th input s i to which noise n i is added, hence<br />
r i = s i + n i , for i = 1, 2, · · · . (19.1)<br />
We assume that the noise variables N i are independent and normally distributed, with mean 0<br />
and variance σ 2 = N 0 /2.<br />
This AWGN-channel models transmission over a waveform channel with additive white<br />
Gaussian noise with power density S Nw ( f ) = N 0 /2 for −∞ < f < ∞ as we have seen<br />
before (see section 6.8).<br />
Note that i is the index of the dimensions. If e.g. the waveform channel has bandwidth W ,<br />
we can get a new dimension every 1<br />
2W<br />
seconds (see chapter 14). This determines the relation<br />
between the actual time t and the dimension index i but we will ignore this in what follows.<br />
166
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 167<br />
In this chapter we call a dimension a transmission. We assume that the energy “invested” in<br />
each transmission (dimension) is E N . It is our objective to investigate coding techniques for this<br />
channel. First we consider uncoded transmission.<br />
19.2 |A|-Pulse amplitude modulation<br />
In this section we will consider pulse amplitude modulation (PAM) again. In chapter 14 the modulation<br />
properties of this technique were discussed. Here we will investigate the error behavior<br />
of this method which can be regarded as uncoded transmission. We will use (uncoded) PAM as<br />
a reference for evaluating alternative signaling methods, methods that do involve coding.<br />
In the case of PAM, the channel input symbols s i for i = 1, 2, · · · assume values in the<br />
symbol alphabet (or signal structure or signal constellation)<br />
scaled by a factor d 0 /2. We write<br />
A = {−|A| + 1, −|A| + 3, · · · , |A| − 1}, (19.2)<br />
s i = d 0<br />
2 a i where a i ∈ A. (19.3)<br />
We will here call a i an elementary PAM-symbol.<br />
The constellation A is called an |A|-PAM constellation. It contains |A| equidistant elementary<br />
symbols centered on the origin and having minimum Euclidean distance 2. Therefore d 0 is<br />
the minimum Euclidean distance between the symbols s i . We assume in this chapter that |A| is<br />
even. In figure 19.2 an |A|-PAM constellation is shown.<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
<br />
−|A| + 1 · · · -5 -3 -1 1 3 5 · · · |A| − 1<br />
Figure 19.2: An |A|-PAM constellation (|A| even) with the |A| elementary symbols.<br />
Each symbol value has a probability of occurrence of 1/|A|. Therefore the information rate<br />
per transmission is<br />
R N = log 2 |A| bits per transmission. (19.4)<br />
For the average symbol energy E N per transmission we now have that<br />
E N = d2 0<br />
4 E pam (19.5)<br />
where E pam is the elementary energy of an |A|-PAM constellation which is defined as<br />
E pam<br />
=<br />
1<br />
|A|<br />
∑<br />
a=−|A|+1,−|A|+3,··· ,|A|−1<br />
a 2 . (19.6)
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 168<br />
RESULT 19.1 In appendix J it is shown that<br />
The result holds both for even and odd |A|.<br />
E pam = |A|2 − 1<br />
. (19.7)<br />
3<br />
It now follows from the above result that, since s i = (d 0 /2)a i , the average symbol energy<br />
E N = d2 0<br />
4<br />
Example 19.1 Consider the 4-PAM constellation<br />
|A| 2 − 1<br />
. (19.8)<br />
3<br />
A = {−3, −1, +1, +3}. (19.9)<br />
The rate corresponding to this constellation is R N = log 2 4 = 2 bits per transmission. For the elementary<br />
energy per transmission we obtain<br />
E pam = 1 4 ((−3)2 + (−1) 2 + (+1) 2 + (+3) 2 ) = 42 − 1<br />
3<br />
= 5. (19.10)<br />
Apart from the rate R N and the average symbol energy E N , a third important parameter is<br />
the symbol error probability PE s. This error probability is Q( d 0/2<br />
σ<br />
) for the outer two signal points<br />
and 2Q( d 0/2<br />
σ<br />
) for the inner signal points. Therefore<br />
P s E<br />
=<br />
2(|A| − 2) + 2<br />
|A|<br />
Q( d 0/2<br />
σ<br />
) =<br />
2(|A| − 1)<br />
|A|<br />
Q( d 0/2<br />
). (19.11)<br />
σ<br />
By (19.8) and the fact that σ 2 = N 0 /2, the square of the argument of the Q-function can be<br />
written as<br />
d0<br />
2<br />
4σ 2 = 3 E N<br />
|A| 2 − 1 N 0 /2 . (19.12)<br />
Therefore we can express the symbol error probability as<br />
(√<br />
)<br />
PE s 2(|A| − 1) 3 E N<br />
= Q<br />
|A| |A| 2 =<br />
− 1 N 0 /2<br />
(√<br />
2(|A| − 1)<br />
Q<br />
|A|<br />
)<br />
3<br />
|A| 2 − 1 S/N , (19.13)<br />
if we note that the signal-to-noise ratio S/N = E N /(N 0 /2), i.e. the average symbol energy<br />
divided by the variance of the noise in a transmission.<br />
19.3 Normalized S/N<br />
We now want to look more closely at signal-to-noise ratios. Therefore note that reliable transmission<br />
is only possible if the rate R N (in bits per transmission, dimension) is smaller than the
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 169<br />
capacity C N per dimension. If we write the capacity per transmission (see 12.19) as a function<br />
of S/N = E N /(N 0 /2) we obtain<br />
R N < C N = 1 2 log 2 (1 + S/N ). (19.14)<br />
Rewriting this inequality in the reverse direction way we obtain the following lower bound for<br />
S/N as a function of the rate R N<br />
S/N > 2 2R N<br />
− 1 = S/N min . (19.15)<br />
Definition 19.1 This immediately suggests defining the normalized signal-to-noise ratio as<br />
S/N norm<br />
=<br />
S/N<br />
S/N min<br />
=<br />
S/N<br />
2 2R N − 1 . (19.16)<br />
This definition allows us to compare coding methods with different rates with each other.<br />
Now it follows from (19.15) that for reliable transmission S/N norm > 1 must hold. Good<br />
signaling methods achieve small error probabilities for S/N norm close to 1 while lousy methods<br />
need a larger S/N norm to realize reliable transmission. It is the objective of the communication<br />
engineer to design systems that achieve acceptable error probabilities for a S/N norm as close as<br />
possible to 1.<br />
19.4 Baseline performance in terms of S/N norm<br />
In a previous section we have determined the symbol error probability of |A|-PAM as a function<br />
of S/N . This symbol error probability can be expressed in terms of S/N norm in a very simple<br />
way as we see soon.<br />
Since we are considering bandlimited transmission in this chapter, we may assume that<br />
S/N ≫ 1. We are aiming at achieving<br />
log |A| = R N ≈ C N = 1 2 log 2 (1 + S/N ) (19.17)<br />
or |A| ≈ √ 1 + S/N and hence we make the assumption that also |A| ≫ 1. Therefore we may<br />
conclude for the symbol error probability for |A|-PAM that<br />
(√<br />
)<br />
PE s 3<br />
≈ 2Q<br />
|A| 2 − 1 S/N<br />
(√<br />
)<br />
3<br />
(√ )<br />
= 2Q<br />
2 2R N − 1 S/N = 2Q 3S/N norm (19.18)<br />
by definition 19.1 of the normalized signal-to-noise ratio. This symbol error probability for PAM<br />
is depicted in figure 19.3. It will serve as reference performance for the alternative methods that<br />
will be investigated soon. We call it therefore the baseline performance. Note once more that the<br />
baseline performance corresponds to uncoded PAM transmission.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 170<br />
10 0 SNRnorm<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
10 −5<br />
10 −6<br />
−1 0 1 2 3 4 5 6 7 8 9 10<br />
Figure 19.3: Average symbol error probability P s E versus S/N norm for uncoded |A|-PAM, under<br />
the assumption that |A| is large. Horizontally S/N norm in dB, vertically log 10 P s E .<br />
19.5 The promise of coding<br />
It can be observed from figure 19.3 that a S/N norm of approximately 9 dB is needed to achieve<br />
a reliable performance when PAM is used. It is more or less common practice to assume that<br />
P s E ≈ 10−6 corresponds to reliable transmission.<br />
Shannon’s results imply that reliable transmission is possible for S/N norm ≈ 1. Therefore<br />
for |A|-PAM the S/N norm is roughly 9 dB, a factor of eight, larger than necessary. There exist<br />
methods that perform up to 9 dB better than our baseline method, i.e. up to 9 dB better than<br />
uncoded PAM.<br />
Why do we prefer small signal-to-noise ratios? In a radio transmission system a smaller S/N<br />
gives the opportunity to use less transmitter power or to apply simpler and/or smaller antennas.<br />
Using less transmitter power makes smaller batteries possible, results in a smaller bill from the<br />
electricity distribution company, or makes dissipation in circuits lower, etc. In line transmission<br />
using smaller transmitter power is also advantageous for the same reasons. Moreover we have<br />
the effect of cross-talk between different users. We can not combat the noise that results from<br />
this cross-talk by increasing the transmitter power. In this case it is better to achieve a better error<br />
performance given the S/N that occurs there.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 171<br />
19.6 Union bound<br />
19.6.1 Sequence error probability<br />
In what follows we will investigate several coding methods and evaluate their performance. Since<br />
we want to compare the performance of the proposed codes with the baseline performance it<br />
would be nice to know the symbol error probabilities of the proposed codes. But since it is not<br />
clear yet what this means, we consider the (sequence) error probability P E of a coding method<br />
first. The union bound can be used to give a good estimate of this error probability. This bound<br />
is based on the evaluating the pair-wise error probabilities between pairs of coded sequences.<br />
On the AWGN channel the received sequence r = s + n is the sum of the transmitted coded<br />
sequence (signal) s and the noise vector n consisting of N independent components each having<br />
mean zero and variance σ 2 = N 0 /2. Now consider two coded sequences s ∈ S and s ′ ∈ S<br />
where S is the set of sequences (signals). They differ by the Euclidean distance ‖s − s ′ ‖. If the<br />
sequence s is transmitted, the probability that the received vector r = s + n will be closer to s ′<br />
than to s is given by<br />
( ‖s − s ′ )<br />
‖/2<br />
Q<br />
. (19.19)<br />
σ<br />
The probability that r will be closer to s ′ than to s for any s ′ ̸= s in S which is precisely the<br />
probability of error P E (s) for minimum distance decoding when s is sent, is therefore upperbounded<br />
by<br />
P E (s) ≤ ∑ ( ‖s − s ′ )<br />
‖/2<br />
Q<br />
. (19.20)<br />
s ′̸=s σ<br />
The error probability over all coded sequences s is now upper-bounded by<br />
P E = ∑ 1<br />
|S| P E(s) ≤ ∑ 1 ∑<br />
( ‖s − s ′ )<br />
‖/2<br />
Q<br />
= ∑ ( ) d/2<br />
K d Q , (19.21)<br />
|S|<br />
s∈S<br />
s∈S s ′̸=s σ<br />
σ<br />
d<br />
where<br />
K d = ∑ s∈S<br />
# of s ′ such that ‖s − s ′ ‖ = d<br />
, (19.22)<br />
|S|<br />
i.e. the average number of signals at distance d from a coded sequence. Inequality (19.21) is<br />
referred to as the union bound for the sequence error probability.<br />
The function Q(x) decays exponentially not slower than exp(−x 2 /2). This follows from<br />
appendix A. Therefore, when K d does not increase too rapidly with d, for small σ , the union<br />
bound estimate is dominated by its first term, which is called the union bound estimate<br />
( )<br />
dmin /2<br />
P E ≈ K min Q , (19.23)<br />
σ<br />
where d min is the minimum Euclidean distance between any two sequences in S and K min the<br />
average number of signals at distance d min from a coded sequence. K min is called the error<br />
coefficient.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 172<br />
19.6.2 Symbol error probability<br />
To compare a coded system with the uncoded baseline system, we would like to know, instead<br />
of the sequence error probability P E for this coded system, the symbol error probability P s E .<br />
A first problem now appears: What are the symbols? Usually in a coded system a bunch of<br />
bits is transmitted in a single sequence during N transmissions. In the baseline system however,<br />
a symbol is the amount of information that is sent in a single transmission. Therefore in the<br />
coded case we assume also that in each transmission a symbol is sent. Such an abstract symbol<br />
corresponds to a fraction 1/N of the information bits. Thus the coded sequence contains N of<br />
these symbols.<br />
Can we now express the sequence error probability P E in terms of the error probability ˜P s E<br />
for these symbols 1 and vice versa? Now we get a second problem: How do these symbols-errors<br />
depend on each other? To make life easy we assume that the N symbols-errors are independent<br />
of each other. Now we easily obtain<br />
or<br />
P E = 1 − (1 − ˜P s E )N ≈ N ˜P s E (19.24)<br />
˜P s E ≈ P E/N. (19.25)<br />
If we now use the union bound estimate (19.23) obtained in the previous subsection we get<br />
˜P E s ≈ P E/N ≈ K ( )<br />
min<br />
N Q dmin /2<br />
. (19.26)<br />
σ<br />
The ratio K min /N is called the normalized error coefficient (see [6], p. 2399). The symbol error<br />
probability ˜P<br />
E s which is also referred to as sequence error probability per transmission can now<br />
be used for comparison with the uncoded PAM case.<br />
19.7 Extended Hamming codes<br />
To construct sets of coded sequences we can make use of a binary error-correcting code. Such a<br />
code C is a set of |C| codewords {c 1 , c 2 , · · · , c |C| }. Each codeword c is a sequence consisting of<br />
N symbols c 1 , c 2 , · · · , c N taking values in the alphabet {0, 1}.<br />
A code can be linear. In that case all codewords are linear combinations of K independent<br />
codewords 2 . Such a linear code contains 2 K codewords.<br />
Definition 19.2 The minimum Hamming distance of a code is defined as<br />
d H,min<br />
= min<br />
c̸=c ′ d H(c, c ′ ), (19.27)<br />
where d H (c, c ′ ) is the Hamming distance between two codewords c and c ′ , i.e. the number of<br />
positions at which they differ.<br />
1 We write ˜P E s with a tilde since we are not dealing with concrete symbols.<br />
2 The sum c of two codewords c ′ and c ′′ is the component-wise mod 2 sum of both codewords. A codeword c<br />
times a scalar is the codeword c itself if the scalar is 1 or the all-zero codeword 0, 0, · · · , 0 if the scalar is 0.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 173<br />
So-called extended Hamming codes exist and have parameters<br />
(N, K, d H,min ) = (2 q , 2 q − q − 1, 4), for q = 2, 3, · · · . (19.28)<br />
We will use these codes in the next sections of this chapter, just because it is convenient to do so.<br />
Example 19.2 For q = 3 we obtain a code with parameters (N, K, d H,min ) = (8, 4, 4). Indeed this code<br />
can be constructed from the following K = 4 independent codewords of length N = 8:<br />
c 1<br />
= (1, 0, 0, 0, 0, 1, 1, 1),<br />
c 2<br />
= (0, 1, 0, 0, 1, 0, 1, 1),<br />
c 3<br />
= (0, 0, 1, 0, 1, 1, 0, 1),<br />
c 4<br />
= (0, 0, 0, 1, 1, 1, 1, 0). (19.29)<br />
The code consists of the 16 linear combinations of these 4 codewords. Why is the minimum Hamming<br />
distance d H,min of this code 4? To see this, note that the difference of two codewords c ̸= c ′ is a non-zero<br />
codeword itself. The Hamming distance between two codewords is the number of ones i.e. the Hamming<br />
weight of this difference. The minimal Hamming weight of a nonzero codeword is at least 4. This can be<br />
checked easily.<br />
19.8 Coding by set-partitioning<br />
19.8.1 Construction of the signal set<br />
The idea of “mapping by set-partitioning” comes from Ungerboeck [22]. He proposed to partition<br />
the constellation A into subsets A 0 and A 1 in such a way that the minimum distance of<br />
the elementary symbols in the subsets becomes larger than that of the elementary symbols in<br />
the original constellation. E.g. an 8-PAM constellation A can be partitioned into a subset A 0<br />
✄<br />
✂ ✁<br />
-7<br />
A<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
<br />
-5 -3 -1 1 3 5 7<br />
A 0 A 1<br />
✄ ✂<br />
✁<br />
✏✮<br />
✏ ✏✏✏✏✏<br />
<br />
<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
-7 -3 1 5<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
<br />
-5 -1 3 7<br />
Figure 19.4: Partitioning of an 8-PAM set into two subsets with four elementary symbols having<br />
both a twice as large minimum within-subset distance than the original PAM-set.<br />
and a subset A 1 both containing four elementary symbols, and such that the minimum distance<br />
between two elementary symbols in A 0 or two symbols in A 1 , i.e. the minimum within-subset
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 174<br />
distance, is 4, which is twice as large as the minimum distance between two elementary symbols<br />
in A, which is 2 (see figure 19.4). Note that the minimum distance between any symbol from A 0<br />
and any symbol from A 1 is still 2.<br />
How can set-partitioning lead to more reliable transmission? Therefore we consider the<br />
example-encoder shown in figure 19.5. This encoder transforms 20 binary information digits<br />
b 1 , b 2 , · · · , b 20 into 8 elementary 8-PAM symbols a 1 , a 2 , · · · , a 8 . The digits b 1 , b 2 , b 3 , b 4 are<br />
b 1<br />
b 2<br />
b 3<br />
b 4<br />
(8,4,4)<br />
c 1<br />
b 5<br />
b 8<br />
b 19 tab<br />
b 20<br />
a 8<br />
c 8<br />
tab<br />
b 6<br />
c 2<br />
a 1<br />
b 7<br />
tab<br />
a 2<br />
.<br />
. .<br />
Figure 19.5: An encoder that maps by set-partitioning.<br />
.<br />
transformed into a codeword c 1 , c 2 , · · · , c 8 from an (8,4,4) extended Hamming code. Digit c i<br />
together with two information digits b 2i+3 and b 2i+4 are used to form the PAM symbol a i . Therefore<br />
the following table can be used. The table lists a i for all combinations of c i and b 2i+3 , b 2i+4 .<br />
c i b 2i+3 , b 2i+4 00 01 11 10<br />
0 −7 −3 +1 +5<br />
1 −5 −1 +3 +7<br />
Note that according to this table essentially one out of the four elementary symbols from the set<br />
A 0 is chosen if c i = 0 or one out the four elementary symbols from A 1 if c i = 1. The binary<br />
digits b 2i+3 and b 2i+4 determine which symbol is picked.<br />
19.8.2 Distance between the signals<br />
How about the minimum distance d min between the signals s = s 1 , s 2 , · · · , s 8 that is achieved in<br />
this way? Note that there are 2 20 such signals. Just like before each signal (vector)<br />
s = (d 0 /2)a, (19.30)<br />
thus each signal s is a scaled version of an elementary signal (vector) a = a 1 , a 2 , · · · , a 8 . To determine<br />
the minimum Euclidean distance between two signals s ̸= s ′ we determine the minimum<br />
distance between two elementary signals a ̸= a ′ first. We consider two cases.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 175<br />
1. First consider all 2 16 elementary sequences a that result from a fixed codeword c from<br />
the extended Hamming code. For such a sequence each component a i ∈ A ci for i =<br />
1, 2, · · · , 8. The within-subset distance in subsets A 0 and A 1 is 4. Therefore the minimum<br />
distance between two elementary sequences a and a ′ resulting from the same codeword c<br />
is 4.<br />
2. Next consider two elementary sequences a and a ′ that correspond to two different codewords<br />
c and c ′ from the extended Hamming code. Then there are at least d H,min = 4<br />
positions i ∈ {1, 2, · · · , 8} at which c i ̸= c<br />
i ′ . Note that at these positions the elementary<br />
symbol a i ∈ A ci while the symbol a<br />
i ′ ∈ A c i ′ . The minimum distance between any symbol<br />
from A 0 and any symbol from A 1 is 2 and thus at these positions |a i − a<br />
i ′ | ≥ 2 and hence<br />
‖a − a ′ ‖ 2 = ∑<br />
(a i − a i ′ )2 ≥ 4 · 4 = 16. (19.31)<br />
i=1,8<br />
Therefore the distance between two elementary sequences a and a ′ that correspond to two<br />
different codewords c and c ′ is at least 4.<br />
We conclude that the Euclidean distance between two elementary sequences a and a ′ is at least<br />
4. Therefore the Euclidean distance between two sequences s and s ′ is at least 4(d 0 /2) = 2d 0 .<br />
Hence d min = 2d 0 , which is a factor of two larger than in the uncoded case.<br />
19.8.3 Asymptotic coding gain<br />
We have seen in the previous section that the minimum distance between any two coded sequences<br />
for the Ungerboeck coding method d min = 2d 0 . In uncoded PAM, our baseline method,<br />
the distance between any two symbols was at least d 0 . We may conclude that by using the idea<br />
of Ungerboeck the minimum Euclidean distance has increased by a factor of two.<br />
Let us now compute the sequence error probability for our Ungerboeck coding technique. Is it<br />
indeed smaller than the baseline behavior for the symbol error probability for the uncoded PAM<br />
case that was described by (19.18)? Therefore we first note that by the union bound estimate<br />
(19.23) we can write for our coding method that<br />
( )<br />
dmin /2<br />
P E ≈ K min Q . (19.32)<br />
σ<br />
Now we express P E in terms of E N by noting that again<br />
E N = d2 0<br />
4<br />
|A| 2 − 1<br />
, (19.33)<br />
3<br />
where it is important to note that although we use coding, the PAM symbols remain equiprobable.<br />
Moreover<br />
dmin<br />
2<br />
4σ 2 = d2 min<br />
d0<br />
2<br />
d0<br />
2 4σ 2 = d2 min 3 E N<br />
d0<br />
2 |A| 2 − 1 N 0 /2 = d2 min 3<br />
d0<br />
2 |A| 2 S/N . (19.34)<br />
− 1
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 176<br />
Therefore<br />
P E = K min Q<br />
(√<br />
) (√<br />
)<br />
dmin<br />
2 3<br />
d0<br />
2 |A| 2 − 1 S/N dmin<br />
2 2<br />
= K min Q<br />
2R N − 1<br />
d0<br />
2 |A| 2 − 1 3S/N norm , (19.35)<br />
where R N is the rate of the signal set S, i.e.<br />
R N = log 2 |S|<br />
N<br />
= K N + log 2 |A|/2<br />
= 2q − q − 1<br />
2 q + log 2 |A| − 1 = log 2 |A| − q + 1<br />
2 q . (19.36)<br />
Note that the rate R N approaches log 2 |A| for q → ∞, i.e. if we increase the complexity of the<br />
extended Hamming code.<br />
The argument of the Q( √·)-function in terms of S/N norm was 3S/N norm in the uncoded<br />
PAM case. If we use Ungerboeck coding as we described here, it is d2 min<br />
d0<br />
2<br />
we have achieved an increase of the argument by a factor G asymp<br />
cod<br />
of<br />
G asymp<br />
cod<br />
= d2 min<br />
d 2 0<br />
2 2R N −1<br />
|A| 2 −1 3S/N norm, hence<br />
2 2R N<br />
− 1<br />
|A| 2 − 1 = 4 · 22R N<br />
− 1<br />
|A| 2 − 1 . (19.37)<br />
This factor is called the asymptotic (or nominal) coding gain of our coding system with respect<br />
to the baseline performance.<br />
The factor of dmin 2 /d2 0<br />
= 4 corresponds to the increase of the squared Euclidean distance<br />
between the sequences. This factor is referred to as distance gain. This increase of the distance<br />
however was achieved since we have introduced redundancy by using an error-correcting code.<br />
The factor (2 2R N<br />
− 1)/(|A| 2 − 1) that corresponds to this loss is therefore called redundancy<br />
loss.<br />
Example 19.3 The encoder shown in figure 19.5 is based on the (8, 4, 4) extended Hamming code. The<br />
rate of this extended Hamming, corresponding to q = 3, is<br />
R ext.Hamm. = K N = 2q − q − 1<br />
= 23 − 3 − 1<br />
= 1/2. (19.38)<br />
2 q 2 3<br />
Therefore the total rate R N = R ext.Hamm. + 2 = 5/2 bits/transmission. The additional 2 bits come from the<br />
fact that the subsets A 0 and A 1 each contain four symbols.<br />
The distance gain of this method<br />
dmin<br />
2<br />
d0<br />
2 = 4, (19.39)<br />
which is 10 log 10 4 = 6.02dB. The redundancy loss however is<br />
2 2R N<br />
− 1 22·5/2<br />
|A| 2 − 1 = − 1<br />
= 31/63 (19.40)<br />
8 2 − 1<br />
which is 10 log 10 31/63 = −3.08dB. Therefore the asymptotic coding gain for this code is<br />
G asymp<br />
cod<br />
= 10 log 10 4 · 31 = 6.02dB − 3.08dB = 2.94dB. (19.41)<br />
63
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 177<br />
19.8.4 Effective coding gain<br />
In the previous subsection we have considered only the asymptotic coding gain, i.e. we have<br />
only concentrated on the argument of the Q-function (in the expressions of both P E and ˜P s E )<br />
and ignored the coefficient of it. How large are these coefficients? It can be shown that for the<br />
extended Hamming code with parameter q<br />
K min (q) =<br />
( 2 q<br />
4<br />
)<br />
+ (2 q − 1) ( 2 q−1 )<br />
2<br />
2 q (19.42)<br />
is the number of codewords at Hamming distance 4 from a codeword. We have already seen that<br />
the minimum Euclidean distance of our Ungerboeck code is 2d 0 . How many sequences have<br />
distance 2d 0 to a given sequence, in other words how large is our K min ? Therefore suppose that<br />
a sequence s is a result of extended Hamming codeword c. Note first that there are at most 2N<br />
signals s ′ that result from the same codeword c having distance 2d 0 from s. Moreover for each<br />
extended Hamming codeword c ′ at Hamming distance 4 from c there are at most 2 4 sequence s ′<br />
having distance 2d 0 from s. Therefore in total<br />
K min ≤ 2N + 2 4 K min (q). (19.43)<br />
Note that the normalized error coefficient is K min /N. This normalized error coefficient<br />
K min /N is larger than the coefficient 2 in the uncoded case. What is the effect of this larger<br />
coefficient in dB? In other words, how much do we have to increase the S/N norm , to compensate<br />
for the larger K min /N? Note that we are interested in achieving a symbol error probability<br />
of 10 −6 . To achieve the baseline performance the argument α of Q( √·) has to satisfy<br />
2Q( √ α) = 10 −6 (19.44)<br />
in other words α = 23.928. For an alternative coding methods with normalized error coefficient<br />
K min /N we need an argument β that satisfies<br />
K min<br />
N Q(√ β) = 10 −6 , (19.45)<br />
or a S/N norm that is a factor β/α larger, to get the same performance. Therefore the coding loss<br />
L that we have to take into account is<br />
L = 10 log 10<br />
β<br />
α . (19.46)<br />
In practice it turns out that if the normalized error coefficient K min /N is increased by a factor of<br />
two, we have to accept an additional loss of roughly 0.2dB (at error rate 10 −6 ).<br />
The resulting coding gain, called the effective coding gain, is the difference of the asymptotic<br />
coding gain and the loss, hence<br />
G eff<br />
cod = Gasymp cod<br />
− L = G asymp<br />
cod<br />
− 10 log 10<br />
β<br />
α . (19.47)
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 178<br />
4<br />
R N<br />
(bits/transm.)<br />
6<br />
Asympt. cod. gain (dB)<br />
3<br />
5<br />
4<br />
2<br />
3<br />
1<br />
2<br />
1<br />
0<br />
2 3 4 5 6<br />
0<br />
2 3 4 5 6<br />
3000<br />
K min<br />
/N<br />
6<br />
Effect. cod. gain (dB)<br />
2500<br />
5<br />
2000<br />
4<br />
1500<br />
3<br />
1000<br />
2<br />
500<br />
1<br />
0<br />
2 3 4 5 6<br />
0<br />
2 3 4 5 6<br />
Figure 19.6: Rate, asymptotic coding gain (in dB), normalized error coefficient, and effective<br />
coding gain (in dB) versus q.<br />
Example 19.4 Again consider the encoder shown in figure 19.5 which is based on the (8, 4, 4) extended<br />
Hamming code. We can show that K min (q) i.e. the number of “nearest neighbors” in the extended Hamming<br />
code, for q = 3, is<br />
( 2 3) + (2 3 − 1) ( )<br />
2 3−1<br />
4<br />
K min (3) =<br />
= 14. (19.48)<br />
2 3<br />
This results in the following bound for the error coefficient in the set S of coded sequences<br />
2<br />
K min ≤ 2N + 2 4 K min (3) = 2 · 8 + 16 · 14 = 240. (19.49)<br />
The normalized error coefficient K min /N is therefore at most 240/8 = 30. Now we have to solve<br />
K min<br />
N Q(√ β) = 30 · Q( √ β) = 10 −6 , (19.50)<br />
It turns out that β = 29.159 and thus the loss is 10 log 10 (29.159/23.928) = 0.86dB. Therefore the<br />
effective coding gain G eff<br />
cod = Gasymp cod<br />
− 0.86dB = 2.94dB − 0.86dB = 2.08dB.<br />
19.8.5 Coding gains for more complex codes<br />
For the values 2, 3, · · · , 6 of the parameter q that determines the extended Hamming code we<br />
have computed rates, normalized error coefficients, and asymptotic and effective coding gains of<br />
our Ungerboeck codes. These data can be found in the table below.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 179<br />
q N R N G asymp<br />
cod<br />
K min /N L G eff<br />
cod<br />
2 4 2.250 1.38 dB 6 0.37 1.01 dB<br />
3 8 2.500 2.94 dB 30 0.86 2.08 dB<br />
4 16 2.687 4.10 dB 142 1.29 2.81 dB<br />
5 32 2.813 4.87 dB 622 1.66 3.21 dB<br />
6 64 2.891 5.35 dB 2606 1.99 3.36 dB<br />
The results from the table are also shown in figure 19.6. It can be observed that the rate R N of<br />
our code tends to 3 bits/transmission if q increases. Moreover we see that the asymptotic coding<br />
gain approaches 6dB. The effective coding gain is considerably smaller however. The reason for<br />
this is that the normalized error coefficient increases very quickly with q.<br />
19.9 Remark<br />
What we did not do is the following. To keep things simple we have only discussed onedimensional<br />
signal structures. Ungerboeck’s method of set-partitioning also applies to two or<br />
more-dimensional structures.<br />
19.10 Exercises<br />
1. Consider instead of the extended Hamming code a so-called single-parity-check code of<br />
length 4. This code contains 8 codewords and has distance d H,min = 2. A codeword<br />
now has four binary components c 1 , c 2 , c 3 , c 4 and is determined by three information bits<br />
b 1 , b 2 , b 3 .<br />
✘✘✘✘✘ ✘ ✘✘❳ ❳ ❳❳❳❳❳<br />
✘✘✘ ✘ ❳<br />
A ✲ A<br />
❳❳❳❳3<br />
0<br />
A 1<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
3 ✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✄<br />
3 ✂ ✂ ✁<br />
✄<br />
✁<br />
3 ✂ ✁<br />
<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
1 ✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
1 ✂ ✁<br />
✄<br />
✂ ✁<br />
1<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄ 1<br />
✂ ✁<br />
✄ 3<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
1<br />
✄ 3<br />
✂ ✁<br />
<br />
✄<br />
✂ ✁<br />
✄ 1 3<br />
✂ ✁<br />
<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
✄<br />
✂ ✁<br />
Figure 19.7: Partitioning a 16-QAM constellation into two subsets.<br />
As before we use set-partitioning, but now we partition a 16-QAM constellation into two<br />
subsets A 0 and A 1 (see figure). Note that the signals in these sets are two-dimensional.
CHAPTER 19. CODING FOR BANDLIMITED CHANNELS 180<br />
Each component c i , i = 1, 2, 3, 4 of the chosen codeword c determines the used subset<br />
A ci , i.e. if c i = 0 then A 0 is used, if c i = 1 we use A 1 . Three binary digits<br />
b 3i+1 , b 3i+2 , b 3i+3 then determine which one of the (two-dimensional) signal points<br />
a 2i−1 , a 2i from the subset A ci is transmitted. Note that N = 8, since four two-dimensional<br />
signals are transmitted.<br />
As before the elementary signal a = a 1 , a 2 , · · · , a 8 is multiplied by d 0 /2 to get the actual<br />
signal s.<br />
(a) What is the minimum Euclidean distance d min now? What is the rate R N of this code<br />
in bits per transmission? Give the asymptotic coding gain of this code (relative to the<br />
baseline performance).
Part VI<br />
Appendices<br />
181
Appendix A<br />
An upper bound for the Q-function<br />
Note that the Q-function is defined as<br />
Q(x) =<br />
∫ ∞<br />
x<br />
1<br />
√ exp(− α2<br />
)dα. (A.1)<br />
2π 2<br />
The lemma below gives a useful upper bound for Q(x). In figure A.1 the Q-function and the<br />
derived upper bound are plotted.<br />
LEMMA A.1 For x ≥ 0 the Q-function is bounded as<br />
Q(x) ≤ 1 2<br />
Proof: Note that for α ≥ x, and since x ≥ 0,<br />
Therefore<br />
Q(x) =<br />
≤<br />
x2<br />
exp(− ). (A.2)<br />
2<br />
α 2 = (α − x) 2 + 2αx − x 2<br />
∫ ∞<br />
x<br />
∫ ∞<br />
x<br />
≥ (α − x) 2 + 2x 2 − x 2<br />
= (α − x) 2 + x 2 . (A.3)<br />
1<br />
√<br />
2π<br />
exp(− α2<br />
2 )dα<br />
1<br />
√ exp(− (α − x)2 + x 2<br />
)dα<br />
2π 2<br />
= exp(− x2<br />
2 ) ∫ ∞<br />
= 1 2<br />
x<br />
1<br />
√<br />
2π<br />
exp(−<br />
(α − x)2<br />
)dα<br />
2<br />
x2<br />
exp(− ). (A.4)<br />
2<br />
✷<br />
182
APPENDIX A. AN UPPER BOUND FOR THE Q-FUNCTION 183<br />
10 0 Figure A.1: The Q-function and the derived upper bound.<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
10 −5<br />
10 −6<br />
10 −7<br />
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Appendix B<br />
The Fourier transform<br />
B.1 Definition<br />
THEOREM B.1 If the signal x(t) satisfies the Dirichlet conditions, that is,<br />
1. The signal x(t) is absolutely integrable on the real line, i.e. ∫ ∞<br />
−∞<br />
|x(t)|dt < ∞.<br />
2. The number of maxima and minima of x(t) in any finite interval on the real line is finite.<br />
3. The number of discontinuities of x(t) in any finite interval on the real line is finite.<br />
4. When x(t) is discontinuous at t then x(t) = x(t+ )+x(t − )<br />
2<br />
,<br />
then the Fourier transform (or Fourier integral) of x(t), defined by<br />
X ( f ) =<br />
∫ ∞<br />
−∞<br />
x(t) exp(− j2π f t)dt<br />
(B.1)<br />
exists and the original signal can be obtained from its Fourier transform by<br />
x(t) =<br />
∫ ∞<br />
−∞<br />
X ( f ) exp( j2π f t)d f.<br />
(B.2)<br />
B.2 Some properties<br />
B.2.1 Parseval’s relation<br />
RESULT B.2 If the Fourier transforms of the signals f (t) and g(t) are denoted by F( f ) and<br />
G( f ) respectively, then<br />
∫ ∞<br />
−∞<br />
f (t)g(t)dt =<br />
∫ ∞<br />
−∞<br />
F( f )G ∗ ( f )d f. (B.3)<br />
184
APPENDIX B. THE FOURIER TRANSFORM 185<br />
Proof: Note that for a real signal g(t) the Fourier transform G( f ) satisfies G(− f ) = G ∗ ( f ).<br />
Then<br />
∫ ∞<br />
∫ ∞<br />
[∫ ∞<br />
] [∫ ∞<br />
]<br />
f (t)g(t)dt =<br />
F( f ) exp( j2π f t)d f G( f ) exp( j2π f t)d f dt<br />
−∞<br />
−∞ −∞<br />
−∞<br />
∫ ∞<br />
[∫ ∞<br />
] [∫ ∞<br />
]<br />
=<br />
F( f ) exp( j2π f t)d f G ∗ ( f ′ ) exp(− j2π f ′ t)d f ′ dt<br />
We know that<br />
therefore<br />
∫ ∞<br />
−∞<br />
=<br />
−∞<br />
∫ ∞<br />
−∞<br />
−∞<br />
[∫ ∞<br />
[∫ ∞<br />
F( f ) G ∗ ( f ′ )<br />
∫ ∞<br />
−∞<br />
f (t)g(t)dt =<br />
−∞<br />
−∞<br />
exp( j2πt( f − f ′ ))dt<br />
−∞<br />
exp( j2πt( f − f ′ ))dt = δ( f − f ′ ),<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
[∫ ∞<br />
]<br />
F( f ) G ∗ ( f ′ )δ( f − f ′ )d f ′ d f<br />
−∞<br />
]<br />
d f ′ ]<br />
d f.<br />
(B.4)<br />
(B.5)<br />
F( f )G ∗ ( f )d f. (B.6)<br />
✷
Appendix C<br />
Impulse signal, filters<br />
C.1 The impulse or delta signal<br />
In mathematical sense (see e.g. [17]) the delta signal δ(t) is not a funtion but a distribution or a<br />
generalized function. A distribution is defined in terms of its effect on another function (usually<br />
called “test function”). The impulse distribution can be defined by its effect on the test function<br />
f (t), which is supposed to be continuous at the origin, by the relation<br />
∫ ∞<br />
−∞<br />
f (t)δ(t)dt = f (0).<br />
(C.1)<br />
We call this property the sifting property of the impulse signal. Note that we defined the impulse<br />
signal δ(t) by desribing its action on the test function f (t) and not by specifying its value for<br />
different values of t.<br />
C.2 Linear time-invariant systems, filters<br />
Definition C.1 A system L is linear if and only if for any two legitimate input signals x 1 (t)<br />
and x 2 (t) and any two scalars α 1 and α 2 , the linear combination α 1 x 1 (t) + α 2 x 2 (t) is again a<br />
legitimate input, and<br />
L[α 1 x 1 (t) + α 2 x 2 (t)] = α 1 L[x 1 (t)] + α 2 L[x 2 (t)].<br />
(C.2)<br />
A system that does not satisfy this relation is nonlinear.<br />
Definition C.2 The impulse response h(t) of a system L is the response of the system to an<br />
impulse input δ(t), thus<br />
h(t) = L[δ(t)].<br />
(C.3)<br />
The response of the time-invariant system time-invariant system to a unit response applied at<br />
time τ, i.e. δ(t − τ) is obviously h(t − τ).<br />
186
APPENDIX C. IMPULSE SIGNAL, FILTERS 187<br />
Now we can determine the response of a linear time-invariant system L to an input signal<br />
x(t) as follows:<br />
y(t) = L[x(t)]<br />
= L[<br />
=<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
x(τ)δ(t − τ)dτ]<br />
x(τ)L[δ(t − τ)]dτ<br />
x(τ)h(t − τ)dτ.<br />
(C.4)<br />
The final integral is called the convolution of the signal x(t) and the impulse response h(t).<br />
A linear time-invariant system is often called a filter.
Appendix D<br />
Correlation functions, power spectra<br />
We will investigate here what the influence is of a filter with impulse response h(t) on the spectrum<br />
S x ( f ) of the input random process X (t). Then we study the consequences of our findings.<br />
Consider figure D.1. The filter produces the random process Y (t) at its output while the input<br />
process is X (t).<br />
x(t)<br />
✲<br />
h(t)<br />
y(t)<br />
✲<br />
Figure D.1: Filtering the noise process X (t).<br />
D.1 Expectation of an integral<br />
For the expected value of the random process Y (t) we can write 1<br />
∫ ∞<br />
m y (t) = E[Y (t)] = E[<br />
=<br />
−∞<br />
∫ ∞<br />
−∞<br />
The autocorrelation function of the random process is<br />
R y (t, s) = E[Y (t)Y (s)] = E[<br />
=<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞ ∫ ∞<br />
−∞ −∞<br />
∫ ∞ ∫ ∞<br />
−∞<br />
X (α)h(t − α)dα]<br />
E[X (α)]h(t − α)dα.<br />
X (α)h(t − α)dα<br />
−∞<br />
1 Note that we do not assume (yet) that the process X (t) is Gaussian.<br />
∫ ∞<br />
−∞<br />
X (β)h(s − β)dβ]<br />
E[X (α)X (β)]h(t − α)h(s − β)dαdβ<br />
R x (α, β)h(t − α)h(s − β)dαdβ.<br />
(D.1)<br />
(D.2)<br />
188
APPENDIX D. CORRELATION FUNCTIONS, POWER SPECTRA 189<br />
D.2 Power spectrum<br />
To study the effect of filtering a random process we assume that R x (t, s) = R x (τ) with τ =<br />
t − s, i.e. the autocorrelation function R x (t, s) of the input random process depends only on the<br />
difference in time t − s between the sample times t and s. If this condition is satisfied we can<br />
investigate the distribution of the mean power of a random process as a function of frequency.<br />
Now<br />
R y (t, s) =<br />
=<br />
∫ ∞ ∫ ∞<br />
−∞ −∞<br />
∫ ∞ ∫ ∞<br />
−∞<br />
−∞<br />
R x (α − β)h(t − α)h(s − β)dαdβ<br />
R z (t − s + µ − ν)h(ν)h(µ)dνdµ,<br />
(D.3)<br />
with ν = t − α and µ = s − β. Observe that now also R y (t, s) only depends on the time<br />
difference τ = t − s hence R y (t, s) = R y (τ).<br />
Next consider the Fourier transforms S x ( f ) and S y ( f ) of R x (τ) and R y (τ) respectively (see<br />
appendix B):<br />
S x ( f ) =<br />
S y ( f ) =<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
R x (τ) exp(− j2π f τ)dτ<br />
R y (τ) exp(− j2π f τ)dτ<br />
and R x (τ) = ∫ ∞<br />
−∞ S x( f ) exp( j2π f τ)d f<br />
and R y (τ) = ∫ ∞<br />
−∞ S y( f ) exp( j2π f τ)d f.<br />
(D.4)<br />
Note that these Fourier transforms can only be defined if the correlation functions depend only<br />
on the time-difference τ = t − s. Next we obtain<br />
R y (τ) =<br />
=<br />
=<br />
=<br />
hence<br />
∫ ∞ ∫ ∞<br />
−∞ −∞<br />
∫ ∞ ∫ ∞ ∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
−∞<br />
[<br />
S x ( f )[<br />
R x (τ + µ − ν)h(ν)h(µ)dνdµ<br />
−∞<br />
∫ ∞<br />
−∞<br />
S x ( f ) exp( j2π f (τ + µ − ν)d f ]h(ν)h(µ)dνdµ<br />
h(ν) exp(− j2π f ν)dν][<br />
S x ( f )H( f )H ∗ ( f ) exp( j2π f τ)d f,<br />
D.3 Interpretation<br />
∫ ∞<br />
−∞<br />
S y ( f ) = S x ( f )H( f )H ∗ ( f ) = S x ( f )|H( f )| 2 .<br />
h(µ) exp(− j2π f µ)dµ] exp( j2π f τ)d f<br />
(D.5)<br />
(D.6)<br />
First note that the mean square value of the filter output process Y (t) is time-independent if<br />
R x (t, s) = R x (t − s). This follows from<br />
E[Y 2 (t)] = R y (t, t) = R y (t − t) = R y (0).<br />
(D.7)
APPENDIX D. CORRELATION FUNCTIONS, POWER SPECTRA 190<br />
✻ H( f )<br />
1<br />
− f 2<br />
− f 1 0 f 1 f 2<br />
f<br />
✲<br />
Figure D.2: An ideal bandpass filter.<br />
Therefore R y (0) is the expected value of the power that is dissipated in a resistor of 1 connected<br />
to the output of the filter, at any time instant. Next note that<br />
R y (0) =<br />
∫ ∞<br />
−∞<br />
S y ( f )d f =<br />
∫ ∞<br />
−∞<br />
S x ( f )|H( f )| 2 d f.<br />
(D.8)<br />
Consider a filter with transfer function H( f ) shown in figure D.2. This is a bandpass filter and<br />
H( f ) =<br />
{ 1 for f1 ≤ | f | ≤ f 2 ,<br />
0 elsewhere.<br />
(D.9)<br />
Then we obtain<br />
R y (0) =<br />
∫ − f1<br />
− f 2<br />
S y ( f )d f +<br />
∫ f2<br />
f 1<br />
S y ( f )d f. (D.10)<br />
Now let f 1 = f and f 2 = f + f . The filter H( f ) now only tranfers components of X (t)<br />
in the frequency band ( f, f + f ) and stops all other components. Since a power spectrum<br />
is always even, see section D.6, the expected power at the output of the filter is approximately<br />
2S x ( f )f . This implies that S x ( f ) is the distribution of average power in the process X (t) over<br />
the frequencies. Therefore we call S x ( f ) the power spectral density function of X (t).<br />
D.4 Wide-sense stationarity<br />
Definition D.1 A random process Z(t) is called wide-sense stationary (WSS) if and only if<br />
m z (t) = constant<br />
and R z (t, s) = R z (t − s). (D.11)<br />
For a WSS process Z(t) it is meaningful to consider its power spectral density S z ( f ). For<br />
a WSS process Y (t) the expected power can be expressed as in equation (D.7). When a WSS<br />
process X (t) is the input of a filter then the filter output process Y (t) is also WSS and its power<br />
spectral density S y ( f ) is related to the input power spectral density S x ( f ) as given by (D.8).<br />
These consequences already justify the concept wide-sense stationarity.
APPENDIX D. CORRELATION FUNCTIONS, POWER SPECTRA 191<br />
D.5 Gaussian processes<br />
Recall that a Gaussian process Z(t) is completely specified by its mean m z (t) and autocorrelation<br />
function R z (t, s) for all t and s. Therefore a WSS (wide-sense stationary) Gaussian process Z(t)<br />
is also (strict-sense) stationary.<br />
Furthermore note that when a Gaussian process X (t) is the input of a filter then the filter<br />
output process Y (t) is also Gaussian. Remember that the output process is WSS when input<br />
process is WSS.<br />
D.6 Properties of S x ( f ) and R x (τ)<br />
Let X (t) be a WSS random process again.<br />
a) First observe that R x (τ) is a (real) even function of τ. This follows from<br />
R x (−τ) = E[X (t − τ)X (t)] = E[X (t)X (t − τ)] = R x (τ).<br />
(D.12)<br />
b) Next we show that S x ( f ) is a real even function of f . Since R x (τ) is an even funtion and<br />
sin(2π f τ) an odd function of τ<br />
Therefore<br />
S x ( f ) =<br />
=<br />
=<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
∫ ∞<br />
−∞<br />
R x (τ) sin(2π f τ)dτ = 0.<br />
R x (τ) exp(−2π f τ)dτ<br />
R x (τ)[cos(2π f τ) − j sin(2π f τ)]dτ<br />
R x (τ) cos(2π f τ)dτ,<br />
(D.13)<br />
(D.14)<br />
which is even and real.<br />
c) The power spectral density S x ( f ) is a non-negative function of the frequency f . To see<br />
why this statement is true suppose that S x ( f ) is negative for f 1 < | f | < f 2 . Then consider an<br />
ideal bandpass filter H( f ) as given by (D.9). The expected power of the output Y (t) of this filter<br />
when the input is the random process X (t) is given by<br />
E[Y 2 (t)] = R y (0) = 2<br />
∫ f2<br />
f 1<br />
S x ( f )d f < 0, (D.15)<br />
which yields a contradiction.<br />
d) Without proof we give another property of R x (τ). It can be shown that<br />
|R x (τ)| ≤ R x (0). (D.16)
Appendix E<br />
Gram-Schmidt procedure, proof, example<br />
We will only prove result 6.2 for signal sets with signals s m (t) ̸≡ 0 for all m ∈ M. The statement<br />
then also should hold for sets that do contain all-zero signals.<br />
E.1 Proof of result 6.2<br />
The proof is based in induction.<br />
1. First we will show that the induction hypothesis holds for one signal. Therefore consider<br />
s 1 (t) and take<br />
ϕ 1 (t) = s ∫ T<br />
1(t)<br />
√ with E s1 = s1 2 (t)dt.<br />
(E.1)<br />
Es1 0<br />
Note that by doing so<br />
∫ T<br />
0<br />
ϕ 2 1 (t)dt = ∫ T<br />
0<br />
1<br />
s1 2 (t)dt = 1,<br />
E (E.2)<br />
s1<br />
i.e. ϕ 1 (t) has unit energy (is normal). Since s 1 (t) = √ E s1 ϕ 1 (t) the coefficient s 11 = √ E s1 .<br />
2. Now suppose that induction hypothesis holds for m − 1 ≥ 1 signals. Thus there exists an<br />
orthonormal basis ϕ 1 (t), ϕ 2 (t), · · · , ϕ n−1 (t) for the signals s 1 (t), s 2 (t), · · · , s m−1 (t) with<br />
n − 1 ≤ m − 1. Now consider the next signal s m (t) and an auxiliary signal θ m (t) which is<br />
defined as follows:<br />
θ m (t) = s m (t) −<br />
∑<br />
s mi ϕ i (t)<br />
(E.3)<br />
with<br />
s mi =<br />
∫ T<br />
We can distinguish between two cases now:<br />
0<br />
i=1,n−1<br />
s m (t)ϕ i (t)dt for i = 1, n − 1.<br />
(E.4)<br />
(a) If θ m (t) ≡ 0 then s m (t) = ∑ i=1,n−1 s miϕ i (t) and the induction hypothesis also holds<br />
for m signals in this case.<br />
192
APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 193<br />
(b) If on the other hand θ m (t) ̸≡ 0 then take a new building-block waveform<br />
By doing so<br />
ϕ n (t) = θ ∫ T<br />
m(t)<br />
√ with E θm = θm 2 (t)dt.<br />
Eθm 0<br />
∫ T<br />
0<br />
ϕ 2 n (t)dt = ∫ T<br />
0<br />
1<br />
θm 2 (t)dt = 1,<br />
E (E.6)<br />
θm<br />
i.e. also ϕ n (t) has unit energy. Moreover for all i = 1, n − 1<br />
(E.5)<br />
∫ T<br />
0<br />
ϕ n (t)ϕ i (t)dt =<br />
=<br />
=<br />
=<br />
∫ T<br />
0<br />
1<br />
√<br />
Eθm<br />
θ m (t)ϕ i (t)dt<br />
⎛<br />
1<br />
√ ⎝<br />
Eθm<br />
∫ T<br />
0<br />
⎛<br />
1<br />
√ ⎝s mi −<br />
Eθm<br />
1<br />
√<br />
Eθm<br />
(s mi −<br />
s m (t)ϕ i (t)dt −<br />
∑<br />
j=1,n−1<br />
∑<br />
j=1,n−1<br />
∫ T<br />
0<br />
∑<br />
j=1,n−1<br />
⎞<br />
∫ T<br />
s mj ϕ j (t)ϕ i (t)dt⎠<br />
0<br />
s mj δ ji ) = 0,<br />
⎞<br />
s mj ϕ j (t)ϕ i (t)dt⎠<br />
(E.7)<br />
and therefore ϕ n (t) is orthogonal to all ϕ i (t).<br />
Note that now<br />
s m (t) = θ m (t) + ∑<br />
s mi ϕ i (t)<br />
i=1,n−1<br />
= √ E θm ϕ n (t) + ∑<br />
i=1,n−1<br />
s mi ϕ i (t) = ∑<br />
i=1,n<br />
s mi ϕ i (t)<br />
(E.8)<br />
with s mn = √ E θm . Thus also in this case for m signals there exists an orthonormal<br />
basis with n ≤ m building-block waveforms. Therefore the induction hypothesis also<br />
holds for m signals.<br />
It should be noted that, when using the Gram-Schmidt procedure, any ordering of the<br />
signals other than s 1 (t), s 2 (t), · · · , s |M| (t) will yield a basis, i.e. a set of building-block<br />
waveforms, of the smallest possible dimensionality (see exercise 1 in chapter 6), however<br />
in general with different building-block waveforms.<br />
E.2 An example<br />
We will next discuss an example in which we actually carry out the Gram-Schmidt procedure for<br />
a given set of waveforms.
APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 194<br />
3<br />
s 1 (t)<br />
3<br />
ϕ 1 (t)<br />
1<br />
t<br />
1/ √ 3<br />
t<br />
-1<br />
1 2 3<br />
−1/ √ 3<br />
1 2 3<br />
-3<br />
-3<br />
3<br />
s 2 (t)<br />
3<br />
θ 2 (t)<br />
3<br />
ϕ 2 (t)<br />
1<br />
t<br />
1<br />
t<br />
1/ √ 2<br />
t<br />
-1<br />
1 2 3 -1 1 2 3<br />
1 2 3<br />
-3<br />
-3<br />
-3<br />
3<br />
s 3 (t)<br />
1<br />
-1<br />
1 2<br />
3<br />
t<br />
θ 3 (t) ≡ 0<br />
-3<br />
Figure E.1: An example of the Gram-Schmidt procedure.<br />
Example E.1 Consider the three waveforms s m (t), m = 1, 2, 3, shown in figure E.1. Note that T = 3.<br />
We first determine from the energy E 1 of the first signal s 1 (t)<br />
E 1 = 4 + 4 + 4 = 12.<br />
(E.9)<br />
Therefore the first building-block waveform ϕ 1 (t) is<br />
ϕ 1 (t) = s 1(t)<br />
√<br />
E1<br />
= s 1(t)<br />
2 √ 3 , and thus s 11 = 2 √ 3. (E.10)<br />
Now we continue with the second signal s 2 (t). First we determine the projection s 21 of s 2 (t) on the first<br />
dimension i.e. the dimension that corresponds to ϕ 1 (t):<br />
s 21 =<br />
∫ T<br />
0<br />
s 2 (t)ϕ 1 (t)dt = 1 √<br />
3<br />
(−1 − 3 + 1) = − √ 3.<br />
(E.11)
APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 195<br />
s 2<br />
s 1<br />
− √ 2<br />
ϕ 2<br />
❏❪<br />
❏<br />
2 √ 2<br />
❏<br />
❏<br />
❏<br />
√<br />
2<br />
√<br />
− 3<br />
❏<br />
❏<br />
❏<br />
❏<br />
✡<br />
✡<br />
✡<br />
✡<br />
✡<br />
✡<br />
✡<br />
s 3 −2 √ 2<br />
✡<br />
√<br />
3<br />
2 √ 3<br />
✲<br />
ϕ 1<br />
Figure E.2: A vector representation of the signals s m (t), m = 1, 2, 3, of figure E.1.<br />
hence the auxiliary signal θ 2 (t) is now<br />
θ 2 (t) = s 2 (t) + √ 3ϕ 1 (t).<br />
(E.12)<br />
Next we determine the energy E θ2 of the auxiliary signal θ 2 (t):<br />
E θ2 = 4 + 4 = 8,<br />
(E.13)<br />
and the second building-block waveform ϕ 2 (t) is<br />
Now we obtain for the signal s 2 (t):<br />
ϕ 2 (t) = θ 2(t)<br />
√<br />
Eθ2<br />
s 2 (t) = θ 2 (t) − √ 3ϕ 1 (t)<br />
= θ 2(t)<br />
2 √ 2 . (E.14)<br />
= 2 √ 2ϕ 2 (t) − √ 3ϕ 1 (t), thus s 22 = 2 √ 2. (E.15)<br />
For the third signal s 3 (t) we can now determine the projections s 31 and s 32 and the auxiliary signal θ 3 (t):<br />
s 31 =<br />
s 32 =<br />
∫ T<br />
0<br />
∫ T<br />
0<br />
s 3 (t)ϕ 1 (t)dt = 1 √<br />
3<br />
(−1 + 1 − 3) = − √ 3<br />
s 3 (t)ϕ 2 (t)dt = 1 √<br />
2<br />
(−1 − 3) = −2 √ 2 hence<br />
θ 3 (t) = s 3 (t) + √ 3ϕ 1 (t) + 2 √ 2ϕ 2 (t) ≡ 0. (E.16)<br />
Since this auxiliary signal is always 0 we can express the third signal s 3 t) in terms of ϕ 1 (t) and ϕ 2 (t) as<br />
follows:<br />
s 3 (t) = − √ 3ϕ 1 (t) − 2 √ 2ϕ 2 (t),<br />
(E.17)
APPENDIX E. GRAM-SCHMIDT PROCEDURE, PROOF, EXAMPLE 196<br />
hence the coefficients corresponding to s 3 (t) are<br />
s 31 = − √ 3 and<br />
s 32 = −2 √ 2. (E.18)<br />
We have now obtained three vectors of coefficients, one for each signal:<br />
s 1<br />
= (s 11 , s 12 ) = (2 √ 3, 0),<br />
s 2<br />
= (s 21 , s 22 ) = (− √ 3, 2 √ 2), and<br />
s 3<br />
= (s 31 , s 32 ) = (− √ 3, −2 √ 2). (E.19)<br />
These vectors are depicted in figure E.2.
Appendix F<br />
Schwarz inequality<br />
LEMMA F.1 (Schwarz inequality) For two finite-energy waveforms a(t) and b(t) the inequality<br />
(∫ ∞<br />
2 ∫ ∞ ∫ ∞<br />
a(t)b(t)dt)<br />
≤ a 2 (t)dt b 2 (t)dt<br />
(F.1)<br />
−∞<br />
−∞<br />
−∞<br />
holds. Equality is obtained only if b(t) ≡ Ca(t) for some constant C.<br />
Proof: Form an orthonormal expansion for a(t) and b(t), i.e.<br />
a(t) = a 1 φ 1 (t) + a 2 φ 2 (t),<br />
b(t) = b 1 φ 1 (t) + b 2 φ 2 (t), (F.2)<br />
with ∫ ∞<br />
−∞ φ i(t)φ j (t)dt = δ i j for i = 1, 2 and j = 1, 2.<br />
Then with a = (a 1 , a 2 ) and b = (b 1 , b 2 ) and the Parseval results (a · b) = ∫ ∞<br />
−∞ a(t)b(t)dt,<br />
(a · a) = ∫ ∞<br />
−∞ a2 (t)dt, and (b · b) = ∫ ∞<br />
−∞ b2 (t)dt, we find for the vectors a and b that<br />
(a · b)<br />
‖a‖‖b‖ =<br />
∫ ∞<br />
−∞<br />
√ a(t)b(t)dt<br />
∫ √<br />
∞ ∫ . (F.3)<br />
∞<br />
−∞ a2 (t)dt<br />
−∞ b2 (t)dt<br />
If we now can prove that (a · b) 2 ≤ ‖a‖ 2 ‖b‖ 2 we get Schwarz inequality.<br />
We start by stating the triangle inequality for a and b:<br />
‖a + b‖ ≤ ‖a‖ + ‖b‖.<br />
(F.4)<br />
Moreover<br />
hence<br />
and therefore<br />
‖a + b‖ 2 = (a + b) 2 = ‖a‖ 2 + ‖b‖ 2 + 2(a · b)<br />
(F.5)<br />
‖a‖ 2 + ‖b‖ 2 + 2(a · b) ≤ ‖a‖ 2 + ‖b‖ 2 + 2‖a‖‖b‖,<br />
(F.6)<br />
(a · b) ≤ ‖a‖‖b‖. (F.7)<br />
197
APPENDIX F. SCHWARZ INEQUALITY 198<br />
From this it also follows that<br />
−(a · b) = (a · (−b)) ≤ ‖a‖‖ − b‖ = ‖a‖‖b‖<br />
(F.8)<br />
or<br />
(a · b) ≥ −‖a‖‖b‖. (F.9)<br />
We therefore may conclude that −‖a‖‖b‖ ≤ (a · b) ≤ ‖a‖‖b‖ or (a · b) 2 ≤ ‖a‖ 2 ‖b‖ 2 and this<br />
proves Schwarz inequality.<br />
Equality in Schwarz inequality is achieved only when equality in the triangle inequality is<br />
obtained. This the case when (first part of proof) b = Ca or (second part) −b = (−b) = Ca for<br />
some non-negative C.<br />
✷
Appendix G<br />
Bound error probability orthogonal<br />
signaling<br />
Note that the correct probability satisfies<br />
P C =<br />
∫ ∞<br />
−∞<br />
p(µ − b)(1 − Q(µ)) |M|−1 dµ<br />
(G.1)<br />
where b = √ 2E s /N 0 , hence<br />
P E = 1 − P C =<br />
∫ ∞<br />
−∞<br />
p(µ − b)[1 − (1 − Q(µ)) |M|−1 ]dµ.<br />
(G.2)<br />
The term in square brackets is the probability that at least one of |M| − 1 noise components<br />
exceeds µ. By the union bound this probability is not larger than the sum of the probabilities that<br />
individual components exceed µ, thus<br />
1 − (1 − Q(µ)) |M|−1 ≤ (|M| − 1)Q(µ) ≤ |M|Q(µ). (G.3)<br />
Moreover, since this term is a probability we can also write<br />
1 − (1 − Q(µ)) |M|−1 ≤ 1. (G.4)<br />
When µ is small Q(µ) is large and then the unity bound is tight. On the other hand for large µ<br />
the bound |M|Q(µ) is tighter. Therefore we split the integration range into two parts, (−∞, a)<br />
and (a, ∞):<br />
∫ a<br />
∫ ∞<br />
P E ≤ p(µ − b)dµ + |M| p(µ − b)Q(µ)dµ.<br />
(G.5)<br />
−∞<br />
a<br />
If we take a ≥ 0 we can use the upper bound Q(µ) ≤ exp(−µ 2 /2), see the proof in appendix<br />
A, and we obtain<br />
∫ a<br />
∫ ∞<br />
P E ≤ p(µ − b)dµ + |M| p(µ − b) exp(− µ2<br />
)dµ. (G.6)<br />
−∞<br />
a<br />
2<br />
199
APPENDIX G. BOUND ERROR PROBABILITY ORTHOGONAL SIGNALING 200<br />
If we denote the first integral by P 1 and the second by P 2 , we get<br />
P E ≤ P 1 + |M|P 2 .<br />
(G.7)<br />
To minimize the bound we choose a such that the derivative of (G.6) with respect to a is equal<br />
to 0, i.e.<br />
This results in<br />
0 = d<br />
da [P 1 + |M|P 2 ]<br />
= p(a − b) − |M|p(a − b) exp(− a2<br />
). (G.8)<br />
2<br />
exp( a2<br />
) = M. (G.9)<br />
2<br />
Therefore the value a = √ 2 ln |M| achieves a (at least a local) minimum in P E . Note that a ≥ 0<br />
as was required. Now we can consider the separate terms. For the first term we can write<br />
P 1 =<br />
=<br />
∫ a<br />
−∞<br />
∫ a−b<br />
−∞<br />
= Q(b − a)<br />
1<br />
√<br />
2π<br />
exp(−<br />
(µ − b)2<br />
)dµ<br />
2<br />
1<br />
√ exp(− γ 2<br />
2π 2 )dγ<br />
(∗) (b − a)2<br />
≤ exp(− ) if 0 ≤ a ≤ b. (G.10)<br />
2<br />
Here the inequality (∗) follows from appendix A. For the second term we get<br />
∫ ∞<br />
1 (µ − b)2<br />
P 2 = √ exp(− ) exp(− µ2<br />
a 2π 2<br />
2 )dµ<br />
∫<br />
(∗∗)<br />
= exp(− b2 ∞<br />
4 ) 1<br />
√ exp(−(µ − b/2) 2 )dµ<br />
2π<br />
= exp(− b2<br />
4 ) 1 √<br />
2<br />
∫ ∞<br />
= exp(− b2<br />
4 ) 1 √<br />
2<br />
∫ ∞<br />
a<br />
a<br />
a √ 2<br />
1<br />
√<br />
2π<br />
exp(−<br />
1<br />
√<br />
2π<br />
exp(−<br />
(<br />
µ √ 2 − (b/2) √ ) 2<br />
2<br />
2<br />
(<br />
γ − (b/2) √ 2<br />
2<br />
) 2<br />
)dγ<br />
)dµ √ 2<br />
= exp(− b2<br />
4 ) 1 √<br />
2<br />
Q(a √ 2 − (b/2) √ 2). (G.11)
APPENDIX G. BOUND ERROR PROBABILITY ORTHOGONAL SIGNALING 201<br />
Here the equality (∗∗) follows from<br />
From (G.11) we get<br />
1<br />
2 (µ − b)2 + µ2<br />
2<br />
= µ2<br />
2<br />
b2<br />
− µb +<br />
2 + µ2<br />
2<br />
= µ 2 − µb + b2<br />
4 + b2<br />
4<br />
= (µ − b 2 )2 + b2<br />
4 . (G.12)<br />
{ exp(−b<br />
P 2 ≤<br />
2 /4) 0 ≤ a ≤ b/2<br />
exp(−b 2 /4 − (a − b/2) 2 ) a ≥ b/2.<br />
(G.13)<br />
If we now collect the bounds (G.10) and (G.13) and substitute the optimal value of a found in<br />
(G.9) we get<br />
P E ≤<br />
{ exp(−(b − a) 2 /2) + exp(a 2 /2) exp(−b 2 /4) 0 ≤ a < b/2<br />
exp(−(b − a) 2 /2) + exp(a 2 /2) exp(−b 2 /4 − (a − b/2) 2 ) b/2 ≤ a ≤ b.<br />
(G.14)<br />
Now we note that<br />
(a − b) 2 ( b<br />
2<br />
)<br />
−<br />
2 4 − a2<br />
=<br />
2<br />
(<br />
a − b 2) 2<br />
≥ 0. (G.15)<br />
Therefore for 0 ≤ a < b/2 the first term is not larger than the second one while for b/2 ≤ a ≤ b<br />
both terms are equal. This leads to<br />
P E ≤<br />
{ 2 exp(−b 2 /4 + a 2 /2) 0 ≤ a < b/2<br />
2 exp(−(b − a) 2 /2) b/2 ≤ a ≤ b.<br />
(G.16)<br />
Now we rewrite both a an b as<br />
a = √ 2 ln |M| = √ 2 ln 2 · √log<br />
2 |M|<br />
b = √ 2E s /N 0 = √ 2E b /N 0 · √log<br />
2 |M|, (G.17)<br />
where we used the following definition for E b i.e. the energy per transmitted bit of information<br />
E b<br />
=<br />
E s<br />
log 2 |M| .<br />
(G.18)<br />
This finally leads to<br />
P E ≤<br />
{ 2 exp(− log2 |M|[E b /(2N 0 ) − ln 2]) E b /N 0 ≥ 4 ln 2<br />
2 exp(− log 2 |M|[ √ E b /N 0 − √ ln 2] 2 ) ln 2 ≤ E b /N 0 ≤ 4 ln 2.<br />
(G.19)
Appendix H<br />
Decibels<br />
Definition H.1 An energy ratio X > 0 can be expressed in decibels as follows:<br />
X| in dB<br />
= 10 log10 X. (H.1)<br />
It is easy to work with decibels since by doing so we transform multiplications into additions.<br />
Table H.1 shows for some energy ratios X the equivalent ratio in dB.<br />
Note that 1 dB = 0.1 Bel.<br />
X X| in dB<br />
0.01 -20.0 dB<br />
0.1 -10.0 dB<br />
1.0 0.0 dB<br />
2.0 3.0 dB<br />
3.0 4.8 dB<br />
5.0 7.0 dB<br />
10.0 10.0 dB<br />
100.0 20.0 dB<br />
Table H.1: Decibels.<br />
202
Appendix I<br />
Whitening filters<br />
Suppose that the power spectral density of the noise is rational, and can be written as<br />
S n ( f ) = k N( f )<br />
D( f ) . (I.1)<br />
Here the numerator polynomial N( f ) is assumed to have degree n, its complex roots are ν 1 , ν 2 ,<br />
· · · , ν n , the coefficient of f n is one. The roots of N( f ) are the zeros of S n ( f ). We assume that<br />
the denominator polynomial D( f ) has degree d, its complex roots are π 1 , π 2 , · · · , π d , and the<br />
coefficient of f d is one. The roots of D( f ) are the poles of S n ( f ). Finally k is some constant.<br />
Note that if S n ( f ) is not rational it can be approximated as a rational function of f .<br />
A power density function is real and even (see appendix D). Therefore the zeros and poles<br />
of S n ( f ) come in pairs, i.e. each zero ν = a + jb has a counterpart ν ′ = a − jb and each pole<br />
π = a + jb has a counterpart π ′ = a − jb (see figure I.1). Note that both n and d are even.<br />
To construct the whitening filter we number the roots (poles and zeros) of S n ( f ) in such a<br />
way that odd-numbered roots have a positive imaginary part and the even-numbered roots have<br />
a negative imaginary part. Now we define<br />
N pos ( f ) = ( f − ν 1 )( f − ν 3 ) · · · ( f − ν n−1 ),<br />
N neg ( f ) = ( f − ν 2 )( f − ν 4 ) · · · ( f − ν n ),<br />
D pos ( f ) = ( f − π 1 )( f − π 3 ) · · · ( f − π d−1 ),<br />
D neg ( f ) = ( f − π 2 )( f − π 4 ) · · · ( f − π d ). (I.2)<br />
The whitening filter G( f ) can now be specified by<br />
G( f ) =<br />
√<br />
N0 /2<br />
k<br />
D pos ( f )<br />
N pos ( f ) ,<br />
(I.3)<br />
thus the numerator of G( f ) corresponds to the poles of S n ( f ) with a positive imaginary part and<br />
the denominator of G( f ) corresponds to the zeros of S n ( f ) with a positive imaginary part.<br />
203
APPENDIX I. WHITENING FILTERS 204<br />
✻jβ<br />
❅<br />
✄<br />
✂ ✁<br />
ν 1<br />
❅<br />
π 1<br />
✄<br />
✂ ✁ν 3<br />
❅ <br />
π 3<br />
α<br />
✲<br />
❅<br />
✄<br />
✂ ✁<br />
ν 2<br />
π 6<br />
❅<br />
π 5<br />
π 4<br />
π 2<br />
✄<br />
✂ ✁ν 4<br />
❅<br />
Figure I.1: Poles and zeros of S n ( f ).<br />
How does this filter change the spectral density of the noise? The power spectral density at<br />
the filter output is S n ( f )|G( f )| 2 and<br />
|G( f )| 2 = G( f )G ∗ ( f ) = N 0/2<br />
k<br />
= N 0/2<br />
k<br />
= N 0/2<br />
k<br />
D pos ( f ) Dpos ∗ ( f )<br />
N pos ( f ) Npos ∗ ( f )<br />
D pos ( f ) D neg ( f )<br />
N pos ( f ) N neg ( f )<br />
D( f )<br />
N( f ) = N 0/2<br />
S n ( f ) ,<br />
(I.4)<br />
thus the output power spectral density S n o( f ) = N 0 /2, i.e. the output process is white Gaussian<br />
noise.<br />
We next have to verify that both filters G( f ) and G −1 ( f ) = 1/G( f ) are realizable. The<br />
transfer functions of both these filters contain only factors [ f − (a + jb)] with b ≥ 0. Let<br />
s = j2π f denote the complex frequency then we can express such a factor as<br />
[s/j2π − (a + jb)] = 1 [s + 2πb − 2 jπa].<br />
j2π (I.5)<br />
Note that the roots of these factors are s = −2πb + 2 jπa and thus have non-positive real part.<br />
Therefore none of the poles of both filters occur in the right-half s-plane. Consequently both<br />
G( f ) and G −1 ( f ) are realizable and our hence whitening filter G( f ) is reversible.
Appendix J<br />
Elementary energy E pam of an |A|-PAM<br />
constellation<br />
The elementary energy E pam of an |A|-PAM constellation was defined as<br />
E pam<br />
=<br />
1<br />
|A|<br />
∑<br />
s=−|A|+1,−|A|+3,··· ,|A|−1<br />
s 2 .<br />
(J.1)<br />
We want to express E pam in terms of the number of symbols |A|. We will only consider the case<br />
where |A| is even. A similar treatment can be given for odd |A|. For even |A|<br />
By induction we will show that now<br />
E pam = 2<br />
|A|<br />
∑<br />
k=1,H<br />
∑<br />
k=1,|A|/2<br />
(2k − 1) 2 = 4H 3 − H<br />
3<br />
(2k − 1) 2 . (J.2)<br />
. (J.3)<br />
Note that the induction hypothesis holds for H = 1. Next suppose that it holds for H = h ≥ 1.<br />
Now it turns out that<br />
∑<br />
(2k − 1) 2 = ∑<br />
(2k − 1) 2 + (2h + 1) 2<br />
k=1,h+1<br />
k=1,h<br />
= 4h3 − h<br />
3<br />
+ (2h + 1) 2<br />
= 4h3 − h + 12h 2 + 12h + 3<br />
3<br />
= 4h3 + 12h 2 + 12h + 4 − h − 1<br />
3<br />
= 4(h + 1)3 − (h + 1)<br />
, (J.4)<br />
3<br />
205
APPENDIX J. ELEMENTARY ENERGY E PAM OF AN |A|-PAM CONSTELLATION 206<br />
and thus the induction hypothesis also holds for h + 1 and thus for all H. Therefore we obtain<br />
E pam = 2<br />
|A|<br />
∑<br />
k=1,|A|/2<br />
(2k − 1) 2 = 2<br />
|A|<br />
4(|A|/2) 3 − |A|/2<br />
3<br />
= |A|2 − 1<br />
. (J.5)<br />
3
Bibliography<br />
[1] A.R. Calderbank, T.A. Lee, and J.E. Mazo, “Baseband Trellis Codes with a Spectral Null at<br />
Zero,” IEEE Trans. Inform. Theory, vol. IT-34, pp. 425-434, May 1988.<br />
[2] G.A. Campbell, U.S. Patent 1,227,113, May 22, 1917, “Basic Types of Electric Wave Filters.”<br />
[3] J.R. Carson, “Notes on the Theory of Modulation,” Proc. IRE, vol. 10, pp. 57-64, February<br />
1922. Reprinted in Proc. IEEE, Vol. 51, pp. 893-896, June 1963.<br />
[4] T.M. Cover, “Enumerative Source Encoding,” IEEE Trans. Inform. Theory, vol. IT-19, pp.<br />
73-77, Jan. 1973.<br />
[5] G.D. Forney, Jr., R.G. Gallager, G.R. Lang, F.M. Longstaff, and S.U. Quereshi, “Efficient<br />
Modulation for Band-Limited Channels,” IEEE Journ. Select. Areas Comm., vol. SAC-2, pp.<br />
632-647, Sept. 1984.<br />
[6] G.D. Forney, Jr., and G. Ungerboeck, “Modulation and Coding for Linear Gaussian Channels,”<br />
IEEE Trans. Inform. Theory, vol. 44, pp. 2384 - 2415, October 1998.<br />
[7] R.G. Gallager, Information Theory and Reliable Communication, Wiley and Sons, 1968.<br />
[8] R.V.L. Hartley, “Transmission of Information,” Bell Syst. Tech. J., vol. 7, pp. 535 - 563, July<br />
1928.<br />
[9] C.W. Helstrom, Elements of <strong>Signal</strong> Detection and Estimation, Prentice Hall, 1995.<br />
[10] S. Haykin, Communication <strong>Systems</strong>. John Wiley & Sons, 4th edition, 2000.<br />
[11] V.A. Kotelnikov, The Theory of Optimum Noise Immunety. Dover Publications, 1960.<br />
[12] R. Laroia, N. Farvardin, S. Tretter, “On Optimal Shaping of Multidimensional Constellations”,<br />
submitted to IEEE Trans. on Inform. Theory, vol. IT-40, pp. 1044-1056, July 1994.<br />
[13] E.A. Lee and D.G. Messerschmitt, Digital Communication, 2nd Edition, Kluwer Academic,<br />
1994.<br />
207
BIBLIOGRAPHY 208<br />
[14] D.O. North, Analysis of the Factors which determine <strong>Signal</strong>/Noise Discrimination in Radar.<br />
RCA Laboratories, Princeton, Technical Report, PTR-6C, June 1943. Reprinted in Proc.<br />
IEEE, vol. 51, July 1963.<br />
[15] H. Nyquist, “Certain factors affecting telegraph speed,” Bell Syst. Tech. J., vol. 3, pp. 324 -<br />
346, April 1924.<br />
[16] B.M. Oliver, J.R. Pierce, and C.E. Shannon, “The Philosophy of PCM,” Proc. IRE, vol. 36,<br />
pp.1324-1331, Nov. 1948.<br />
[17] J.G. Proakis and M. Salehi, Communication <strong>Systems</strong> Engineering. Prentice Hall, 1994.<br />
[18] J.G. Proakis, Digital Communications. McGraw-Hill, 4th Edition, 2001.<br />
[19] J.P.M. Schalkwijk, “An Algorithm for Source Coding,” IEEE Trans. Inform. Theory, vol.<br />
IT-18, pp. 395-399, May 1972.<br />
[20] C.E. Shannon, “A Mathematical Theory of Communication,” Bell Syst. Tech. J., vol. 27, pp.<br />
379 - 423 and 623 - 656, July and October 1948. Reprinted in the Key Papers on Information<br />
Theory.<br />
[21] C.E. Shannon, “Communication in the Presence of Noise,” Proc. IRE, vol. 37, pp. 10 - 21,<br />
January 1949. Reprinted in the Key Papers on Information Theory.<br />
[22] G. Ungerboeck, “Channel Coding with Multilevel/Phase <strong>Signal</strong>s,” IEEE Trans. Inform.<br />
Theory, vol. IT-28, pp. 55-67, Jan. 1982.<br />
[23] S. Verdu, Multi-user detection, Cambridge, 1998.<br />
[24] J.H. Van Vleck and D. Middleton, “A Theoretical Comparison of Visual, Aural, and Meter<br />
Reception of Pulsed <strong>Signal</strong>s in the Presence of Noise,” Journal of Applied Physics, vol. 17,<br />
pp. 940-971, November 1946.<br />
[25] J.M. Wozencraft and I.M. Jacobs, Principles of Communication Engineering, Wiley, New<br />
York, 1965.<br />
[26] S.T.M. Ackermans and J.H. van Lint, Algebra en Analyse, Academic Service, Den Haag,<br />
1976.
Index<br />
a-posteriori message probability, 18<br />
a-priori message probability, 15<br />
additive Gaussian noise channel<br />
scalar, 27<br />
additive Gaussian noise vector channel, 38<br />
aliased spectrum, 120<br />
Armstrong, E.H., 11<br />
asymptotic coding gain, 175<br />
autocorrelation function<br />
of white noise, 47<br />
available energy<br />
per dimension, 104<br />
bandlimited transmission, 104<br />
bandwidth-limited case, 116<br />
Bardeen J., Brattain W., and Shockley W., 10<br />
Bayes rule, 18<br />
Bell, A.G., 9<br />
bi-orthogonal signaling, 80<br />
binary modulation<br />
frequency shift keying (FSK), 85<br />
phase shift keying (PSK), 85<br />
Branly, E., 9<br />
Braun, K.F., 10<br />
building-block waveforms, 53<br />
Campbell, G.A., 9, 11<br />
capacity<br />
of bandlimited waveform channel, 112<br />
of wideband waveform channel, 114<br />
per dimension, 104, 111<br />
carrier-phase, 137<br />
Carson, J.R., 11<br />
center of gravity of the signal structure, 85<br />
channel<br />
additive Gaussian noise<br />
scalar, 27<br />
bandpass, 127<br />
binary symmetric, 23<br />
capacity, 13<br />
discrete, 15<br />
discrete scalar, 20<br />
discrete vector, 20<br />
real scalar, 25<br />
real vector, 35, 36<br />
waveform, 46<br />
Clarke, A.C., 10<br />
code<br />
error-correcting, 172<br />
coding gain<br />
asymptotic, 175<br />
coherent demodulation, 137<br />
correlation, 50<br />
correlation receiver, 50<br />
cross-talk, 170<br />
decibel definition, 202<br />
decision intervals, 29<br />
decision region, 37<br />
definition for real vector channel, 37<br />
decision rule, 16<br />
maximum a-posteriori<br />
for discrete channel, 17<br />
for real scalar channel, 27<br />
maximum likelihood<br />
for discrete channel, 19<br />
for real scalar channel, 27<br />
minimax, 22<br />
minimum Euclidean distance, 39<br />
decision variables<br />
for discrete channel, 18<br />
for real scalar channel, 26<br />
209
INDEX 210<br />
for real vector channel, 36<br />
for the discrete vector channel, 21<br />
DeForest, L., 10<br />
demodulator, 61<br />
destination, 16<br />
diode, 10<br />
direct receiver, 66<br />
discrete, 16<br />
distance gain, 176<br />
dot product, 38<br />
Dudley, H., 12<br />
energy<br />
of vector, 69<br />
envelope detection, 142<br />
equally likely messages, 19<br />
error probability, 16<br />
minimum, for the scalar AGN channel, 30<br />
symbol, 168<br />
union bound, 88<br />
error-correcting code, 172<br />
estimate, 16, 21<br />
excess bandwidth, 122<br />
extended Hamming codes, 172<br />
Faraday, M., 9<br />
Fleming, J.A., 10<br />
frequency shift keying, 75<br />
FSK, 75<br />
Galvani, L., 8<br />
Gauss, C.F. and Weber, W.E., 9<br />
Gram-Schmidt procedure, 55, 192<br />
Hadamard matrix, 79<br />
Hamming codes<br />
extended, 172<br />
Hamming distance, 172<br />
minimum, 172<br />
Hamming-distance, 23<br />
Hartley, R., 12<br />
Henry, J., 8<br />
Hertz, H., 9<br />
hypershere<br />
interior volume of, 107<br />
information source, 15<br />
inner product, 38<br />
integrated circuit, 10<br />
interval<br />
observation, 47<br />
irrelevant data (noise), 58<br />
Kilby J. and Noyce R., 10<br />
Kotelnikov, V., 12<br />
Kronecker delta function, 50<br />
likelihood, 19<br />
Lodge, O., 9<br />
MAP decision rule<br />
for the discrete vector channel, 21<br />
for discrete channel, 17<br />
for real scalar channel, 27<br />
for the scalar AGN channel, 28<br />
MAP decoding<br />
for real vector channel, 36<br />
Marconi, G., 10<br />
matched filter, 12<br />
matched filters, 64<br />
matched-filter receiver, 64<br />
Maxwell, J.C., 9<br />
message, 15<br />
message sequences, 92<br />
minimax decision rule, 22<br />
minimum Hamming distance, 172<br />
ML decision rule<br />
for discrete channel, 19<br />
for real scalar channel, 27<br />
modulation<br />
amplitude (AM), 11<br />
double sideband suppressed carrier (DSB-<br />
SC), 137<br />
frequency (FM), 11<br />
pulse amplitude (PAM), 119<br />
pulse code (PCM), 12<br />
pulse position, 75<br />
quadrature amplitude (QAM), 127, 135
INDEX 211<br />
modulation interval, 120, 125<br />
modulator, 56, 60<br />
Morse<br />
code, 8<br />
telegraph, 8<br />
Morse, S., 8<br />
multi-pulse transmission, 125<br />
normalized signal-to-noise-ratio, 168<br />
North, D.O., 12, 70<br />
Nyquist criterion, 120<br />
Nyquist, H., 11<br />
observation interval, 47<br />
observation space, 37<br />
Oersted, H.K., 8<br />
optimum decision rule, 16<br />
for real channel, 26<br />
for the discrete channel, 18<br />
optimum receiver, 16<br />
for the AGN vector channel, 39<br />
for the scalar AGN channel, 28<br />
orthogonal signaling, 75<br />
capacity, 77<br />
energy, 88<br />
error probability, 76<br />
optimum receiver for, 76<br />
signal structure, 75<br />
orthonormal, 47<br />
orthonormal functions, 47<br />
Parseval relation, 68<br />
PCM, 12<br />
Pierce, J.R., 10<br />
Poisson distribution, 23<br />
Popov, A., 10<br />
power spectral density<br />
of white noise, 47<br />
power-limited case, 116<br />
PPM, 75<br />
probability of correct decision, 16<br />
probability of error<br />
definition, 16<br />
pulse amplitude modulation, 167<br />
pulse transmission, 120<br />
Pupin, M., 9<br />
Q-function, 30<br />
definition of, 30<br />
table of, 31<br />
upper bound for, 182<br />
quadrature amplitude modulation, 127, 135<br />
quadrature multiplexing, 127<br />
random code, 104, 108<br />
real vector channel, 36<br />
receiver, 16<br />
direct, 66<br />
redundancy loss, 176<br />
Reeves, A., 12<br />
reversibility<br />
theorem of, 42<br />
Righi, A., 9<br />
satellites, 10<br />
Schwarz inequality, 197<br />
serial pulse transmission, 120<br />
set-partitioning, 173<br />
Shannon, C.E., 13<br />
shaping gain, 136<br />
signal, 15<br />
signal constellation, 167<br />
signal energy, 83<br />
signal space, 56<br />
signal structure, 56, 61<br />
average energy of, 84<br />
rotation of, 83<br />
translation of, 83<br />
signal vector, 20<br />
signal-to-noise ratio, 66, 115<br />
at output of matched filter, 67<br />
normalized, 168<br />
signaling<br />
antipodal, 85<br />
bit-by-bit, 104<br />
block-orthogonal, 104<br />
orthogonal binary, 85<br />
source, 15
INDEX 212<br />
sphere hardening<br />
noise vector, 105<br />
received vector, 106<br />
sphere-packing argument, 107<br />
sufficient statistic, 52<br />
sufficient statistics, 53<br />
symbol error probability, 168<br />
telegraph, 9<br />
telephone, 9<br />
Telstar I, 10<br />
transistor, 10<br />
transmitter, 15<br />
triode, 10<br />
uncoded transmission, 167<br />
union bound, 89, 110<br />
van der Bijl, H., Hartley, R., and Heising, R.,<br />
11<br />
vector channel, 20<br />
vector receiver, 21<br />
vector transmitter, 20<br />
Volta, A., 8<br />
water-filling procedure, 113<br />
waveform channel<br />
relation with vector channel, 60<br />
Wiener, N., 12<br />
zero-forcing (ZF), 124<br />
zero-forcing criterion, 120