01.06.2014 Views

FOUNDATIONS OF QUANTUM MECHANICS

FOUNDATIONS OF QUANTUM MECHANICS

FOUNDATIONS OF QUANTUM MECHANICS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>FOUNDATIONS</strong><br />

<strong>OF</strong><br />

<strong>QUANTUM</strong> <strong>MECHANICS</strong><br />

JOS UFFINK<br />

INSTITUTE FOR HISTORY AND <strong>FOUNDATIONS</strong><br />

<strong>OF</strong><br />

SCIENCE<br />

UTRECHT UNIVERSITY<br />

SEPTEMBER 2010


PREFACE<br />

These lecture notes serve as a support for the course on Foundations of Quantum Mechanics, provided<br />

by the Institute for History and Foundations of Science of the University of Utrecht. Although<br />

the text has been revised repeatedly, efforts to improve can sometimes bring along new imperfections,<br />

making revision a never-ending process. The current version, the 11 th , is slightly modified<br />

with respect to the previous one. Many thanks are due to Anne van Weerden for help in the English<br />

translation.<br />

Remarks and comments remain very welcome.<br />

Jos Uffink<br />

Utrecht, August 2010


CONTENTS<br />

I CONCEPTUAL PROBLEMS 7<br />

I. 1 Introduction . . . . . . . . . . . . . . . 7<br />

I. 2 Incompleteness and locality . . . . . . . . . . . 11<br />

II THE FORMALISM 17<br />

II. 1 Finite - dimensional Hilbert spaces . . . . . . . . . . 17<br />

II. 2 Operators . . . . . . . . . . . . . . . . 20<br />

II. 3 Eigenvalue problem and spectral theorem . . . . . . . . 24<br />

II. 3. 1 Appendix . . . . . . . . . . . . . . . 26<br />

II. 4 Functions of normal operators . . . . . . . . . . . 27<br />

II. 5 Direct sum and direct product . . . . . . . . . . . 30<br />

II. 5. 1 Direct sum . . . . . . . . . . . . . . 30<br />

II. 5. 2 Direct product . . . . . . . . . . . . . 31<br />

II. 6 Addendum: Infinite - dimensional Hilbert spaces . . . . . . 34<br />

II. 6. 1 The structure of vector spaces . . . . . . . . . . 34<br />

II. 6. 2 Operators . . . . . . . . . . . . . . . 36<br />

II. 6. 2. 1 Unbounded operators . . . . . . . . . . 37<br />

II. 6. 2. 2 Continuous spectra . . . . . . . . . . . 38<br />

II. 6. 2. 3 Spectral theorem . . . . . . . . . . . . 39<br />

II. 6. 3 Dirac . . . . . . . . . . . . . . . . 40<br />

II. 6. 4 Summary . . . . . . . . . . . . . . . 40<br />

III THE POSTULATES 41<br />

III. 1 Von Neumann’s postulates . . . . . . . . . . . . 41<br />

III. 2 Pure and mixed states . . . . . . . . . . . . . 45<br />

III. 3 The interpretation of mixed states . . . . . . . . . . 51<br />

III. 4 Composite systems . . . . . . . . . . . . . 55<br />

III. 4. 1 Summary . . . . . . . . . . . . . . . 63<br />

III. 5 Proper and improper mixtures . . . . . . . . . . . 63<br />

III. 6 Spin 1/2 particles . . . . . . . . . . . . . . 64


III. 6. 1 Spin 1/2 and rotations in spin space . . . . . . . . 67<br />

III. 6. 2 Mixed spin 1/2 states . . . . . . . . . . . . 70<br />

III. 6. 3 Two spin 1/2 particles . . . . . . . . . . . . 72<br />

III. 6. 3. 1 Singlet and triplet states . . . . . . . . . . 72<br />

III. 6. 3. 2 Correlations . . . . . . . . . . . . . 73<br />

III. 6. 3. 3 Conditional probabilities . . . . . . . . . . 74<br />

III. 6. 3. 4 Example of a mixed state of two spin 1/2 particles . . . 75<br />

IV THE COPENHAGEN INTERPRETATION 77<br />

IV. 1 Heisenberg and the uncertainty principle . . . . . . . . 77<br />

IV. 1. 1 Remarks . . . . . . . . . . . . . . . 81<br />

IV. 2 Bohr and complementarity . . . . . . . . . . . . 82<br />

IV. 2. 1 Complementary phenomena . . . . . . . . . . 84<br />

IV. 2. 2 Remarks and problems . . . . . . . . . . . 86<br />

IV. 2. 3 Agreement and difference between Heisenberg and Bohr . . . 87<br />

IV. 3 Debate between Einstein en Bohr . . . . . . . . . . 88<br />

IV. 3. 1 Introduction . . . . . . . . . . . . . . 88<br />

IV. 3. 2 The photon box . . . . . . . . . . . . . 90<br />

IV. 3. 3 Einstein, Podolsky and Rosen . . . . . . . . . . 92<br />

IV. 3. 4 Heisenberg, Bohr and Einstein, Podolsky and Rosen . . . . 92<br />

IV. 4 Neutron interferometry . . . . . . . . . . . . 93<br />

IV. 5 The uncertainty relations . . . . . . . . . . . . 97<br />

IV. 5. 1 Introduction . . . . . . . . . . . . . . 97<br />

IV. 5. 2 The standard uncertainty relations . . . . . . . . 98<br />

IV. 5. 3 Single slit experiment . . . . . . . . . . . . 100<br />

IV. 5. 4 Time and energy . . . . . . . . . . . . . 103<br />

IV. 5. 5 Double slit experiment . . . . . . . . . . . 104<br />

IV. 5. 6 A new uncertainty measure . . . . . . . . . . 105<br />

IV. 5. 7 Interpretation . . . . . . . . . . . . . . 108<br />

V HIDDEN VARIABLES 109<br />

V. 1 Hidden reality . . . . . . . . . . . . . . . 109<br />

V. 2 Non - contextual hidden variables . . . . . . . . . . 110<br />

V. 3 Kochen and Specker’s theorem . . . . . . . . . . 115<br />

V. 3. 1 Summary . . . . . . . . . . . . . . . 120<br />

V. 4 Contextual hidden variables . . . . . . . . . . . 120


VI BOHMIAN <strong>MECHANICS</strong> 127<br />

VI. 1 Introduction . . . . . . . . . . . . . . . 127<br />

VI. 2 The quantum potential . . . . . . . . . . . . . 128<br />

VI. 3 Composite systems . . . . . . . . . . . . . 132<br />

VI. 4 Remarks and problems . . . . . . . . . . . . 135<br />

VI. 5 The Hamilton - Jacobi equation . . . . . . . . . . 136<br />

VII BELL’S INEQUALITIES 139<br />

VII. 1 Local deterministic hidden variables . . . . . . . . . 139<br />

VII. 1. 1 Derivation of the first Bell inequality . . . . . . . . 139<br />

VII. 1. 2 The Bell inequality of Clauser, Horne, Shimony and Holt . . . 141<br />

VII. 1. 3 Violation of the Bell inequalities by quantum mechanics . . . 142<br />

VII. 1. 4 The Bell inequality in a non-contextual, local deterministic HVT . 144<br />

VII. 2 Local deterministic contextual hidden variables . . . . . . 145<br />

VII. 3 Wigner’s derivation . . . . . . . . . . . . . 147<br />

VII. 4 The derivation of Eberhard and Stapp . . . . . . . . . 150<br />

VII. 4. 1 Counterfactual conditional statements and indeterminism . . . 152<br />

VII. 5 Stochastic hidden variables . . . . . . . . . . . 153<br />

VII. 5. 1 Outcome, parameter and source independence . . . . . 155<br />

VII. 5. 2 Quantum mechanics as a stochastic HVT . . . . . . . 156<br />

VII. 6 An algebraic proof without inequalities . . . . . . . . 158<br />

VII. 7 Miscellanea . . . . . . . . . . . . . . . 160<br />

VII. 7. 1 Locality and relativity . . . . . . . . . . . 160<br />

VII. 7. 2 Locality versus conditional independence . . . . . . . 161<br />

VII. 7. 3 Determinism . . . . . . . . . . . . . . 161<br />

VIII THE MEASUREMENT PROBLEM 163<br />

VIII. 1 Introduction . . . . . . . . . . . . . . . 163<br />

VIII. 2 Measurement according to classical physics . . . . . . . 164<br />

VIII. 3 Measurement according to quantum mechanics . . . . . . 166<br />

VIII. 4 The measurement problem in the narrow sense . . . . . . 170<br />

VIII. 4. 1 The projection postulate and consciousness . . . . . . 172<br />

VIII. 4. 2 Bohmian mechanics . . . . . . . . . . . . 173<br />

VIII. 4. 3 Spontaneous collapse . . . . . . . . . . . . 173<br />

VIII. 4. 4 Many worlds . . . . . . . . . . . . . . 174<br />

VIII. 4. 5 Superselection rules . . . . . . . . . . . . 175<br />

VIII. 4. 6 Irreversibility of measurement . . . . . . . . . 176


VIII. 4. 7 Modal interpretation . . . . . . . . . . . . 176<br />

VIII. 4. 8 Decoherence . . . . . . . . . . . . . . 177<br />

VIII. 5 Incompatible quantities . . . . . . . . . . . . 179<br />

VIII. 6 Comments on the theory of measurement . . . . . . . . 181<br />

A GLEASON’S THEOREM 183<br />

A. 1 Introduction . . . . . . . . . . . . . . . 183<br />

A. 2 Conversion to a 3 - dimensional real problem . . . . . . . 184<br />

A. 2. 1 Step 1 . . . . . . . . . . . . . . . 185<br />

A. 3 Formulation of the problem on the surface of a sphere . . . . . 186<br />

A. 3. 1 Step 2 . . . . . . . . . . . . . . . 188<br />

A. 3. 1. 1 Lemma 1 . . . . . . . . . . . . . 188<br />

A. 3. 1. 2 Lemma 2 . . . . . . . . . . . . . 189<br />

A. 3. 1. 3 Result of lemma 1 and 2 . . . . . . . . . . 191<br />

A. 3. 2 Step 3 . . . . . . . . . . . . . . . 192<br />

A. 4 An analytic lemma . . . . . . . . . . . . . 196<br />

A. 4. 1 Step 4 . . . . . . . . . . . . . . . 196<br />

A. 5 Summary . . . . . . . . . . . . . . . . 198<br />

WORKS CONSULTED 199<br />

BIBLIOGRAPHY 200


LIST <strong>OF</strong> FIGURES<br />

III. 1 A discontinuous measure for dim H = 2 . . . . . . . . . 48<br />

III. 2 A rotated unit vector in the xz - plane . . . . . . . . . . 68<br />

III. 3 Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b . . . . . . 73<br />

IV. 1 Heisenberg’s γ - microscope . . . . . . . . . . . . 79<br />

IV. 2 The double slit interference experiment (Bohr 1949 ) . . . . . . 89<br />

IV. 3 Contexts of measurement in which the interference of the particles is visible,<br />

and those in which the recoil of the screen is visible, exclude each other. (Bohr<br />

1949 ) . . . . . . . . . . . . . . . . . . 90<br />

IV. 4 Several perfect crystal neutron interferometers (Rauch and Werner 2000 ) . 93<br />

IV. 5 The interference pattern in the neutron interferometer is acquired by measuring<br />

the intensity in the detectors at a variable optical path length difference. . 94<br />

IV. 6 The probability distribution in position for a slit of width 2 a . . . . 101<br />

IV. 7 The diffraction pattern for a small slit of width 2 a . . . . . . . 101<br />

IV. 8 The probability distribution in position for a double slit, 2 a is the width of each<br />

slit and 2 A the distance between the slits . . . . . . . . . 104<br />

IV. 9 The interference pattern for the double slit . . . . . . . . . 104<br />

IV. 10 Moving screen . . . . . . . . . . . . . . . . 106<br />

V. 1 A solution for dim H = 2 . . . . . . . . . . . . . 117<br />

V. 2 a) Kochen - Specker diagram b) Conway - Kochen diagram . . . . 118<br />

V. 3 M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of<br />

the left pillar. Each cube has 4 lines from the mutual center to its vertices, 6<br />

lines to the centers of its edges, and 3 lines to the centers of its faces. Three of<br />

the lines are shared by all three cubes, giving 3 · (4 + 6 + 3 ) − 6 = 33 lines.<br />

These are Peres’ vectors. (Text Meyer 2003 ) . . . . . . . . 119<br />

V. 4 µ(P i ) = cos 2 θ . . . . . . . . . . . . . . . 120<br />

VI. 1<br />

VI. 2<br />

The quantum potential for the two slit system as viewed from the screen, under<br />

assumption of a Gaussian distribution at the slits (Bohm 1989 ) . . . 131<br />

A simulation of the double slit experiment in Bohmian mechanics. Each particle<br />

follows a certain path between the slits and the photographic plate. All<br />

particles coming from the upper slit arrive at the upper half of the photographic<br />

plate, likewise for the lower slit and lower half of the plate. The twists in the<br />

paths are caused by the quantum potential U. (Vigier et al. 1987 ) . . . 132<br />

VII. 1 Thought experiment of Einstein, Podolsky and Rosen on the singlet . . . 140<br />

VII. 2 A configuration in which the spin quantities violate the Bell inequality . . 142<br />

VII. 3 The Bell inequality violated for every acute angle ϕ . . . . . . 143<br />

VII. 4 the configuration giving the largest violation of the Bell inequality (all vectors<br />

in the same plane) . . . . . . . . . . . . . . . 143


VII. 5 Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n<br />

is positive, in the unshaded areas a n b n is negative. . . . . . . . 144<br />

VII. 6 Comparison of the quantum mechanical expectation values and those for the<br />

local deterministic HVT . . . . . . . . . . . . . . 145<br />

VII. 7 Violation of the Bell inequality again . . . . . . . . . . 149<br />

VII. 8 The Mermin pentagon . . . . . . . . . . . . . . 159<br />

VII. 9 Minkowski diagram of the EPRB experiment, where λ is in the past light cones<br />

of both A and B . . . . . . . . . . . . . . . 160<br />

VIII. 1 Schrödinger’s cat paradox (DeWitt 1970 ) . . . . . . . . . 170<br />

A. 1 Construction of a 3 - dimensional subspace E . . . . . . . . 185<br />

A. 2 Rotation of s to s 0 and t to t ′ along a great circle around axis r . . . 188<br />

A. 3 Projection of points on a great circle onto a plane P through the north pole 189<br />

A. 4 Projection of meridians, circles with constant latitude, and a great circle . 190<br />

A. 5 Spiral representing a projected path from s to t along subsequent great circles,<br />

each time starting at their most northern point . . . . . . . . 190<br />

A. 6 Path from t to v, having the same longitude . . . . . . . . 191<br />

A. 7 A strictly in - or decreasing curve C . . . . . . . . . . 192<br />

A. 8 Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ ) . . 193<br />

A. 9 Great circle C and tilted great circles C ′ and C ′′ . . . . . . . 194<br />

A. 10 Two continuous curves on S 2 , intersecting in q . . . . . . . 195


I<br />

CONCEPTUAL PROBLEMS<br />

Anyone who is not shocked by quantum theory has not understood it.<br />

— Niels Bohr<br />

I think it is safe to say that no one understands quantum mechanics.<br />

— Richard Feynman<br />

I. 1 INTRODUCTION<br />

Quantum mechanics emerged at the beginning of the 20 th century from an attempt to understand<br />

the interaction between atoms and radiation. The presence of discrete lines in the emission<br />

and absorption spectra of chemical elements indicates that this interaction takes the form of discrete<br />

quanta. When, in the years 1925 and 1926, a coherent theory was developed by the unified efforts of<br />

Werner Heisenberg, Paul Dirac, Max Born, Pascual Jordan, Wolfgang Pauli and Erwin Schrödinger,<br />

and this theory was axiomatized seven years later by John von Neumann, the question about the<br />

physical interpretation of the mathematical symbols of the theory arose.<br />

The central mathematical concept in quantum mechanics is ψ, in the form of a wave function ψ(q)<br />

in Schrödinger’s wave mechanics, or of a vector |ψ⟩ in Hilbert space, à la Von Neumann. According<br />

to Born, its physical meaning is that ψ determines probabilities for results of measurements, and a<br />

key question is then how such probabilities must be interpreted. By means of four examples we will<br />

give an idea of the conceptual problems raised by quantum mechanics.<br />

(i) Consider as a first example the decay of radioactive nuclei of a certain kind, as discussed by<br />

Einstein (P.A. Schilpp (1949, p.667, ff). We see the unstable nuclei decay at various times, one almost<br />

immediately, another only after a long time; the α - particles are radiated in ever different directions.<br />

Quantum mechanics describes these nuclei by a non-stationary wave function, and using this function<br />

one can calculate the expected lifetime of the nuclei.<br />

A natural reaction is to assume that the nuclei differ from each other, and that this difference is<br />

the cause of the mutually different individual life spans and the different directions the α - particles<br />

are radiated in. In this view, the quantum mechanical expectation value would be comparable to<br />

the average life span in a population. However, this does not fit in a natural way in the quantum<br />

mechanical description. Quantum mechanics describes all nuclei by the same wave function. If this<br />

description is complete, the fact that quantum mechanics gives only expected life spans is not due to<br />

a lack of knowledge. Rather, there simply is nothing more to know concerning the nuclei than their<br />

wave function and the probabilities that follow from it.<br />

On the other hand, we see before our eyes that the nuclei do not behave the same way, they decay<br />

at different times and send the α - particles in ever different directions. This suggests that more can


8 CHAPTER I. CONCEPTUAL PROBLEMS<br />

be known about nuclei than their expected life spans, just like a more thorough investigation of the<br />

individuals of a population enables us to know more than their mere average life span; we would<br />

then be able to make a more detailed statement about their individual life spans. In this view the<br />

quantum mechanical description is not complete, there are extra, until now ‘hidden’, variables which<br />

say something about the individual case.<br />

There is a standard answer to this problem, called the ‘Copenhagen interpretation’, after the view<br />

developed by Bohr and his coworkers. This answer is that the idea that the individual nuclei have a<br />

definite life span, independent of the observation of this life span, is incorrect. We can only speak of<br />

an individual life span within the context of an experiment in which this is measured. An experiment<br />

always entails a disturbance of the system. For this reason no conclusions can be drawn concerning<br />

the undisturbed system. It is incorrect to speak of the life span of a nucleus which is not observed.<br />

The statistical spread in the measured individual life spans is due to the quantum character of the<br />

interaction between object and measuring apparatus. As a matter of principle, what happens in this<br />

interaction cannot be described more precisely. This makes every individual measurement into a<br />

unique event.<br />

Characteristic for the Copenhagen interpretation is, furthermore, that one cannot simply combine<br />

the description of the system, obtained within the context of a certain type of experiment, with a<br />

description of the same system, obtained in a different kind of experiment. The best known example<br />

of such mutually excluding experiments are measurements of position and momentum. According to<br />

Bohr, descriptions of a system with terms like ‘position’ or ‘momentum’ are complementary; they are<br />

supplementary to each other, but they can never be united in one picture.<br />

The main point behind the Copenhagen answer is the idea of measurement disturbance. According<br />

to this line of thought quantum mechanics is distinguished from classical physics by the quantization<br />

of the interaction between system and measuring apparatus. Every observation involves an<br />

interaction with, and therefore a disturbance of, the observed system. This disturbance cannot be<br />

made arbitrarily small; ≠ 0. Therefore, one cannot identify observation results with properties<br />

the system has independently of the observation. One can only talk meaningfully about observation<br />

results which are created by the measurement. In contrast to classical physics, quantum mechanics<br />

does not deal with what exists, but with what is observed.<br />

At first sight this reasoning seems to be plausible, it is, however, not without problems. Can<br />

we use the same reasoning if the observed system is macroscopic? And, as a matter of fact, what<br />

exactly is an observation? Is it essential that some conscious being takes notice of the result of the<br />

observation, or is an apparatus registering the outcome sufficient? These problems will appear in the<br />

third and fourth example.<br />

(ii) The next example is from a letter Einstein wrote to Born in 1948 (Born 1971, pp. 169, 170).<br />

Consider a free particle described by a wave function ψ. According to the quantum mechanical<br />

description, ψ satisfies an uncertainty relation; the statistical deviations of position and momentum<br />

cannot simultaneously be made arbitrarily small. Apparently the outcomes of measurements of position<br />

and momentum of an individual particle cannot both be predicted exactly, and the question arises<br />

how to interpret this situation. Einstein distinguishes two points of view.<br />

(a) The (free) particle really has a definite position and a definite momentum, even if they<br />

cannot both be ascertained by measurement in the same individual case. According to<br />

this point of view, the ψ - function represents an incomplete description of the real state


I. 1. INTRODUCTION 9<br />

of affairs. [. . . ] Its acceptance would lead to an attempt to obtain a complete description<br />

of the real state of affairs as well as the incomplete one, and to discover physical laws<br />

for such a description. The theoretical framework of quantum mechanics would then be<br />

exploded.<br />

(b) In reality the particle has neither a definite momentum nor a definite position; the description<br />

by [the] ψ - function is, in principle, a complete description. The strictly defined<br />

position of the particle, obtained by measuring the position, cannot be interpreted as the<br />

position of the particle prior to the measurement. The sharp localization which appears as<br />

a result of the measurement is brought about only as a result of the unavoidable (but not<br />

unimportant) operation of measurement. The result of the measurement depends not only<br />

on the real particle situation but also on the nature of the measuring mechanism, which<br />

in principle is incompletely known. An analogous situation arises when the momentum<br />

or any other observable quantity relating to the particle is measured.<br />

Interpretation (b) is accepted by the majority of the physicists and Einstein admits<br />

[. . . ] it alone does justice in a natural way to the empirical state of affairs expressed in<br />

Heisenberg’s principle within the framework of quantum mechanics.<br />

Nevertheless, he emphasizes his preference for interpretation (a). His argument is that it is basic<br />

to physics that physical concepts refer to entities, such as particles, fields, etc., that exist independently<br />

of the observer, and are situated in space and time. Interpretation (b) renders this kind of<br />

description impossible. A second argument has to do with composite systems and will be discussed<br />

in section I. 2.<br />

(iii) The next example, also originating from the correspondence between Einstein and Born<br />

(Born 1971, pp. 188, 208 - 209), concerns a freely moving macroscopic object, for instance a star.<br />

A simple Schrödinger equation applies to the center of mass of such a body, namely that of a free<br />

particle. Since all wave functions which are solutions of the Schrödinger equation are admissible,<br />

one may consider as a solution a wave function with two peaks of equal size, located far from each<br />

other.<br />

Upon measurement of the position of the center of mass of such a body, the outcome is found at<br />

one peak in about half of the measurements, in the other half the outcome is found at the other peak.<br />

In this case it is tempting to say that for half of these measurements the center of mass was at that<br />

one position, that the object was at that position, while at the other half the center of mass was at the<br />

other position. But according to the standard interpretation this is incorrect: prior to the measurement<br />

no position can be assigned to the center of mass. Quantum mechanics applies just as well to the<br />

center of mass of a macroscopic body as to an electron. It is, however, difficult to imagine how a<br />

measurement ‘creates’ the position of the center of mass of a star as a result of a disturbance in the<br />

order of the size of one single quantum .<br />

According to Pauli, one of the representatives of the Copenhagen interpretation, this is a creation<br />

outside the laws of nature (ibid., p. 223). The laws of nature only say something about the statistics<br />

of the outcomes. The quantum mechanical probability description does not express our ignorance<br />

concerning the position of the center of mass of the body; the probability description corresponds


10 CHAPTER I. CONCEPTUAL PROBLEMS<br />

to an essential indeterminacy of that position. Pauli states that the question whether the ‘position’<br />

of a body would also exist without observation is fundamentally unanswerable and for this reason<br />

meaningless.<br />

In this example the problem of the transition between the microscopic and the macroscopic levels<br />

arises. Our intuition tells us that somewhere along the way the quantum mechanical probability<br />

description must turn into a classical description of an ensemble, an ensemble of objects that have<br />

properties. But if we accept at the same time that quantum mechanics applies as well to macroscopic<br />

bodies as to microscopic ones, our expectation is refuted. This transition of the one type of ensemble<br />

to the other is a problem which invariably emerges in considerations concerning the ‘measurement<br />

problem’. We will come back to this in chapter VIII.<br />

The previous discussion follows rather closely the formulations of Einstein and Pauli in the<br />

years 1948-1954, as can be found in the correspondence between Born and Einstein (Born 1971).<br />

An interesting aspect is that the discussion actually takes place over Born’s head. Born saw Einstein<br />

as the one who had, in his theory of relativity, abolished the idea of absolute simultaneity by means of<br />

the argument that it is meaningless to want to speak about something you cannot measure in principle.<br />

Einstein reacts (ibid., p. 188)<br />

There is nothing analogous in relativity to what I call incompleteness of description in<br />

the quantum theory. Briefly it is because the ψ - function is incapable of describing certain<br />

qualities of an individual system, whose ‘reality’ we none of us doubt (such as a<br />

macroscopic parameter).<br />

Moreover, Born continues to believe, despite everything Einstein writes, that Einstein objects<br />

to the indeterministic character of quantum mechanics, i.e., the fact that it only provides probability<br />

statements, instead of objecting to the alleged completeness of quantum mechanics, until Pauli<br />

intervenes in the discussion and explains Einstein’s position to Born (ibid., pp. 217-219).<br />

(iv) The last example is Schrödinger’s notorious cat paradox (Schrödinger 1935b).<br />

One can even set up quite ridiculous cases. A cat is penned up in a steel chamber, along<br />

with the following diabolical device (which must be secured against direct interference<br />

by the cat): in a Geiger counter there is a tiny bit of radioactive substance, so small<br />

that perhaps in the course of one hour one of the atoms decays, but also, with equal<br />

probability, perhaps none; if it happens, the counter tube discharges and through a relay<br />

releases a hammer which shatters a small flask of hydrocyanic acid. If one has left this<br />

entire system to itself for an hour, one would say that the cat still lives if meanwhile no<br />

atom has decayed. The first atomic decay would have poisoned it. The Ψ - function for<br />

the entire system would express this by having in it the living and the dead cat (pardon<br />

the expression) mixed or smeared out in equal parts.<br />

In this example a number of problems is combined. In the first place there is again the difference<br />

between a classical state and a quantum state. If the standard interpretation is extended consistently,<br />

the cat cannot be considered dead or alive as long as the chamber is not opened and the cat is not<br />

observed. (One may wonder what the cat itself thinks of this.)<br />

The question whether it is permitted to extend the standard interpretation in this way coincides<br />

with the question if and to what extent the quantum mechanical description can be transferred from


I. 2. INCOMPLETENESS AND LOCALITY 11<br />

the microscopic to the macroscopic level. Then there is the question what an observation exactly is.<br />

Are cats observers of their own situation? And if consciousness is essential for an observation, do<br />

cats have the correct type of consciousness?<br />

From the examples above we can isolate the following central concepts:<br />

1. the real state of a system independent of measurement,<br />

2. incompleteness,<br />

3. measurement disturbance,<br />

4. complementarity,<br />

5. the transition from microscopic to macroscopic,<br />

6. consciousness. 1<br />

I. 2 INCOMPLETENESS AND LOCALITY<br />

The previous discussion only served to get the reader in the right mood! In 1935 Albert Einstein,<br />

Boris Podolsky and Nathan Rosen, from now on abbreviated as EPR, came up with an example<br />

which considerably sharpened the discussion (EPR 1935). Using rigorous reasoning they argued that<br />

quantum mechanics is an incomplete theory. As an introduction to their argumentation we will first<br />

examine a more simple argument that Einstein formulated in the same year in a letter to Schrödinger,<br />

as paraphrased by A. Fine (1986, p. 37).<br />

Consider a composite system of two particles which interacted with each other but are so widely<br />

separated in space now that they no longer interact. Suppose they are in a state |ψ⟩ which is an eigenstate<br />

of the total momentum P 1 +P 2 with eigenvalue 0, but is not an eigenstate of P 1 or P 2 separately,<br />

(P 1 + P 2 ) |ψ⟩ = 0 and P 1 |ψ⟩ ̸= a |ψ⟩, P 2 |ψ⟩ ̸= b |ψ⟩, for a, b ∈ R. (I. 1)<br />

Through a measurement of the momentum of particle 1 we can predict with certainty what the result<br />

will be of a measurement of the momentum of particle 2. Moreover, the measurement of particle 1<br />

has absolutely no physical influence on particle 2. But if it is possible to predict the momentum of<br />

particle 2 with certainty without any interaction with that particle, then particle 2 must already have<br />

this momentum before the measurement, and this must even be the case before the measurement of<br />

particle 1, since the measurement absolutely does not disturb particle 2. However, the value of this<br />

property of particle 2 cannot be derived from the quantum mechanical description using the state |ψ⟩.<br />

Therefore, quantum mechanics is incomplete.<br />

We see how Einstein succeeds, thanks to the strict correlation between the particles that quantum<br />

mechanics allows for, and thanks to the spatial separation of the particles, to refute the argument of<br />

1 The role of consciousness is regarded as essential by mathematicians and physicists like Von Neumann, London,<br />

Heitler and Wigner. The fact that they felt forced to take this highly unusual step in physical theory illustrates how serious<br />

the situation is.


12 CHAPTER I. CONCEPTUAL PROBLEMS<br />

the measurement disturbance as a physical process. In the earlier examples we could imagine the<br />

measurement to create the outcome (although this already seemed a hardly convincing escape in Einstein’s<br />

example of macroscopic bodies), and that this outcome did not exist prior to the measurement<br />

because of the disturbance that comes with the measurement. We now see that we cannot imagine<br />

these measurement disturbances as spatially limited, ‘local’ processes. Einstein spoke of “a spooky<br />

action at a distance” and of “telepathy”.<br />

The case against the completeness of the quantum mechanics gained strength with this example.<br />

However, objections can be made. (Later Einstein would be amused about the fact that everyone knew<br />

the argumentation was not correct but that everyone had another reason to think so.) The argument<br />

uses the fact that in quantum mechanics there are eigenstates of P 1 + P 2 in which the momentum<br />

of each individual particle is undetermined. It could be objected that such states are perhaps not<br />

physically realizable, that only eigenstates of P 1 + P 2 which are at the same time also eigenstates of<br />

both P 1 and P 2 would be realizable, and that we should therefore replace the state |ψ⟩ by a mixture<br />

of such eigenstates, in which case the argumentation does not hold any longer.<br />

The EPR article itself gives a more balanced argumentation that does not have this shortcoming.<br />

The article deviates from the above on two points. First, not only the momentum, but also the position<br />

of the two particles is brought into the consideration. Second, EPR formulate a ‘sufficient condition of<br />

reality’ by means of the term ‘element of physical reality’, which we will call EPR(EPR). As worded<br />

by EPR, p. 777,<br />

EPR(EPR): If, without in any way disturbing a system, we can predict with certainty<br />

(i.e., with probability equal to unity) the value of a physical quantity, then there exists an<br />

element of physical reality corresponding to this physical quantity.<br />

How else could we explain that we are able to predict the outcomes of measurements with certainty?<br />

A necessary, and certainly sufficient, condition for a complete physical theory is, that each<br />

element of physical reality must have a counterpart in the theoretical description,<br />

COMP(T): If a physical theory T is complete, then every element of physical reality<br />

must have a counterpart in the theory T .<br />

It is possible to choose for |ψ⟩ a state which is a simultaneous eigenstate of the commuting operators<br />

P 1 + P 2 and Q 1 − Q 2 . In Dirac - notation, and only considering one spatial dimension, such a<br />

state is written in the ‘p - language’ and in the ‘q - language’, as<br />

∫<br />

∫<br />

|ψ⟩ = |p 1 = p⟩ ⊗ |p 2 = −p⟩ e − i l p dp = |q 1 = q⟩ ⊗ |q 2 = q − l⟩ dq, (I. 2)<br />

R<br />

where l is the eigenvalue of the mutual distance Q 1 − Q 2 and can be chosen arbitrarily large, and<br />

the terms with the ‘cartwheels’ are the direct products, see subsection II. 5. 2, p. 31, of which the first<br />

factor refers to particle 1, and the second to particle 2. The ‘p - language’ and the ‘q - language’ can<br />

be ‘translated’ into each other by means of a Fourier - transformation. 2<br />

2 Without Dirac - notation but in terms of Dirac’s δ - ‘functions’ the wave function has, in ‘p - language’ and in ‘q - language’,<br />

the following form,<br />

ψ(p 1 , p 2 ) = e − i lp 1<br />

δ(p 1 + p 2 ) and ˜ψ(q1 , q 2 ) = δ(q 1 − q 2 + l).<br />

R


I. 2. INCOMPLETENESS AND LOCALITY 13<br />

Although this state |ψ⟩ is an eigenstate of the total momentum P 1 + P 2 of the two particles and<br />

their mutual distance Q 1 − Q 2 , with eigenvalues 0 or l, respectively,<br />

(<br />

P 1 + P 2<br />

)<br />

|ψ⟩ = 0 |ψ⟩ and<br />

(<br />

Q1 − Q 2<br />

)<br />

|ψ⟩ = l |ψ⟩, (I. 3)<br />

it is not an eigenstate of any of the 1 - particle operators P 1 , Q 1 , P 2 or Q 2 . However, given the<br />

outcome of a measurement of P 1 , e.g. a, we can predict the result of a measurement of P 2 with<br />

certainty, namely −a. In the same way, from a measurement of Q 1 with outcome x, the result of a<br />

measurement of Q 2 follows with certainty, namely x − l.<br />

Now the argumentation is as follows. If we would measure the momentum P 1 of particle 1,<br />

then we could predict the value of P 2 with certainty, without disturbing particle 2. According to<br />

the aforementioned criterion the momentum P 2 of particle 2 must then correspond to an element of<br />

physical reality. On the other hand, if we would measure the position Q 1 of particle 1, then we could<br />

predict the value of Q 2 with certainty, again without disturbing particle 2. In that case there must<br />

be an element of physical reality which corresponds to Q 2 . Therefore we can, depending on which<br />

measurement we perform on particle 1, assign an element of physical reality to particle 2.<br />

However, because of the absence of physical interaction between the particles there can be no real<br />

change in particle 2 as a result of what is done with particle 1. Consequently, particle 2 must have both<br />

elements of physical reality. But such a simultaneous assignment of exact position and momentum<br />

has no counterpart in the quantum mechanical formalism, there are no wave functions which are<br />

simultaneous eigenfunctions of position and momentum. The conclusion is unavoidable, the answer<br />

to the question in the title of their article ‘Can quantum - mechanical description of physical reality be<br />

considered complete?’ must be negative.<br />

Notice that it is not necessary to perform the measurements on P 1 or Q 1 simultaneously, the<br />

only thing that matters is the possibility to choose whether to predict the position or momentum of<br />

particle 2 with certainty. Because of the absence of interaction between both particles it makes no<br />

difference for particle 2 which choice is made for particle 1. This part of the argumentation relies<br />

on the supposition that the elements of physical reality have a local character. This implicit, but<br />

reasonable locality premise, runs as follows,<br />

LOC(EPR): Performing a measurement on a physical system S 1 does not have an instantaneous<br />

effect on elements of physical reality belonging to any system S 2 which is spatially<br />

separated from S 1 .<br />

We can thus summarize the argument of EPR schematically; quantum mechanics, QM, together<br />

with EPR(EPR) and LOC(EPR), implies that quantum mechanics is an incomplete theory,<br />

not COMP(QM). Or:<br />

QM ∧ EPR(EPR) ∧ LOC(EPR) → ¬ COMP(QM). (I. 4)<br />

In comparison to the foregoing, the strength of this argument is, in the first place, the larger precision<br />

with which the argumentation has been set up: the conclusion follows logically from a number<br />

of explicitly formulated premises and conditions. Moreover, we see that we are able to attribute to<br />

particle 2 both position and momentum without interacting with particle 2. This means that we cannot<br />

avoid the argumentation by assuming that for the correct quantum mechanical description the


14 CHAPTER I. CONCEPTUAL PROBLEMS<br />

given wave function ψ must be replaced by a mixture of eigenstates. Such eigenstates of position and<br />

momentum are simply not available in quantum mechanics. The possibility to assign values to P 2<br />

and Q 2 attacks the complementarity idea in the heart.<br />

EPR anticipated the objection that only that which has been measured is real (EPR 1935, p. 780),<br />

Indeed, one would not arrive at our conclusion if one insisted that two or more physical<br />

quantities can be regarded as simultaneous elements of reality only when they can be<br />

simultaneously measured or predicted. On this point of view, since either one or the<br />

other, but not both simultaneously, of the quantities P and Q can be predicted, they are<br />

not simultaneously real. This makes the reality of P and Q depend upon the process of<br />

measurement carried out on the first system, which does not disturb the second system in<br />

any way. No reasonable definition of reality could be expected to permit this.<br />

They conclude their article with the next paragraph,<br />

While we have thus shown that the wave function does not provide a complete description<br />

of physical reality, we left open the question of whether or not such a description exists.<br />

We believe, however, that such a theory is possible.<br />

The problem whether a complete theory is possible or not, is called the hidden variable problem. The<br />

so - called ‘hidden variable theories’ are attempts to solve this problem. We will come back to this in<br />

chapter V.<br />

Bohr’s (1935a) response to the argument of EPR aims at the question to what extent the condition<br />

for an element of ‘physical reality’, as worded by EPR, is fulfilled in their example. The next quotation<br />

is from Bohr (1935b, p. 700),<br />

From our point of view we now see that the wording of the aforementioned criterion of<br />

physical reality proposed by Einstein, Podolsky and Rosen contains an ambiguity as regards<br />

the meaning of the expression “without in any way disturbing a system.” Of course<br />

there is in a case like that just considered no question of a mechanical disturbance of<br />

the system under investigation during the last critical stage of the measuring procedure.<br />

But even at this stage there is essentially the question of an influence on the very conditions<br />

which define the possible types of predictions regarding the future behavior of<br />

the system. Since these conditions constitute an inherent element of the description of<br />

any phenomenon to which the term ‘physical reality’ can be properly attached, we see<br />

that the argumentation of the mentioned authors does not justify their conclusion that<br />

quantum mechanical description is essentially incomplete. (Emphasis added.)<br />

It is not easy to completely comprehend what Bohr says here. Evidently, he abandons the original<br />

idea that the measurement disturbance creates the measurement results, or, at least, that such a creation<br />

can be understood as a physical process. It is replaced by the idea that applicability of physical concepts<br />

depends on the context of measurement. Performing a measurement on one of the particles<br />

is considered as determinative for the applicability of concepts to the other particle. Bohr says that<br />

the measurement disturbance is not a mechanical disturbance; apparently LOC(EPR) continues to<br />

apply for him if we, using the term ‘influence’, refer to a mechanical interaction, but not if we mean


I. 2. INCOMPLETENESS AND LOCALITY 15<br />

by ‘influence’ the ‘defining effect’ of the context of measurement. The experimental circumstances<br />

define what you may call physical reality. Physical reality is not defined by experiments you could do,<br />

as is the case according to EPR, but exclusively by experiments you actually do. Under circumstances<br />

as described in the EPR experiment this ‘defining effect’ of the experimental setup also reaches parts<br />

of the system with which the measuring apparatus has no physical interaction.<br />

A distinct difference between Einstein and Bohr is that Einstein wants to visualize reality independent<br />

of observation, whereas Bohr is satisfied with complementary pictures of which the applicability<br />

always remains dependent on the chosen measurement setup. In 1955 Einstein says (Fine 1986, p.95)<br />

It is basic for physics that one assumes a real world existing independently from any act<br />

of perception. But this we do not know. We take it only as a programme in our scientific<br />

endeavors. This programme is, of course, prescientific and our ordinary language is<br />

already based on it.<br />

And concerning the EPR situation he says (Schilpp 1949, p. 85)<br />

But on one supposition we should, in my opinion, absolutely hold fast: the real factual<br />

situation of the system S 2 is independent of what is done with the system S 1 , which is<br />

spatially separated from the former.<br />

Bohr’s conceptions concerning physical reality are much more difficult to characterize. According<br />

to him there is no independent reality of which the physical theory would have to give an unambiguous<br />

representation. He writes (Schilpp 1949, p. 211)<br />

Thus, a sentence like “we cannot know both the momentum and the position of an atomic<br />

object” immediately raises questions as to the physical reality of two such attributes of the<br />

object, which can be answered only by referring to the conditions for the unambiguous<br />

use of space - time concepts, on the one hand, and dynamical conservation laws, on the<br />

other hand.<br />

An exhaustive description of reality must always use concepts which themselves remain dependent<br />

on mutually excluding contexts. Bohr says (A. Petersen 1963, p. 11)<br />

The word ‘reality’ is also a word, a word which we must learn to use correctly.<br />

He constantly emphasizes the restricted applicability of our physical concepts, which makes the link<br />

between description and reality very complicated. Petersen mentions (ibid., p. 12)<br />

When asked whether the algorithm of quantum mechanics could be considered as somehow<br />

mirroring an underlying quantum world, Bohr would answer, “There is no quantum<br />

world. There is only an abstract quantum physical description. It is wrong to think that<br />

the task of physics is to find out how nature is. Physics concerns what we can say about<br />

nature.”


16 CHAPTER I. CONCEPTUAL PROBLEMS<br />

Einstein’s conceptions are, in a certain way, easier than those of Bohr and correspond to the<br />

intuition of the majority of physicists. When the preponderance of the Copenhagen school started to<br />

wane, in the 1960s, attention for Einstein’s viewpoint revived.<br />

In 1964, John Bell gave a reconstruction of the EPR experiment (see chapter VII) satisfying Einstein’s<br />

requirement that the real, factual situation of physical system S 2 is independent of what is<br />

done with system S 1 , the two systems being spatially separated. He constructed a very general model<br />

and made the surprising discovery that such a model cannot completely reproduce the quantum mechanical<br />

predictions. Especially remarkable are the broad generality of his derivation and the fact that<br />

the differences with quantum mechanics are large enough to be able to be measured. Sensationally, a<br />

‘philosophical’ issue thus came within the range of experimental physics! Abner Shimony has spoken<br />

in this respect of experimental metaphysics.<br />

Bell’s work is an attempt solve the completeness problem. Hereafter attempts were undertaken<br />

to really carry out the EPR experiment, which was thus far only a thought experiment. The first<br />

experiment was done in 1972 by Freedman and Clauser. Later, several other experiments have been<br />

done, the highlight of which was, in 1982, the experiment of Alain Aspect and his group in Paris. In<br />

turn, this has been superseded by the experiments of Anton Zeilinger and his groups in Vienna and<br />

Innsbruck (e.g. Weihs 1998). The results of these experiments are in good to excellent agreement<br />

with quantum mechanics, and therefore in conflict with all models meeting Einstein’s requirements.<br />

The latter conclusion applies irrespective of the validity of quantum mechanics.<br />

These results brought about a great number of responses and is one of the main causes of the<br />

revived interest for interpretation problems of quantum mechanics. The discussion focusses on the<br />

question what exactly the suppositions are that lead to the result of Bell and whether his model is<br />

indeed the most general model that meets Einstein’s requirements.<br />

The consequences of Bell’s result seem to be considerable. It can be argued that no independent<br />

existence can be granted to objects that at some time interacted, irrespective of how far apart they are,<br />

this even holds completely independent of the distance. This suggests that reality cannot be reduced<br />

to the ‘sum’ of its parts and that a more holistic approach is imperative, making our picture of nature<br />

much more complicated.<br />

Through the discussion of the EPR argument some more basic concepts are added to our list:<br />

7. element of physical reality,<br />

8. separability of physical systems,<br />

9. locality,<br />

10. holism.<br />

These ten concepts play a central role in the research on the foundations of quantum mechanics.


II<br />

THE FORMALISM<br />

As far as the laws of mathematics refer to reality, they are not certain; and as far as they<br />

are certain, they do not refer to reality.<br />

In mathematics you don’t understand things. You just get used to them.<br />

— Albert Einstein<br />

— John von Neumann<br />

The usual mathematical formulation of quantum mechanics has been developed by John von Neumann<br />

in 1932 as an operator calculus on a Hilbert space. We will not need all details of this<br />

calculus, therefore, give only a succinct review. For our purposes we can limit ourselves to a<br />

finite - dimensional Hilbert space, a complex vector space with an inner product. We will give an<br />

overview of the elementary concepts of this Hilbert space, and in an addendum concisely summarize<br />

the infinite - dimensional case. For a more extensive treatment of Hilbert spaces we refer<br />

to the first chapters of E. Prugovečki (2006).<br />

II. 1<br />

FINITE - DIMENSIONAL HILBERT SPACES<br />

We start this chapter by defining a space called a Hilbert space, denoted by H. The elements H are<br />

called vectors. Following Dirac’s ket notation the vectors will be written as |α⟩,|β⟩,|γ⟩,|ϕ⟩,|ψ⟩,|χ⟩, . . . ,<br />

complex numbers will be specified by the first characters of the alphabet a, b, c ∈ C.<br />

Vectors can be added, and multiplied with a complex number, also called a scalar, we then remain<br />

in H, i.e., for all |ϕ⟩, |ψ⟩ ∈ H and a, b ∈ C we have<br />

a|ϕ⟩ + b|ψ⟩ ∈ H. (II. 1)<br />

In other words, H is closed under linear combinations.<br />

The addition is commutative and associative,<br />

|ϕ⟩ + |ψ⟩ = |ψ⟩ + |ϕ⟩, (II. 2)<br />

|ϕ⟩ + ( |ψ⟩ + |χ⟩ ) = ( |ϕ⟩ + |ψ⟩ ) + |χ⟩. (II. 3)<br />

We require the existence of a null vector, 0 ∈ H, which is provable unique and has the property<br />

that for all |ϕ⟩ ∈ H<br />

0 + |ϕ⟩ = |ϕ⟩, (II. 4)


18 CHAPTER II. THE FORMALISM<br />

and that every vector has an additive inverse, i.e., for every |ϕ⟩ ∈ H there is a vector |ϕ ′ ⟩ ∈ H, also<br />

provable unique, such that<br />

|ϕ⟩ + |ϕ ′ ⟩ = 0 . (II. 5)<br />

The scalar multiplication is distributive and associative,<br />

(a + b) ( |ϕ⟩ + |ψ⟩ ) = a |ϕ⟩ + a |ψ⟩ + b |ϕ⟩ + b |ψ⟩, (II. 6)<br />

a ( b |ϕ⟩ ) = (a b) |ϕ⟩, (II. 7)<br />

and we demand that<br />

1 |ψ⟩ = |ψ⟩. (II. 8)<br />

Incidentally we also write<br />

a |ψ⟩ ≡ |a ψ⟩ ≡ |ψ⟩ a. (II. 9)<br />

EXERCISE 1. Prove (a) 0|ϕ⟩ = 0 ,<br />

(b) the additive inverse of |ϕ⟩ equals −1|ϕ⟩.<br />

An inner product on a vector space is a mapping H × H → C, where the image in C<br />

of ( |ϕ⟩, |ψ⟩ ) ∈ H × H is written as ⟨ϕ | ψ⟩. The inner product has the following properties:<br />

(i)<br />

⟨ϕ | a ψ + b χ⟩ = a ⟨ϕ | ψ⟩ + b ⟨ϕ | χ⟩,<br />

(ii) ⟨ϕ | ψ⟩ = ⟨ψ | ϕ⟩ ∗ ,<br />

(iii) ⟨ϕ | ϕ⟩ 0, (II. 10)<br />

(iv) ⟨ϕ | ϕ⟩ = 0 iff |ϕ⟩ = 0 .<br />

The value<br />

∥ψ∥ := √ ⟨ψ | ψ⟩ (II. 11)<br />

is called the norm of |ψ⟩ and meets the usual requirements for a norm; its value is positive, except<br />

for the zero vector which is assigned 0, it is homogeneous, in the sense that ∥aψ∥ = |a|∥ψ∥, and it<br />

satisfies the triangle inequality ∥ψ + ϕ∥ ∥ψ∥ + ∥ϕ∥. A vector is called a unit vector if the norm<br />

equals 1.


II. 1. FINITE - DIMENSIONAL HILBERT SPACES 19<br />

An important inequality is the Cauchy - Schwarz inequality<br />

|⟨ϕ | ψ⟩| 2 ⟨ϕ | ϕ⟩ ⟨ψ | ψ⟩. (II. 12)<br />

EXERCISE 2. Prove (a) the Cauchy - Schwarz inequality (II. 12),<br />

(b) the definition of the norm satisfies the standard requirements for a norm.<br />

The n vectors |α 1 ⟩, . . . , |α n ⟩ are called (linearly) independent if it follows from<br />

n∑<br />

c i |α i ⟩ = 0 (II. 13)<br />

i=1<br />

that all coefficients c i are equal to zero, otherwise the vectors are called dependent.<br />

EXERCISE 3. Prove that mutually orthogonal vectors are linearly independent.<br />

A set of vectors |α 1 ⟩, . . . , |α N ⟩ in H is complete 1 if every vector |ψ⟩ ∈ H can be written as a<br />

linear combination of this set,<br />

|ψ⟩ =<br />

N∑<br />

c i |α i ⟩. (II. 14)<br />

i=1<br />

A complete, independent set of vectors is called a basis. A basis is called orthonormal if<br />

⟨α i | α j ⟩ = δ ij , (II. 15)<br />

where δ ij is the Kronecker delta. It can be proved that every basis of a space H contains the same<br />

number of elements, this number is, by definition, the dimension of H, and is written dim H. The<br />

dimension of a Hilbert space is infinite if every finite set of linearly independent vectors is incomplete.<br />

If |α 1 ⟩, . . . , |α N ⟩ is an orthonormal basis, with N = dim H, then it follows from (II. 15) that the<br />

coefficients in (II. 14) are given by<br />

c i = ⟨α i | ψ⟩, (II. 16)<br />

and the vectors |ψ⟩ can thus be represented in such a basis by columns of N complex numbers.<br />

Therefore, an N - dimensional Hilbert space can also be written as C N .<br />

1 The use of the term ‘complete’ for a system of vectors should not be confused with the same phrase as used within the<br />

context of the foundations of quantum mechanics, that is, as a property of a physical theory.


20 CHAPTER II. THE FORMALISM<br />

With (II. 16), in an orthonormal basis we have<br />

⎛ ⎞<br />

c 1<br />

c 2<br />

|ψ⟩ = ⎜ ⎟<br />

⎝ . ⎠<br />

c n<br />

(II. 17)<br />

and hence ⟨ψ| = (c 1 ∗ , c 2 ∗ , . . . , c ∗ n), therefore<br />

⎛<br />

c 1 c<br />

∗ 1 . . . c 1 c ∗ ⎞<br />

n<br />

⎜<br />

|ψ⟩ ⟨ψ| = ⎝<br />

.<br />

. ..<br />

⎟<br />

⎠ , (II. 18)<br />

c n c<br />

∗ 1 c n cn<br />

∗<br />

from which it is evident that for the vectors of the orthonormal basis {|α i ⟩} it holds that<br />

N∑<br />

|α i ⟩ ⟨α i | = 11, (II. 19)<br />

i=1<br />

with 11 the identity mapping on H,<br />

11 |ψ⟩ = |ψ⟩ ∀ |ψ⟩ ∈ H. (II. 20)<br />

Using (II. 14) and (II. 16), we see that an orthonormal basis is indeed characterized by the relation<br />

|ψ⟩ =<br />

N∑<br />

⟨α i | ψ⟩ |α i ⟩ =<br />

i=1<br />

N∑<br />

|α i ⟩ ⟨α i | ψ⟩. (II. 21)<br />

i=1<br />

The definition of a finite - dimensional Hilbert space is now completed; it is a finite - dimensional<br />

complex Hilbert space with an inner product which is related to the norm by means of (II. 11). A real<br />

finite - dimensional Hilbert space is obtained by replacing C everywhere by R, i.e., the set of scalars is<br />

in R and the inner product is always real. In section II. 6 we will see that for the infinite - dimensional<br />

case the definition must be extended with two requirements, ‘separability’ and ‘completeness’, which<br />

we can prove in the finite - dimensional case.<br />

II. 2<br />

OPERATORS<br />

An operator A on a Hilbert space H is a linear mapping of H onto itself,<br />

A : H → H, |ψ⟩ ↦→ A |ψ⟩ with A ( a |ψ⟩ + b |ϕ⟩ ) = a A |ψ⟩ + b A |ϕ⟩. (II. 22)<br />

From (II. 16) we saw that in a given orthonormal basis |α 1 ⟩, . . . , |α N ⟩ the vectors |ψ⟩ ∈ H are<br />

unambiguously represented by rows of N complex numbers c i = ⟨α i | ψ⟩. This corresponds to the


II. 2. OPERATORS 21<br />

representation of an operator A as an N × N - matrix A in a basis {|α i ⟩}, and the coefficients of the<br />

vector A|ψ⟩ in this basis are, using (II. 19),<br />

with<br />

⟨α i | A | ψ⟩ = ⟨α i | A 11 | ψ⟩ =<br />

N∑<br />

⟨α i | A | α j ⟩ ⟨α j | ψ⟩ =<br />

j=1<br />

N∑<br />

A ij c j , (II. 23)<br />

j=1<br />

A ij := ⟨α i | A | α j ⟩. (II. 24)<br />

Operators A and B can be added and multiplied,<br />

(A + B) |ψ⟩ := A |ψ⟩ + B |ψ⟩ and (A B) |ψ⟩ := A ( B |ψ⟩ ) . (II. 25)<br />

The adjoint A † of an operator A is defined by the following equation<br />

⟨ψ | A † | ϕ⟩ = ⟨ϕ | A | ψ⟩ ∗ ∀ |ϕ⟩, |ψ⟩ ∈ H. (II. 26)<br />

EXERCISE 4.<br />

( ) A<br />

† = A ∗<br />

ij ji .<br />

Show that for the matrix representation in an orthonormal basis it holds that<br />

Every operator on a finite - dimensional vector space has a unique adjoint, and the following holds<br />

(c A) † = c ∗ A † ,<br />

(A + B) † = A † + B † ,<br />

(A B) † = B † A † ,<br />

(<br />

A<br />

† ) † = A. (II. 27)<br />

An operator B is called an inverse of A if<br />

A B = B A = 11. (II. 28)<br />

In this case we write A −1 for B, because the inverse, if it exists, is unique. Not every operator has an<br />

inverse, an example in the Hilbert space C 2 is<br />

( ) 0 1<br />

. (II. 29)<br />

0 0<br />

The trace of an operator A is defined as follows,<br />

Tr A :=<br />

N∑<br />

⟨γ i | A | γ i ⟩, (II. 30)<br />

i=1<br />

where |γ 1 ⟩, . . . , |γ N ⟩ is an arbitrary orthonormal basis and N = dim H.


22 CHAPTER II. THE FORMALISM<br />

EXERCISE 5. Show that Tr A is independent of the choice of the orthonormal basis.<br />

The trace has the following properties:<br />

Tr A † = Tr A ∗ ,<br />

Tr (bA + cB) = b Tr A + c Tr B,<br />

Tr AB = Tr BA. (II. 31)<br />

EXERCISE 6. Prove the three statements in (II. 31).<br />

We will now list the most important types of operators. An operator A is called normal if it<br />

commutes with its adjoint,<br />

[<br />

A, A<br />

† ] := A A † − A † A = 0 , (II. 32)<br />

where 0 is actually the ‘zero operator’, it maps all vectors to the zero vector 0 . An operator is called<br />

self - adjoint or Hermitian if it is equal to its adjoint,<br />

A † = A, (II. 33)<br />

and with the first statement of (II. 31) we see that the trace of a self-adjoint operator is always real.<br />

Self - adjoint operators are normal, but not all normal operators are self - adjoint, e.g., the unitary<br />

operator,<br />

U † = U − 1 . (II. 34)<br />

EXERCISE 7. Prove that a unitary operator preserves the inner product, e.g., for all |ϕ⟩,|ψ⟩ ∈ H<br />

the following holds: if |ϕ ′ ⟩ = U |ϕ⟩ and |ψ ′ ⟩ = U |ψ⟩ then ⟨ψ ′ | ϕ ′ ⟩ = ⟨ψ | ϕ⟩.<br />

An operator A is called positive, i.e. A 0, if<br />

⟨ψ | A | ψ⟩ 0 ∀ |ψ⟩ ∈ H. (II. 35)<br />

An operator P is called a projection operator, or a projector for short, if it is self - adjoint and<br />

idempotent,<br />

P = P † and P 2 = P. (II. 36)


II. 2. OPERATORS 23<br />

An example of a projector, apart from the obvious examples of the zero operator 0 and the identity<br />

operator 11, is the mapping P ϕ = |ϕ⟩ ⟨ϕ| which projects on a given unit vector |ϕ⟩,<br />

P ϕ : |ψ⟩ ↦→ ⟨ϕ | ψ⟩ |ϕ⟩ = |ϕ⟩ ⟨ϕ | ψ⟩. (II. 37)<br />

EXERCISE 8. Show that (a) every projector is positive,<br />

(b) if P is a projector, then 11 − P is one also.<br />

Projectors are the workhorses of Hilbert space. Nearly all of our further considerations concerning<br />

quantum mechanics can be formulated in terms of projectors, and therefore we will now discuss their<br />

properties somewhat more elaborate.<br />

We write the set of all projectors on a Hilbert space H as P (H). Every projector P can be<br />

characterized by means of its range, i.e. the set<br />

H P := { P |ψ⟩ : |ψ⟩ ∈ H } . (II. 38)<br />

This set is closed under linear combinations and thus forms another Hilbert space by itself, called a<br />

subspace of H. Conversely, every subspace of H corresponds unambiguously to a projector. 2<br />

The subspace corresponding to a projector is also called its eigenspace, and if the dimension of its<br />

eigenspace is N, the projector is called N - dimensional.<br />

Two projectors P 1 and P 2 are called mutually orthogonal, written as P 1 ⊥ P 2 , if<br />

P 1 P 2 = 0. (II. 39)<br />

In that case their eigenspaces are also orthogonal,<br />

P 1 ⊥ P 2 iff ∀ |ψ⟩ ∈ H P 1<br />

, ∀ |ϕ⟩ ∈ H P 2<br />

it holds that ⟨ϕ | ψ⟩ = 0. (II. 40)<br />

EXERCISE 9. Verify that P 1 P 2 = 0 =⇒ P 2 P 1 = 0 holds for projectors.<br />

For two orthogonal projectors P 1 ⊥ P 2 , the sum P 1 + P 2 is also a projector since it is, as can be<br />

seen using (II. 27), self - adjoint, and it is idempotent,<br />

(P 1 + P 2 ) 2 = P 1 2 + P 1 P 2 + P 2 P 1 + P 2 2 = P 1 2 + P 2 2 = P 1 + P 2 , (II. 41)<br />

thereby satisfying the requirements (II. 36). The eigenspace of the projector P 1 + P 2 is the linear<br />

space spanned by the vectors in H P1 and H P2 .<br />

2 In infinite - dimensional Hilbert spaces this only holds for closed subspaces.


24 CHAPTER II. THE FORMALISM<br />

A set of projectors P 1 , . . . , P N is called mutually orthogonal if<br />

P i P j = δ ij P i for i, j = 1, . . . , N, (II. 42)<br />

a set of mutually orthogonal projectors is called complete if<br />

N∑<br />

P i = 11. (II. 43)<br />

i=1<br />

In particular, in accordance with (II. 19), for an orthonormal basis |α i ⟩, . . . , |α N ⟩ it holds that the<br />

associated 1 - dimensional projectors form a complete set,<br />

N∑<br />

|α i ⟩ ⟨α i | = 11. (II. 44)<br />

i=1<br />

II. 3<br />

EIGENVALUE PROBLEM AND SPECTRAL THEOREM<br />

If |β 1 ⟩, . . . , |β N ⟩ is an arbitrary orthonormal basis, an operator A is represented in this basis as<br />

an arbitrary N × N - matrix,<br />

A ij = ⟨β i | A | β j ⟩. (II. 45)<br />

A powerful tool for the study of such matrices is obtained if they can be ‘diagonalized’, i.e., if an<br />

orthonormal basis |α 1 ⟩, . . . , |α N ⟩ can be found where the matrix representation of A is of the form<br />

⎛ ⎞<br />

a 1<br />

A = ⎝ . ..<br />

0<br />

⎠ , (II. 46)<br />

0 a N<br />

or, equivalently,<br />

A ij = a j δ ij . (II. 47)<br />

For such a basis it holds that<br />

A |α i ⟩ = a i |α i ⟩. (II. 48)<br />

Equation (II. 48) is called the eigenvalue equation of the operator A, the values a i are called the<br />

eigenvalues of A, the set of eigenvalues of A the spectrum of A, written as Spec A, the vectors |α i ⟩<br />

are called the eigenvectors, and the system |α 1 ⟩, . . . , |α N ⟩ an eigenbasis of A. For a self - adjoint<br />

operator it holds that the eigenvalues are all real, and the eigenvalues are not negative if the operator<br />

is positive. For a unitary operator U all eigenvalues u i ∈ C are on the complex unit circle, |u i | = 1,<br />

for a projector the eigenvalues are 0 or 1.<br />

The eigenvalue equation does, however, not always have a solution. See as an example operator<br />

(II. 29). The conditions under which the equation can be solved are given by the next important<br />

theorem which we mention without proof.


II. 3. EIGENVALUE PROBLEM AND SPECTRAL THEOREM 25<br />

SPECTRAL THEOREM:<br />

Every normal operator A has an orthonormal basis of eigenvectors |α 1 ⟩, . . . , |α N ⟩ and<br />

associated eigenvalues a 1 , . . . , a N , not necessarily distinct, satisfying (II. 48).<br />

The spectral theorem tells us that normal operators can be diagonalized. This can be formulated<br />

more elegantly in Dirac notation, where we must distinguish between the case in which all eigenvalues<br />

differ from each other, and the case in which some eigenvalues are equal. In the first case the operator<br />

is called maximal, in the second case the operator is called degenerate.<br />

Suppose that the operator A is maximal, i.e. all eigenvalues a i differ from each other, a i ≠ a j<br />

if i ≠ j. In this case we often use the eigenvalues as a label for the eigenvectors and write |a i ⟩ instead<br />

of |α i ⟩. This notation is unambiguous, since there is exactly one eigenvalue for every eigenvector.<br />

Now, according to the spectral theorem, there is an orthonormal basis |a 1 ⟩, . . . , |a n ⟩ such that<br />

A =<br />

N∑<br />

a i |a i ⟩ ⟨a i |, (II. 49)<br />

i=1<br />

since, with (II. 44), it holds for all |ψ⟩ ∈ H that<br />

A |ψ⟩ = A 11 |ψ⟩ = A<br />

N∑<br />

|a i ⟩ ⟨a i | ψ⟩ =<br />

i=1<br />

N∑<br />

a i |a i ⟩ ⟨a i | ψ⟩. (II. 50)<br />

i=1<br />

If the operator is degenerate there are only M < N distinct eigenvalues a 1 , . . . , a M . For every<br />

eigenvalue a i , there exists a number n i of mutually orthogonal eigenvectors, for which we have<br />

M∑<br />

n i = N. (II. 51)<br />

i=1<br />

The eigenvalue a i is called n i - fold degenerate. The associated eigenvectors span a n i - dimensional<br />

subspace of eigenvectors for the value a i .<br />

Choose, in this subspace, an orthonormal basis {|α i , j⟩} with j = 1, . . . , n i . Here we can also<br />

use the eigenvalues a i as a label for the basis vectors because the extra label j prevents our notation<br />

from becoming ambiguous. Now the eigenvalue equation (II. 48) becomes<br />

A |a i , j⟩ = a i |a i , j⟩. (II. 52)<br />

Analogous to (II. 49), we find<br />

A =<br />

M∑<br />

i=1<br />

a i<br />

∑n i<br />

j=1<br />

|a i , j⟩ ⟨a i , j|, (II. 53)<br />

which, in terms of the n i - dimensional eigenprojectors<br />

P ai =<br />

∑n i<br />

j=1<br />

|a i , j⟩ ⟨a i , j|, (II. 54)


26 CHAPTER II. THE FORMALISM<br />

can also be written as<br />

A =<br />

M∑<br />

a i P ai . (II. 55)<br />

i=1<br />

EXERCISE 10. (a.) Show that P ai in (II. 54) is independent of the choice of the orthonormal<br />

basis |a i , 1⟩, . . . , |a i , n i ⟩. (b). Show that for P ai as defined in (II. 54) and P ϕ given by II. 37:<br />

TrP ai P ϕ = ⟨ϕ|P ai |ϕ⟩ (II. 56)<br />

We summarize the two preceding cases in the following, equivalent, form of the spectral theorem,<br />

formulated in terms of projectors.<br />

SPECTRAL THEOREM:<br />

For every normal operator A a unique set of mutually distinct eigenvalues a 1 , . . . , a M<br />

exists, with M N, and an associated unique complete set of mutually orthogonal projectors<br />

P a1 , . . . , P aM , such that<br />

A =<br />

11 =<br />

M∑<br />

a i P ai , (II. 57)<br />

i=1<br />

M∑<br />

P ai . (II. 58)<br />

i=1<br />

If the operator is non - degenerate, all of these projectors are 1 - dimensional; if it is degenerate,<br />

dim P ai gives the degeneracy of eigenvalue a i . Equation (II. 57) is called the spectral decomposition<br />

of A, the set of mutually orthogonal projectors P ai is called the spectral family of A, and (II. 58) a<br />

resolution of identity.<br />

II. 3. 1<br />

APPENDIX<br />

A formulation of the spectral theorem which is equivalent to the preceding, but is more suitable<br />

for generalizations, can be obtained if we introduce the correspondence between the eigenvalues and<br />

the associated eigenprojectors as a mapping A of all subsets of Spec A ⊂ C to the set P (H) of<br />

projectors on H.<br />

We construct that mapping by demanding<br />

{a i } ↦→ P ai , (II. 59)


II. 4. FUNCTIONS <strong>OF</strong> NORMAL OPERATORS 27<br />

and extend this with the condition<br />

{a 1 , a 2 } ↦→ P {a1 , a 2 } := P a1 + P a2 , (II. 60)<br />

or, more generally, if ∆ represents an arbitrary set of eigenvalues, we define<br />

∆ ↦→ P ∆ = ∑<br />

P a . (II. 61)<br />

a ∈ ∆<br />

A mapping A : C → P (H) is called a projection - valued measure if<br />

(i) P ∅ = 0<br />

(ii) P Spec A = 11<br />

(iii) P ∪i ∆ i<br />

= ∑ i<br />

P ∆i , for all ∆ i mutually disjoint. (II. 62)<br />

EXERCISE 11. Verify that: P ∆ c = 11 − P ∆ where ∆ c = Spec A \ ∆ is the complement of ∆.<br />

The spectral theorem can now again be formulated.<br />

SPECTRAL THEOREM:<br />

Every normal operator A corresponds unambiguously to a projection - valued measure A.<br />

II. 4<br />

FUNCTIONS <strong>OF</strong> NORMAL OPERATORS<br />

The spectral theorem makes it possible to treat functions of normal operators in a simple manner.<br />

If f is an arbitrary function, real or complex, and A is an operator with spectral decomposition<br />

A =<br />

M∑<br />

a i P ai , (II. 63)<br />

i=1<br />

then the function f (A) of A is defined as<br />

f (A) :=<br />

M∑<br />

f (a i ) P ai . (II. 64)<br />

i=1<br />

This means that f (A) always has the same eigenvectors and eigenprojections as A, and only differs<br />

from A in the labeling of its eigenvalues, namely by f (a i ) instead of a i . As an example, consider the<br />

characteristic function χ a of a ∈ C,<br />

χ a : C → {0, 1}, x ↦→ χ a (x) :=<br />

{ 1 if x = a<br />

0 otherwise<br />

(II. 65)


28 CHAPTER II. THE FORMALISM<br />

for which, with (II. 64), we have<br />

χ ak (A) : =<br />

M∑<br />

χ ak (a i ) P ai = P ak , (II. 66)<br />

i=1<br />

and we see that the projectors from the spectral decomposition of A, (II. 63), are functions of A.<br />

We use the spectral decompositions in the proof of the following theorem.<br />

THEOREM:<br />

If two self - adjoint operators A and B commute, there is a maximal, self - adjoint operator<br />

C of which both A and B are a function.<br />

To prove this theorem we first prove a useful lemma.<br />

LEMMA:<br />

If [A, B] = 0, a basis {|γ i ⟩} exists in which A and B are simultaneously diagonal.<br />

Proof<br />

Let {|a i , j⟩} be an orthonormal eigenbasis of operator A, where j = 1, . . . , n i is the degeneracy<br />

of eigenvalue a i , and we have<br />

⟨a p , q | a i , j⟩ = δ pi δ qj . (II. 67)<br />

Analogously, let there be an orthonormal eigenbasis {|b k , l⟩} for operator B. From [A, B] = 0<br />

and (II. 63) it follows that<br />

A ( B |a i , j⟩ ) = B A |a i , j⟩ = a i B |a i , j⟩, (II. 68)<br />

and B |a i , j⟩ is, apparently, an eigenvector of A with the eigenvalue a i , i.e., B |a i , j⟩ is in the<br />

eigenspace spanned by |a i , 1⟩, . . . , |a i , n i ⟩. Or, equivalently,<br />

B |a i , j⟩ =<br />

∑n i<br />

k=1<br />

holds for certain numbers Λ [i]<br />

j,k ∈ C.<br />

Λ [i]<br />

j,k |a i, k⟩ (II. 69)<br />

By assmuptionion, B is self - adjoint and therefore the matrix Λ [i] must be Hermitian,<br />

and we see that<br />

⟨a k , l | B | a i , j⟩ = Λ [i]<br />

l,j δ ki = Λ [i]<br />

l,j<br />

, (II. 70)<br />

⟨a k , l | B | a i , j⟩ ∗ = Λ [i] ∗<br />

l,j = ⟨ai , j | B † | a k , l⟩ = Λ [k]<br />

j,l δ ik = Λ [i]<br />

j,l<br />

, (II. 71)<br />

Λ [i]<br />

l,j<br />

∗ [i] = Λ<br />

j,l<br />

. (II. 72)


II. 4. FUNCTIONS <strong>OF</strong> NORMAL OPERATORS 29<br />

Because Λ [i] is self - adjoint, it can be diagonalized by a unitary matrix S [i] ,<br />

Λ ′ [i]<br />

= S [i]− 1 Λ [i] S [i] . (II. 73)<br />

This corresponds to an orthonormal basis transformation within the n i - dimensional subspace<br />

with eigenvalue a i . Carrying out this transformation in each of the subspaces and writing |a i , m ′ ⟩<br />

for the transformed eigenvectors of A, we have<br />

|a i , m ′ ⟩ =<br />

∑n i<br />

j=1<br />

S [i]<br />

j,m ′ |a i, j⟩. (II. 74)<br />

In the new basis {|a i , m ′ ⟩} the matrix Λ [i] is diagonalized and therefore<br />

B |a i , m ′ ⟩ = Λ ′ [i]<br />

m ′ , m ′ δ m ′ j |a i , j⟩ = Λ ′ [i]<br />

m ′ , m ′ |a i, m ′ ⟩. (II. 75)<br />

The vectors |a i , m ′ ⟩ are not just eigenvectors of A, but also of B and form, by construction, a<br />

basis. □<br />

Notice that it is not in contradiction to this lemma if non - commuting operators have some eigenvectors<br />

in common. ▹<br />

Now we come to the proof of the theorem.<br />

Proof<br />

Define, in the basis {|γ i ⟩} of the lemma<br />

A = ∑ i<br />

a i P |γi⟩ and B = ∑ i<br />

b i P |γi⟩, (II. 76)<br />

where the eigenvalues a i and b i are allowed to be degenerate. Next, define a maximal self - adjoint<br />

operator<br />

C = ∑ i<br />

c i P |γi ⟩, (II. 77)<br />

with all c i ∈ C distinct.<br />

Then, according to (II. 66), with χ ci defined analogously to (II. 65),<br />

P |γi⟩ = χ ci (C). (II. 78)<br />

With f (x) = ∑ i<br />

a i χ ci (x) and g(x) = ∑ i<br />

b i χ ci (x), as defined in (II. 64), we now find<br />

A = ∑ i<br />

a i χ ci (C) = f (C) and B = ∑ i<br />

b i χ ci (C) = g(C). (II. 79)


30 CHAPTER II. THE FORMALISM<br />

Thus, both self - adjoint, and mutually commuting, operators A and B are functions of the maximal,<br />

self - adjoint operator C, which is what we set out to prove. □<br />

Note that the choice of C in the above theorem is not unique. Indeed, suppose that<br />

A = f (C 1 ) = g(C 2 ), (II. 80)<br />

where C 1 and C 2 are both maximal. In general, it is not required for C 1 and C 2 to commute.<br />

But they do commute if A itself is maximal . In that case f can be inverted<br />

C 1 = f − 1 (A) = f − 1 (g(C 2 )) (II. 81)<br />

from which it follows that<br />

[C 1 , C 2 ] = 0. ▹ (II. 82)<br />

II. 5<br />

DIRECT SUM AND DIRECT PRODUCT<br />

There are two ways to construct a new Hilbert space H from two given Hilbert spaces H 1<br />

and H 2 , or vice versa, to divide a given Hilbert space H into smaller spaces.<br />

II. 5. 1<br />

DIRECT SUM<br />

Let H 1 and H 2 be two Hilbert spaces. By definition we call the space H := H 1 ⊕ H 2 the direct<br />

sum space of H 1 and H 2 if the following requirements are satisfied:<br />

(i) The space H 1 ⊕ H 2 contains as its elements all ordered pairs of vectors, written as |ϕ⟩ 1 ⊕ |ψ⟩ 2 ,<br />

with |ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 ∈ H 2 .<br />

(ii) Addition and scalar multiplication are defined on H 1 ⊕ H 2 , and obey<br />

a ( |ϕ⟩ 1 ⊕ |ψ⟩ 2<br />

)<br />

+ b<br />

(<br />

|χ⟩1 ⊕ |ξ⟩ 2<br />

)<br />

=<br />

(<br />

a |ϕ⟩1 + b |χ⟩ 1<br />

)<br />

⊕<br />

(<br />

a |ψ⟩2 + b |ξ⟩ 2<br />

)<br />

. (II. 83)<br />

(iii) The inner product is additive,<br />

(<br />

1⟨ϕ| ⊕ 2 ⟨ϕ| ) ( |ψ⟩ 1 ⊕ |ψ⟩ 2<br />

)<br />

= 1 ⟨ϕ | ψ⟩ 1 + 2 ⟨ϕ | ψ⟩ 2 . (II. 84)<br />

(iv) H 1 ⊕ H 2 is the smallest Hilbert space spanned by the elements of the form |ϕ⟩ 1 ⊕ |ψ⟩ 2 and<br />

their linear combinations.


II. 5. DIRECT SUM AND DIRECT PRODUCT 31<br />

A few remarks about this definition are in order. (a) According to (II. 83), an arbitrary linear<br />

combination of elements in H 1 ⊕ H 2 is, of the form<br />

∑ ( ) ∑<br />

a i |ϕi ⟩ 1 ⊕ |ψ i ⟩ 2 = a i |ϕ i ⟩ 1 ⊕ ∑ a i |ψ i ⟩ 2 . (II. 85)<br />

i<br />

i<br />

i<br />

Consequently, with |ϕ⟩ 1 := ∑ i a i|ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 := ∑ i a i|ψ⟩ 2 ∈ H 2 , all elements in H 1 ⊕ H 2<br />

are of the form |ϕ⟩ 1 ⊕|ψ⟩ 2 . This means that the requirements (i) and (ii) imply that H 1 ⊕H 2 is closed<br />

under linear combinations.<br />

(b) The subspace of H 1 ⊕H 2 , existing of all vectors of the form 0 1 ⊕|ψ⟩ 2 , with 0 1 the null vector<br />

in H 1 , and |ψ⟩ 2 ∈ H 2 arbitrary, is isomorphic to H 2 , likewise for |ϕ⟩ 1 ⊕ 0 2 and H 1 . Moreover, these<br />

two subspaces of H 1 ⊕ H 2 are mutually orthogonal, because<br />

(<br />

1⟨ϕ| ⊕ 0 2<br />

) (<br />

0 1 ⊕ |ψ⟩ 2<br />

)<br />

= 1 ⟨ϕ | 0 ⟩ 1 + 2 ⟨0 | ψ⟩ 2 = 0. (II. 86)<br />

Therefore, every vector |χ⟩ ∈ H 1 ⊕ H 2 can be written uniquely as the direct sum of two orthogonal<br />

terms,<br />

|χ⟩ = |ϕ⟩ 1 ⊕ |ψ⟩ 2 = |ϕ⟩ 1 ⊕ 0 2 + 0 1 ⊕ |ψ⟩ 2 . (II. 87)<br />

Vice versa, suppose that H is an arbitrary Hilbert space, and that H 1 is a subspace of H. Now<br />

let H 2 = H 1 ⊥ be the orthocomplement of H 1 , i.e., H 2 contains all vectors in H which are perpendicular<br />

to all vectors in H 1 . Then H = H 1 ⊕ H 2 holds, with the identification<br />

and<br />

|ϕ⟩ 1 ⊕ 0 2 ↔ |ϕ⟩ ∈ H 1 , (II. 88)<br />

0 1 ⊕ |ψ⟩ 2 ↔ |ψ⟩ ∈ H 2 , (II. 89)<br />

|ϕ⟩ ⊕ |ψ⟩ = |ϕ⟩ + |ψ⟩. (II. 90)<br />

In this case the direct sum ⊕ is nothing but ordinary addition in H, which was given in (II. 1) as<br />

a general property of H. This means that every Hilbert space can be written as a direct sum of<br />

an arbitrary subspace and its orthocomplement. We also see something that holds generally: the<br />

dimension of H 1 ⊕ H 2 is the sum of the dimensions of H 1 and H 2 ,<br />

dim (H 1 ⊕ H 2 ) = dim H 1 + dim H 2 . (II. 91)<br />

II. 5. 2<br />

DIRECT PRODUCT<br />

There is another, actually more important, way to construct a new Hilbert spaces out of two given<br />

spaces. Again, let H 1 and H 2 be two Hilbert spaces. By definition we call the space H :=<br />

H 1 ⊗ H 2 the direct product space if the following requirements have been satisfied.


32 CHAPTER II. THE FORMALISM<br />

(i) The space H 1 ⊗ H 2 has as its elements at least all ordered pairs ( |ϕ⟩ 1 , |ψ⟩ 2<br />

)<br />

, with |ϕ⟩1 ∈ H 1<br />

and |ψ⟩ 2 ∈ H 2 , which we now write as |ϕ⟩ 1 ⊗ |ψ⟩ 2 .<br />

(ii) The addition and scalar multiplication on H 1 ⊗ H 2 satisfy<br />

|ϕ⟩ 1 ⊗ |ψ⟩ 2 + |ϕ⟩ 1 ⊗ |χ⟩ 2 = |ϕ⟩ 1 ⊗ ( |ψ⟩ 2 + |χ⟩ 2<br />

)<br />

. (II. 92)<br />

and<br />

a ( |ϕ⟩ 1 ⊗ |ψ⟩ 2<br />

)<br />

= a |ϕ⟩1 ⊗ |ψ⟩ 2 = |ϕ⟩ 1 ⊗ a |ψ⟩ 2 (II. 93)<br />

(iii) The inner product is multiplicative,<br />

(<br />

1⟨ϕ| ⊗ 2 ⟨χ| ) ( |ψ⟩ 1 ⊗ |ξ⟩ 2<br />

)<br />

= 1 ⟨ϕ | ψ⟩ 1 2 ⟨χ | ξ⟩ 2 . (II. 94)<br />

(iv) H 1 ⊗ H 2 is the smallest Hilbert space spanned by vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 ∈ H and<br />

their linear combinations.<br />

If |α 1 ⟩ 1 , . . . , |α N1 ⟩ 1 is an orthonormal basis in H 1 , and |β 1 ⟩ 2 , . . . , |β N2 ⟩ 2 is likewise in H 2 ,<br />

with N 1 = dim H 1 , N 2 = dim H 2 , their direct products, i.e. the vectors of the form |α i ⟩ 1 ⊗ |β j ⟩ 2<br />

provide, an orthonormal set of vectors in H 1 ⊗ H 2 . Indeed, using (II. 94),<br />

(<br />

1⟨α j | ⊗ 2 ⟨β k | ) ( |α m ⟩ 1 ⊗ |β n ⟩ 2<br />

)<br />

= 1 ⟨α j | α m ⟩ 1 2 ⟨β k | β n ⟩ 2 = δ jm δ kn . (II. 95)<br />

Because orthonormal vectors are independent, the dimension of H 1 ⊗ H 2 cannot be smaller than<br />

the product of the separate dimensions. But furthermore, according to (iv), all vectors in H 1 ⊗ H 2<br />

are obtainable as linear combinations of vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 , which in turn are linear<br />

combinations of the vectors |α j ⟩ 1 ⊗|β k ⟩ 2 . Therefore, these vectors also span the entire space H 1 ⊗H 2 .<br />

In other words, |α 1 ⟩ 1 ⊗ |β 1 ⟩ 2 , |α 2 ⟩ 1 ⊗ |β 1 ⟩ 2 , . . . , |α N1 ⟩ 1 ⊗ |β N2 ⟩ 2 is also a basis for H 1 ⊗ H 2 . For<br />

the dimension of H 1 ⊗ H 2 we thus find<br />

dim (H 1 ⊗ H 2 ) = dim H 1 · dim H 2 . (II. 96)<br />

Consequently, an arbitrary vector |χ⟩ ∈ H 1 ⊗ H 2 can, in this product basis |α j ⟩ 1 ⊗ |β k ⟩ 2 , be written<br />

as<br />

|χ⟩ =<br />

∑N 1 ∑N 2<br />

j=1 k=1<br />

c jk |α j ⟩ 1 ⊗ |β k ⟩ 2 with c jk = ( 1⟨α j | ⊗ 2 ⟨β k | ) |χ⟩ ∈ C. (II. 97)<br />

For vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 it holds that<br />

N 1 ∑<br />

j=1<br />

a j |α j ⟩ 1 ⊗<br />

N 2 ∑<br />

k=1<br />

b k |β k ⟩ 2 =<br />

∑N 1 ∑N 2<br />

j=1 k=1<br />

a j b k |α j ⟩ 1 ⊗ |β k ⟩ 2 . (II. 98)


II. 5. DIRECT SUM AND DIRECT PRODUCT 33<br />

We see that (II. 98) is a special case of (II. 97), that is, where c jk = a j b k . The special vectors which<br />

can be written as (II. 98), i.e., in the form |ϕ⟩ 1 ⊗|ψ⟩ 2 , are called direct product vectors, or factorizable.<br />

In a direct sum space H 1 ⊕H 2 all vectors can be written in the form |ϕ⟩ 1 ⊕|ψ⟩ 2 , but in a direct product<br />

space H 1 ⊗ H 2 not all vectors can be written in the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 . Further on we will see that<br />

states for which c jk cannot be written as a j b k give rise to typical quantum mechanical behavior, as<br />

in the thought experiment of EPR where composite systems are considered, corresponding to states<br />

on H 1 ⊗ H 2 which cannot be factorized. Such states are called non - factorizable or entangled states.<br />

If A and B are operators on H 1 and H 2 , respectively, the direct product operator A ⊗ B is the<br />

operator on H 1 ⊗ H 2 , defined by<br />

(A ⊗ B) ( |ϕ⟩ 1 ⊗ |ψ⟩ 2<br />

)<br />

:= A |ϕ⟩1 ⊗ B |ψ⟩ 2 . (II. 99)<br />

It follows that, with operators C ∈ H 1 and D ∈ H 2 ,<br />

(A ⊗ B) (C ⊗ D) = (A C) ⊗ (B D). (II. 100)<br />

Similar to vectors, operators on the direct product space H 1 ⊗ H 2 are not always factorizable. The<br />

total momentum operator P 1 + P 2 and the distance operator Q 1 − Q 2 of EPR, with P as defined in<br />

section I. 2, (I. 1), and Q likewise, are examples of such non - factorizable direct product operators,<br />

P 1 ⊗ 11 2 + 11 1 ⊗ P 2 and Q 1 ⊗ 11 2 − 11 1 ⊗ Q 2 . (II. 101)<br />

EXERCISE 12. Calculate the commutator of these operators, given that [ P i , Q j<br />

]<br />

= −iδij .<br />

The following properties of the direct product of operators will, further on, be used frequently:<br />

A ⊗ 0 = 0 ⊗ B = 0 ,<br />

(A 1 + A 2 ) ⊗ B = (A 1 ⊗ B) + (A 2 ⊗ B),<br />

11 ⊗ 11 = 11,<br />

a A ⊗ b B = a b (A ⊗ B), (II. 102)<br />

(A ⊗ B) − 1 = A − 1 ⊗ B − 1 ,<br />

(A ⊗ B) † = A † ⊗ B † ,<br />

Tr ( bA ⊗ cB ) = b c Tr A · Tr B.<br />

EXERCISE 13. Prove the properties of ⊗ in (II. 102).


34 CHAPTER II. THE FORMALISM<br />

Finally, the matrix A ⊗ B of the operator A ⊗ B in the direct product space H 1 ⊗ H 2 is of the<br />

form<br />

⎛ ⎛<br />

⎞<br />

⎞<br />

b 11 · · · b 1N2<br />

⎜<br />

a 11 ⎝<br />

.<br />

. ..<br />

⎟<br />

⎠ · · · a 1N1 B<br />

b N2 1 b N2 N 2 A ⊗ B =<br />

, (II. 103)<br />

a 22 B .<br />

⎜<br />

.<br />

⎝ .<br />

..<br />

⎟<br />

⎠<br />

a N1 1 B · · · a N1 N 1<br />

B<br />

where a ij = ⟨α i | A | α j ⟩ and b kl = B kl = ⟨β k | B | β l ⟩, as in (II. 24). This matrix is called the<br />

Kronecker product of the matrices A and B.<br />

II. 6<br />

ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES<br />

This section is intended for interested readers, who wish to gain more in - depth knowledge of<br />

Hilbert spaces.<br />

In physical applications of quantum mechanics we nearly always need infinite - dimensional Hilbert<br />

spaces. Indeed, this already applies to the case of a free particle in one spatial dimension.<br />

The mathematical theory of infinite - dimensional Hilbert spaces is in some aspects more difficult<br />

than that of finite - dimensional ones.<br />

II. 6. 1<br />

THE STRUCTURE <strong>OF</strong> VECTOR SPACES<br />

An infinite - dimensional space H is a space where for every n independent vectors in H, with<br />

n arbitrarily large, it is always possible to find still another vector in H that is independent of these<br />

vectors. In rough approximation it can be said that all formulas of the previous sections remain valid<br />

if we replace the sums from 1 to N by sums from 1 to infinity. But, of course, attention must be given<br />

to the convergence of such sums. This leads to two extra assumptions which were superfluous in the<br />

theory of finite - dimensional spaces.<br />

(i) Separability. A Hilbert space H is called separable if it has a countable basis, i.e., a countable<br />

set of independent vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H exists such that every vector |ϕ⟩ ∈ H<br />

can, analogously to (II. 14), be written as<br />

|ϕ⟩ =<br />

∞∑<br />

c j |ϕ j ⟩ with c j = ⟨ϕ j | ϕ⟩. (II. 104)<br />

j=1<br />

This equation is shorthand for<br />

lim<br />

m→∞<br />

∥<br />

∥ϕ −<br />

m∑ ∥ ∥∥<br />

c j ϕ j = 0. (II. 105)<br />

j=1


II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 35<br />

(ii) Completeness. We require that the space is complete, which means that every Cauchy sequence,<br />

i.e., a sequence of vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H, for which<br />

lim ∥ϕ j − ϕ k ∥ = 0, (II. 106)<br />

j, k→∞<br />

has a limit vector |ϕ⟩ in H,<br />

lim ∥ϕ m − ϕ∥ = 0. (II. 107)<br />

m→∞<br />

for example, in this sense Q, the set of rational numbers, is incomplete, since many Cauchy<br />

sequences of rational terms exist which have no limit in Q, for instance the series expansions of π<br />

and e. If the limiting points of all Cauchy sequences are added to Q, we obtain exactly R. Q is called<br />

a countably infinite set, R is called an uncountably infinite set.<br />

Below, we will assume Hilbert spaces to be separable and complete.<br />

EXERCISE 14. Prove that every finite - dimensional complex vector space with an inner product<br />

is separable and complete.<br />

The claim in the3 above exercise makes clear that in the finite - dimensional case the requirements<br />

of separability and completeness are indeed superfluous.<br />

The next two spaces are well - known examples of infinite - dimensional Hilbert spaces.<br />

(i) The space of all complex, square integrable functions,<br />

{<br />

∫<br />

}<br />

L 2 (R) := ψ : R → C ∣ |ψ(q)| 2 dq < ∞ , (II. 108)<br />

R<br />

with an inner product defined as<br />

∫<br />

⟨ψ | ϕ⟩ := ψ ∗ (q) ϕ(q) dq, (II. 109)<br />

R<br />

and likewise for L 2 (R n ) with arbitrary n ∈ N + .<br />

(ii) The space of square summable sequences of complex numbers, defined by Erhard Schmidt,<br />

l 2 (N) :=<br />

{<br />

c : N → C ∣<br />

∞∑<br />

j=0<br />

}<br />

|c j | 2 < ∞ , (II. 110)<br />

with inner product<br />

⟨c | d⟩ :=<br />

∞∑<br />

cj ∗ d j . (II. 111)<br />

j=0


36 CHAPTER II. THE FORMALISM<br />

The proof that these vector spaces are complete is not simple, however, the proof that the remaining<br />

requirements for a Hilbert space have been met, is.<br />

These two spaces correspond to two versions of quantum mechanics, where L 2 (R) corresponds<br />

to Schrödingers wave mechanics (1926) and l 2 (N) to the matrix mechanics of Heisenberg, Born, and<br />

Jordan (1925), that is, if we take matrix mechanics in the enriched version of Von Neumann, since<br />

the original version did not contain a ‘state space’. These two versions of quantum mechanics are<br />

mathematically equivalent, see F.A. Muller (1997a, 1997b and 1999) for historical details.<br />

II. 6. 2<br />

OPERATORS<br />

More serious complications occur when introducing of operators on infinite - dimensional Hilbert<br />

spaces. First, we will see that such operators are in general ‘unbounded’, which entails that<br />

they cannot be defined on the entire Hilbert space. Consequently, the definition of sum and product<br />

of operators, as well as their adjoints, becomes more cumbersome, and the terms ‘self - adjoint’<br />

and ‘Hermitian’ no longer coincide. Second, these operators do not always have eigenvectors in H.<br />

Therefore it is more difficult to give a useful version of the spectral theorem.<br />

The second problem is independent of the first, i.e., it can also appear for bounded self - adjoint<br />

operators. ▹<br />

For position and momentum both complications occur together which is shown by an example.<br />

EXAMPLE<br />

Consider the position operator<br />

Q : ψ(q) ↦→ q ψ(q), (II. 112)<br />

and the momentum operator<br />

P : ψ(q) ↦→ − i d ψ(q), (II. 113)<br />

dq<br />

both acting on L 2 (R).<br />

The first problem is that these operators do not map every vector in L 2 (R) to another vector<br />

in L 2 (R). For instance, every non - differentiable function in L 2 (R) is outside the domain of P .<br />

Vice versa, taking for Q, for example, ψ (q) = (a + q) − 3 2 with a ∈ R, we have ψ ∈ L 2 (R),<br />

but Qψ ∉ L 2 (R).<br />

The second problem is that the eigenvalue equation for momentum, −i d dq<br />

ψ (q) = pψ (q), has<br />

solutions ψ (q) ∝ e i pq for p ∈ R, but these functions are not square integrable and therefore<br />

they are not in L 2 (R). Something similar applies to the eigenvalue equation Qψ(q) = q 0 ψ(q)<br />

and its solutions ψ(q) = δ(q − q 0 ).


II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 37<br />

II. 6. 2. 1<br />

UNBOUNDED OPERATORS<br />

Let us start with a definition: an operator A on Hilbert space H is called bounded if the set of<br />

positive numbers ∥Aχ∥ = ∥⟨χ | A | χ⟩∥ has an upper bound for all unit vectors |χ⟩, where the least<br />

upper bound, or supremum, is called the norm of A,<br />

{<br />

}<br />

∥A∥ = sup ∥Aχ∥ ∈ R ∣ ∥χ∥ = 1 . (II. 114)<br />

The set of all bounded operators on H is written as B(H).<br />

In finite - dimensional Hilbert spaces all operators are bounded, but this is not the case in infinite -<br />

dimensional Hilbert spaces. As we want to hold on to the requirement that every vector A|ψ⟩ has a<br />

finite norm, we have to exclude from the domain of A the set of vectors |ϕ⟩ for which<br />

∥A χ∥<br />

∥χ∥<br />

→ ∞ if |χ⟩ → |ϕ⟩. (II. 115)<br />

Therefore, from now an operator A is a linear mapping from a subset of H to H. This subset is called<br />

the domain of A, written as Dom A ⊂ H. Hence, an operator is a linear mapping<br />

ψ ∈ Dom A, A : ψ ↦→ A ψ ∈ H. (II. 116)<br />

We will, however, always assume that Dom A is dense in H which means that every vector ϕ in H<br />

can be approximated arbitrarily well by vectors in Dom A. The foregoing implies that also sums and<br />

products of operators are generally defined on a limited domain only,<br />

Dom (A + B) = Dom A ∩ Dom B (II. 117)<br />

{<br />

}<br />

Dom (A B) = ψ ∈ Dom B : B ψ ∈ Dom A . (II. 118)<br />

It is more difficult to introduce the adjoint A † of an operator A. The operator is again called<br />

Hermitian if<br />

⟨ϕ | A | ψ⟩ = ⟨ψ | A | ϕ⟩ ∗ ∀ ϕ, ψ ∈ Dom A, (II. 119)<br />

but this definition is no longer sufficient for our purposes, as can be seen in the next example.<br />

EXAMPLE<br />

Consider the operator P from (II. 113), now acting on L 2( [0, ∞⟩ ) , and choose as its domain<br />

Dom P =<br />

{<br />

ψ :<br />

∫ ∞<br />

0<br />

∫<br />

|ψ(q)| 2 dq < ∞,<br />

}<br />

|P ψ(q)| 2 dq < ∞, ψ(0) = 0 . (II. 120)<br />

This operator is indeed Hermitian, which can be checked using integration by parts, where the<br />

non - integral term cancels out because of the boundary condition ψ(0) = 0. But the operator is<br />

not self - adjoint, as we will see in the next exercise.


38 CHAPTER II. THE FORMALISM<br />

To introduce the adjoint of an operator we first delimit the domain. Let Dom A † be the set of all<br />

vectors |ϕ⟩ such that a vector |η⟩ exists for which<br />

⟨ϕ | A | ψ⟩ = ⟨η | ψ⟩ ∀ |ψ⟩ ∈ Dom A. (II. 121)<br />

Using the assumption that Dom A is dense in H it is possible to show that if such a vector |η⟩ exists<br />

it is also unique. The adjoint A † of operator A is now, by definition, the mapping<br />

A † : |ϕ⟩ ∈ Dom A † ↦→ |η⟩ := A † |ϕ⟩, (II. 122)<br />

and the operator is called self - adjoint if<br />

A = A † and Dom A = Dom A † . (II. 123)<br />

This requirement is stronger than Hermiticity; it can be shown that in general it holds for Hermitian<br />

operators that Dom A ⊂ Dom A † , instead of (II. 123).<br />

EXERCISE 15. Verify that the domain of P † , with P as in the example above, is indeed larger<br />

than the domain of P .<br />

II. 6. 2. 2<br />

CONTINUOUS SPECTRA<br />

Another aspect in which infinite - dimensional Hilbert spaces deviate from finite - dimensional<br />

ones is the possibility for an operator to have a continuous spectrum, a mathematical impossibility<br />

in the finite - dimensional case since the term ‘spectrum’ was defined as the set of eigenvalues of<br />

operators. Examples of operators with continuous spectra are, again, the position operator and the<br />

momentum operator, whose spectra consist of the entire line of real numbers R. Therefore, the<br />

term ‘spectrum’ needs to be redefined. The spectrum of operator A is now defined as the set of all<br />

values λ ∈ C for which the operator A − λ11 has no inverse operator. To illustrate the deviations from<br />

the finite - dimensional case we give two examples, the angle operator and the angular momentum<br />

operator.<br />

EXAMPLE<br />

Consider the Hilbert space L 2( [0, 2π] ) and the angle operator<br />

Q : ψ(q) ↦→ q ψ(q), 0 q 2 π. (II. 124)<br />

This operator has, analogous to (II. 112), eigenfunctions which are not in H, its spectrum is the<br />

interval [0, 2π], but it is bounded, ∥Q∥ = 2π.


The angular momentum operator<br />

II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 39<br />

L : ψ(q) ↦→ − i d ψ(q), (II. 125)<br />

dq<br />

with domain<br />

Dom L =<br />

{<br />

}<br />

ψ : ∥L ψ∥ < ∞, ψ(0) = ψ(2π) , (II. 126)<br />

does have normalized eigenfunctions,<br />

ψ(q) = 1 √<br />

2 π<br />

e i l q , (II. 127)<br />

and a discrete spectrum l ∈ Z. But, since l can be arbitrarily large, it is unbounded.<br />

II. 6. 2. 3<br />

SPECTRAL THEOREM<br />

Von Neumann succeeded in proving the spectral theorem, in the version of II. 3. 1, for infinite -<br />

dimensional Hilbert spaces for which we can formulate the theorem now.<br />

SPECTRAL THEOREM:<br />

To every normal operator A, bounded or unbounded, corresponds a unique mapping of<br />

subsets of Spec A to the set P (H) of projectors on H, ∆ ↦→ P A (∆), having the following<br />

properties:<br />

(i) P ∅ = 0<br />

(ii) P C = 11<br />

(iii) P ∪i ∆ i<br />

= ∑ i<br />

P ∆i for all ∆ i mutually disjoint. (II. 128)<br />

For the position operator Q we have an explicit expression for the spectral family of eigenprojectors<br />

of Q,<br />

P Q (∆) ψ(q) =<br />

{ q ψ(q) if q ∈ ∆<br />

0 otherwise<br />

, (II. 129)<br />

hence, P Q (∆) is in fact a multiplication with the characteristic function of ∆. The spectral family of<br />

the momentum operator is obtained by applying a Fourier transform to the aforementioned expression.<br />

The probability of finding upon measurement for the physical quantity A, which corresponds to<br />

the normal operator A if the physical system is in the state ψ ∈ H, a value a ∈ ∆ ⊂ R, is<br />

Prob ψ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩, (II. 130)


40 CHAPTER II. THE FORMALISM<br />

which, using (II. 129), yields for the physical quantity position Q<br />

∫<br />

Prob ψ (Q : ∆) = ⟨ψ(q) | P Q (∆) | ψ(q)⟩ = q |ψ(q)| 2 dq. (II. 131)<br />

All empirical statements of quantum mechanics can therefore be expressed in terms of projectors, or,<br />

more precisely, all empirical statements of quantum mechanics concerning physical quantity A can<br />

be expressed in terms of the spectral family of A.<br />

∆<br />

II. 6. 3<br />

DIRAC<br />

Finally we remark that quantum mechanics à la Dirac willingly and knowingly violates Von Neumann’s<br />

postulates by going outside the Hilbert space. Dirac writes (1958, p. 40)<br />

The bra and ket vectors that we now use form a more general space than a Hilbert space.<br />

To make Dirac’s approach mathematically expressible, the French mathematician Laurent Schwarz<br />

developed the theory of distributions, and the Russian mathematical physician I.M. Gel’fand developed<br />

the theory of rigged Hilbert spaces. Contrary to Schrödinger and Von Neumann, Dirac regarded<br />

wave mechanics as a generalization of matrix mechanics, going from a discrete index to a continuous<br />

index, making a transition from square summable sequences of complex numbers to wave functions,<br />

and from infinite matrices to integral kernels.<br />

II. 6. 4<br />

SUMMARY<br />

A complex Hilbert space is, by definition, a complete, separable complex vector space with an inner<br />

product which is related to the norm by ∥ψ∥ 2 = ⟨ψ | ψ⟩, its dimension is either finite or countably<br />

infinite. Contrary to the infinite - dimensional case, the requirements of separability and completeness<br />

are superfluous in the finite - dimensional case because they are derivable from the other properties of<br />

a Hilbert space, but in the vast majority of physical applications infinite - dimensional Hilbert spaces<br />

and unbounded operators are required.


III<br />

THE POSTULATES<br />

The sciences do not try to explain, they hardly even try to interpret, they mainly make<br />

models. By a model is meant a mathematical construct which, with the addition of certain<br />

verbal interpretations, describes observed phenomena. The justification of such a<br />

mathematical construct is solely and precisely that it is expected to work [. . . ]<br />

— John von Neumann<br />

It would seem that the theory is exclusively concerned about ‘results of measurement’,<br />

and has nothing to say about anything else. [. . . ] To restrict quantum mechanics to be<br />

exclusively about piddling laboratory operations is to betray the great enterprise.<br />

— John Bell<br />

In this chapter we will formulate and discuss Von Neumann’s postulates. Next, we will extend the<br />

quantum mechanical concept of ‘pure’ states by adding ‘mixed’ states, and show how quantum<br />

mechanics treats states of subsystems of composite physical systems. Finally, we apply these<br />

concepts to spin 1/2 particles and we derive some formulas needed in subsequent chapters.<br />

III. 1<br />

VON NEUMANN’S POSTULATES<br />

We are now ready to give, in some cases in simplified fashion, Von Neumann’s postulates of<br />

quantum mechanics, which link the physical concepts of the theory to the mathematical concepts of<br />

its formalism.<br />

1. State postulate, pure states. Every physical system has a corresponding Hilbert space H, the<br />

states of the system are completely described by unit vectors in H. A composite physical<br />

system corresponds to the direct product of the Hilbert spaces of the subsystems.<br />

2. Observables postulate. Every physical quantity A of the system corresponds to a self - adjoint<br />

operator A in H. Dirac called the quantities ‘observables’.<br />

3. Spectrum postulate. The only possible outcomes which can be found upon measurement of a<br />

physical quantity A, corresponding to an operator A, are values from the spectrum of A.<br />

4. Born postulate, discrete case. If the system is in a state |ψ⟩ ∈ H, and a measurement is made<br />

of a physical quantity A, corresponding to an operator A with a discrete spectrum Spec A,<br />

probability to find the outcome a i ∈ Spec A, is equal to<br />

Prob |ψ⟩ (a i ) = ⟨ψ | P ai | ψ⟩, (III. 1)


42 CHAPTER III. THE POSTULATES<br />

where P ai is the projector from the spectral decomposition (II. 57) of A.<br />

5. Schrödinger postulate. As long as no measurements are made on the system, the time evolution<br />

of the system is described by a unitary transformation,<br />

|ψ(t)⟩ = U (t, t 0 ) |ψ(t 0 )⟩. (III. 2)<br />

6. Projection postulate, discrete case. If the system is in a state |ψ⟩ ∈ H and a measurement is<br />

made on a physical quantity A corresponding to an operator A with discrete spectrum, and the<br />

outcome of the measurement is the eigenvalue a i ∈ Spec A, the system is, immediately after<br />

the measurement, in the eigenstate<br />

|ψ⟩ P a i<br />

|ψ⟩<br />

. (III. 3)<br />

∥P ai |ψ⟩∥<br />

The first four postulates connect the (undefined) concepts ‘physical system’, ‘state’, ‘quantity’<br />

and ‘measurement’ to mathematical concepts. In the literature the postulates 3 and 4 are sometimes<br />

combined into the so - called measurement postulate. The last two postulates determine the evolution<br />

of the states in time.<br />

Ad 1. The state postulate implies that systems with the same |ψ⟩ are in the same physical state.<br />

The way in which this state vector |ψ⟩ is produced, is thus unimportant. Also the fact that two systems<br />

which are described by the same |ψ⟩ can, upon measurement, have different outcomes, which is<br />

allowed according to the measurement postulate, is no reason to regard their states as being different.<br />

On the other hand, not every pair of mutually different unit vectors also represent different states.<br />

Usually it is assumed that vectors whose only difference is their phase factor e iθ , with θ ∈ R, describe<br />

the same physical state, because they predict the same probability distributions for outcomes of all<br />

possible measurements. Such vectors form a so - called unit ray.<br />

The statement that all unit vectors of H describe physical states also need not be true in general.<br />

Notice that the set of unit vectors is extremely large. Even for a particle in one spatial dimension the<br />

Hilbert space is infinite - dimensional. Furthermore, some types of superposition, linear combinations<br />

of two or more eigenstates, do not occur in nature, for instance superpositions of states with different<br />

charges, i.e., electrical, baryonic etc., or superpositions of states with different spin.<br />

It is possible to prohibit these superpositions in the theory by introducing so - called superselection<br />

rules. The requirement that, for identical particles, only states are allowed which are symmetric or<br />

antisymmetric under permutation of the particles is an example of such a superselection rule. In the<br />

presence of a superselection rule the class of allowed states breaks up into in a direct sum of the<br />

eigenspaces of the superselection operator,<br />

H = ⊕ j=1<br />

H j . (III. 4)<br />

Within one such subspace H j , called a coherent sector, superpositions of all states are allowed.


III. 1. VON NEUMANN’S POSTULATES 43<br />

In absence of superselection rules the entire Hilbert space is one coherent sector. Then the superposition<br />

principle is valid in general, which says that for every two states |ψ⟩ and |ϕ⟩ the linear<br />

combination a|ψ⟩ + b|ϕ⟩, with |a| 2 + |b| 2 = 1, is a state too. Because nature apparently imposes<br />

superselection rules, which can sometimes be derived from symmetries as was first shown by Wick,<br />

Wightman and Wigner (1952), the superposition principle only applies for coherent sectors. Since<br />

superpositions of vectors from different coherent sectors do not correspond to physical states, the<br />

state postulate has to be accordingly reformulated.<br />

As far as composite physical systems are concerned, we say that the system is in an entangled<br />

state iff the state vector is not factorizable, see section II. 5. In the thought experiment of EPR such an<br />

entangled state plays the principal part. Schrödinger (1935b) was the first to show that the occurrence<br />

of entanglement is widespread in quantum mechanics and he considered this to be the cardinal distinction<br />

between classical mechanics and quantum mechanics. In section III. 2 we will further extend<br />

the notion of state.<br />

Ad 2. The question if every self - adjoint operator represents a physical quantity, has, according<br />

to some authors, a negative answer. Wigner, for instance, asked how to measure the quantity corresponding<br />

to the self - adjoint operator P + Q. Another example is a projector which projects on<br />

superpositions of vectors from different coherent sectors, as we saw in Ad 1.<br />

Also the reverse question, whether every physically meaningful quantity is represented by a self -<br />

adjoint operator, is controversial. For some physical quantities which correspond to experimentally<br />

clear measuring procedures, such as ‘time of decay’ in case of a radioactive atom, or the ‘phase’ of<br />

a harmonic oscillator, no associated self - adjoint operator can be found. In later generalizations of<br />

the formalism of quantum mechanics this problem is somewhat relieved by considering more general<br />

mathematical constructions, the so - called positive operator valued measures, which are also capable<br />

of representing physical quantities; see for example A.S. Holevo (1982) or Busch, Grabowski and<br />

Lahti (1995).<br />

Another question is which operator exactly corresponds to which quantity. Again, no commonly<br />

accepted recipe is available here. Generally, one starts with demanding that certain classical quantities<br />

are represented by special operators. It is standard procedure to choose position and momentum to<br />

be these quantities and to require that the corresponding operators satisfy the canonical commutation<br />

relation of Born and Jordan (1925), and Dirac (1925),<br />

[P, Q] := P Q − Q P = − i 11. (III. 5)<br />

Next, a certain ‘quantization prescription’ is chosen which can be used to construct an operator<br />

corresponding to more general physical quantities. Dirac’s mathematical prescription of replacing<br />

Poisson brackets by commutators is famous. Unfortunately, this prescription is inconsistent. The<br />

alternative prescriptions for quantization which have been presented for this purpose, do not mutually<br />

agree. We will not discuss this problem further.<br />

Ad 4. With P ψ = |ψ⟩ ⟨ψ|, as defined in (II. 37), P ai as in (II. 54), and using the relation (II. 56),<br />

the probability of finding a value a i ∈ Spec A, in a measurement of the physical quantity A with


44 CHAPTER III. THE POSTULATES<br />

corresponding operator A, can also be written as<br />

⟨ψ | P ai | ψ⟩ =<br />

∑n i<br />

j=1<br />

⟨ψ | a i , j⟩ ⟨a i , j | ψ⟩ =<br />

∑n i<br />

j=1<br />

|⟨a i , j | ψ⟩| 2 = Tr P ai P ψ (III. 6)<br />

Likewise, the expectation value of A, with A as defined in (II. 55), is<br />

⟨A⟩ ψ = ⟨ψ | A | ψ⟩ =<br />

M∑ ∑n i<br />

⟨ψ | a i , j⟩ a i ⟨a i , j | ψ⟩ =<br />

i=1 j=1<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

a i |⟨a i , j | ψ⟩| 2 = Tr(III. AP ψ 7) .<br />

In case there is no degeneracy, (III. 6) takes the simpler form<br />

⟨ψ | P ai | ψ⟩ = |⟨a i | ψ⟩| 2 = Tr P ψ P ai . (III. 8)<br />

We also note that in case A has a continuous spectrum, as discussed in section II. 6, we have (II. 130),<br />

Prob |ψ⟩ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩. ▹ (III. 9)<br />

Ad 5. If the system is invariant under translations in time, the unitary evolution operator U (t, t 0 )<br />

depends only on the time difference t − t 0 , and can be written as U (t − t 0 ). The evolution operators<br />

then form a continuous abelian Lie group, the group of translations in time, satisfying the group multiplication<br />

structure U(t) U(t ′ ) = U(t + t ′ ). According to the Stone - Von Neumann theorem (1932),<br />

they can be written as<br />

U (t) = e − i H t (III. 10)<br />

where H is a unique self - adjoint operator H as the generator of the Lie group. H is called the<br />

Hamiltonian. Therefore, the evolution operator U (t − t0) from the Schrödinger postulate can be<br />

written as<br />

U (t − t 0 ) = e − i H (t−t 0) , (III. 11)<br />

and the Schrödinger equation is, according to (III. 2),<br />

i d dt |ψ(t)⟩ = i d dt e − i H (t − t 0) |ψ(t 0 )⟩ = H |ψ(t)⟩. (III. 12)


III. 2. PURE AND MIXED STATES 45<br />

Ad 6. This is the notorious projection postulate. It introduces a second kind of dynamics in<br />

the theory; a projector is, in general, not unitary and therefore it cannot be written in terms of the<br />

Schrödinger postulate. Some authors do not regard the projection postulate to be a part of quantum<br />

mechanics. The problem is then how to account for the measurement process using the other<br />

postulates, this will be discussed further in chapter VIII.<br />

The version of the projection postulate we gave is a stronger version of Von Neumann’s original<br />

formulation and is defined by G. Lüders (1951). Von Neumann only required that the state, directly<br />

after a measurement of A which has a i as an outcome, is an (arbitrary) eigenstate with eigenvalue a i .<br />

In Lüders’ version the state directly after the measurement, (III. 3), is the normalized projection of the<br />

original state on the eigenspace of a i . Here the disturbance of the original state is as small as possible,<br />

in the sense that the angle between the original and the final state is as small as possible.<br />

If the operator A is maximal, both versions coincide because in that case P ai is a 1 - dimensional<br />

projector.<br />

III. 2<br />

PURE AND MIXED STATES<br />

A state vector, a unit vector in H, provides a description of the system which is as complete as the<br />

theory allows. In classical mechanics such a description corresponds, for a system of point particles,<br />

to giving all coordinates of position and momentum; (q, p) := (q 1 , . . . , q n ; p 1 , . . . , p n ) is a point<br />

in the phase space Γ. In practice, the value of these coordinates is often not known precisely and a<br />

probability distribution ρ(q, p) is introduced over the phase space Γ. The integral of ρ(q, p) over ∆<br />

is the probability to find the system in the subset ∆ ⊆ Γ. The probabilities have to be positive and<br />

normalized,<br />

∫<br />

ρ(q, p) 0 and ρ(q, p) dq dp = 1. (III. 13)<br />

Γ<br />

In classical physics it is also customary to extend the notion of state and also call a probability<br />

distribution ρ a (generalized) state of the system. A physical quantity A corresponds to a real function<br />

on the phase space, A : Γ → R, and the expectation value of A in the state ρ is<br />

∫<br />

⟨A⟩ ρ := A(q, p) ρ(q, p) dq dp. (III. 14)<br />

Γ<br />

The states ρ form a convex set, i.e., if ρ 1 and ρ 2 are states on Γ and w 1 and w 2 both are real numbers<br />

satisfying<br />

then<br />

0 w i 1 and w 1 + w 2 = 1, (III. 15)<br />

ρ := w 1 ρ 1 + w 2 ρ 2 (III. 16)<br />

also satisfies the requirements of (III. 13) and therefore it is also a state on Γ. This convex set of states<br />

is written S (Γ).


46 CHAPTER III. THE POSTULATES<br />

A state which cannot be decomposed according to (III. 16) is called a pure state, otherwise it<br />

is called a mixed state. The pure states are the states ρ concentrated on a single point of Γ, the δ -<br />

‘functions’. Generally, the elements of a convex set which cannot be written in the form (III. 16),<br />

with w 1 , w 2 ≠ 0, are called extreme elements of that set, therefore in our case the extreme elements<br />

are the pure states. Every element of a convex set can always be written as a convex sum of extreme<br />

elements. This corresponds to the expansion of ρ to δ - functions,<br />

∫<br />

ρ(q, p) = ρ(q ′ , p ′ ) δ(q − q ′ ) δ(p − p ′ ) dq ′ dp ′ . (III. 17)<br />

Γ<br />

The dynamics of an arbitrary state follows from the Hamiltonian equations of motion of the pure<br />

states, found by calculating the path of least energy. This holds for conservative systems, which<br />

the quantum mechanical states in these lecture notes are assumed to be. We will come back to the<br />

derivation of the equations in section VI. 5.<br />

The Hamiltonian equations of motion are<br />

˙q = ∂H<br />

∂p<br />

and<br />

ṗ = − ∂H . (III. 18)<br />

∂q<br />

To find the equation of motion in terms of ρ we use Liouville’s theorem which states that for points<br />

moving in phase space obeying the Hamiltonian equations of motion the time evolution of the probability<br />

distribution ρ(q, p, t) is constant. Using (III. 18) and the Poisson brackets<br />

{H, ρ} :=<br />

( ∂H<br />

∂q<br />

∂ρ<br />

∂p − ∂ρ<br />

∂q<br />

∂H<br />

∂p<br />

the Liouville equation, the equation of motion for the state ρ<br />

)<br />

, (III. 19)<br />

equals<br />

dρ<br />

dt = ∂ρ<br />

∂t + ∂ρ<br />

∂q<br />

∂ρ ˙q + ṗ = 0, (III. 20)<br />

∂p<br />

∂ρ<br />

∂t<br />

= {H, ρ}. (III. 21)<br />

Now we will consider, analogous to the classical case, a probability distribution of the state vectors<br />

in H. With help of the state ρ we introduce a mapping µ of subsets ∆ of Γ to R,<br />

∫<br />

µ(∆) := ρ(q, p) dq dp with ∆ ⊆ Γ. (III. 22)<br />

∆<br />

This mapping µ is additive<br />

µ (∪ i ∆ i ) = ∑ i=1<br />

µ(∆ i ) (III. 23)<br />

for every countable sequence of disjoint ∆ i ⊂ Γ. Furthermore,<br />

0 µ(∆) 1, µ(∅) = 0 and µ(Γ) = 1. (III. 24)


III. 2. PURE AND MIXED STATES 47<br />

Each mapping which maps a measurable subset of Γ to a number in the interval [0, 1], thereby<br />

satisfying (III. 23) and (III. 24), is called a probability measure. It is not difficult to see that every<br />

probability distribution ρ corresponds univocally to a probability measure and vice versa, this is even<br />

true for δ - functions. Therefore we can also represent a state, in the extended meaning, by a probability<br />

measure on Γ.<br />

Analogous to this reasoning we now aim to let the physical states in quantum mechanics correspond<br />

to probability measures on H. Since we want to preserve the structure of H, we do not consider<br />

arbitrary subsets of H, instead we look at the set P(H) of all subspaces of H generated by orthogonal<br />

projectors, or, equivalently, at the projectors projecting on those subspaces. What we are thus looking<br />

for is a probability measure on P (H), i.e. a mapping<br />

µ : P (H) → [0, 1], (III. 25)<br />

which is additive in the relevant manner; if P 1 , P 2 , . . . , P N is a set of pairwise orthogonal projectors,<br />

P i ⊥ P j for i ≠ j, the following holds,<br />

( ∑<br />

µ<br />

j<br />

P j<br />

)<br />

= ∑ j<br />

and the mapping satisfies<br />

µ(P j ), (III. 26)<br />

µ(0 ) = 0 and µ(11) = 1. (III. 27)<br />

In 1957 A.M. Gleason proved the following theorem.<br />

GLEASON’S THEOREM:<br />

Every probability measure µ on P (H) can, under the condition that dim H > 2, be<br />

written as<br />

µ(P ) = Tr P W, (III. 28)<br />

for a certain operator W satisfying the following requirements: 1<br />

(i) W = W † ,<br />

(ii) ⟨ψ | W | ψ⟩ 0 ∀ |ψ⟩ ∈ H,<br />

(iii) Tr W = 1. (III. 29)<br />

The original proof of Gleason’s theorem is extraordinarily difficult. In the appendix of these<br />

lecture notes, p. 183, ff, we prove a simplified version of this theorem for the interested reader.<br />

1 Conditions (i) and (ii) of (III. 29) are not mutually independent in the complex Hilbert space of the formalism of<br />

quantum mechanics, in this space (i) is in fact superfluous. In a complex Hilbert space all positive operators are self -<br />

adjoint, and an operator A is uniquely defined by all matrix elements of the form ⟨ψ | A | ψ⟩. This is, however, not the case<br />

in a real space, where Gleason’s theorem is also valid. In that case (i) and (ii) are independent.


48 CHAPTER III. THE POSTULATES<br />

Here we prove that (III. 28) indeed satisfies the requirements (III. 25), (III. 26) and (III. 27) of a<br />

probability measure.<br />

Proof<br />

Requirement (III. 27) is obvious, and verification of (III. 26) can be done with (II. 31). To<br />

prove (III. 25), i.e.<br />

µ(P ) = Tr P W ∈ [0, 1], (III. 30)<br />

we choose an orthonormal basis of eigenvectors of P ; P |v k ⟩ = |v k ⟩, P |u l ⟩ = 0. Then<br />

Tr P W = ∑ k<br />

⟨v k | P W | v k ⟩ + ∑ l<br />

⟨u l | P W | u l ⟩<br />

= ∑ k<br />

⟨v k | W | v k ⟩ 0, (III. 31)<br />

due to the positivity of the operators W . If P is a projector, then 11 − P is one also, therefore<br />

Tr (11 − P )W 0, (III. 32)<br />

such that indeed, with (III. 29) (iii), we see that<br />

0 Tr P W + Tr (11 − P )W = Tr (P + 11 − P )W = Tr W = 1. □ (III. 33)<br />

An important aspect of Gleason’s theorem is the fact that the probability measure (III. 28) is<br />

continuous in P . For measures representing pure states, this is proved in the appendix on p. 183, ff.<br />

If dim H = 2, discontinuous probability measures exist on P (H). To see this, consider a real H.<br />

The 1 - dimensional subspaces are lines through the origin connecting opposite points on the circle.<br />

Attaching values as in the diagram, figure III. 1,<br />

P 2<br />

1<br />

0<br />

P 1<br />

0<br />

1<br />

Figure III. 1: A discontinuous measure for dim H = 2<br />

we see that, with µ(0 ) = 0 and µ(11) = 1, for two arbitrary orthogonal projectors we have<br />

µ(P 1 ) + µ(P 2 ) = 1 = µ(11) = µ(P 1 + P 2 ). (III. 34)


III. 2. PURE AND MIXED STATES 49<br />

This measure is indeed additive, but we also see that it is not continuous, and consequently, Gleason’s<br />

theorem does not hold for dim H = 2.<br />

The operator W is known as the statistical operator, or as the density matrix, or the state operator.<br />

In analogy with the classical case we extend the notion of state and call W a state of the physical<br />

system. From now on states will be represented by the state operators W .<br />

The state operators W form a set S(H) which is again convex; if W 1 and W 2 are state operators,<br />

then<br />

W = w 1 W 1 + w 2 W 2 with 0 w i 1 and w 1 + w 2 = 1 (III. 35)<br />

is again a state operator. The most simple example of a state operator is a 1 - dimensional projector.<br />

A higher - dimensional projector is not a state operator.<br />

EXERCISE 16. Why not?<br />

Before showing how the state operators W represent states, we will prove the next theorem.<br />

THEOREM:<br />

The 1 - dimensional projectors in P(H) are the extreme elements of the convex set S(H)<br />

of all state operators W on H.<br />

Proof<br />

To prove this theorem we first have to show that P ψ cannot be written in the form<br />

P ψ = w W 1 + (1 − w) W 2 , with 0 w 1. (III. 36)<br />

Suppose it could be done. Then, using (II. 37), it also has to hold that, for all |ϕ⟩ ⊥ |ψ⟩,<br />

which implies<br />

⟨ϕ | P ψ | ϕ⟩ = 0 = w ⟨ϕ | W 1 | ϕ⟩ + (1 − w) ⟨ϕ | W 2 | ϕ⟩, (III. 37)<br />

⟨ϕ | W 1 | ϕ⟩ = ⟨ϕ | W 2 | ϕ⟩ = 0. (III. 38)<br />

Now, a positive operator can always be written as the square of a self - adjoint operator, W i = A 2 i ,<br />

yielding that for all |ϕ⟩ ⊥ |ψ⟩<br />

⟨ϕ | W i | ϕ⟩ = ⟨ϕ | A 2 i | ϕ⟩ = ∥A i |ϕ⟩∥ 2 = 0 ⇒ A i |ϕ⟩ = 0 ⇒ W i |ϕ⟩ = 0 .(III. 39)<br />

Therefore, W 1 and W 2 map to the 1 - dimensional space spanned by |ψ⟩. They are, according<br />

to (III. 29), therefore, both identical to the projector P ψ ,<br />

W 1 = P ψ = W 2 . (III. 40)<br />

We thus conclude that P ψ cannot be split up into other state operators.


50 CHAPTER III. THE POSTULATES<br />

Now we have to show that the 1 - dimensional projectors are the only extreme elements. A state<br />

operator is self - adjoint and has, according to the spectral theorem, p. 26, a complete orthonormal<br />

set of eigenstates |w i , j⟩, where j is the degeneracy, j = 1, . . . , n i , and which has M ∈ N +<br />

different w i . We can write an arbitrary W ∈ S (H) as<br />

W =<br />

M∑<br />

∑n i<br />

i=1 j=1<br />

w i W i,j , (III. 41)<br />

where<br />

W i,j := |w i , j⟩ ⟨w i , j|, and<br />

M∑<br />

n i = dim H. (III. 42)<br />

i=1<br />

For w i it holds that<br />

M∑<br />

n i w i = 1 and 0 < w i < 1 (III. 43)<br />

i=1<br />

because, according to (III. 29) (ii) and (III. 29) (iii),<br />

w i = ⟨w i , j | W | w i , j⟩ 0 and Tr W =<br />

M∑<br />

n i w i = 1. (III. 44)<br />

i=1<br />

Thus we se that the sum (III. 41) is a convex decomposition of W .<br />

A convex decomposition W = w 1 W 1 + w 2 W 2 can always be decomposed further through<br />

expansion of W 1 and W 2 . In case of a bounded convex set the expansion ends on extreme<br />

elements. Therefore, if W is an extreme element, the sum has to reduce to one term. In that case<br />

W is a 1 - dimensional projector, and we see that all extreme elements of S(H) are 1 - dimensional<br />

projectors. □<br />

Physical states which are represented by 1 - dimensional projectors are called pure states, where<br />

states which can be divided non - trivially are called mixed states or mixtures. To see that pure states<br />

correspond to the vector states of H, consider W to be the 1 - dimensional projector P ψ projecting<br />

on the vector |ψ⟩. The state defined by this state operator through (III. 28) behaves exactly like the<br />

vector state |ψ⟩; for arbitrary |ϕ⟩ it holds that<br />

µ W (P ϕ ) = Tr P ϕ W = Tr P ϕ P ψ = ⟨ψ | P ϕ | ψ⟩ = |⟨ψ | ϕ⟩| 2 , (III. 45)<br />

which means that the probability to find the state |ϕ⟩ in the state |ψ⟩ is equal to (III. 6). 2 It holds<br />

especially that µ(P ψ ) = 1, and if |ϕ⟩ ⊥ |ψ⟩, then µ(P ϕ ) = 0. We see that the state P ψ assigns a<br />

2 ‘The probability to find the state |ϕ⟩’ is shorthand for the probability to find, upon measurement of the quantity corresponding<br />

to the projector |ϕ⟩ ⟨ϕ|, the value 1.


III. 3. THE INTERPRETATION <strong>OF</strong> MIXED STATES 51<br />

probability to the orthogonal set of vectors from which |ψ⟩ is an element, which is totally concentrated<br />

on the vector |ψ⟩. In this sense P ψ is analogous to a δ - distribution on the classical phase space.<br />

But the 1 - dimensional projectors are, generally, not mutually orthogonal which means that the<br />

pure state P ψ also assigns a positive probability to P ϕ if ⟨ϕ | ψ⟩ ̸= 0. This is contradictory to the<br />

classical case, where the pure state, which is concentrated on (p 0 , q 0 ), i.e., δ(q − q 0 , p − p 0 ), always<br />

assigns a zero probability to every other pure state. This is characteristic for quantum mechanics and<br />

is the cause for the radical difference between quantum states and classical states.<br />

In this section we showed that a unique correspondence exists between the pure states, the extreme<br />

elements of the convex set S (H) of state operators, the 1 - dimensional projectors and, up to a phase<br />

factor, the unit vectors in H. We will conclude this section with a formulation of the extended version<br />

of the state postulate (1) and the generalization of the Born postulate (4).<br />

1 ′ State postulate, mixed and pure states. Every physical system has a corresponding Hilbert<br />

space. The mixed physical states of the system uniquely correspond to the state operators<br />

within S (H), the pure physical states of the system uniquely correspond to the state operators<br />

on the boundary ∂ S (H). States of composite physical systems correspond bijectively to state<br />

operators on the direct product space of the state spaces H 1 and H 2 of the subsystems, i.e., with<br />

elements of S (H 1 ⊗ H 2 ).<br />

◃ 3<br />

4 ′ Generalized Born postulate, discrete case. If the system is in the state W ∈ S (H), the probability<br />

to find, upon measurement of quantity A corresponding to an operator A having a discrete<br />

spectrum, an eigenvalue in ∆ ⊆ Spec A, is equal to<br />

Prob W (A : ∆) = Tr P A (∆)W, (III. 46)<br />

where P A (∆) ∈ P (H) projects on the subspace span by the eigenvectors having their eigenvalues<br />

in ∆.<br />

III. 3<br />

THE INTERPRETATION <strong>OF</strong> MIXED STATES<br />

The spectral decomposition (III. 41) suggests an interpretation of the state W . As we saw in (III. 45),<br />

a pure state W = P ψ corresponds to a probability measure µ, which we call concentrated on the<br />

eigenvector |ψ⟩ since µ(P ψ ) = 1. In the same way an arbitrary W corresponds, according to (III. 41),<br />

to a probability measure on its orthonormal set of eigenvectors |w i , j⟩, assigning a probability w i to<br />

the eigenvector |w i , j⟩. With the projector W i,j as in (III. 42), we have<br />

µ W (W i,j ) = Tr W i,j W = Tr<br />

M∑<br />

k=1<br />

n k ∑<br />

l=1<br />

W i,j w k |w k , l⟩ ⟨w k , l|<br />

=<br />

M∑<br />

k=1<br />

n k ∑<br />

l=1<br />

w k |⟨w k , l | w i , j⟩| 2 = w k δ ik δ jl = w i . (III. 47)<br />

3 Notice how in this extended version of the state postulate the annoying phase factor has disappeared.


52 CHAPTER III. THE POSTULATES<br />

The expectation value of operator A is, according to (III. 7) and replacing P ψ by W , also forming<br />

an orthonormal basis,<br />

⟨A⟩ W = Tr AW, (III. 48)<br />

which yields, using again (III. 7) and the spectral decomposition of W , (III. 41),<br />

⟨A⟩ W = Tr<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

A w i W i,j<br />

=<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i Tr AW i,j =<br />

M∑<br />

i=1<br />

w i<br />

∑n i<br />

j=1<br />

⟨w i , j | A | w i , j⟩. (III. 49)<br />

This is exactly the weighted sum of w i and the expectation values of A in the states |w i , j⟩.<br />

The above suggests that W describes an ensemble of physical systems each of which is in one<br />

of the pure states |w i , j⟩ and that w i is the fraction of systems in |w i , j⟩. This is the way Von Neumann<br />

originally introduced state operators, in analogy to ensembles in classical statistical mechanics,<br />

hence his terminology statistical operator. But this attractive interpretation, known as the ignorance<br />

interpretation of mixtures, is not without problems as we will show now.<br />

In case of degeneracy the choice of the basis vectors in (III. 41) is not unique, and the projector P i<br />

in the subspace corresponding to the eigenvalue w i can be written in terms of basis states in arbitrarily<br />

many ways,<br />

∑n i<br />

j=1<br />

|w i , j⟩ ⟨w i , j| =<br />

∑n i<br />

k=1<br />

|w i , k⟩ ⟨w i , k|, (III. 50)<br />

with { |w i , k⟩} another arbitrary orthonormal basis in this subspace. Therefore, given any W we<br />

cannot say of which vector states the ensemble is composed. To see that this is a general phenomenon,<br />

consider the operator<br />

W =<br />

K∑<br />

p k U k =<br />

k=1<br />

K∑<br />

p k |u k ⟩ ⟨u k |. (III. 51)<br />

k=1<br />

Here K ∈ N + is arbitrary and {|u k ⟩} is an arbitrary basis of unit vectors which are, in general,<br />

not orthogonal, but as long as the p k satisfy 0 p k 1 and ∑ p k = 1, as required in (III. 35), the<br />

operator W in (III. 51) is still a state operator.<br />

Indeed, equation (III. 51) is an alternative decomposition of W into extreme elements, just like<br />

the spectral decomposition. We see that, in contrast to the classical case, convex decompostions are<br />

not unique.<br />

According to the ignorance interpretation, W describes the ensemble as consisting of systems of<br />

which a fraction p k is in the state |u k ⟩, e.g.<br />

⟨A⟩ W = Tr AW =<br />

K∑<br />

p k ⟨u k | A | u k ⟩, (III. 52)<br />

k=1


ut the probability to find the system in |u k ⟩ is<br />

µ W (U k ) = Tr U k W = Tr<br />

III. 3. THE INTERPRETATION <strong>OF</strong> MIXED STATES 53<br />

K∑<br />

m=1<br />

U k p m |u m ⟩ ⟨u m | =<br />

K∑<br />

m=1<br />

p m |⟨u k | u m ⟩| 2 (III. 53)<br />

Although the result (III. 52) is in accordance with the behavior of an ensemble of systems being in<br />

the state |u k ⟩ with probability p k , we see that for (III. 53), contrary to (III. 47), the outcome, i.e. the<br />

probability to find in (III. 51) the state |u k ⟩, is in general not p k , which is a consequence of the non -<br />

orthogonality of the states |u k ⟩. On the other hand, (III. 51) can always be written in the form (III. 41),<br />

in terms of the orthonormal set of eigenvectors of W , which leads to the conclusion that ensembles<br />

which are interpreted as being physically completely different, are described by the same operator W .<br />

This can be compared with the fact that a pure state |ψ⟩ can be written in numerous ways as a<br />

superposition of other pure states, which corresponds to different ways of preparation of |ψ⟩ by superposition<br />

of other states, for instance in a tilted Stern - Gerlach apparatus in case of measurement of<br />

spin. We can no longer see if |ψ⟩ is, for example, a superposition of spin up and down in the z - direction,<br />

or of spin up and down in the x - direction.<br />

For pure states this seems completely natural; it is a direct consequence of the state postulate<br />

which forms a vector space of states. In case of mixed states the situation is less clear. It can be<br />

maintained that an ensemble, of which each system is in the state |u k ⟩ with probability p k , really<br />

differs from an ensemble of systems which are in a state |w i , j⟩ with probability w i , even though the<br />

expectation values of all physical quantities are equal for both ensembles. In that case, from<br />

W =<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i |w i , j⟩ ⟨w i , j| =<br />

K∑<br />

p k |u k ⟩ ⟨u k | (III. 54)<br />

k=1<br />

it has to be concluded that the state operator W characterizes these ensembles incompletely. There is<br />

no postulate in quantum mechanics by which this is prohibited.<br />

Another view is, however, that the state operator is a complete description of a state, the different<br />

possible ways of preparation are not retrievable from the state W . Consequently, the conclusion has<br />

to be that W , in (III. 51), does not characterize an ensemble which exists of a mixture of systems in<br />

pure states |u k ⟩, but an ensemble characterized by W only presents itself as such an ensemble upon<br />

measurement. Here we see again that, in quantum mechanics, we get in trouble if we speak in terms<br />

of what really exists. In section III. 5 we will return to this discussion in the context of improperly<br />

mixed states.<br />

The dynamics of mixed states follows, as in the classical case, from the pure states. Define,<br />

analogously to (III. 41),<br />

W (t) :=<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i W i,j (t). (III. 55)<br />

According to the Schrödinger postulate, (III. 2),<br />

|w i , j, t⟩ := U (t − t 0 ) |w i , j, t 0 ⟩, (III. 56)


54 CHAPTER III. THE POSTULATES<br />

which yields for (III. 55)<br />

W (t) =<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i U (t − t 0 ) W i,j (t 0 ) U † (t − t 0 ), (III. 57)<br />

and therefore<br />

W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 58)<br />

With (III. 11) we find<br />

i d dt W (t) = [H, W (t 0)], (III. 59)<br />

which is the analogue of the Liouville equation of motion, (III. 21), describing the time evolution of<br />

the states ρ. Equation (III. 59) is called the Liouville - Von Neumann equation, it is the generalization<br />

of the Schrödinger equation to an equation for mixed states.<br />

The extensions of the Schrödinger postulate and the projection postulate for mixed states can now<br />

be formulated.<br />

5 ′ Generalized Schrödinger postulate. If no measurements are made on the physical system, the<br />

time evolution of the state of the system is described by a unitary transformation,<br />

W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 60)<br />

6 ′ Generalized projection postulate, discrete case. If the system is in a state W when a measurement<br />

is made on a physical quantity A corresponding to an operator A having a discrete spectrum,<br />

and the outcome of the measurement is the eigenvalue a i ∈ R, the system is, directly<br />

after the measurement, in the eigenspace corresponding to the eigenvalue a i ,<br />

W P a i<br />

W P ai<br />

Tr P ai W P ai<br />

. (III. 61)<br />

◃ Remark<br />

Remember that, in general, the projectors P ai do not have to be 1 - dimensional. ▹<br />

Finally, we give a theorem concerning the generalized Schrödinger postulate which is important<br />

for the measurement problem.<br />

VON NEUMANN’S THEOREM A:<br />

The properties ‘pure’ and ‘mixed’ are invariant under a unitary time evolution.


III. 4. COMPOSITE SYSTEMS 55<br />

Proof<br />

We know that if W is pure, i.e. equal to a 1-dimensional projector, then W 2 = W .<br />

Now consider the expression (sometimes called the purity of W ):<br />

Tr W 2 = ∑ i,<br />

w 2<br />

i (III. 62)<br />

since Tr W = 1 → ∑ i w i = 1, and W is pure iff exactly one of the w i is equal to 1, and all<br />

others vanish, we conclude that<br />

Tr W 2 = 1iff W is pure; Tr W 2 < 1iff W is mixed (III. 63)<br />

But Tr W 2 is invariant under the time evolution (III. 60). Indeed, if we remember that U † (t −<br />

t 0 ) = U −1 (t − t 0 ) and that Tr AB = Tr BA, it follows that<br />

Tr (W(t)) 2 = Tr U(t−t 0 ) W(t 0 ) U † (t−t 0 )U(t−t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U(t−t 0 ) W(t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U<br />

□<br />

III. 4<br />

COMPOSITE SYSTEMS<br />

Suppose that a system S is composed of two subsystems S I and S II . The Hilbert spaces associated<br />

with S I and S II are H I and H II , with dim H I = N I and dim H II = N II , the Hilbert space<br />

associated with S is the direct product space H = H I ⊗ H II , with dim H = N. If |α 1 ⟩, . . . , |α n ⟩<br />

and |β 1 ⟩, . . . , |β m ⟩ are bases of the subspaces H I and H II , {|α i ⟩ ⊗ |β j ⟩} forms a basis in H. An<br />

arbitrary vector in H is a superposition of such direct products of basis vectors and is generally not of<br />

the form |ψ⟩ ⊗ |ϕ⟩, with |ψ⟩ ∈ H I and |ϕ⟩ ∈ H II . Consequently, one cannot say for such an arbitrary<br />

state in H that the subsystems are in some pure state in H I or H II .<br />

This entanglement of the subsystems, when |Ψ⟩ ̸= |ψ⟩ ⊗ |ϕ⟩, with |Ψ⟩ ∈ H, which is characteristic<br />

for quantum mechanics, has no analogue in classical mechanics. It is a consequence of the formal<br />

requirement that the state space of a composite system is also a vector space. Entanglement is the aspect<br />

of the quantum mechanical description that gives rise to the EPR - paradox and the measurement<br />

problem as we shall see in later chapters.<br />

The quantities of system S correspond to self - adjoint operators in H. We make the supposition<br />

that quantities of the subsystem S I correspond to operators of the form A ⊗ 11 in H, where A is<br />

a self - adjoint operator in H I , and quantities of S II correspond analogously to operators of the<br />

form 11 ⊗ B, with B in H II . A state of S is given by a state operator W in H; W ∈ S (H). In<br />

general, W is not a direct product of operators, but in case W can be written as a direct product, we<br />

write W = W 1 ⊗ W 2 , with W 1 and W 2 state operators in H I and H II , respectively.


56 CHAPTER III. THE POSTULATES<br />

EXERCISE 17. Prove the following statements.<br />

(a) W = W 1 ⊗ W 2 is a state operator if W 1 and W 2 are state operators.<br />

(b) The opposite of (a) is not true; give a counterexample.<br />

(c) W = W 1 ⊗ W 2 is pure iff both W 1 and W 2 are pure.<br />

EXERCISE 18. Prove that for all vectors |ψ⟩, |ψ ′ ⟩ ∈ H I and |ϕ⟩, |ϕ ′ ⟩ ∈ H II we have<br />

(<br />

|ψ⟩ ⊗ |ϕ⟩<br />

)(<br />

⟨ψ<br />

′<br />

| ⊗ ⟨ϕ ′ | ) = |ψ⟩ ⟨ψ ′ | ⊗ |ϕ⟩ ⟨ϕ ′ |. (III. 65)<br />

THEOREM:<br />

If W is a direct product of operators, W = W 1 ⊗ W 2 , the subsystems are mutually<br />

independent, i.e., the probability to find for A⊗11 the value a i and for 11⊗B the value b j<br />

is equal to the product of the separate probabilities. In this case the expectation values<br />

factorize too, such that ⟨A ⊗ B⟩ W 1 ⊗ W 2<br />

= ⟨A⟩ W 1<br />

⟨B⟩ W 2<br />

.<br />

Proof<br />

Let a i and b j be eigenvalues of A and B, respectively. Using (III. 65) we see that the projector on<br />

the eigenstate |a i ⟩ ⊗ |b j ⟩ of A ⊗ B is P |ai⟩ ⊗ P |bj⟩. Therefore, with (II. 102), p. 33,<br />

( )<br />

µ W P|ai⟩ ⊗ P |bj⟩<br />

= Tr ( )( )<br />

P |ai⟩ ⊗ P |bj⟩ W 1 ⊗ W 2<br />

= Tr ( )<br />

P |ai ⟩W 1 ⊗ P |bj ⟩W 2<br />

= Tr P |ai ⟩W 1 Tr P |bj ⟩W 2<br />

=<br />

( ( )<br />

µ W 1 P|ai⟩)<br />

µW<br />

2<br />

P|bj⟩<br />

=<br />

(<br />

µ W P|ai⟩ ⊗ 11 ) (<br />

µ W 11 ⊗ P|bj⟩)<br />

, (III. 66)<br />

which proves the first part of the theorem.<br />

For the factorization of the expectation values, we have, analogously,<br />

⟨A ⊗ B⟩ W 1 ⊗W 2<br />

= Tr (A ⊗ B)(W 1 ⊗ W 2 ) = Tr AW 1 Tr BW 2<br />

= ⟨A⟩ W 1<br />

⟨B⟩ W 2<br />

, (III. 67)<br />

and we see that the expectation values indeed factorize. □<br />

From (III. 67) we also see that, if W = W 1 ⊗ W 2 , then ⟨A ⊗ 11⟩ W = Tr A W 1 = ⟨A⟩ W1<br />

and ⟨11 ⊗ B⟩ W = ⟨B⟩ W2 , but this does not hold for more general statistical operators W .


III. 4. COMPOSITE SYSTEMS 57<br />

With (II. 99), for an arbitrary state operator W , hence in general W ≠ W 1 ⊗ W 2 , the expectation<br />

value of A ⊗ 11 is<br />

⟨A ⊗ 11⟩ W = Tr (A ⊗ 11)W<br />

=<br />

=<br />

=<br />

∑N I ∑N II<br />

i=1<br />

N I ∑<br />

i=1<br />

N I ∑<br />

i=1<br />

j=1<br />

N I ∑<br />

k=1 j=1<br />

N I<br />

(<br />

⟨αi | ⊗ ⟨β j | )( A ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )<br />

N II ∑<br />

∑<br />

⟨α i | A | α k ⟩<br />

k=1<br />

(<br />

⟨αi | ⊗ ⟨β j | )( A |α k ⟩ ⟨α k | ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )<br />

N II ∑<br />

j=1<br />

(<br />

⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) . (III. 68)<br />

To find ⟨A ⊗ 11⟩ W , define the operator W I in H I , called the partial trace of W in relation to H II ,<br />

W I = Tr II W :=<br />

N II<br />

∑<br />

⟨β j | W | β j ⟩, W I ∈ S (H I ). (III. 69)<br />

j=1<br />

For this partial trace it holds that<br />

⟨α k | W I | α i ⟩ =<br />

N II ∑<br />

j=1<br />

and substituting (III. 70) in (III. 68) yields<br />

⟨A ⊗ 11⟩ W =<br />

N I ∑<br />

i=1<br />

(<br />

⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) , ⟨α k | W I | α i ⟩ ∈ R, (III. 70)<br />

N I<br />

∑<br />

⟨α i | A | α k ⟩ ⟨α k | W I | α i ⟩ = Tr AW I = ⟨A⟩ WI . (III. 71)<br />

k=1<br />

Analogously, with W II the partial trace of W in relation to H I ,<br />

W II = Tr I W :=<br />

N I<br />

∑<br />

⟨α i | W | α i ⟩, W II ∈ S (H II ), (III. 72)<br />

i=1<br />

we see that<br />

⟨11 ⊗ B⟩ W = Tr BW II = ⟨B⟩ WII . (III. 73)<br />

EXERCISE 19. Prove that Tr II W and Tr I W are state operators in H I and H II , respectively.


58 CHAPTER III. THE POSTULATES<br />

Concerning the expectation values of the quantities of the subsystem S I alone we can replace the<br />

state W by the partial trace, or state operator, Tr II W in H I , analogously for S II . Therefore it is<br />

customary to let the states of the subsystems correspond to the partial traces Tr II W and Tr I W .<br />

For the partial traces it holds that if W is a direct product of state operators W 1 and W 2 in H I<br />

and H II , respectively, W can also be written as a direct product of its partial traces, which we now<br />

show in a lemma.<br />

LEMMA:<br />

If W is a direct product of the form W = W 1 ⊗ W 2 , where W 1 and W 2 are state operators<br />

in H I and H II , respectively, then Tr II W = W 1 and Tr I W = W 2 .<br />

Proof<br />

Tr II W = Tr II (W 1 ⊗ W 2 ) =<br />

∑N II<br />

⟨β j | W 1 ⊗ W 2 | β j ⟩<br />

j=1<br />

∑N II<br />

= W 1 ⟨β j | W 2 | β j ⟩ = W 1 Tr W 2 = W 1 , (III. 74)<br />

j=1<br />

likewise,<br />

Tr I (W 1 ⊗ W 2 ) = W 2 . □ (III. 75)<br />

From this lemma we see that W = W 1 ⊗ W 2 = Tr II W ⊗ Tr I W , and with the first theorem<br />

of this section, p. 56, this leads to the conclusion that if W is a direct product of its partial traces, it<br />

can be uniquely reconstructed from its partial traces. Generally, an arbitrary state operator W of the<br />

composite system can not be defined by its partial traces, which was shown by Von Neumann.<br />

VON NEUMANN’S THEOREM B:<br />

The partial traces Tr II W and Tr I W uniquely define W , iff at least one of the partial<br />

traces is pure, in which case W is factorizable,<br />

W = Tr II W ⊗ Tr I W. (III. 76)<br />

Proof<br />

Let {|u i ⟩} be a basis of eigenstates of W I having non - degenerate eigenvalues. Leaving out the<br />

eigenvalues p n and u i which are equal to 0, expand W and Tr II W in their eigenvectors,<br />

W =<br />

N∑<br />

p n |ψ n ⟩ ⟨ψ n | with |ψ n ⟩ ∈ H (III. 77)<br />

n=1<br />

and<br />

Tr II W =<br />

∑N I<br />

i=1<br />

u i |u i ⟩ ⟨u i | with |u i ⟩ ∈ H I . (III. 78)


III. 4. COMPOSITE SYSTEMS 59<br />

◃ Remark<br />

Leaving out the eigenvalues u i = 0, the eigenvectors |u i ⟩ with eigenvalue 0 do not occur in the<br />

expansion of Tr II W , however, they do belong to the complete basis basis {|u i ⟩}. ▹<br />

Let {|v j ⟩} be a basis in H II . Then {|u i ⟩ ⊗ |v j ⟩} is a basis in H, and |ψ n ⟩ can be expanded as<br />

where<br />

|ψ n ⟩ =<br />

|ϕ n i ⟩ :=<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

∑N II<br />

j=1<br />

ψ n<br />

ij |u i ⟩ ⊗ |v j ⟩ =<br />

∑N I<br />

i=1<br />

|u i ⟩ ⊗ |ϕ n i ⟩ (III. 79)<br />

ψ n<br />

ij |v j ⟩ ∈ H II . (III. 80)<br />

These |ϕi n ⟩ are, in general, not orthogonal. Substituting (III. 79) in (III. 77) we have<br />

W =<br />

N∑<br />

n=1<br />

p n<br />

∑N I<br />

∑N I<br />

i=1 k=1<br />

|u i ⟩ ⟨u k | ⊗ |ϕ n i ⟩ ⟨ϕ n k |. (III. 81)<br />

Subtitution of (III. 81) in (III. 69) yields<br />

Tr II W =<br />

∑N II<br />

⟨β l | W | β l ⟩ =<br />

l=1<br />

N∑<br />

∑N I<br />

∑N I<br />

n=1 i=1 k=1<br />

∑N II<br />

p n |u i ⟩ ⟨u k | ⟨β l | ϕi n ⟩ ⟨ϕk n | β l ⟩<br />

l=1<br />

=<br />

N∑<br />

∑N I<br />

∑N I<br />

n=1 i=1 k=1<br />

p n ⟨ϕ n k | ϕ n i ⟩ |u i ⟩ ⟨u k |. (III. 82)<br />

With {|ψ i ⟩} a basis, the coefficients in the expansion of an operator of the form ∑ ij c ij|ψ i ⟩⟨ψ j |<br />

are unique, and comparison of (III. 82) with (III. 78) gives<br />

therefore,<br />

N∑<br />

p n ⟨ϕk n | ϕi n ⟩ = u i δ ik , (III. 83)<br />

n=1<br />

Tr II W =<br />

∑N I<br />

∑N I<br />

i=1 k=1<br />

u i δ ik |u i ⟩ ⟨u k | =<br />

∑N I<br />

i=1<br />

u i |u i ⟩ ⟨u i |. (III. 84)<br />

◃ Remark<br />

In (III. 83) it follows for i = k, due to the positivity of the p n , that if u i = 0 for certain i,<br />

then |ϕ n i ⟩ = 0 for all n and we see that in (III. 79) only the terms appear for which u i ≠ 0.<br />

Consequently, the same terms occur in (III. 79) as in the expansion (III. 78) of Tr II W . ▹


60 CHAPTER III. THE POSTULATES<br />

If Tr II W is pure, there is only one term<br />

Tr II W = |u 1 ⟩ ⟨u 1 |, (III. 85)<br />

and substitution in (III. 79) yields<br />

|ψ n ⟩ = |u 1 ⟩ ⊗ |ϕ 1 n ⟩. (III. 86)<br />

Therefore,<br />

W =<br />

N∑<br />

N∑<br />

p n |u 1 ⟩ ⟨u 1 | ⊗ |ϕ n 1 ⟩ ⟨ϕ n 1 | = |u 1 ⟩ ⟨u 1 | ⊗ p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 87)<br />

n=1<br />

n=1<br />

Analogous to (III. 82) we find for<br />

Tr I W =<br />

N∑<br />

∑N I<br />

∑N I<br />

n=1 i=1 k=1<br />

p n ⟨u k | u i ⟩ |ϕ n i ⟩ ⟨ϕ n k |. (III. 88)<br />

With i = k = 1 and ⟨u 1 | u 1 ⟩ = 1 we have<br />

Tr I W =<br />

N∑<br />

p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 89)<br />

n=1<br />

Substituting (III. 89) in (III. 87) we see that W = Tr II W ⊗ Tr I W . Indeed, if one of the partial<br />

traces is pure, W is factorizable, and therefore completely determined, by its partial traces.<br />

To show the ‘only if’ - part of the theorem, that Tr II W and Tr I W uniquely define the state W<br />

of the composite system only if at least one of the partial traces is pure, since only in that<br />

case W is factorizable, we decompose them into orthogonal 1 - dimensional eigenprojectors,<br />

where both u i , v j ∈ [0, 1] sum up to 1 as required in (III. 35) for the projectors to be state<br />

operators,<br />

Tr II W =<br />

Tr I W =<br />

It then holds that<br />

∑N I<br />

i=1<br />

∑N II<br />

j=1<br />

Tr II W ⊗ Tr I W =<br />

u i |u i ⟩ ⟨u i | :=<br />

v j |v j ⟩ ⟨v j | :=<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

∑N I<br />

i=1<br />

∑N II<br />

j=1<br />

u i U i , (III. 90)<br />

v j V j . (III. 91)<br />

u i v j U i ⊗ V j . (III. 92)


Now consider an operator W of the form<br />

III. 4. COMPOSITE SYSTEMS 61<br />

W =<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

which is, in general, not factorizable.<br />

z ij U i ⊗ V j , (III. 93)<br />

EXERCISE 20. Prove that U i ⊗ V j is a 1 - dimensional projector in H.<br />

The operator W , (III. 93), is a state operator if<br />

z ij ∈ [0, 1] and<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

z ij = 1, (III. 94)<br />

furthermore, with (III. 69) and (III. 72) we have<br />

and<br />

Tr II W =<br />

Tr I W =<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

z ij U i (III. 95)<br />

z ij V j . (III. 96)<br />

This system has an infinite number of solutions for the unknown z ij , unless one of the partial<br />

traces is pure, e.g. Tr II W = U 1 . In that case, according to (III. 95) it has to hold for i = 1<br />

that ∑ j z 1j = 1. But then (III. 35) requires that ∑ j z ij = 0 if i ≠ 1, which means that, because<br />

of the non - negativity of the z ij , it has to hold that z ij = 0 if i ≠ 1. Substituting i = 1 in (III. 93)<br />

yields<br />

W =<br />

∑N II<br />

j=1<br />

z 1j U 1 ⊗ V j<br />

= U 1 ⊗<br />

∑N II<br />

j=1<br />

z 1j V j = Tr II W ⊗ Tr I W, (III. 97)<br />

where the last step is in accordance with (III. 96).<br />

We conclude that only if, at least, one of the partial traces is pure, W is factorizable. □<br />

In the foregoing we saw that only if the state operator W of a composite system is factorizable, it<br />

can be uniquely defined. Contrary to classical physics, in quantum mechanics maximal knowledge of<br />

the state of the subsystems is in general not equivalent to maximal knowledge of the state of the entire


62 CHAPTER III. THE POSTULATES<br />

system. Consequently, the state of the entire system can, generally, not be derived from measurements<br />

on the separate subsystems. 4<br />

If the partial traces of W = W 1<br />

⊗ W 2 are both pure, W is also pure, as we saw in the exercise<br />

on p. 56, and since the pure partial traces each have only one term W is of the form |u⟩ ⟨u| ⊗ |v⟩ ⟨v|.<br />

On the other hand, a pure state in H is, generally, not factorizable, which we will show in an example.<br />

EXAMPLE<br />

If |u i ⟩ and |v j ⟩ span a basis in H I and H II , respectively, an arbitrary vector |ψ⟩ in H = H I ⊗ H II<br />

is of the form<br />

|ψ⟩ =<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

c ij |u i ⟩ ⊗ |v j ⟩. (III. 98)<br />

An arbitrary pure state in H is therefore of the form<br />

|ψ⟩ ⟨ψ| =<br />

∑N I<br />

∑N II<br />

∑N I<br />

∑N II<br />

i=1 j=1 k=1 l=1<br />

Consider the following pure entangled state in H,<br />

c ∗ kl c ij<br />

(<br />

|ui ⟩ ⊗ |v j ⟩ )( ⟨u k | ⊗ ⟨v l | ) . (III. 99)<br />

|Φ⟩ = 1 2<br />

√<br />

2<br />

(<br />

|u1 ⟩ ⊗ |v 1 ⟩ + |u 2 ⟩ ⊗ |v 2 ⟩ ) . (III. 100)<br />

The corresponding W is the 1 - dimensional projector<br />

(<br />

W = |Φ⟩ ⟨Φ| = 1 2 |u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 2 |<br />

+ |u 2 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) . (III. 101)<br />

This pure state W is not factorizable, and cannot be written in the form (III. 93). But although W<br />

is pure, its partial traces are not pure,<br />

Tr II W =<br />

Tr I W =<br />

∑N II<br />

(<br />

⟨v j | Φ⟩ ⟨Φ | v j ⟩ = 1 2 |u1 ⟩ ⟨u 1 | + |u 2 ⟩ ⟨u 2 | ) , (III. 102)<br />

j=1<br />

∑N I<br />

i=1<br />

⟨u i | Φ⟩ ⟨Φ | u i ⟩ = 1 2<br />

(<br />

|v1 ⟩ ⟨v 1 | + |v 2 ⟩ ⟨v 2 | ) , (III. 103)<br />

and indeed,<br />

W I ⊗ W II = 1 4<br />

(<br />

|u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 2 | +<br />

|u 2 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) ≠ W. (III. 104)<br />

4 This aspect of the quantum mechanical state description is, however, analogous to a classical state description with a<br />

probability distribution. The two - particle distribution function ρ(q 1 , p 1 ; q 2 , p 2 ) is not uniquely defined by the marginal<br />

distribution functions<br />

∫<br />

∫<br />

ρ 1 (q 1 , p 1 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 2 dp 2 and ρ 2 (q 2 , p 2 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 1 dp 1 ,<br />

the marginals are, after all, analogous to the partial traces.


III. 5. PROPER AND IMPROPER MIXTURES 63<br />

III. 4. 1<br />

SUMMARY<br />

1. The state operator W ∈ S (H) of a composite system, whether pure or not, is not factorizable<br />

in general.<br />

2. If W is factorizable, the factors are equal to the partial traces of W ,<br />

W = W 1 ⊗ W 2 implies W 1 = Tr II W and W 2 = Tr I W. (III. 105)<br />

3. The partial traces uniquely define W iff, at least, one of the partial traces is pure, in which<br />

case W is directly factorizable, W = W 1 ⊗ W 2 .<br />

4. The partial traces of W are pure iff W is pure and of the form W = ( |u⟩ ⊗ |v⟩ )( ⟨u| ⊗ ⟨v| ) ,<br />

with |u⟩ ∈ H I and |v⟩ ∈ H II .<br />

III. 5<br />

PROPER AND IMPROPER MIXTURES<br />

The states of composite systems shed new insight on the interpretation of mixtures. Suppose that<br />

W I and W II are the partial traces of an arbitrary state operator W , and, with u i , v j ∈ [0, 1], it holds<br />

that<br />

W I =<br />

N I ∑<br />

i=1<br />

u i |u i ⟩ ⟨u i | and W II =<br />

N II ∑<br />

j=1<br />

v j |v j ⟩ ⟨v j |. (III. 106)<br />

W I and W II contain all quantum mechanical information about results of measurements on the subsystems<br />

in H I and H II . The question is whether we can interpret this by assuming that the individual<br />

subsystems are in the pure states |u i ⟩ and |v j ⟩, with probabilities u i and v j , respectively. If this were<br />

the case, the composite system could be divided in subensembles of systems in the states |u i ⟩ ⊗ |v j ⟩<br />

with probabilities depending on possible correlations between the values of i and j. The state would<br />

be of the form<br />

W ′ =<br />

=<br />

∑N I ∑N II<br />

i=1<br />

j=1<br />

∑N I ∑N II<br />

i=1<br />

j=1<br />

p ij<br />

(<br />

|ui ⟩ ⊗ |v j ⟩ )( ⟨u i | ⊗ ⟨v j | )<br />

p ij |u i ⟩ ⟨u i | ⊗ |v j ⟩ ⟨v j |. (III. 107)<br />

The coefficients p ij have to satisfy<br />

p ij ∈ [0, 1],<br />

N II ∑<br />

j=1<br />

p ij = u i ,<br />

N I ∑<br />

i=1<br />

p ij = v j<br />

and<br />

∑N I ∑N II<br />

i=1<br />

j=1<br />

p ij = 1, (III. 108)


64 CHAPTER III. THE POSTULATES<br />

but otherwise they are free to choose. As far as being in one of the states |u i ⟩ or |v j ⟩ can be interpreted<br />

as a property the subsystems possess, all correlations between these properties in the total state can<br />

be expressed by the p ij . If there are no correlations, p ij = u i v j .<br />

But we see that W ′ is of the special form (III. 93), and therefore in general not equal to the arbitrary<br />

state operator W we started with, it cannot be said that the individual subsystems are in the pure<br />

states |u i ⟩ and |v j ⟩. Although W I and W II are state operators, they cannot be interpreted as mixtures of<br />

pure states. The mixed states W I and W II are called improper mixtures by B. d’Espagnat (1989, p.61).<br />

Proper mixed states can in principle be taken as an ensemble of systems which are in pure states,<br />

where improper states cannot.<br />

The foregoing shows that the concept of mixed states is forced upon us by the theory of composite<br />

systems as a natural extension of the concept of pure states. Even if the composite system is in a pure<br />

state, the subsystems are generally not pure, it is not correct to understand mixed states in general as<br />

simple mixtures of pure states, in the way the mixture of pieces in the box of a game of chess consists<br />

of black and white pieces.<br />

Finally, we make an observation about similar, or identical, particles. A system of similar particles<br />

is described in quantum mechanics by symmetrized states. Consider the following symmetrized<br />

two - particle state<br />

|Ψ(1, 2)⟩ = 1 2<br />

√<br />

2<br />

(<br />

|u⟩ ⊗ |v⟩ ± |v⟩ ⊗ |u⟩<br />

)<br />

, (III. 109)<br />

where the first factor in each direct product is related to particle 1, and the second to particle 2. In<br />

this case the two subspaces are identical and |u⟩ and |v⟩ can represent states in both one and the other<br />

subspace. The corresponding state operator is<br />

W = |Ψ(1, 2)⟩ ⟨Ψ(1, 2)| = 1 (<br />

2 |u⟩ ⟨u| ⊗ |v⟩ ⟨v| ± |u⟩ ⟨v| ⊗ |v⟩ ⟨u|<br />

the partial traces are<br />

and<br />

W I = Tr II W = 1 2<br />

W II = Tr I W = 1 2<br />

± |v⟩ ⟨u| ⊗ |u⟩ ⟨v| + |v⟩ ⟨v| ⊗ |u⟩ ⟨u| ) , (III. 110)<br />

(<br />

|u⟩ ⟨u| + |v⟩ ⟨v|<br />

)<br />

, (III. 111)<br />

(<br />

|v⟩ ⟨v| + |u⟩ ⟨u|<br />

)<br />

, (III. 112)<br />

and we see that the partial traces are identical. We have to say that both particles are in the same state,<br />

we certainly can not say that one particle is in the state |u⟩ and the other in |v⟩. We cannot assign a<br />

pure state to the separate particles, although the state of the composite system is pure.<br />

III. 6<br />

SPIN 1/2 PARTICLES<br />

The time - dependent Schrödinger equation for the wave function Ψ(q, t) is given by<br />

i ∂Ψ<br />

∂t<br />

= − 2<br />

2m ∇2 Ψ + V Ψ. (III. 113)


III. 6. SPIN 1/2 PARTICLES 65<br />

In this equation<br />

⃗p = − i ⃗ ∇ (III. 114)<br />

is the canonical momentum operator, yielding for the components of the angular momentum ⃗ L = ⃗q×⃗p<br />

for a system in 3 - dimensional space<br />

L i = − i ϵ ijk q j ∂ k . (III. 115)<br />

These components do not commute,<br />

[L i , L j ] = i ϵ ijk L k , (III. 116)<br />

but the operator ⃗ L 2 = L 2 x + L 2 y + L 2 z does commute with ⃗ L, or with any one of its components,<br />

where usually L z is taken.<br />

The simultaneous eigenstates of ⃗ L 2 and L z are written as |l, m⟩, and their eigenvalues are discrete,<br />

⃗L 2 |l, m⟩ = 2 l (l + 1) |l, m⟩, with l = 0, 1 2 , 1, 3 2<br />

, . . . , (III. 117)<br />

L z |l, m⟩ = m |l, m⟩ with m = − l, − l + 1, . . . , l − 1, l. (III. 118)<br />

Although the algebraic derivation using the commutation relations allows for half integer values,<br />

for angular momentum ⃗ L the values of l can only be integers to make sense physically. But the half<br />

integer values are included in the description of spin.<br />

Spin ⃗ S is an internal degree of freedom of elementary particles, which cannot easily be described<br />

in classical terms, but is similar to ⃗ L. A main difference is that where the value of the angular momentum<br />

of a particle can vary, the value s of spin of a particle is constant. The similarity is that spin has,<br />

like ⃗ L, a direction ⃗n in 3 - dimensional space, and satisfies the commutation relations of (III. 116).<br />

Writing the simultaneous eigenstates of ⃗ S 2 and S z as |s, m⟩, we can use (III. 117) and (III. 118)<br />

again, where L 2 and L z are replaced by S 2 and S z , respectively, and l by s. The eigenvalues of ⃗ S 2<br />

and S z are<br />

s = 0 : ⃗ S 2 = 0, S z = 0, (III. 119)<br />

s = 1 2 : S ⃗ 2 = 3 4 2 , S z = − 1 2 , 1 2<br />

, (III. 120)<br />

and so on for s = 0, 1 2 , 1, 3 2<br />

, . . . . In this section we restrict ourselves to the most simple non - trivial<br />

case, spin 1/2.<br />

For spin 1/2 particles there are only two orthonormal eigenstates, | 1 2 , 1 2 ⟩ and | 1 2 , − 1 2<br />

⟩, called<br />

‘spin up’ and ‘spin down’, usually written as |↑⟩ and |↓⟩, respectively. Together, these eigenstates<br />

form a basis for a spin space, the 2 - dimensional Hilbert space H = C 2 .<br />

According to the observables postulate, p. 41, the observable spin corresponds uniquely to a self -<br />

adjoint, or Hermitian, operator A in H. Every Hermitian operator in C 2 can be represented in the<br />

aforementioned basis as a 2 × 2 - matrix,<br />

A =<br />

( )<br />

a11 a 12<br />

a 21 a 22<br />

=<br />

( )<br />

a0 + a z a x − ia y<br />

a x + ia y a 0 − a z<br />

= a 0 11 + a x σ x + a y σ y + a z σ z = a 0 11 + ⃗a · ⃗σ, (III. 121)


66 CHAPTER III. THE POSTULATES<br />

with real coefficients a 0 and ⃗a, and ⃗σ defined by the Pauli matrices,<br />

σ x =<br />

( ) 0 1<br />

, σ<br />

1 0 y =<br />

( ) 0 −i<br />

, σ<br />

i 0 z =<br />

( ) 1 0<br />

. (III. 122)<br />

0 −1<br />

EXERCISE 21. Prove the aforementioned statement.<br />

( (<br />

Writing the eigenvectors of σ z , 1<br />

and 0<br />

, as |z ↑⟩ and |z ↓⟩, we have<br />

0)<br />

1)<br />

σ z |z ↑⟩ = |z ↑⟩ and σ z |z ↓⟩ = − |z ↓⟩. (III. 123)<br />

Analogously, let |x ↑⟩, |x ↓⟩ and |y ↑⟩, |y ↓⟩ denote eigenstates for the eigenvalues ±1 of σ x and σ y .<br />

The Pauli matrices have the following properties:<br />

σ 2 x = σ 2 y = σ 2 z = 11, (III. 124)<br />

σ i σ j = i ϵ ijk σ k , (III. 125)<br />

Tr ⃗σ = 0. (III. 126)<br />

Using the anticommutation relations for the Pauli matrices, [σ i , σ j ] +<br />

from (III. 125), we find a useful relation,<br />

= 0, which follow directly<br />

(⃗a · ⃗σ) ( ⃗ b · ⃗σ) = (⃗a · ⃗b) 11 + i ⃗σ · (⃗a × ⃗ b) (III. 127)<br />

from which it follows that<br />

(⃗a · ⃗σ) 2 = 11 if ∥⃗a∥ = 1. (III. 128)<br />

A 2 × 2 - matrix A has eigenvalues ±1 iff A 2 = 11, and therefore, with ⃗n a unit vector, we see<br />

that the only operators of the form (III. 121) having eigenvalues ±1 are precisely of the form ⃗n · ⃗σ.<br />

This allows us to let spin in the direction ⃗n correspond to the operator<br />

⃗S = 1 2<br />

⃗n · ⃗σ. (III. 129)<br />

We will found this choice shortly, but first we determine the eigenvectors of the spin operator ⃗n · ⃗σ.<br />

Writing ⃗n in spherical coordinates<br />

⃗n =<br />

⎛ ⎞<br />

sin θ cos ϕ<br />

⎝sin θ sin ϕ⎠ , (III. 130)<br />

cos θ


III. 6. SPIN 1/2 PARTICLES 67<br />

we have<br />

⃗n · ⃗σ =<br />

( cos θ e<br />

− i ϕ )<br />

sin θ<br />

e i ϕ , (III. 131)<br />

sin θ − cos θ<br />

with eigenvectors<br />

|⃗n, +⟩ =<br />

(<br />

)<br />

e − i 2 ϕ cos 1 2 θ<br />

e i 2 ϕ sin 1 2 θ<br />

and |⃗n, −⟩ =<br />

(<br />

)<br />

− e − i 2 ϕ sin 1 2 θ<br />

e i 2 ϕ cos 1 2 θ<br />

(III. 132)<br />

for eigenvalues ±1.<br />

EXERCISE 22. Verify (III. 132)<br />

III. 6. 1<br />

SPIN 1/2 AND ROTATIONS IN SPIN SPACE<br />

A rotation over an angle α ∈ [0, π) around an axis in the direction of the unit vector ⃗m,<br />

with ⃗m ∈ R 3 , can be written as a unitary matrix<br />

U (⃗m, α) = e − i α ( ⃗m · ⃗J) , (III. 133)<br />

where the total angular momentum J ⃗ = L ⃗ + S ⃗ is the infinitesimal generator of rotations. With L ⃗ = 0<br />

and writing S i = 1 2 σ i, which is, using (III. 124), in accordance to (III. 120) and the still unfounded<br />

(III. 129), the Pauli matrices are the generators of rotations in C 2 , leading to<br />

U (⃗m, α) = e − i 2 α ( ⃗m · ⃗σ) , (III. 134)<br />

where ∥⃗m∥ is again 1. Using Taylor expansions, with (III. 128) we find for (III. 134)<br />

∞∑ (− i) k (⃗m · ⃗σ) k (<br />

U(⃗m, α) =<br />

1<br />

k!<br />

2 α) k<br />

=<br />

k=0<br />

∞∑<br />

k=0<br />

k=even<br />

(− 1) 1 2 k ( 1<br />

k!<br />

2 α) ∑<br />

k ∞ 11 + i (⃗m · ⃗σ)<br />

k=1<br />

k=odd<br />

(− 1) 1 2 (k+1) ( 1<br />

k!<br />

2 α) k<br />

= cos 1 2 α 11 − i (⃗m · ⃗σ) sin 1 2α. (III. 135)<br />

It can be verified that, under a rotation around an axis ⃗m over an angle α, with ⃗n R the unit vector<br />

in the rotated direction, the eigenstates of ⃗n · ⃗σ, (III. 132), transform into the eigenstates of ⃗n R · ⃗σ,<br />

obeying the rotational transformation rules<br />

U (⃗m, α) |⃗n, ±⟩ = |⃗n R , ±⟩. (III. 136)


68 CHAPTER III. THE POSTULATES<br />

We illustrate (III. 136) using a rotation of ⃗n in the x z - plane, ϕ = 0, over an angle α around<br />

the y - axis as in diagram III. 2.<br />

⃗n<br />

z<br />

θ<br />

α<br />

⃗n R<br />

x<br />

y<br />

Figure III. 2: A rotated unit vector in the xz - plane<br />

For ⃗n and ⃗n R we have<br />

⎛ ⎞ ⎛ ⎞<br />

sin θ<br />

sin(θ + α)<br />

⃗n = ⎝ 0 ⎠ , ⃗n R = ⎝ 0 ⎠ . (III. 137)<br />

cos θ<br />

cos(θ + α)<br />

The eigenstates of ⃗n · ⃗σ, using (III. 132), are<br />

( cos<br />

1<br />

|⃗n, +⟩ = 2 θ )<br />

sin 1 2 θ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 138)<br />

and<br />

|⃗n, −⟩ =<br />

( − sin<br />

1<br />

2 θ )<br />

cos 1 2 θ<br />

= − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩. (III. 139)<br />

Rotating around the y - axis and therefore<br />

(<br />

U (⃗e y , α) = (cos 1 2 α 11 − i ⃗e y · ⃗σ sin 1 cos<br />

1<br />

2 α) = 2 α − sin 1 2 α )<br />

sin 1 2 α cos 1 2 α , (III. 140)<br />

we have<br />

U (⃗e y , α) |⃗n, +⟩ =<br />

( )<br />

cos<br />

1<br />

2<br />

(θ + α)<br />

sin 1 2 (θ + α)<br />

and<br />

U (⃗e y , α) |⃗n, −⟩ =<br />

= cos 1 2 (θ + α) |z ↑⟩ + sin 1 2<br />

(θ + α) |z ↓⟩ (III. 141)<br />

( )<br />

− sin<br />

1<br />

2<br />

(θ + α)<br />

cos 1 2 (θ + α)<br />

= − sin 1 2 (θ + α) |z ↑⟩ + cos 1 2<br />

(θ + α) |z ↓⟩, (III. 142)


III. 6. SPIN 1/2 PARTICLES 69<br />

and we see that (III. 141) and (III. 142) are indeed the eigenstates |⃗n R , +⟩ and |⃗n R , −⟩ of ⃗n R · ⃗σ.<br />

Comparison of these eigenstates with the eigenstates of ⃗n · ⃗σ, (III. 138) and (III. 139), shows<br />

that (III. 136) is satisfied. As can easily be verified, this holds in general, and we conclude that spin<br />

is represented by the spin operator ⃗n · ⃗σ, founding our choice (III. 129).<br />

Under a rotation around the y - axis over an angle θ the eigenvectors of σ z transform into<br />

and, likewise,<br />

U (⃗e y , θ) |z ↑⟩ = (cos 1 2 θ 11 − i σ y sin 1 2 θ) |z ↑⟩ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 143)<br />

U (⃗e y , θ) |z ↓⟩ = − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩ (III. 144)<br />

Especially, it holds that the eigenvectors of σ x correspond to a rotation of the eigenvectors of σ z<br />

around the y - axis over θ = 1 2 π,<br />

and<br />

U (⃗e y , 1 2 π) |z ↑⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ + |z ↓⟩<br />

)<br />

= |x ↑⟩, (III. 145)<br />

U (⃗e y , 1 2 π) |z ↓⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↓⟩ − |z ↑⟩<br />

)<br />

= |x ↓⟩. (III. 146)<br />

EXERCISE 23. Construct, analogously, the states |y ↑⟩ and |y ↓⟩ from |z ↑⟩ and |z ↓⟩ using a<br />

rotation around the x - axis.<br />

Successively rotating over 1 2<br />

π transforms |z ↑⟩ via |x ↑⟩, |z ↓⟩ and |x ↓⟩ into −|z ↑⟩, instead of<br />

into |z ↑⟩, and consequently, we have to rotate |z ↑⟩ over 4π to come back to |z ↑⟩ again. Generally,<br />

a rotation over 2π transforms a state |ϕ⟩ into −|ϕ⟩. This means we cannot simply visualize particles<br />

with spin as tiny spinning tops!<br />

Finally a useful relation holds. Choosing again ⃗e y for ⃗m, we have U (⃗e y , α) as in (III. 140) which<br />

yields for arbitrary ⃗n, (III. 130),<br />

⟨⃗n, +| U (⃗e y , α) |⃗n, +⟩ = cos 1 2 α + (e − i 2 ϕ − e i 2 ϕ ) cos 1 2 θ sin 1 2 θ sin 1 2 α<br />

= cos 1 2 α − i sin ϕ sin θ sin 1 2α, (III. 147)<br />

from which we see that, if ⃗n and ⃗n R are in the xz - plane, ϕ = 0 or ϕ = π,<br />

⟨⃗n, + | ⃗n R , +⟩ = cos 1 2 α ⃗n ⃗n R<br />

, (III. 148)<br />

where α ⃗n ⃗nR is the angle between ⃗n and ⃗n R . Because ⃗n and α can be chosen arbitrarily, this relation<br />

holds for any two vectors ⃗n and ⃗n ′ in the xz - plane, and, by freedom of choice of the coordinate<br />

system, it holds whenever ⃗n and ⃗n ′ are in the same plane.


70 CHAPTER III. THE POSTULATES<br />

EXERCISE 24. Show that the operator 1 2<br />

(11 + ⃗n · ⃗σ) is the projector on |⃗n, +⟩,<br />

1<br />

2<br />

(11 + ⃗n · ⃗σ) = |⃗n, +⟩ ⟨⃗n, +|. (III. 149)<br />

◃ Remark<br />

This holds in any matrix representation. ▹<br />

III. 6. 2<br />

MIXED SPIN 1/2 STATES<br />

Every Hermitian 2 × 2 - matrix can, as stated before, be written as (III. 121), A = a 0 11 + ⃗a · ⃗σ,<br />

with real coefficients a 0 and ⃗a. According to (III. 29), for the corresponding operator A to be a state<br />

operator the trace of A has to be 1, which means that a 0 = 1 2<br />

. Furthermore, A has to be positive.<br />

A positive matrix can be written as the square of a Hermitian matrix B,<br />

B = b 0 11 + ⃗ b · ⃗σ and B 2 = (b 2 0 + ⃗ b 2 ) 11 + 2 b 0<br />

⃗ b · ⃗σ, (III. 150)<br />

Therefore,<br />

a 0 = 1 2 = b 2 0 + ⃗ b 2 and ⃗a = 2 b 0<br />

⃗ b. (III. 151)<br />

The possible values of b 0 are limited by (III. 151), b 2 0<br />

fixed, ⃗ b = 1 ⃗a<br />

2 b 0<br />

, yielding<br />

1 2 , while as soon as b 0 is chosen ⃗ b is<br />

⃗a 2 = 4 b 0<br />

2⃗ b 2 = 4 b 0<br />

2 ( 1<br />

2 − b 0 2) . (III. 152)<br />

Obviously, ⃗a 2 only depends on b 2 0 and its values in the interval [0, 1 2 ] are between 0 and 1 4<br />

, where ⃗a<br />

2<br />

has a maximum for b 2 0 = 1 4 . In other words, A is a state operator iff a 0 = 1 2 and ⃗a 2 1 4<br />

, in which<br />

case some b 0 and ⃗ b exist, satisfying the requirements (III. 151).<br />

Now an arbitrary state operator is<br />

W = 1 2 (11 + ⃗w · ⃗σ), ⃗w 2 1. (III. 153)<br />

This state operator is characterized by the vector ⃗w, called the polarization vector, which has its<br />

endpoints within or on the surface of the unit sphere, the so - called Bloch sphere. For ∥ ⃗w∥ = 1 the<br />

system is called completely polarized, for ⃗w = 0 it is called unpolarized, and if 0 < ∥ ⃗w∥ < 1 it is<br />

called partially polarized.<br />

The state operators with ⃗w 2 = 1 are the pure states, the 1 - dimensional projectors,<br />

W 2 = 1 4 (11 + 2 ⃗w · ⃗σ + ⃗w 2 11) = 1 2<br />

(11 + ⃗w · ⃗σ) = W, (III. 154)<br />

the state operators with ⃗w 2 < 1 are mixed states. The set of state operators is a convex set as we<br />

can now easily see. If ⃗w 1 and ⃗w 2 are within or on the surface of the unit sphere, then α ⃗w 1 + β ⃗w 2 ,<br />

with 0 < α, β < 1 and α +β = 1, is the chord linking ⃗w 1 and ⃗w 2 , and this chord is within the sphere.


III. 6. SPIN 1/2 PARTICLES 71<br />

EXERCISE 25. Prove the following statements.<br />

(a) ⟨⃗σ⟩ W = ⃗w,<br />

(b) det W = 1 4 (1 − ⃗w 2 ),<br />

(c) the eigenvalues of W are 1 2 ± 1 2 ∥ ⃗w∥.<br />

EXAMPLES<br />

In the following two examples, consider vectors ⃗w with ∥ ⃗w∥ = 1, thus corresponding to pure<br />

states.<br />

(a) Since in this case ⃗w equals the unit vector ⃗n, for ⃗w = (0, 0, 1) ∈ R 3 we have<br />

( )<br />

W = 1 (11 1 0<br />

2 + σ z) = , (III. 155)<br />

0 0<br />

which is a 1 - dimensional projector, it is the matrix representation of W = |z ↑⟩ ⟨z ↑|.<br />

Likewise we have<br />

⃗w = (1, 0, 0) =⇒ W = 1 2 (11 + σ x) = |x ↑⟩ ⟨x ↑|, (III. 156)<br />

⃗w = (0, 1, 0) =⇒ W = 1 2 (11 + σ y) = |y ↑⟩ ⟨y ↑|,<br />

and we see that generally W = 1 2<br />

(11 + ⃗n · ⃗σ) corresponds to the pure state |⃗n, +⟩, as was<br />

already shown in (III. 149).<br />

In the same way, for |⃗n, −⟩ we have<br />

etc.<br />

⃗w = (0, 0, − 1) =⇒ W = 1 2 (11 − σ z) = |z ↓⟩ ⟨z ↓|, (III. 157)<br />

(b) For the probability to find spin up in the direction ⃗n ′ in the state |⃗n, +⟩, with (III. 45)<br />

and (III. 127) we find<br />

µ W ⃗n<br />

(W ⃗n ′) = Tr W ⃗n ′ W ⃗n = Tr ( 1<br />

2 + ⃗n ′ · ⃗σ) · 1<br />

2<br />

(11 + ⃗n · ⃗σ))<br />

= 1 4 Tr ( 11 + ⃗n ′ · ⃗σ + ⃗n · ⃗σ + (⃗n ′ · ⃗n)11 + i⃗σ · (⃗n ′ × ⃗n) )<br />

= 1 2 (1 + ⃗n ′ · ⃗n) = 1 2 (1 + cos θ) = cos2 1 2θ, (III. 158)<br />

with θ the angle between ⃗n and ⃗n ′ . This is in accordance with (III. 148).<br />

The following examples concern mixed state operators W , for which ⃗w has its endpoint somewhere<br />

inside the sphere, ⃗w 2 < 1.<br />

(c) Choosing ⃗w to be 1 2<br />

(0, 1, 0) yields<br />

( 1<br />

W = 1 (11 2 + 1 2 σ 2<br />

y) =<br />

− 1 4 i<br />

This can, for instance, be factorized as<br />

1<br />

4 i 1<br />

2<br />

)<br />

. (III. 159)<br />

W = 1 4 |z ↑⟩ ⟨z ↑| + 1 4 |z ↓⟩ ⟨z ↓| + 1 2<br />

|y ↑⟩ ⟨y ↑|, (III. 160)<br />

which clearly is a mixture.


72 CHAPTER III. THE POSTULATES<br />

The next two examples concern the center of the Bloch sphere, ⃗w = 0 .<br />

(d) With ⃗w = 0 , we have<br />

( ) 1 0<br />

W = 1 2<br />

. (III. 161)<br />

0 1<br />

The eigenvalues of this mixed state W are degenerate, and various factorizations are possible,<br />

for example<br />

W = 1 2 |x ↑⟩ ⟨x ↑| + 1 2<br />

|x ↓⟩ ⟨x ↓|<br />

= 1 2 |y ↑⟩ ⟨y ↑| + 1 2<br />

|y ↓⟩ ⟨y ↓|<br />

= 1 2 |z ↑⟩ ⟨z ↑| + 1 2<br />

|z ↓⟩ ⟨z ↓|. (III. 162)<br />

(e) Under a rotation R, ⃗w behaves like a vector in R 3 ,<br />

U (R) ( ⃗w · ⃗σ) U − 1 (R) = ⃗w R · ⃗σ (III. 163)<br />

where U (R) is given by (III. 135). Therefore, the only rotation invariant state for a 1 - particle<br />

system is ⃗w = 0 .<br />

The similarity between the set of density matrices W and the 3 - dimensional unit sphere of polarization<br />

vectors is specific for spin 1/2 particles, in which case every pure state is also the eigenstate<br />

for the spin operator in a certain spin direction. For spin 1 bosons and higher spin particles this no<br />

longer applies.<br />

III. 6. 3<br />

TWO SPIN 1/2 PARTICLES<br />

III. 6. 3. 1<br />

SINGLET AND TRIPLET STATES<br />

Consider a composite system of two spin 1/2 fermions. In the direct product space C 2 ⊗ C 2 = C 4<br />

a basis is<br />

|z ↑⟩ ⊗ |z ↑⟩, |z ↑⟩ ⊗ |z ↓⟩, |z ↓⟩ ⊗ |z ↑⟩, |z ↓⟩ ⊗ |z ↓⟩. (III. 164)<br />

From these basis states the simultaneous eigenstates |s, m⟩ of the operators ⃗ S 2 = ( ⃗ S 1 + ⃗ S 2 ) 2<br />

and S z = S 1z + S 2z can be formed, where s can be 0 or 1. The eigenvalues of ⃗ S 2 are 2 s(s + 1), the<br />

eigenvalues of S z are m, as introduced on p. 65.<br />

The singlet state or singlet for short, with s = 0 and therefore m = 0, is the entangled state<br />

|Ψ 0 ⟩ = |0, 0⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ ⊗ |z ↓⟩ − |z ↓⟩ ⊗ |z ↑⟩<br />

)<br />

, (III. 165)<br />

which looks the same in terms of the eigenstates of S x and S y , having spherical symmetry. The singlet<br />

is a simultaneous eigenstate of S x , S y and S z with eigenvalue 0. Hence the singlet is an eigenstate<br />

of ⃗n · ⃗S with eigenvalue 0, which means that a rotation (III. 133) carries (III. 165) back into itself.


III. 6. SPIN 1/2 PARTICLES 73<br />

The triplet states, with s = 1 and m = 1, 0, −1 are<br />

|1, 1⟩ = |z ↑⟩ ⊗ |z ↑⟩<br />

√<br />

|1, 0⟩ = 1 ( )<br />

2 2 |z ↑⟩ ⊗ |z ↓⟩ + |z ↓⟩ ⊗ |z ↑⟩<br />

|1, − 1⟩ = |z ↓⟩ ⊗ |z ↓⟩. (III. 166)<br />

III. 6. 3. 2<br />

CORRELATIONS<br />

In chapter VII we will use the spin correlation function of the singlet,<br />

E QM (⃗a, ⃗ b) := ⟨0, 0|⃗a · ⃗σ 1 ⊗ ⃗ b · ⃗σ 2 |0, 0⟩, (III. 167)<br />

where ⃗a, ⃗ b ∈ R 3 are unit vectors. E QM (⃗a, ⃗ b) is the expectation value to find both for particle 1 spin<br />

up along ⃗a and for particle 2 spin up along ⃗ b. To find E QM (⃗a, ⃗ b), first choose the z - axis along ⃗a<br />

as in diagram III. 3, next choose the x - axis in such a way that ⃗ b is in the xz - plane. The spherical<br />

symmetry of the singlet state allows such a choice.<br />

z<br />

⃗a<br />

θ ⃗a, ⃗ b<br />

⃗ b<br />

Figure III. 3: Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b<br />

x<br />

With ⃗a = ⃗e z , ⃗ b similar to ⃗n in (III. 137), and θ ⃗a, ⃗ b<br />

the angle between ⃗a and ⃗ b, we have<br />

E QM (⃗a, ⃗ b) = ⟨0, 0| σ 1z ⊗ (sin θ ⃗a, ⃗ b<br />

σ 2x + cos θ ⃗a, ⃗ b<br />

σ 2z ) |0, 0⟩. (III. 168)<br />

Now σ z |z ↑⟩ = |z ↑⟩, σ x |z ↑⟩ = |z ↓⟩ etc., so that we have, using (II. 100), (III. 165) and (III. 166),<br />

√<br />

(σ 1z ⊗ σ 2x ) |0, 0⟩ = 1 ( )<br />

2 2 |1, 1⟩ + |1, −1⟩ (III. 169)<br />

which is perpendicular to |0, 0⟩, and<br />

(σ 1z ⊗ σ 2z ) |0, 0⟩ = − |0, 0⟩, (III. 170)<br />

from which we see that<br />

E QM (⃗a, ⃗ b) = − cos θ ⃗a, ⃗ b<br />

. (III. 171)


74 CHAPTER III. THE POSTULATES<br />

III. 6. 3. 3<br />

CONDITIONAL PROBABILITIES<br />

In chapter VII we will also need to know, again in case the particles are in the singlet state, the<br />

probability for the spin of particle 2 to be found in the direction ⃗ b, given that the spin of particle 1 was<br />

found in the direction ⃗a. This conditional probability is, by definition,<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />

Prob ( ) . (III. 172)<br />

⃗a · ⃗σ 1 = 1<br />

Here the joint probability is<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = | ( ⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| ) |0, 0⟩| 2 , (III. 173)<br />

with |⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ the direct product of the eigenstates of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 having eigenvalues +1.<br />

Again choosing ⃗a and ⃗ b as in diagram III. 3, |⃗a ↑⟩ = |z ↑⟩ and | ⃗ b ↑⟩ equal to |⃗n, +⟩, (III. 138), we find<br />

for the direct product<br />

|⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ = |z ↑⟩ ⊗ ( cos 1 2 θ ⃗a, ⃗ b |z ↑⟩ + sin 1 2 θ ⃗a, ⃗ b |z ↓⟩) . (III. 174)<br />

Therefore, with (III. 165),<br />

( ) √<br />

⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| |0, 0⟩ =<br />

1<br />

2 2 sin<br />

1<br />

2 θ ⃗a, ⃗ , (III. 175)<br />

b<br />

and we see that the joint probability is<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 sin2 1 2 θ ⃗a, ⃗ . (III. 176)<br />

b<br />

Likewise, again using (III. 173) with ⟨ ⃗ b ↓| equal to |⃗n, −⟩, (III. 139), we have<br />

Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (III. 177)<br />

b<br />

This yields for the marginal probability<br />

Prob ( ⃗a · ⃗σ 1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />

and we see that the conditional probability (III. 172) is<br />

+ Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />

= 1 2 sin2 1 2 θ ⃗a, ⃗ b + 1 2 cos2 1 2 θ ⃗a, ⃗ b = 1 2<br />

, (III. 178)<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = sin 2 1 2 θ ⃗a, ⃗ . (III. 179)<br />

b<br />

◃ Remark<br />

By definition there is no correlation between the two results of measurements of spin if<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ) , (III. 180)<br />

which is the case if ⃗a and ⃗ b are perpendicular. ▹


III. 6. SPIN 1/2 PARTICLES 75<br />

We are now able to calculate the correlation (III. 167) directly, using a well - known formula from<br />

probability theory,<br />

E QM (⃗a, ⃗ b) =<br />

∑+1<br />

∑+1<br />

a=−1 b=−1<br />

a b Prob (a, b), (III. 181)<br />

where a, b ∈ { −1, 1} are the results of measurements of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 , respectively, and<br />

Prob (a, b) is the joint probability to find a and b at measurements of the respective spin quantities.<br />

Using (III. 176) and (III. 177) and calculating the probabilities with eigenvalues −1 for ⃗a · ⃗σ 1 we<br />

find<br />

E QM (⃗a, ⃗ b) = Prob (1, 1) + Prob (− 1, − 1) − Prob (1, − 1) − Prob (− 1, 1)<br />

= 2 · 1<br />

2 sin2 1 2 θ ⃗a, ⃗ b − 2 · 1<br />

2 cos2 1 2 θ ⃗a, ⃗ b<br />

= − cos θ ⃗a, ⃗ b<br />

. (III. 182)<br />

This is indeed equal to the earlier result (III. 171).<br />

III. 6. 3. 4<br />

EXAMPLE <strong>OF</strong> A MIXED STATE <strong>OF</strong> TWO SPIN 1/2 PARTICLES<br />

Consider, analogous to (III. 100), the pure entangled state<br />

|Φ⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ ⊗ |z ↑⟩ + |z ↓⟩ ⊗ |z ↓⟩<br />

)<br />

, (III. 183)<br />

and the corresponding state W = |Φ⟩ ⟨Φ|, acting in H I ⊗ H II ,<br />

W = 1 (<br />

2 |z ↑⟩ ⟨z ↑| ⊗ |z ↑⟩ ⟨z ↑| + |z ↑⟩ ⟨z ↓| ⊗ |z ↑⟩ ⟨z ↓| +<br />

|z ↓⟩ ⟨z ↑| ⊗ |z ↓⟩ ⟨z ↑| + |z ↓⟩ ⟨z ↓| ⊗ |z ↓⟩ ⟨z ↓| ) , (III. 184)<br />

where the first factor in the direct product acts in H I , and the second factor in H II .<br />

The representation of W in the corresponding basis (III. 164) of H = H I ⊗ H II is, using the<br />

Kronecker product of matrices, (II. 103),<br />

⎛ ⎞<br />

1 0 0 1<br />

W = 1 ⎜0 0 0 0<br />

⎟<br />

2 ⎝0 0 0 0⎠ . (III. 185)<br />

1 0 0 1<br />

This is indeed a pure state, since W is idempotent, a necessary and sufficient condition for bounded,<br />

self - adjoint operators to be a projector.<br />

The partial traces are<br />

W I = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H I), (III. 186)<br />

W II = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H II), (III. 187)


76 CHAPTER III. THE POSTULATES<br />

and their matrix representation in the basis of σ z is<br />

W I = 1 2<br />

( ) 1 0<br />

0 1<br />

and W II = 1 2<br />

( ) 1 0<br />

. (III. 188)<br />

0 1<br />

Although W is a pure state, the direct product of the partial traces W I and W II is not pure,<br />

⎛ ⎞<br />

1 0 0 0<br />

W I ⊗ W II = 1 ⎜0 1 0 0<br />

⎟<br />

4 ⎝0 0 1 0⎠ ≠ W. (III. 189)<br />

0 0 0 1<br />

This conclusion is, of course, in accordance with the conclusion (III. 104) concerning the pure state<br />

operator (III. 100).<br />

◃ Remark<br />

Notice that all matrices in this example are indeed Hermitian, positive and have trace 1, the requirements<br />

of Gleason’s theorem, p. 47, for operators W to be state operators. ▹<br />

EXERCISE 26.<br />

(a) In (III. 184), fill in the matrix representations of the projectors in H I and H II , and check<br />

that forming Kronecker products indeed yields (III. 185).<br />

(b) Is the state (III. 184) spherically symmetric?


IV<br />

THE COPENHAGEN INTERPRETATION<br />

It is wrong to think that the task of physics is to find out how nature is. Physics concerns<br />

what we can say about nature.<br />

— Niels Bohr<br />

The Heisenberg-Bohr tranquilizing philosophy - or religion? - is so delicately contrived<br />

that, for the time being, it provides a gentle pillow for the true believer from which he<br />

cannot very easily be aroused. So let him lie there.<br />

— Albert Einstein<br />

I know it is not the fault of N. B. that he did not study philosophy. But I deeply regret<br />

that by his authority the brains of two or three generations will be upset and hindered to<br />

think about the problems ‘He’ pretends to have solved.<br />

— Erwin Schrödinger<br />

Bohr’s famous institute being located in Copenhagen, the standard interpretation of quantum<br />

mechanics as explained in most of the textbooks is generally indicated as the Copenhagen Interpretation.<br />

It is however worth mentioning that the conceptions of the many supporters of the<br />

Copenhagen Interpretation, Niels Bohr, Werner Heisenberg, Wolfgang Pauli, Rudolf Peierls,<br />

Léon Rosenfeld and John Wheeler, to name some of them, mutually differ on numerous points,<br />

and that some of them, including Bohr himself, modified their conceptions in the course of time,<br />

so that the name ‘Copenhagen Interpretation’ is more a collective noun than the name of one<br />

clearly outlined vision. Moreover, important contributions to the standard interpretation of the<br />

theory have been made by Born and Von Neumann, working independently of the Copenhagen<br />

school. In this chapter we will evaluate the conceptions of Heisenberg and Bohr as the main<br />

representatives of the Copenhagen Interpretation, and consider more closely the debate between<br />

Einstein and Bohr. Finally, we will discuss the exact expression of the uncertainty principle.<br />

IV. 1<br />

HEISENBERG AND THE UNCERTAINTY PRINCIPLE<br />

The history of modern quantum mechanics starts in 1925, when Heisenberg publishes his famous<br />

transitional article ‘Über quantentheoretische Umdeutung kinematischer und mechanischer<br />

Beziehungen’ (‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’). His<br />

summary reads<br />

The present paper seeks to establish a basis for theoretical quantum mechanics founded<br />

exclusively upon relationships between quantities which in principle are observable.


78 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Obviously the theory was only allowed to speak about observable quantities; every attempt to<br />

visualize the inside of an atom had to be avoided. In particular, one could not speak of the orbit<br />

of an electron. Only the transitions between stationary states were ‘observable’ and therefore the<br />

transition quantities could be characterized by two discrete indices. These ideas were developed<br />

by Heisenberg, Born and Jordan into matrix mechanics. They represented all physical quantities<br />

by infinite complex Hermitian matrices. The ‘quantum condition’, the fundamental equation of this<br />

theory, is the commutation relation<br />

P Q − Q P = − i 11 (IV. 1)<br />

between the matrices P and Q, which were meant to be the ‘quantum counterparts’ of the canonical<br />

dynamical quantities, momentum and position, of classical mechanics à la Hamilton.<br />

In 1926 matrix mechanics received unexpected competition by wave mechanics, established by<br />

Erwin Schrödinger. He interpreted the electron as a vibrating charge cloud, continuously moving<br />

in space. In his conception the stationary states could be understood as resonances, comparable to<br />

the vibrations of the string of a violin. According to Schrödinger, wave mechanics was to be preferred<br />

over matrix mechanics because wave mechanics offers a graphic picture of what takes place in<br />

microphysical reality. This interpretation foundered on three insoluble problems:<br />

(i) waves of physical systems consisting of more than one particle were defined in the configuration<br />

space R 3N instead of in the three - dimensional space R 3 surrounding us,<br />

(ii) wave packets of free particles eventually fall apart and therefore, the electron cannot remain a<br />

localized entity,<br />

(iii) the wave function can carry complex values.<br />

Nevertheless, eventually the empirical strength of wave mechanics turned out to be just as strong<br />

as that of matrix mechanics.<br />

The fact that an approach with such radically different starting points turned out to be possible<br />

also, impelled Heisenberg to further clarify his starting points. The result of this effort is his ‘uncertainty<br />

principle’, formulated for the first time in his 1927 article ‘Über den anschaulichen Inhalt der<br />

quantentheoretischen Kinematik und Dynamik’, which was translated as ‘The physical content of<br />

quantum kinematics and mechanics’.<br />

In this article Heisenberg wonders how the ‘orbit’ of an electron must be understood in quantum<br />

mechanics. On the one hand, the basic equation (IV. 1) prevents granting numerical values to position<br />

and momentum simultaneously, on the other hand, the path of a particle in, for example, a Wilson<br />

chamber, seems to be directly perceptible. To find a way out of this dilemma, he was inspired by a<br />

statement of Einstein (H.J. Folse 1985, p. 91),<br />

[. . . ] it is the theory finally which decides what can be observed and what can not [. . . ]<br />

Could it be, that if a path cannot be defined in quantum mechanics, it can in fact not be observed also?<br />

This idea led him to analyze what the theory has to say about observations.


IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 79<br />

He starts (1927, Eng. tr. p. 64) with linking measuring and defining operationally,<br />

When one wants to be clear about what is to be understood by the words “position of the<br />

object”, for example of the electron, relative to a given frame of reference, then one must<br />

specify definite experiments with whose help one plans to measure the “position of the<br />

electron”, otherwise this word has no meaning.<br />

We will call this the measuring = defining principle.<br />

One could, for example, determine the position of an electron by examining it under a microscope.<br />

According to classical optics a microscope has a limited resolution. The Abbe criterion gives the<br />

smallest distinguishable details as<br />

δq ∼<br />

λ , (IV. 2)<br />

sin ε<br />

where λ is the wavelength of light and ε is the aperture, the opening angle of the lens. For a precise<br />

measurement we must therefore use a very short wavelength, i.e. gamma radiation. But in that case<br />

the Compton effect cannot be neglected. The radiation behaves as a flow of particles, with momentum<br />

p 0 = h λ<br />

, which collides with the electron and causes it to recoil.<br />

Figure IV. 1: Heisenberg’s γ - microscope<br />

To allow for an observation at least one photon has to collide with the electron, which will bring<br />

about a change of momentum. But as we do not know anything more about the direction of the<br />

photon after the collision than that it has gone through the lens, we cannot indicate the size of the<br />

recoil exactly. As can be seen in figure IV. 1, the transfer of momentum remains unknown to an<br />

amount<br />

δp ∼ p 0 sin ε = h λ<br />

sin ε (IV. 3)<br />

and therefore<br />

δq δp ∼ h. (IV. 4)


80 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

The more closely the position is determined, δq is small, the more inaccurately the momentum afterwards<br />

is known, δp is large.<br />

Quoting Heisenberg again (loc. cit.)<br />

At the instant when position is determined - therefore, at the moment when the photon is<br />

scattered by the electron - the electron undergoes a discontinuous change in momentum.<br />

This change is the greater the smaller the wavelength of the light employed - that is, the<br />

more exact the determination of the position. At the instant at which the position of the<br />

electron is known, its momentum therefore can be known up to magnitudes which correspond<br />

to that discontinuous change. Thus, the more precisely the position is determined,<br />

the less precisely the momentum is known, and conversely.<br />

This conclusion is the first formulation of the uncertainty principle. According to Heisenberg’s<br />

own measuring = defining principle this conclusion can, however, not yet be drawn because it also<br />

has to be specified what, in this context, must be understood by the term ‘momentum of the electron’.<br />

In a later discussion (Heisenberg 1930), Heisenberg specifies the reasoning by also discussing the<br />

definition of the momentum of the electron.<br />

This reasoning goes as follows. Suppose that the momentum of the electron has been measured<br />

in advance with an inaccuracy δ p 1 . Next, the position is measured with an inaccuracy δ q, then the<br />

momentum is measured again, with inaccuracy δp 2 . We can assume that δp 1 ≪ p 1 and δp 2 ≪ p 2 ,<br />

so that the momentum is very accurately known before and after the position measurement. Now it<br />

makes sense to speak of the momentum p 1 of the electron shortly before the position measurement.<br />

If now the position is measured very precisely, the position and momentum of the electron in the past<br />

are arbitrarily well defined. Heisenberg (1930, p. 20):<br />

[. . . ] if the velocity of the electron is at first known and the position then exactly measured,<br />

the position for times previous to the measurement may be calculated. Then for<br />

these past times δp δq is smaller than the usual limiting value [. . . ]<br />

Apparently, the uncertainty relation does not apply to the past. In the example the uncertainty concerns<br />

the unpredictability of the value of p 2 after the position measurement, not the inaccuracy δp 2<br />

with which p 2 can be measured. This unpredictability can be determined by accurately measuring<br />

the momentum before and after the determination of position, and the unpredictability is larger if<br />

the determination of position was more precise. Although it is true that one can speak in a logically<br />

consistent manner of the position and momentum of the electron in the past (loc. cit.),<br />

[. . . ] but this knowledge of the past is of a purely speculative character, since it can never<br />

(because of the unknown change in momentum caused by the position measurement) be<br />

used as an initial condition in any calculation of the future progress of the electron and<br />

thus cannot be subjected to experimental verification. It is a matter of personal belief<br />

whether such a calculation concerning the past history of the electron can be ascribed<br />

any physical reality or not.<br />

For Heisenberg, such a calculation does not describe reality. But then, what is reality to him?<br />

Heisenberg says, (1927, Eng. tr. p. 73),<br />

The “orbit” comes into being only when we observe it.


IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 81<br />

Apparently, the measurement creates reality, instead of revealing it. This is what we call the measuring<br />

= creating principle.<br />

This leads to the following representation. First, we measure the momentum of the electron<br />

precisely. Not only is the term “the momentum of the electron” hereby defined, now we also can<br />

say, according to the measuring = creating principle, that the value of the momentum, which was<br />

determined in this measurement, is physically real. Next, we measure the position precisely. At<br />

this measurement the electron obtains an exact position. After this measurement the momentum of<br />

the electron has however changed in an unpredictable manner. This can be verified with a second<br />

precise momentum measurement. This unpredictability turns out to be all the larger as the position<br />

measurement is more precise.<br />

Now the question is, if the electron had this changed momentum already before the second momentum<br />

measurement, i.e., if this value is also physically real before this measurement. According<br />

to Heisenberg this is not the case, because we can only predict the momentum to the order of the<br />

size of the change. Before the second momentum measurement the electron has only a blurred, fuzzy<br />

momentum. Only when the measurement of momentum has been carried out the electron regains a<br />

sharply defined momentum. ‘Fuzzy’ is meant in the ontological sense, as the sharpness of a property<br />

the electron possesses. As one quantity is measured more precisely, the conjugate quantity becomes<br />

more fuzzy.<br />

◃ Remark<br />

Directly after the measurement of momentum it is meaningful to say that the electron has this momentum,<br />

because in that case the outcome of a next measurement of momentum can, within the accuracy<br />

of measurement, be predicted with certainty. ▹<br />

In later work Heisenberg uses the Aristotelian term potential. A related term by K.R. Popper<br />

is propensity. The electron has a propensity to produce, at measurement, a certain outcome. This<br />

propensity can be understood as a real property of the electron, even if we are not performing a<br />

measurement. The potential and propensity interpretations are therefore ‘realistic’ interpretations, or<br />

at least not in conflict with scientific realism which is, roughly speaking, the thesis that a scientific<br />

theory tells us how (a part of) reality is made up.<br />

IV. 1. 1<br />

REMARKS<br />

(a) Heisenberg derives the uncertainty relation (IV. 4) for the electron from a quantum mechanical<br />

treatment of the photon. What he in fact hereby proves is the consistency of the uncertainty<br />

principle.<br />

(b) Although it is frequently written that the uncertainty relation restricts simultaneous measurements,<br />

simultaneous measurements of position and momentum do not appear in this discussion.<br />

(c) Creation of the sharp value of a quantity upon measurement can, in the terminology of the<br />

projection postulate, p. 42, be described as follows. Upon measurement of p the state transforms<br />

into the proper eigenstate of p. In that state q is unpredictable. If next q is measured, the<br />

state transforms into the proper eigenstate of q and p becomes unpredictable. The uncertainty<br />

principle says that that unpredictability is larger if the preceding measurement of q was more<br />

precise.


82 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

(d) Heisenberg (1930) describes the path of an electron in a Wilson chamber as follows. Suppose<br />

that the incoming electron can be described by a wave packet with fairly sharply defined position<br />

and momentum. Upon free development this packet spreads out in the course of time so<br />

that the position becomes less sharp. When the electron ionizes a molecule in the Wilson chamber<br />

a macroscopic droplet is formed, which can be understood as a position measurement. As<br />

a result the wave packet reduces to a packet which is rather sharply located, with a dimension<br />

in the order of a molecule, which again spreads out until a next ionization takes place.<br />

It can be shown that the successive spreading and contraction in position and momentum is,<br />

according to the uncertainty principle, in agreement with the observation of a macroscopic<br />

path. We cannot speak however of the path of an electron in an atom, not even approximately.<br />

An observation of the position of the electron with an accuracy larger than the dimension of the<br />

atom requires such a large recoil that the electron is generally pushed out of the atom entirely.<br />

Therefore, of such an ‘orbit’ no more than one point is observable. Notice that observation plays<br />

a vital role; the path in the Wilson chamber only comes into existence because we observe it.<br />

(e) As a result of Heisenbergs discussion of the uncertainty principle the term measurement disturbance<br />

was introduced in quantum mechanics. Initially the inclination existed to consider this<br />

as a more or less classical physical process; the momentum of the electron is disturbed by the<br />

collision with a photon. This is also indicated by Heisenberg’s use of the word ‘error’ for δq.<br />

From the beginning, Bohr resisted this explanation of Heisenberg, and he put the emphasis on<br />

the necessity to combine mutually excluding terms from a wave and particle picture in one description.<br />

Especially because of EPR it later became clear that the ‘measurement disturbance’<br />

cannot be an ordinary error.<br />

IV. 2<br />

BOHR AND COMPLEMENTARITY<br />

The core of the Copenhagen interpretation lays, of course, in Bohr’s work. His articles are characterized<br />

by an entirely own style. Remarkably, Bohr hardly uses the formalism of the theory, he<br />

generally gives a qualitative argument instead. His difficult, and sometimes obscurely formulated,<br />

long sentences are notorious, full of subordinate clauses and conditional definitions which do not<br />

always clarify his intentions. A careful reconstruction and interpretation of Bohr’s point of view,<br />

and its development in the course of time, has been given by E. Scheibe (1973, chapter 1), another<br />

interpretation is the monograph of H.J. Folse (1985).<br />

Centrally in Bohr’s consideration is the language we use to do physics. Bohr emphasizes that,<br />

regardless of how abstract and refined the terms of modern physics may be, in essence they are only<br />

an extension of everyday language, and they are nothing but means of communication we use to<br />

communicate observational results to other people. Such an observational result, the outcome of a<br />

measurement on a physical system in certain experimental circumstances, is therefore the basic element<br />

of consideration. For this, Bohr uses the term phenomenon. Every phenomenon is the resultant<br />

of a physical system S, a preparation apparatus P , a measuring apparatus M and their mutual interaction<br />

in a concrete experimental situation.<br />

The description of a phenomenon must always be made in unambiguous terms because of the<br />

requirement of communicability. A statement like, for example, “the object is in a superposition


IV. 2. BOHR AND COMPLEMENTARITY 83<br />

of two different states” is therefore not suitable. In classical physics a sufficient arsenal of terms is<br />

developed for these aims.<br />

According to Bohr, characteristic of classical physics is in the first place that the interaction between<br />

object and measuring apparatus can be assumed to be negligible small. This implies that upon<br />

describing a phenomenon the measuring apparatus can be left out of consideration. Instead of the<br />

statement: “Thermal interaction between a thermometer and a glass of water has, in certain circumstances,<br />

yielded as a result that the mercury column has been found to have a certain length”, we<br />

can also say: “The temperature of water has a certain value”. In this case we can, without objection,<br />

transfer the description of the phenomenon onto the object itself, and speak in terms of its properties.<br />

The essential difference between classical physics and quantum physics is, according to Bohr, that<br />

in quantum physics the interaction is quantized. The interaction between an object and a measuring<br />

apparatus can only exist of the exchange of one or more quanta, and cannot be made arbitrarily small.<br />

Bohr calls this starting point the quantum postulate (Bohr 1928, p. 580).<br />

<strong>QUANTUM</strong> POSTULATE:<br />

[The] essence [of the quantum theory] may be expressed in the so - called quantum postulate,<br />

which attributes to any atomic process an essential discontinuity, or rather individuality,<br />

completely foreign to the classical theories and symbolized by Planck’s quantum<br />

of action.<br />

In a phenomenon the object, the measuring apparatuses, and their interaction form an indivisible<br />

whole, and the interaction always amounts to at least one quantum h. This postulate unsettles the<br />

procedure to convert the description of a phenomenon into a description of the object itself.<br />

There is however a second element in Bohr’s point of view, which tempers this pessimistic conclusion.<br />

Scheibe called it the buffer postulate (1973, p. 24) because “the function of the postulate is<br />

to use classical physics as a buffer against the quantum - mechanical treatment of a phenomenon”,<br />

BUFFER POSTULATE:<br />

The description of the apparatus and of the results of observation, which forms part of<br />

the description of a quantum phenomenon, must be expressed in the concepts of classical<br />

physics (including those of “everyday life”), eliminating consistently the Planck quantum<br />

of action.<br />

The context of this requirement is again to be able to communicate our experimental findings to other<br />

people. The reasoning is as follows (Bohr 1947, p. 59),<br />

[. . . ] by an experiment we simply understand an event about which we are able in an<br />

unambiguous way to state the conditions necessary for the reproduction of the phenomena.<br />

In the account of these conditions, there can, therefore, be no question of departing<br />

from the Newtonian way of description and, in particular, it may be stressed that by the<br />

[. . . measuring apparatus . . . ], we simply understand some piece of machinery as regards<br />

the working of which classical mechanics can be entirely relied upon and where,<br />

consequently, all quantum effects have to be disregarded.


84 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Bohr assumes that only the language and terms of classical physics are suitable for the description<br />

of observational results. He writes (Bohr 1931, p. 692)<br />

[. . . ] the unambiguous interpretation of any measurement must be essentially framed in<br />

terms of the classical physical theories, and we may say that in this sense the language<br />

of Newton and Maxwell will remain the language of physicists for all time.<br />

This is a particularly radical point of view, and we will return to its motivation later.<br />

The combination of both postulates now leads to the following reasoning. In all phenomena an<br />

interaction exists between the system and the measuring apparatus which has a minimal order of magnitude<br />

h > 0, after all, the most minute measurements always rely on a quantum phenomenon. But<br />

in our description of the phenomenon we are forced to use classical concepts and this interaction, h,<br />

cannot occur. The consequence is that in our description the interaction is not analyzable.<br />

At the same time the classical character of the description makes it possible to speak again in<br />

terms of properties of the object itself. Therefore, instead of the statement “the interaction between a<br />

particle and a photographic plate resulted in a little black dot in a certain area of the plate”, we can<br />

also say “the particle has been found at a position in that area”, where no longer is referred to the<br />

measuring apparatus.<br />

But the large difference with the classical situation is that we, by disregarding the interaction,<br />

in a certain way make a mistake which remains without consequences within this phenomenon, but<br />

prevents the description to be combinable with the information obtained under different experimental<br />

conditions. If the object is coupled to another measuring apparatus there will be another interaction,<br />

which will again not be analyzable. Descriptions of the object that have been obtained under different<br />

measurement arrangements cannot be combined to one picture which covers it all. We will illustrate<br />

this in a more concrete case.<br />

IV. 2. 1<br />

COMPLEMENTARY PHENOMENA<br />

The most important examples of phenomena which give additional, but mutually excluding information<br />

on an object are measurements of position and momentum. Bohr (1939, p. 22) writes<br />

[. . . ] any phenomenon in which we are concerned with tracing a displacement of some<br />

atomic object in space and time necessitates the establishment of several coincidences<br />

between the object and the rigidly connected bodies and movable devices which, in serving<br />

as scales and clocks respectively, define the space - time frame of reference to which<br />

the phenomenon in question is referred.<br />

In this case, therefore, the object has an interaction with an apparatus which is firmly bolted down<br />

or anchored, so that its position remains secured. But the consequence is that a possible exchange<br />

of momentum between object and apparatus cannot be analyzed. Such a transfer of momentum<br />

will be absorbed by the fixed parts of the apparatus without leaving behind any trails. Within this<br />

experimental setup we are therefore prohibited to say anything about the momentum of the object.


IV. 2. BOHR AND COMPLEMENTARITY 85<br />

The opposite applies to the measurement of momentum (Bohr in Schilpp 1949, p. 219);<br />

In the study of phenomena in the account of which we are dealing with detailed momentum<br />

balance, certain parts of the whole device must naturally be given the freedom to<br />

move independently of others.<br />

Bohr assumes that a measurement of momentum is made by registering the recoil after a collision,<br />

for example, with a test particle. In this way we can, using the conservation laws, retrieve the<br />

momentum of the object. However, the condition that the test particle can move freely means that we<br />

cannot guarantee that it preserves a definite position. It is therefore excluded from being used as part<br />

of a spatial coordinate system, and now we cannot say anything about the position of the object.<br />

In order to perform a position measurement we must therefore put the object in contact with a<br />

part of the measuring apparatus which has been bolted down firmly, while performing a momentum<br />

measurement we must observe the recoil of a freely movable part of the measuring apparatus, and<br />

apply the momentum conservation law. Position and momentum measurements therefore exclude<br />

each other, because a measuring apparatus cannot at the same time be bolted down and freely movable.<br />

In the description of the object we must choose between granting a position or momentum. As worded<br />

by Philipp Frank (1949, p. 163)<br />

Quantum mechanics speaks neither of particles the positions and velocities of which<br />

exist but cannot be accurately observed, nor of particles with indefinite positions and<br />

velocities. Rather, it speaks of experimental arrangements in the description of which the<br />

expressions ”position of a particle” and ”velocity of a particle” can never be employed<br />

simultaneously.<br />

Bohr calls this characteristic property of quantum mechanics, where two quantities exclude each<br />

other whereas both are necessary to describe all phenomena in which the object can participate, complementarity.<br />

Position and momentum are examples of complementary quantities. Similar considerations<br />

apply to time and energy, such that a general complementarity exists between on the one hand<br />

a space - time description of phenomena, and on the other hand a dynamical description, frequently<br />

indicated by Bohr as ‘causally’, in which the conservation laws for energy momentum are applicable.<br />

◃ Remark<br />

The complementarity between quantities like position and momentum or descriptions using space -<br />

time coordination or dynamic laws differs from, and replaces, the contrast which Bohr placed central<br />

in his earlier work, namely between ‘wave’ and ‘particle’, because a classical particle has both position<br />

and momentum, a classical wave has neither. ▹<br />

The role of the uncertainty relations in Bohr’s views can now be described as considering them<br />

in the first place as symbolic expressions of the impossibility to define position and momentum at<br />

the same time when describing an object. In a phenomenon in which the position is determined<br />

sharply, δ q = 0, the momentum must be undetermined, δ p = ∞, and vice versa. But the relation<br />

δq δp ∼ h is, of course, more general. Bohr (1934, pp. 60,61) interprets this as follows:<br />

At the same time, however, the general character of this relation makes it possible to<br />

a certain extent to reconcile the conservation laws with the space - time co - ordination<br />

of observations, the idea of a coincidence of well - defined events in a space - time point<br />

being replaced by that of unsharply defined individuals within finite space - time regions.


86 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

The meaning Bohr attaches to the uncertainty relations can be summarized this way: the sharper<br />

we can, in a phenomenon, define the position of the object, the fuzzier the momentum must be defined,<br />

and vice versa. The quantities δq and δp in the relation δqδp ∼ h therefore represent the fuzziness in<br />

the definition. Bohr emphasizes an epistemological role of these quantities stronger than an ontological<br />

role.<br />

IV. 2. 2<br />

REMARKS AND PROBLEMS<br />

Bohr’s supposition that classical language is a definite means of expression for physical observations<br />

which cannot be improved upon, is radical and at first sight even fairly unacceptable. Language<br />

develops and history teaches us that from time to time new concepts are necessary. Aristotle had,<br />

for example, no momentum concept, Newton knew nothing of energy, Coulomb had no theory of<br />

fields, etc. Doesn’t it speak for itself that quantum mechanics also asks for new concepts? Bohr,<br />

however, (ibid., p. 16), says<br />

[. . . ] it would be a misconception to believe that the difficulties of the atomic theory may<br />

be evaded by eventually replacing the concepts of classical physics by new conceptual<br />

forms.<br />

Bohr emphasizes that with this point of view he does not reject the introduction of new entities,<br />

e.g. quarks, superstrings or black holes. The aspects of classical language which are the reason<br />

that it cannot be improved upon are, according to him, descriptions in terms of space and time and<br />

descriptions in terms of cause and effect. These are the only categories with which we can describe<br />

observational results.<br />

Another problem with the idea that the classical concepts cannot be improved upon is Bohr’s<br />

immediate conclusion that the quantum of action cannot occur in the description of a phenomenon,<br />

because a statement such as ‘h = 6.6 · 10 −34 Js’ is also an unambiguous summary of experimental<br />

evidence, although not of one phenomenon. The idea that h cannot appear in the language of observations<br />

is a weak, and in fact untenable point in his argumentation. The prohibition of the use of h<br />

in the language of observations also brought Bohr to the conclusion that the spin of an electron, 1 2 ,<br />

would be fundamentally unobservable. This conclusion has been proven to be incorrect.<br />

In some articles Bohr gives a more abstract explanation of the quantum postulate and emphasizes<br />

the ‘symbolic’ role of h. It does not so much represent the inevitable interaction, or measurement<br />

disturbance, between object and measuring apparatus, as the fundamental impossibility to make a<br />

sharp distinction between object and observation apparatus. It is, in any case, clear that Bohr does not<br />

regard the formalism of quantum mechanics, with its wave functions and operators, as an extension<br />

or improvement of classical language. He emphasizes that this formalism is purely symbolic and<br />

cannot be taken as a description, as the quantum state of a system is given without reference to the<br />

experimental setup.<br />

It should be noted that Bohr, at emphasizing the applicability of concepts, has more in mind than<br />

the ‘logical’ question of ’definiteness’. For Bohr a term like ‘position of a particle’ is applicable if we<br />

can in fact control and secure this position, using firmly bolted apparatuses. Bohr’s use of the term<br />

‘determination’ refers both to a measurement as to a state preparation.


IV. 2. BOHR AND COMPLEMENTARITY 87<br />

Speaking of ‘partially defined positions and momenta’, Bohr considers the uncertainty relation<br />

between position and momentum as the possibility to come to a compromise with the complementarity<br />

between position and momentum. Here we can think of a context of measurement in which the<br />

object interacts with a part of the apparatus which is linked with the rest of the apparatus by means<br />

of a spring with a finite spring constant, an intermediate form between ‘freely movable’ and ‘firmly<br />

bolted’. He has, however, not developed this compromise. This point of view does in fact not fit the<br />

usual mathematical derivation of the uncertainty relations for position and momentum. They make,<br />

for two given (sharp) quantities p and q, a statement about spreading in quantum states, not about the<br />

well-definedness of the quantities. It has been attempted to prove this compromise mathematically,<br />

by the introduction of ‘blurred quantities’, e.g. Busch, Grabowski and Lahti (1995).<br />

Of fundamental importance in Bohr’s point of view is that in a phenomenon an object and experimental<br />

setup are involved. The setup determines which frame of concepts applies to the object. In<br />

many cases the contrast between object and measuring apparatus coincides with that of the microscopic<br />

and macroscopic system, respectively. But that is not necessarily so. A macroscopic system<br />

can also be considered as an object while a microscopic system can serve as a measuring apparatus.<br />

We can consider, for example, a macroscopic measuring apparatus to be the object of another measurement.<br />

As soon as we do this the macroscopic system can, according to Bohr, no longer execute<br />

its role as a measuring device. It becomes an object itself, to which the quantum formalism must be<br />

applied. This functional contrast between object and measuring apparatus is therefore more essential<br />

than that between microscopic and macroscopic systems.<br />

For a good understanding of Bohr’s position, and Heisenberg’s for that matter, it is important to<br />

notice that measurements do not require the presence of consciousness. Decisive for applicability of<br />

classical concepts is the presence of a measurement context. Therefore, subjectivity does not play<br />

a role in any form, for applicability of a concept as ‘momentum’ it does not matter if a conscious<br />

observer, a computer or another measuring apparatus carries out the momentum measurement.<br />

Also, from Bohr’s refusal to assign a realistic meaning to the quantum mechanical description, the<br />

conclusion cannot be drawn that he supports an anti - realistic or ‘instrumentalist’ view on physics,<br />

where instrumentalism is roughly the thesis that a scientific theory is only an instrument to carry out<br />

calculations of which we compare the outcomes with the indications of measuring apparatuses, in<br />

particular, that a theory is no ‘knowledge of the world’, that it does not provide a faithful picture of<br />

what reality is. An object such as an electron has, besides its quantum mechanical state, more than<br />

enough permanent properties, such as the super - selected quantities mass and charge which are not<br />

subject to complementarity, to conceive it as a real, existing object.<br />

IV. 2. 3<br />

AGREEMENT AND DIFFERENCE BETWEEN HEISENBERG AND BOHR<br />

Both Heisenberg and Bohr emphasize that quantum mechanics is a complete theory which cannot<br />

be extended into a more detailed description with hidden variables. Bohr says (Schilpp 1949, p. 235)<br />

[. . . ] in quantum mechanics, we are not dealing with an arbitrary renunciation of a more<br />

detailed analysis of atomic phenomena, but with a recognition that such an analysis is in<br />

principle excluded.


88 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Heisenberg (1927, p. 83) also expresses himself in this sense. He defines the uncertainty relations<br />

as<br />

Even in principle, we cannot know the present in all detail.<br />

He rejects the conception that behind the statistic description of quantum mechanics there still is a<br />

‘real world’ as a “fruitless and senseless speculation” (loc. cit.).<br />

According to both Bohr and Heisenberg, the quantum mechanical description cannot be applied<br />

to the whole world, because a classically described context of measurement is always necessary. The<br />

border between the classical and quantum mechanical description can be moved at will, but cannot<br />

be removed. Therefore, quantum mechanics is not a universal theory in the sense that there exists<br />

something like a ‘wave function of the universe’.<br />

Further agreement between Heisenberg and Bohr is found in the significance they attach to measurement.<br />

The difference is that according to Heisenberg something changes in the object during<br />

measurement; some properties are created, others disappear or become fuzzy. According to Bohr<br />

nothing has to happen in the object. The experimental setup only enables some description of the system<br />

which would not be allowed at another experimental setup. According to Bohr, the uncertainty<br />

relation is a symbolic, contrary to a descriptive, expression of the impossibility to define position and<br />

the momentum in one phenomenon.<br />

Another difference is that Heisenberg tends, more than Bohr, to a realistic interpretation of the<br />

mathematical quantum formalism. In an interview at the end of his life, Heisenberg admitted that he<br />

never really understood the idea of complementarity.<br />

IV. 3<br />

DEBATE BETWEEN EINSTEIN EN BOHR<br />

IV. 3. 1<br />

INTRODUCTION<br />

Einstein, who contributed to the development of the quantum theory until 1922, never wanted<br />

to accept the Copenhagen interpretation. In his memoirs, Heisenberg mentions how he, at a visit to<br />

Berlin, explained his starting - point that the theory may speak exclusively about observable quantities,<br />

and, to his surprise, Einstein wanted to know nothing about it, “the theory decides what can be<br />

observed”. The main source of the course of the debate between Einstein and Bohr which we will<br />

review here, is Bohr’s own report ‘Discussion with Einstein on Epistemological Problems in Atomic<br />

Physics’ (Bohr 1949).<br />

The very first time Einstein gave publicity to his objections was at the 5 th Solvay conference in<br />

Brussels in 1927 where he suggested there were two conceivable conceptions concerning the quantum<br />

mechanical wave function.<br />

(i) The state ψ gives a description of the individual system which is as complete as possible.<br />

(ii) The state ψ does not characterize an individual system but an ensemble of identically prepared<br />

systems. Therefore, as a description of the individual system ψ is incomplete, ψ is a ‘statistical<br />

quantity’.


IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 89<br />

Conception (i) was defended by Heisenberg and Bohr. Einstein posed the next objection to this<br />

conception: when a particle travels through a narrow slit, the wave function will, by deflection, extend<br />

itself over a large part of space. If this is a complete description of the particle, we have to conclude<br />

that it is potentially present everywhere in this area. But after detection of the particle on a photographic<br />

plate it is out of the question that it can still be found elsewhere. Therefore, the wave function<br />

must disappear suddenly there, which would imply a peculiar ‘action at a distance’. This objection<br />

does not apply to conception (ii), because there the detection simply corresponds to the choice of an<br />

element from the ensemble.<br />

In his answer, Bohr emphasized that the deflection of the wave function by a slit in a firmly bolted<br />

screen finds its origin in the possibility of the particle to exchange momentum with the screen. But<br />

this exchange of momentum is not analyzable within this setup, i.e., without detaching the screen.<br />

The question whether a more detailed description of the individual case is possible found its<br />

temporary culmination in the analysis of the thought experiment with the double slit, which is depicted<br />

in figure IV. 2. When a monochromatic wave travels through a screen with two narrow slits, an interference<br />

pattern is visible on a photographic plate. This is typical for wave behavior, where the waves<br />

from both slits cooperate. An individual particle, however, can only travel through one slit, and the<br />

wave function does not tell us through which slit it travels.<br />

Figure IV. 2: The double slit interference experiment (Bohr 1949 )<br />

Einstein now suggested that it was nevertheless possible to obtain information about through<br />

which slit the particle travels, for example by measuring the transfer of momentum to the first screen.<br />

If this screen received a thrust downwards, the particle has chosen the upper slit, and vice versa.<br />

Bohr answered that if we want to measure the momentum transfer to the screen with an exactitude<br />

which is enough to distinguish the recoils belonging to the paths through the two slits, the momentum<br />

of the screen itself must be very exactly known. If d represents the distance between the slits, and l<br />

represents the distance between the screens, the angle between the two paths is of the order<br />

α ≃ sin α = d . (IV. 5)<br />

l<br />

The recoil is of the order<br />

p 0 sin α ≃ d , (IV. 6)<br />

λ l


90 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

and therefore we have to know the momentum of the screen with an uncertainty<br />

δp d . (IV. 7)<br />

λ l<br />

Gaining such an exactitude is, however, only possible if the screen is movable. But in that case it is<br />

no longer possible to fulfil its function as a screen which determines an exact position for the slit. It<br />

is therefore no longer part of the original measuring context, as can be seen in figure IV. 3.<br />

Figure IV. 3: Contexts of measurement in which the interference of the particles is visible, and those<br />

in which the recoil of the screen is visible, exclude each other. (Bohr 1949 )<br />

Actually, because now we will perform a measurement on the screen, the screen itself has to be<br />

considered an object. This means that quantum mechanics applies to it, and the screen is, therefore,<br />

also subject to an uncertainty relation<br />

δq λ l . (IV. 8)<br />

d<br />

But this is an indefiniteness of the same order of magnitude as the distance between the interference<br />

bands. Bohr concludes that under these circumstances interference can no longer be seen.<br />

With this reasoning he was able to transform Einstein’s objection to an affirmation of his idea<br />

of complementarity; as soon as we try to carry out a closer analysis of the phenomenon, we have to<br />

modify the experimental setup in such a way that the phenomenon changes unrecognizably. Nowadays<br />

an alternative of this thought experiment can actually be carried out in a laboratory, as we will<br />

discuss in section IV. 4.<br />

IV. 3. 2<br />

THE PHOTON BOX<br />

At the 6 th Solvay Conference in 1930 in Brussels, Einstein gave another example, which is known<br />

under the name ‘the photon box’. It concerns an isolated box filled with radiation and equipped with<br />

a clock mechanism which opens a shutter during a very short interval. It is assumed that in advance<br />

the box is weighed meticulously.


IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 91<br />

Upon closure of the shutter we have, according to Einstein, a choice: either we weigh the box<br />

again and determine how much mass has vanished so that we can, using the relation E = m c 2 ,<br />

retrieve the energy of the escaped photon, or we open the box and read off the clock mechanism to<br />

determine when the shutter has been opened, which enables us to predict the time of exit of the photon<br />

and therefore its time of arrival at a remote detector. We can choose between both options long after<br />

the photon has left.<br />

Bohr’s answer is not entirely clear. It may be assumed that he did not understand Einstein’s<br />

intentions correctly. 1 He explains Einstein’s objection as an attempt to refute the uncertainty relation<br />

between energy and time; he shows that both determinations cannot possibly be made at the same<br />

time.<br />

Bohr reasons as follows. Assume that the box hangs in equilibrium from a spring in a gravitational<br />

field. When in a time interval T a mass δm escapes, it receives an upward impulse F ∆t of magnitude<br />

g δm T. (IV. 9)<br />

We can keep T finite by, at some moment, hanging a small weight to the box to compensate for the<br />

loss of mass. Suppose we want to determine the mass of the photon by measuring this momentum<br />

transfer then, again, the momentum of the box at the start of the experiment must be exactly known,<br />

δp g δm T. (IV. 10)<br />

But now the same argument applies as used in the double slit experiment. This precise determination<br />

of momentum is only possible if the fixation of the position of the box is given up. The box itself<br />

must be considered a quantum mechanical object, and therefore the uncertainty relation δ pδ q h<br />

applies to it. The position of the box is unknown with an uncertainty of magnitude<br />

δq <br />

<br />

g δm T<br />

(IV. 11)<br />

from which it follows that the gravitational potential ϕ g to which the clock is exposed is also uncertain,<br />

δϕ g ≃ g δq <br />

. (IV. 12)<br />

δm T<br />

But according to the red shift formula from the general theory of relativity (!) the pace of a clock is<br />

influenced by the gravitational potential,<br />

∆T<br />

T<br />

= δϕ g<br />

, (IV. 13)<br />

c2 therefore, the pace of the clock is also uncertain, and consequently the time of opening of the clock is<br />

unknown. Under the circumstances in which we can determine the energy of the photon, we cannot<br />

retrieve its exit time exactly.<br />

Although Bohr seems to rebuke Einstein with his own theory, Bohr’s answer evokes, among other<br />

things, the question whether it is appropriate that the correctness of quantum mechanics relies on<br />

the correctness of the general theory of relativity, which is a classical theory, and is, strictly spoken,<br />

contradictory to quantum mechanics.<br />

1 That Einstein indeed had the intention to point out the freedom of choice is apparent in a letter to Bohr from Paul<br />

Ehrenfest, who heard the argument from Einstein earlier.


92 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

EXERCISE 27. Try, using the uncertainty relation for time and energy, δ tδ E h, to refute<br />

Einstein’s argumentation without appealing to other physical theories.<br />

IV. 3. 3<br />

EINSTEIN, PODOLSKY AND ROSEN<br />

The thought experiment of Einstein, Podolsky and Rosen, which we discussed in section I. 2,<br />

forms the highlight of the debate. Here Einstein’s objections emerge in their most pure form.<br />

Given two systems which interacted with each other at some time, but are separated now, consider<br />

two non - commuting quantities A and A ′ of one of the particles, and B and B ′ of the other<br />

particle. Measurement of A allows us to do a certain prediction concerning B of the other particle,<br />

measurement of A ′ allows us, analogously, to make a certain prediction concerning B ′ of the other<br />

particle.<br />

Einstein admits that these two measurements cannot be carried out simultaneously. But we can<br />

choose which measurement we perform while the other particle is very far away. It is not reasonable,<br />

EPR argue, that this other particle will be influenced by this choice. This means that although<br />

only one of both predictions concerning the other particle can be done with certainty, both predictions<br />

are, at the same time, true, corresponding to properties of the other particle, i.e., to ‘elements of<br />

physical reality’.<br />

IV. 3. 4<br />

HEISENBERG, BOHR AND EINSTEIN, PODOLSKY AND ROSEN<br />

According to Heisenberg, measurement has an essential influence. Some properties of the particle<br />

become sharp, others fuzzy. If this consequence of measurement would be understood to be a physical<br />

interaction, this would evoke the next ‘natural’ requirement of locality (M.L.G. Redhead 1987, p. 77)<br />

An unsharp value for an observable cannot be changed into a sharp value by measurements<br />

performed at a distance.<br />

But the analysis of EPR shows that, the particles being far removed from each other, this requirement<br />

has not been met, making Heisenberg’s interpretation much less physically pictorial than it seemed to<br />

be initially. The natural requirement of locality in Bohr’s interpretation reads (loc. cit.)<br />

A previously undefined value for an observable cannot be defined by measurements performed<br />

‘at a distance’.<br />

This requirement has also not been fulfilled.<br />

Bohr’s answer to EPR, and his rejection of the incompleteness claim, amounts to the notion that<br />

the aforementioned requirement of locality can be violated without implying the existence of superluminal<br />

physical effects. The ‘defining’ functioning of measuring apparatuses is not a process that<br />

propagates in space and time and by means of some interaction disturbs particles that are not measured,<br />

or creates values for properties in those particles. It concerns an epistemological role of the<br />

measuring apparatuses. The measuring apparatuses measuring one of a pair of correlated particles<br />

define which classical terms apply to both particles.


IV. 4. NEUTRON INTERFEROMETRY 93<br />

If the position is measured of one of the particles, we have to do with a phenomenon in which<br />

the term position is applicable. Thus, on the basis of the correlation between these particles the term<br />

‘position’ is also applicable to the other particle. If the position of one of the particles is measured,<br />

a ‘position perspective’ is opened, so to speak, to the world. Likewise, measurement of momentum<br />

on one of the particles makes the other particle accessible to a description with the term ‘momentum’.<br />

Even though there is no physical intervention on this particle, it is still not permitted to speak about<br />

the particles having these properties outside the context of a phenomenon. Therefore, Bohr rejects<br />

Einstein’s reasoning that the other particle, not being disturbed by the measurement, consequently<br />

also possesses the properties ‘position’ and ‘momentum’ independent of measurement.<br />

In fact, this same reasoning can be applied to the the double slit experiment, as Bohr showed<br />

in his answer to Einstein. In this experiment we also have a choice to do either a measurement of<br />

momentum on the screen and this way determine which path the particle has taken, thereby losing the<br />

interference pattern, or to measure its position, thereby retrieving the interference pattern again. But<br />

Bohr writes<br />

As repeatedly stressed, the principal point is here that such measurements demand mutually<br />

exclusive experimental arrangements.<br />

IV. 4<br />

NEUTRON INTERFEROMETRY<br />

Nowadays, a variant version of the thought experiment with the double slit can be carried out in<br />

the laboratory using a neutron interferometer. A neutron interferometer consists of a massive perfect<br />

silicon crystal, usually with dimensions of approximately 10 × 10 × 50 cm 3 . After cutting large<br />

notches in the crystal, a basis with upstanding teeth remains, see figure IV. 4.<br />

Figure IV. 4: Several perfect crystal neutron interferometers (Rauch and Werner 2000 )


94 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Using an interferometer with three upstanding teeth, a monochromatic beam of neutrons with<br />

a de Broglie wavelength of approximately 1 Å now hits the first tooth of this crystal. The crystal<br />

lattice acts like a grid and lets the beam pass in very sharply defined directions. Under suitable<br />

conditions there are exactly two emanating beams, one transmitted (T) and one reflected (R), as<br />

shown in figure IV. 5 a.<br />

At the second tooth this process is repeated, and both beams are again split up. Two of them are<br />

now outside the interferometer where they are screened, no longer participating. The remaining two<br />

beams are bent towards each other and meet at the third tooth. Here, both beams are split up again,<br />

and now the straightforward going beam of one path is superimposed on the reflected beam of the<br />

other path. Neutron detectors are placed in both emanating beams.<br />

T<br />

T<br />

2<br />

R<br />

A<br />

R<br />

1<br />

R<br />

B<br />

T<br />

a) A sketch of the setup b) The experimental results<br />

(Rauch and Werner 2000 )<br />

Figure IV. 5: The interference pattern in the neutron interferometer is acquired by measuring the<br />

intensity in the detectors at a variable optical path length difference.<br />

If the incoming beam comes from below, and the beams are not manipulated, all neutrons turn out<br />

to end up in the upper beam at detector A, undergoing constructive interference, while the neutrons<br />

in the lower beam extinguish each other. For this phenomenon it is essential that the interferometer<br />

consists of only one crystal, for in that case the waves remain coherent even though, along the way,<br />

the beams have been separated by ‘macroscopic distances’, approximately 5 cm or ≃ 10 9 λ. When a<br />

neutron has arrived in a detector it can have traveled along one of both paths.<br />

Upon introducing a phase difference between the two paths by sliding a small piece of aluminium<br />

of variable thickness in one of the paths, the intensity shifts from the upper to the lower detector. This<br />

intensity is a periodic function of the thickness of the piece of aluminium, see figure IV. 5 b. This is<br />

the interference pattern.<br />

Now the question is if we can, in some way, uncover along which path the particle has traveled.<br />

Following Bohr’s line of thought this should be possible by sawing off one of the teeth and measuring<br />

the recoil it receives of the neutron. Such an experiment can, however, not be carried out with the<br />

required experimental exactitude.<br />

Another option is to make use of the fact that the neutron is a spin 1/2 particle and therefore has<br />

an internal degree of freedom. We can carry out such an experiment with a polarized beam, where


IV. 4. NEUTRON INTERFEROMETRY 95<br />

all neutrons have, at entry in the interferometer, spin up in the z - direction. We place the complete<br />

setup in a homogeneous magnetic field which ensures that spin up and spin down have a different<br />

energy ω 0 . In one of the paths we place a ‘spin flipper’, a small coil through which an alternating<br />

current runs having exactly the resonance frequency ω 0 . At a suitable choice of the length of the<br />

coil the spin of every neutron which travels through it will be flipped over. Subsequently, we place<br />

spin analyzers in front of the detectors, so that we can not only observe in which emanating beam the<br />

neutron is located but also its spin in the z - direction.<br />

In this setup we can therefore uncover exactly along which path the particle has traveled; spin up<br />

means the path without the spin flipper has been chosen, spin down means the neutron traveled along<br />

the path with the spin flipper. But in this setup no more interference is seen! The intensity is equal in<br />

both detectors and independent of the phase difference.<br />

We can describe this as follows. The wavepath function |ϕ 0 ⟩ ∈ L 2 (R 2 ) of an emanating neutron<br />

exists of four terms,<br />

|ϕ 0 ⟩ = 1 2<br />

(<br />

|ϕ1A ⟩ + |ϕ 1B ⟩ + e i χ |ϕ 2A ⟩ + e i χ |ϕ 2B ⟩ ) . (IV. 14)<br />

Here ϕ iA and ϕ iB represent the wave functions ending up in the detectors A and B, respectively, 1<br />

and 2 refer to the two possible paths through the interferometer, as can be seen in figure IV. 5 a. The<br />

factor e iχ corresponds to the phase shift by the aluminium. If χ = 0, there is maximum constructive<br />

interference in A and total destructive interference in B, from which it follows that<br />

|ϕ 1A ⟩ = |ϕ 2A ⟩ and |ϕ 1B ⟩ = − |ϕ 2B ⟩. (IV. 15)<br />

The intensity in detector A is given by the expectation value of a projection P A , where<br />

P A |ϕ iA ⟩ = |ϕ iA ⟩ and P A |ϕ iB ⟩ = 0, analogously for P B . Therefore, we find for the intensity I A<br />

of the neutron beam that encounters detector A, quantum mechanically expressed as the probability<br />

to find a neutron in detector A,<br />

I A = ⟨ϕ 0 | P A |ϕ 0 ⟩ = 1 (<br />

4 ⟨ϕ1A | + ⟨ϕ 2A | e − i χ) ( |ϕ 1A ⟩ + e i χ |ϕ 2A ⟩ )<br />

and likewise for I B ,<br />

= 1 2<br />

I B = ⟨ϕ 0 | P B |ϕ 0 ⟩ = 1 4<br />

= 1 2<br />

(1 + cos χ), (IV. 16)<br />

(<br />

⟨ϕ1B | + ⟨ϕ 2B | e − i χ) ( |ϕ 1B ⟩ + e i χ |ϕ 2B ⟩ )<br />

(1 − cos χ). (IV. 17)<br />

In this experiment the neutrons are polarized, therefore we can add the spin state to the wavepath<br />

function and thus get a Pauli spinor,<br />

( 1<br />

|ϕ i, tot ⟩ = |ϕ 0 ⟩ ⊗ |z ↑⟩ = ϕ(⃗q) =<br />

0)<br />

( ) ϕ(⃗q)<br />

0<br />

∈ L 2 (R 3 ) ⊗ C 2 . (IV. 18)<br />

The functioning of the spin flipper, which we assume to be completely ideal, can now be described as<br />

follows. The component of the state traveling along path 1 does not meet a spin flipper, which means


96 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

that it remains unaltered, and we have, leaving out the cartwheels ⊗,<br />

|ϕ 1A ⟩ |z ↑⟩ → |ϕ 1A ⟩ |z ↑⟩ and |ϕ 1B ⟩ |z ↑⟩ → |ϕ 1B ⟩ |z ↑⟩, (IV. 19)<br />

whereas for the components traveling along path 2 the spin direction reverses,<br />

|ϕ 2A ⟩ |z ↑⟩ → |ϕ 2A ⟩ |z ↓⟩ and |ϕ 2B ⟩ |z ↑⟩ → |ϕ 2B ⟩ |z ↓⟩. (IV. 20)<br />

Therefore, the total final state is<br />

|ϕ f, tot ⟩ = 1 2<br />

which means that for the intensity we have<br />

(<br />

|ϕ1A ⟩ |z ↑⟩ + |ϕ 1B ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ + e i χ |ϕ 2B ⟩ |z ↓⟩ ) , (IV. 21)<br />

I A = ⟨ϕ f, tot | P A ⊗ 11 |ϕ f, tot ⟩ = 1 4 ⟨ϕ f, tot| ( |ϕ 1A ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ ) = 1 2<br />

, (IV. 22)<br />

and likewise for I B . We see that, because of the orthogonality of the spin states |z ↑⟩ and |z ↓⟩, the<br />

interference term disappears.<br />

With the neutron interferometer we can also illustrate the fact that there is always freedom of<br />

choice because we can, instead of analyzers for spin in the z - direction, place analyzers for spin in<br />

the x - direction.<br />

The eigenvectors for spin in the x - direction are superpositions of those in the z - direction, see<br />

section III. 6, equations (III. 145) and (III. 146),<br />

|x ↑⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ + |z ↓⟩<br />

)<br />

and |x ↓⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↓⟩ − |z ↑⟩<br />

)<br />

. (IV. 23)<br />

We can calculate the probability to find, e.g., a neutron with spin in the negative x - direction in detector<br />

A, as the expectation value of the projector P A |x ↓⟩⟨x ↓| in the state |ϕ f, tot ⟩, (IV. 21),<br />

⟨ϕ f, tot | ( P A ⊗ |x ↓⟩ ⟨x ↓| ) |ϕ f, tot ⟩<br />

= 1 (<br />

4 ⟨ϕ1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + e i χ ⟨ϕ 1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩<br />

+ e − i χ ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩ )<br />

= 1 4<br />

(1 − cos χ), (IV. 24)<br />

and we see interference again.<br />

EXERCISE 28. Verify the calculations (IV. 22) and (IV. 24).<br />

In this case we also can choose whether we measure spin in the x - direction or in the z - direction<br />

long after the neutron has left the interferometer, which means that the neutron seems to make the<br />

choice whether to take one of the paths through the interferometer, or to show interference between


IV. 5. THE UNCERTAINTY RELATIONS 97<br />

both paths, after it has left the interferometer. J.A. Wheeler (1978) called such experiments delayed -<br />

choice experiments. Outcomes of measurements in the future seem to determine what has happened<br />

in the past!<br />

Actual confirmation of this freedom of choice was not obtained until 2007, when a group in<br />

Cachan, France, succeeded to carry out such an experiment using linearly polarized single photons,<br />

a 48 m interferometer and two beamsplitters. In their article (Jaques 2007) they conclude that<br />

Our realization of Wheeler’s delayed - choice gedanken experiment demonstrates that<br />

the behavior of the photon in the interferometer depends on the choice of the observable<br />

that is measured, even when that choice is made at a position and a time such that it is<br />

separated from the entrance of the photon into the interferometer by a space - like interval.<br />

EXERCISE 29. Give, concisely, Bohr’s view on such experiments.<br />

IV. 5<br />

THE UNCERTAINTY RELATIONS<br />

IV. 5. 1<br />

INTRODUCTION<br />

Heisenberg’s original reasonings concerning the uncertainty principle resulted in ‘approximate<br />

inequalities’ for position q and momentum p, and for energy E and time t, of the form<br />

δq δp ∼ h and δE δt ∼ h. (IV. 25)<br />

In this section we will focus on the mathematical meaning of δ q, δ p, δ E and δ t and their interpretation.<br />

In his first article, Heisenberg (1927) gives the Gaussian wave packet as the only quantitative<br />

example. Its Fourier transform is also Gaussian and the widths of these packets are inversely proportional<br />

to each other, a general result of Fourier analysis. A suitable definition of these widths<br />

yields q 1 p 1 = h, where q 1 and p 1 represent the widths in question. Still in the same year E.H. Kennard<br />

derived the next general inequality,<br />

∆ ψ Q ∆ ψ P 1 2<br />

, (IV. 26)<br />

where ∆ ψ Q and ∆ ψ P are standard deviations of Q and P in ψ ∈ L 2 (R). In his Chicago lectures,<br />

Heisenberg (1930) considers the Kennard inequality (IV. 26) as the mathematical expression of the<br />

uncertainty principle. We will criticize this still widespread conception shortly, and give a derivation<br />

of the ‘standard uncertainty inequalities’, which are a generalization of the Kennard inequality.<br />

◃ Remark<br />

In his discussions of the uncertainty principle, Bohr exclusively makes use of relations of the<br />

type (IV. 25). ▹


98 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

IV. 5. 2<br />

THE STANDARD UNCERTAINTY RELATIONS<br />

If ψ ∈ L 2 (R) is the normalized wave function of a physical system in the q - language,<br />

with ∥ψ∥ = 1, the wave function ˜ψ(p) in the p - language is its Fourier transform<br />

˜ψ(p) =<br />

∫<br />

1<br />

√<br />

2 π <br />

R<br />

e − i p q<br />

ψ(q) dq, (IV. 27)<br />

and its inverse Fourier transform is<br />

∫<br />

1<br />

ψ(q) = √ e i p q<br />

˜ψ(p) dp. (IV. 28)<br />

2 π <br />

R<br />

The norm is invariant under Fourier transformations, therefore ∥ ˜ψ∥ = 1.<br />

The standard deviation of position in a state |ψ⟩, ∆ ψ Q, is defined as<br />

∫<br />

( ∫ ) 2.<br />

(∆ ψ Q) 2 = ⟨Q 2 ⟩ ψ − ⟨Q⟩ ψ 2 = q 2 |ψ(q)| 2 dq − q |ψ(q)| 2 dq (IV. 29)<br />

R<br />

R<br />

Likewise, for momentum, ∆ ψ P , we have<br />

(∆ ψ P ) 2 = ⟨P 2 ⟩ ψ − ⟨P ⟩ ψ<br />

2<br />

∫<br />

= − 2 ψ ∗ (q) d2 ψ(q)<br />

( ∫<br />

R dq 2 dq − − i ψ ∗ (q) dψ(q) ) 2<br />

dq<br />

R dq<br />

∫<br />

= p 2 | ˜ψ(p)|<br />

( ∫ 2. 2 dp − p | ˜ψ(p)| dp) 2 (IV. 30)<br />

R<br />

R<br />

Without loss of generality we can assume ⟨P ⟩ and ⟨Q⟩ to equal 0, so that<br />

of 1 2<br />

(∆ ψ P ) 2 = − 2 ∫<br />

R<br />

ψ ∗ (q) d2 ψ(q)<br />

dq 2 dq =<br />

∫<br />

R<br />

p 2 | ˜ψ(p)| 2 dp. (IV. 31)<br />

If the wave function ψ (q) is a Gaussian wave packet, the product takes on the minimum value<br />

. An example is the ground state of the one - dimensional harmonic oscillator having mass m,<br />

ϕ 0 (q) =<br />

( m ω0<br />

π <br />

) 1<br />

4 e − m ω q2<br />

2 , (IV. 32)<br />

with energy E 0 = 1 2 ω 0.<br />

Before interpreting the Kennard inequality (IV. 26), we give a still more general inequality, derived<br />

by Schrödinger (1930). Consider two arbitrary self - adjoint operators A and B acting on a Hilbert<br />

space H. Define, for a pure state |ψ⟩ ∈ H, the following operators:<br />

A ψ := A − ⟨A⟩ ψ 11 and B ψ := B − ⟨B⟩ ψ 11. (IV. 33)<br />

The expectation values of these operators are, in the state |ψ⟩, equal to 0,<br />

⟨A ψ ⟩ ψ = ⟨B ψ ⟩ ψ = 0. (IV. 34)


IV. 5. THE UNCERTAINTY RELATIONS 99<br />

The Cauchy - Schwarz inequality (II. 12), p. 19, for the vectors A ψ |ψ⟩ and B ψ |ψ⟩ reads<br />

⟨A ψ ψ | A ψ ψ⟩ ⟨B ψ ψ | B ψ ψ⟩ ∣ ∣ ⟨Aψ ψ | B ψ ψ⟩ ∣ ∣ 2 . (IV. 35)<br />

Because A ψ and B ψ are self - adjoint, we can also write this inequality as follows,<br />

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ ∣ ∣⟨A ψ B ψ ⟩ ψ<br />

∣ ∣<br />

2 . (IV. 36)<br />

Using both the commutator [· , ·] − and the anti - commutator [· , ·] + , we find for the right - hand side<br />

of (IV. 36)<br />

∣ ⟨Aψ B ψ ⟩ ψ<br />

∣ ∣<br />

2<br />

where the cross - term disappears because of<br />

Furthermore,<br />

= ∣ 1<br />

2 ⟨[A ψ, B ψ ] − ⟩ ψ + 1 2 ⟨[A ∣<br />

ψ, B ψ ] + ⟩ ψ 2<br />

= 1 ∣<br />

∣<br />

4 ⟨[Aψ , B ψ ] − ⟩ ψ 2 +<br />

1<br />

4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 , (IV. 37)<br />

⟨[A ψ , B ψ ] − ⟩ ∗ ψ = − ⟨[A ψ, B ψ ] − ⟩ ψ<br />

⟨[A ψ , B ψ ] + ⟩ ∗ ψ = + ⟨[A ψ, B ψ ] + ⟩ ψ . (IV. 38)<br />

[A ψ , B ψ ] − = [A, B] − , (IV. 39)<br />

and we obtain the inequality<br />

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4<br />

∣ ∣<br />

∣⟨[A, B] − ⟩ ψ 2 +<br />

1<br />

4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 . (IV. 40)<br />

In view of the inequalities (IV. 26) and (IV. 40), we make a few remarks.<br />

(i) Leaving out the last term on the right - hand side of inequality (IV. 40) gives the better known<br />

but weaker inequality, derived by H.P. Robertson (1929),<br />

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4<br />

∣<br />

∣⟨[A, B] − ⟩ ψ<br />

∣ ∣<br />

2 . (IV. 41)<br />

(ii) Notice that ⟨A 2 ψ ⟩ ψ is equal to the square of the standard deviation of the quantity A in the<br />

state |ψ⟩,<br />

⟨A 2 ψ ⟩ ψ = ⟨(A − ⟨A⟩ ψ ) 2 ⟩ = (∆ ψ A) 2 . (IV. 42)<br />

(iii) For the special case A = Q and B = P , the Robertson inequality (IV. 41) transforms into the<br />

Kennard inequality (IV. 26), and the expressions (IV. 29) and (IV. 31) correspond to ⟨Q 2 ψ ⟩ ψ in<br />

the q - language and ⟨P 2<br />

ψ ⟩ ψ in the p - language.<br />

(iv) Notice that in deriving these uncertainty relations the interpretation of the uncertainties plays<br />

no role.


100 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

(v) An objection to the Robertson inequality (IV. 41) and the Schrödinger inequality (IV. 40) is that<br />

the right - hand side depends on the state, therefore, it is no absolute lower limit for all states.<br />

If |ψ⟩ is an eigenstate of A, the right - hand side of the Robertson inequality (IV. 41) is 0 and<br />

does not provide any restriction on ∆B. Therefore, even if A and B are not both at the same<br />

time sharp in any state, i.e., they do not have simultaneous eigenstates, this does not follow<br />

from the inequality (IV. 41).<br />

Only if the right - hand side of inequality (IV. 41) is unequal to zero for all states, the Robertson<br />

inequality represents the uncertainty principle. This is the case if the commutator is a multiple<br />

of unity, as in the case of P and Q, where [P, Q] = −i11, see p. 78, (IV. 1). It can, however, be<br />

proved that this canonical commutation relation [P, Q] can only apply to unbounded operators<br />

having no eigenstates in the, inevitably infinite dimensional, Hilbert space in which they act.<br />

(vi) Already in 1929 E.U. Condon pointed out the following facts (Jammer 1974, p. 71). In certain<br />

states, non - commuting operators can both be sharp. Take, for example, the ground state of the<br />

H - atom, or any stationary state with total angular momentum l = 0. This is also an eigenstate<br />

of L x , L y and L z with eigenvalue 0. Therefore, ∆L x ∆L y = 0, and likewise for L x and L z ,<br />

and for L y and L z , although these operators do not mutually commute. Therefore, the fact that<br />

operators do not commute does not guarantee an uncertainty relation. Furthermore, sometimes<br />

an inequality holds for commuting operators. Take again a stationary state of the H - atom,<br />

with l = 1 and m = 0. In that state ⟨[L x , L y ]⟩ = 0, whereas ∆L x ≠ 0 and ∆L y ≠ 0.<br />

In conclusion, there are fundamental objections against accepting the Schrödinger inequality, and<br />

by implication against the weaker inequalities which follow from it, to be the mathematical expression<br />

of Heisenberg’s uncertainty principle.<br />

And this is not everything yet.<br />

IV. 5. 3<br />

SINGLE SLIT EXPERIMENT<br />

Relations (IV. 26) and (IV. 41) are considered to be the mathematical expression of the uncertainty<br />

principle in the major part of textbooks on quantum mechanics. Next to the previous criticism, we<br />

will show that this also is, remarkably enough, inconsistent with the experiments used as illustrations<br />

of this principle (Uffink and Hilgevoord 1985, 1988 and Hilgevoord and Uffink 1988, 1990).<br />

Consider the deflection of light, or of electrons, by a single slit in an absorbing screen, an example<br />

Heisenberg also gives. Take for the wave function representing the particles passing through the<br />

screen with the slit a simple square wave function, see figure IV. 6,<br />

ψ ss (q) =<br />

{<br />

1 √<br />

2 a<br />

if |q| a<br />

0 elsewhere<br />

, (IV. 43)<br />

where 2 a ∈ R + is the width of the slit, and q the Cartesian coordinate parallel to the screen and<br />

perpendicular to the slit.


IV. 5. THE UNCERTAINTY RELATIONS 101<br />

2 a<br />

|ψ ss (q)| 2<br />

Figure IV. 6: The probability distribution in position for a slit of width 2 a<br />

The Fourier transform of ψ ss is<br />

˜ψ ss (p) =<br />

√ a<br />

π <br />

sin(ap/)<br />

. (IV. 44)<br />

a p / <br />

The square of this wave function, | ˜ψ ss (p)| 2 , has the same form as the diffraction pattern for the slit<br />

which is formed on a photographic plate placed far away, see figure IV. 7.<br />

2π/a<br />

| ˜ψ ss (p)| 2<br />

Figure IV. 7: The diffraction pattern for a small slit of width 2 a<br />

For the standard deviation of position and momentum in the state ψ ss we find<br />

(∆ ψss Q) 2 =<br />

∫<br />

R<br />

q 2 |ψ ss (q)| 2 dq = 1<br />

2 a<br />

∫ +a<br />

−a<br />

q 2 dq = 1 3 a2 (IV. 45)


102 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

and<br />

yielding<br />

(∆ ψss P ) 2 =<br />

∫<br />

R<br />

p 2 | ˜ψ ss (p)| 2 dp = 1<br />

π a<br />

∫<br />

R<br />

| sin(ap)| 2 dp = ∞, (IV. 46)<br />

∆ ψss Q ∆ ψss P = 1 3√<br />

3 a ∞. (IV. 47)<br />

This indeed satisfies the Kennard inequality (IV. 26), but in a little interesting manner.<br />

Although ∆ ψss P = ∞, the function | ˜ψ ss | 2 has in fact a very pronounced central peak, of a width<br />

of the order a −1 , in which 95% of the total probability is located. It is the inverse proportionality<br />

of the width of this central peak to the width of the slit, which, according to Heisenberg, illustrates<br />

the uncertainty principle; it is impossible to make the probability densities |ψ ss (q)| 2 and | ˜ψ ss (p)| 2<br />

arbitrarily small at the same time.<br />

But this conclusion can not be inferred from the Kennard inequality (IV. 26). If a goes to infinity,<br />

| ˜ψ ss (p)| 2 goes to the delta function δ (p). The standard deviation ∆ ψss P , however, remains<br />

divergent. In other words, 95% of a probability distribution can be concentrated on an arbitrarily<br />

small interval, whereas the standard deviation of the distribution remains arbitrarily large. 2 If nothing<br />

is given concerning the distributions |ψ ss (q)| 2 and | ˜ψ ss (p)| 2 but the Kennard inequality (IV. 26),<br />

these distributions could both be very narrow, and, consequently, Heisenberg’s conclusion can not be<br />

derived from the Kennard inequality, in contrast to what is usually claimed.<br />

Nevertheless, Heisenberg’s conclusion is correct for the given example of the single slit. This<br />

raises the question if his statement is valid in general. What we are in fact interested in is a measure<br />

for the width of a probability distribution representing the width of the unweighted distribution.<br />

The most natural definition of such a measure is the smallest interval a fraction α ∈ [0, 1] of<br />

the total probability can be in, where, roughly, α = 0.95 is taken. If ρ is a probability density, the<br />

definition is<br />

{<br />

∫ b<br />

}<br />

W α (ρ) := min [a, b] ⊂ R ∣ ρ(x) dx = α . (IV. 48)<br />

a<br />

For position and momentum in quantum mechanics we define<br />

{<br />

∫ b<br />

}<br />

W α (Q, ψ) := min [a, b] ⊂ R ∣ |ψ(q)| 2 dq = α , (IV. 49)<br />

{<br />

W α (P, ψ) := min [a, b] ⊂ R<br />

∣<br />

a<br />

∫ b<br />

a<br />

| ˜ψ(p)|<br />

}<br />

2 dp = α . (IV. 50)<br />

The product of these measures also satisfy an uncertainty relation, as was shown for the first time by<br />

H.J. Landau and H.O. Pollak (1961), nota bene in a journal for industrial engineers of the American<br />

Bell Telephone Company,<br />

W α (P, ψ) W α (Q, ψ) c α , (IV. 51)<br />

where α ∈ ( 1<br />

2 , 1] , and c α > 0 is a constant which only depends on α, not on ψ.<br />

2 Responsible for this phenomenon is the mathematical fact that the standard deviation assigns a quadratically increasing<br />

weight to the tails of a distribution. In a Gaussian distribution, e.g. the Gaussian wave packet (IV. 32), these tails go to zero<br />

rapidly enough because an exponential power goes to zero more rapidly than any polynomial goes to infinity, but for many<br />

wave functions occurring in physics the standard deviation diverges.


IV. 5. THE UNCERTAINTY RELATIONS 103<br />

From this inequality it follows that the probability densities of position and momentum cannot<br />

simultaneously be made arbitrarily small, in the sense that a fraction α is concentrated on a arbitrarily<br />

small interval. Finally, 34 years after the birth of the uncertainty principle that of which everyone<br />

thought follows from the standard uncertainty relations was proven.<br />

For the square wave function ψ ss (IV. 43) and its Fourier transform (IV. 44) we find<br />

W α (Q, ψ ss ) ≃ a and W α (P, ψ ss ) ≃ , (IV. 52)<br />

a<br />

so that the product is in the order of magnitude of .<br />

IV. 5. 4<br />

TIME AND ENERGY<br />

In the same article in which Heisenberg (1927) introduces the uncertainty relation for position<br />

and momentum, he also discusses the uncertainty relation between time and energy, starting from the<br />

‘well - known’ equation Et − tE = ih. This equation has caused many problems.<br />

If t is taken to be the universal time parameter, the spectrum of the operator t must be the real axis.<br />

But then the commutation relation can only be satisfied by an energy operator of which the spectrum<br />

is the real axis also. On the other hand, we know that the energy spectrum of quantum mechanical<br />

systems is generally bounded from below and can even be totally or partially discrete. Hence, the<br />

conclusion was soon drawn that there is no time operator in quantum mechanics (Von Neumann 1932,<br />

Pauli 1933). In the light of the existence of a position operator and with the theory of relativity in<br />

mind it was felt that in quantum mechanics something strange was going on with ‘time’. This is<br />

expressed in almost all textbooks and articles concerning this subject. Nevertheless, it has to do with<br />

a conceptual confusion which has not been noticed for a remarkably long time.<br />

As it happens, the comparison between q and t is faulty if t is understood to be a universal time<br />

parameter. After all, q is a dynamic variable of a specific physical system, for example of a particle,<br />

and therefore there are a lot of q’s in a multiple particle system. There is, however, only one time<br />

parameter. This does not belong to a certain physical system but must be put on a par with the<br />

universal position coordinates x, y, z, with which it is linked in the theory of relativity. No more<br />

than these position coordinates, the time coordinate t is an operator in quantum mechanics. Only the<br />

dynamic variables of physical systems can be operators, and the problem outlined above is therefore<br />

a pseudo - problem.<br />

Nevertheless, one can wonder if dynamic variables exist which are just as ‘timelike’, literally<br />

speaking, as q is ‘positionlike’. The answer is affirmative. Such variables exist in systems we call<br />

‘clocks’, think, for example, of the position or the orientation of the hand of a clock. But also very<br />

simple, microscopic systems can have such variables. In quantum mechanics these dynamic time<br />

variables become operators. They occur in specific systems and therefore they are not universal.<br />

And, similar to other dynamic variables, generally the spectrum of such time operators in quantum<br />

mechanics is not the entire real axis (see further J. Hilgevoord 2002).


104 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

IV. 5. 5<br />

DOUBLE SLIT EXPERIMENT<br />

Even more interesting is the famous interference experiment with the double slit. The wave function<br />

corresponding to particles passing through the screen with the slits is, in analogy with (IV. 43),<br />

ψ ds (q) =<br />

{<br />

1 √<br />

2 a<br />

if q ∈ [− A − a, − A + a] ∪ [A − a, A + a]<br />

0 elsewhere<br />

, (IV. 53)<br />

where 2a is the width of each slit, 2A is the distance between the slits, and A ≫ a, see figure IV. 8.<br />

2 A<br />

2 a<br />

|ψ ds (q)| 2<br />

Figure IV. 8: The probability distribution in position for a double slit, 2 a is the width of each slit and<br />

2 A the distance between the slits<br />

The Fourier transform of this double square wave function ψ ds is<br />

˜ψ ds (p) =<br />

√<br />

2 a<br />

( Ap<br />

) sin(ap/)<br />

π cos . (IV. 54)<br />

a p / <br />

The function | ˜ψ ds | 2 again has the same form as the interference pattern for the slits on a photographic<br />

plate placed far away, as can be seen in figure IV. 9.<br />

2 π / a<br />

2 π / A<br />

| ˜ψ ds (p)| 2<br />

Figure IV. 9: The interference pattern for the double slit


IV. 5. THE UNCERTAINTY RELATIONS 105<br />

Now there are, however, two parameters playing a role. The distance of the slits A is a measure for<br />

the total width of |ψ ds (q)| 2 , the ‘enveloping’ cosine factor in (IV. 54), while the width of the slits a is a<br />

measure for the ‘fine structure’ of this probability density. For | ˜ψ ds (p)| 2 the roles have reversed, A −1<br />

is a measure for the width of the interference lines, while a −1 is a measure for the total width of the<br />

interference pattern. This shows the well - known fact that the width of the interference lines and the<br />

distance between the slits are inversely proportional. In a moment we will see that Bohr’s discussion<br />

of the double slit experiment exactly rests on this fact.<br />

◃ Remark<br />

Consider the measures<br />

∆ ψds Q ≃ A and ∆ ψds P = ∞, (IV. 55)<br />

W α (Q, ψ ds ) ≃ A and W α (P, ψ ds ) ≃ . (IV. 56)<br />

a<br />

None of these measures gives the fine structure. Therefore, Bohr’s Copenhagen reasoning, treated<br />

in the next subsection, cannot be based on the Kennard inequality (IV. 26) nor on the inequality of<br />

Landau and Pollak (IV. 51). ▹<br />

EXERCISE 30. Verify the calculations (IV. 55) and (IV. 56).<br />

IV. 5. 6<br />

A NEW UNCERTAINTY MEASURE<br />

Bohr’s reasoning concerning the double slit experiment goes as follows. A way to determine<br />

through which slit the particle has gone is measuring the recoil in the q - direction that the screen<br />

experiences at the passage of this particle. To this end the screen must be able to move in the q - direction.<br />

Instead of a fixed screen we take therefore a screen that is suspended from a spring, as can be<br />

seen in figure IV. 10. The incoming momentum p is perpendicular to the screen.<br />

We assume conservation of kinetic energy, i.e. a heavy screen, which means that only the direction<br />

of the momentum changes. Consequently, a particle arriving at position q of the photographic<br />

plate, gives a recoil to the screen of, assuming r ≫ A and therefore sin θ ≈ tan θ,<br />

( q ± A<br />

r<br />

)<br />

p, (IV. 57)<br />

depending on which slit it has gone through. To be able to measure the difference in recoil, it must<br />

hold for the inaccuracy δP with which the momentum of screen was known in advance, that<br />

δP < 2 A p . (IV. 58)<br />

r


106 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

q<br />

q = r tan θ 1 + A<br />

= r tan θ 2 − A<br />

2 A<br />

1<br />

2<br />

a<br />

θ 2<br />

θ 1<br />

r<br />

Figure IV. 10: Moving screen<br />

Because of the inequality<br />

δP δQ , (IV. 59)<br />

to the inaccuracy with which the position Q of the screen was known then applies<br />

δQ ><br />

r . (IV. 60)<br />

2 A p<br />

But the width of the interference lines on the photographic plate is<br />

λ r<br />

2 A = r , (IV. 61)<br />

2 A p<br />

where λ = p<br />

is the de Broglie wavelength of the electron. Bohr therefore concludes that the uncertainty<br />

in the position of the screen will result in the erasure of the interference pattern.<br />

◃ Remarks<br />

First, we see that Bohr applies the uncertainty principle to the screen which means that he treats this<br />

macroscopic body quantum mechanically. Second, he uses the uncertainty principle in a qualitative<br />

manner, in particular, he does not give a definition of the uncertainties δP and δQ. Third, the relevant<br />

uncertainty in Q is of the order of magnitude of the width A −1 of the interference lines. Bohr<br />

therefore has no use of the Kennard inequality (IV. 26) or the inequality of Landau and Pollak (IV. 51),<br />

which do not contain this width. Finally, Bohr does not show how erasure of the interference pattern<br />

exactly takes place, obviously, he considers it to be intuitively evident. ▹<br />

From the previous it should be clear that something is still lacking in the mathematical formulation<br />

of the uncertainty principle. One would hope that there may exist some direct relation between the


IV. 5. THE UNCERTAINTY RELATIONS 107<br />

total width of a distribution in the p - language (q - language), and the fine structure of this distribution<br />

in the q - language (p - language) as exhibited by the wave function for the double slit, assuming that<br />

this relation has general validity. Indeed, such a relation has been found (Uffink and Hilgevoord 1985),<br />

w α (Q, ψ) W α (P, ψ) C α and w α (P, ψ) W α (Q, ψ) C α , (IV. 62)<br />

where w α ( · , ψ) ∈ R + is a measure for the width of the fine structure of ψ, W α ( · , ψ) ∈ R + is<br />

the measure for the total width of ψ as introduced earlier, and C α > 0 is a constant depending<br />

on α ∈ (0, 1], but not on the state ψ.<br />

Illustratively, if W is taken as a measure of the size of the objective of a microscope and w as a<br />

measure of the fine structure of the image, the inequalities express the fact that the resolving power<br />

must decrease if the aperture is reduced. Likewise, the direction of incoming radiation can better<br />

determined by using a long array of radio telescopes than by using a short one, etc. These inequalities<br />

thus express, among other things, the well-known fact in optics that the resolving power of an<br />

apparatus improves as the apparatus is larger.<br />

The inequalities (IV. 62) seem to solve the problem for Bohr. A closer consideration however<br />

tells us that W α (P, ψ) is not the suitable measure to express whether the difference in recoil can or<br />

cannot be observed. More precise, W α (P, ψ) > 2Ap<br />

r<br />

does not guarantee that this difference cannot be<br />

observed. W α (P, ψ) can be large in this experiment, which makes the inequality (IV. 62) ineffective.<br />

Actually, it is the question if Bohr’s argument can in fact be based on an uncertainty relation.<br />

Nevertheless, his conclusion is correct! The fact is that a direct calculation of the double slit<br />

experiment by D. Hauschildt, unpublished, shows that the intensity of the interference, in case the<br />

screen is movable, is proportional to the factor<br />

∣ ⟨χ| e<br />

i 2 A p Q r sc<br />

|χ⟩ ∣ . (IV. 63)<br />

Here |χ⟩ is the state of the screen and Q sc is the position operator of the screen. The state<br />

|χ⟩ ′<br />

:= e i 2 A p<br />

r Q sc<br />

|χ⟩ (IV. 64)<br />

is the state of which the momentum spectrum is shifted by 2Ap<br />

r<br />

with respect to the momentum spectrum<br />

of the state |χ⟩,<br />

⟨p | χ ′ ⟩ = ⟨ p − 2 A p<br />

r<br />

∣ χ<br />

⟩<br />

. (IV. 65)<br />

The factor (IV. 63) is, therefore, exactly the quantum mechanical expression describing to what extent<br />

the state of the screen after the recoil can be distinguished from the state of the screen before the<br />

recoil.<br />

If the momentum spectrum of |χ⟩ is broad with respect to 2Ap<br />

r<br />

, the overlap (IV. 63) will be large,<br />

namely almost 1. In that case |χ⟩ and |χ⟩ ′ are difficult to distinguish and interference is large. If the<br />

momentum spectrum of |χ⟩ only contains peaks which are narrow with respect to 2Ap<br />

r<br />

, then (IV. 63)<br />

is small. The states |χ⟩ and |χ⟩ ′ are well distinguishable then and interference is small. The essence<br />

of Bohr’s reasoning is therefore correct; to the extent in which the screen can serve as a measuring apparatus<br />

to determine the slit a particle goes through, interference disappears. Whether this reasoning<br />

can be based on an uncertainty relation, is unknown to this very day.


108 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

IV. 5. 7<br />

INTERPRETATION<br />

The statistical interpretation of the uncertainty W α (A, ψ) in (IV. 62) is that it is a measure for<br />

the predictability of an outcome of measurement given a probability distribution, it is nothing but<br />

the usual statistical interpretation of the standard deviation. How we must physically understand<br />

this uncertainty depends directly on how we must physically understand quantum mechanical probabilities.<br />

We will discuss this elaborately further on.<br />

The number w α (A, ψ) is a measure for the distinguishability between the state ψ (probability<br />

distribution) and some other state (other probability distribution) when measuring quantity A corresponding<br />

to operator A. This is also nothing but the usual statistical interpretation of this measure.


V<br />

HIDDEN VARIABLES<br />

While we have thus shown that the wave function does not provide a complete description<br />

of the physical reality, we left open the question of whether or not such a description<br />

exists. We believe, however, that such a theory is possible.<br />

— Einstein, Podolsky and Rosen<br />

You may have already suspected that I still believe in the hidden variables hypothesis.<br />

[. . . ] Anyway, for me, the hidden variable hypothesis is still the best way to ease my<br />

conscience about quantum mechanics.<br />

— Gerard ’t Hooft<br />

In this chapter we get acquainted with so-called ‘hidden variable theories’ and the motivation<br />

to consider such theories. We examine if it is possible to shove such a ‘hidden variable theory’<br />

under quantum mechanics, the way classical mechanics can be shoven under classical statistical<br />

mechanics. We also treat the notorious impossibility theorems of Von Neumann and of Kochen<br />

and Specker.<br />

V. 1 HIDDEN REALITY<br />

Quantum mechanics is, roughly speaking, a theory about outcomes of measurements; about which<br />

values can be found upon measurement and about the probability of finding a specific value in such<br />

a measurement. Moreover, according to the Copenhagen perspective, this description is complete:<br />

there is nothing more to say about a physical system. As a consequence, quantum mechanics is<br />

exclusively concerned with the observable behaviour of measuring apparatuses.<br />

In the eyes of many authors, this is bizarre. In the entire history of physics we see that the aim<br />

of a theory has been to tell us something about how reality is organized, how to explain what we<br />

observe around us. Measuring is the eminent scientific manner to examine whether a given theory or<br />

hypothesis meets this aim, or to gather data to help us select theories. Measurment is not an aim, but<br />

a tool. The subject of physical theories, physical reality, does not occur in the quantum mechanical<br />

tale, in contrast to nearly all theories in classical physics.<br />

From this point of view, we could hope that quantum mechanics is some sort of cloak, which must<br />

be sustained by an underlying theory concerning physical reality. Because that underlying theory is<br />

hidden under the quantum mechanical cloak, we will speak of a hidden variable theory.<br />

So, let us examine the matter not from the viewpoint of quantum mechanics, but from ‘physical<br />

reality’, taking as a working hypothesis that something like a ‘physical reality’ exists. The behavior of


110 CHAPTER V. HIDDEN VARIABLES<br />

radioactive atomic nuclei, as discussed in the Introduction, p. 7, suggests that individual nuclei differ<br />

from each other, they show various life spans and emit α - particles with distinct momentum. The<br />

natural idea is that this difference in behavior has a cause, which can be found in mutually differing<br />

properties of the physical states of the individual nuclei. Quantum mechanics does not give us these<br />

differences, but perhaps a description of state exists, exceeding that what quantum mechanics tells us.<br />

We would like such an additional description to show us how the phenomena observed at an<br />

individual nucleus follow decisively from the state of that nucleus. Such a description requires extra<br />

variables in comparison with the quantum mechanical description. It is conceivable that not all of<br />

these variables are accessible to our present, and possibly future, possibilities of observation. They<br />

are ‘hidden’ from us, but they must exist to explain the observed differences. If they exist, then<br />

quantum mechanical states correspond to probability distributions over the states described by these<br />

variables. These probability distributions would only express our ignorance concerning the exact<br />

physical states. In this respect, the situation would be entirely analogous to that in classical statistical<br />

mechanics. EPR believed that it must in principle be possible to construct such a theory.<br />

Such an attempt, interpreting quantum mechanics as a statistical theory about an underlying physical<br />

reality, is what is called a hidden variable theory, HVT for short, the support under the quantum<br />

mechanical cloak. Assuming that quantum mechanics is empirically adequate, we will examine if it<br />

is possible in principle to found this description on a HVT.<br />

An important distinction between several types of HVT’s concerns the question whether the hidden<br />

variables describing the physical state of the system can depend on which quantity of the system is<br />

measured. Theories in which this has been permitted are called contextual, they will be discussed in<br />

section V. 4. For the moment, we will first concentrate on the simpler case where this is not permitted,<br />

the non - contextual theories, to be discussed in section V. 2.<br />

Another important division has to do with determinism. Although it is the objective of a HVT to<br />

supplement or complete the quantum mechanical description of a physical system, this does not imply<br />

that with this supplement the precise future behavior of this system can be entirely predicted, it is<br />

conceivable that the HVTtoo merely determines probabilities of possible events. In that case we speak<br />

of an indeterministic, or stochastic, HVT. In this chapter we will discuss only deterministic HVT’s,<br />

but we will come back to stochastic HVT’s in chapter VII.<br />

V. 2 NON - CONTEXTUAL HIDDEN VARIABLES<br />

Let us try to reconstruct quantum mechanics in analogy with classical statistical mechanics. We<br />

assume a space Λ analogous to the phase space Γ known from statistical physics, which we have<br />

already met in section III. 2. An arbitrary ‘point’ in that space Λ is indicated with λ. We do not in<br />

advance impose any restriction to the mathematical form of λ. The variable λ can represent anything,<br />

for example a single real variable, an infinite - dimensional vector field, complex functionals, etc. The<br />

possibilities are endless, the only restriction will be that a probability measure can be defined on Λ. It<br />

is possible to also incorporate the quantum mechanical state as a component in the specification of λ.<br />

Speaking about a ‘classical’ statistical model here does not mean that the HVT must look like<br />

classical mechanics, let alone that λ specifies the position and momentum of the particles, although<br />

we do not exclude that as a possibility.


V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 111<br />

In the HVT, a pure physical state corresponds to a single ‘point’ λ ∈ Λ. We assume that the<br />

system is always in one of these states λ ∈ Λ, even though we do not know in which one. A general,<br />

mixed state is a probability distribution over Λ. For any given λ every physical quantity A has an<br />

exact value, denoted by A[λ], which is revealed upon measurement of A, and therefore a physical<br />

quantity A can be represented as a real function on the space A : Λ → R.<br />

Furthermore, every quantity represented by quantum mechanics has to have a counterpart in the<br />

HVT. If such a quantity, corresponds to the function A : Λ → R the values A [λ] can take are<br />

the eigenvalues of the self - adjoint operator A : H → H which, according to quantum mechanics,<br />

corresponds to quantity A.<br />

It is also required that every quantum mechanical state can be represented in the HVT; for every<br />

state operator W there must be a corresponding probability distribution ρ W over Λ. It is, however,<br />

not necessary that pure quantum states correspond to pure hidden variable states, the idea being that<br />

the HVT allows for a more detailed, complete description of the system. Neither is it necessary that<br />

every probability distribution on Λ corresponds to a state operator, the HVT could easily be a theory<br />

richer than quantum mechanics.<br />

The requirement that the HVT has to reproduce the empirical statements of quantum mechanics<br />

is now expressed in the requirement that the expectation values of quantity A belonging to a physical<br />

system in a physical state, corresponding in the HVT to ρ W , and in quantum mechanics to W , coincide,<br />

∫<br />

⟨A⟩ ρW := A[λ] ρ W (λ) dλ = Tr A W, (V. 1)<br />

Λ<br />

where ρ W : Λ → [0, ∞) is a probability density,<br />

∫<br />

ρ W (λ) dλ = 1. (V. 2)<br />

Λ<br />

For a pure state |ψ⟩, (V. 1) reduces to<br />

∫<br />

A[λ] ρ ψ (λ) dλ = ⟨ψ | A | ψ⟩. (V. 3)<br />

Λ<br />

In the discrete case the integrals are replaced by summations.<br />

Summary<br />

An non - contextual HVT is any theory meeting the following requirements.<br />

(i) Every physical state of a physical system corresponds to a probability distribution ρ over Λ.<br />

This is the state postulate.<br />

(ii) Every physical quantity A corresponds to a function A : Λ → R, λ ↦→ A[λ]. This is the<br />

observables postulate.


112 CHAPTER V. HIDDEN VARIABLES<br />

(iii) The range of A : Λ → R coincides with the spectrum of the self - adjoint operator A which,<br />

according to quantum mechanics, corresponds to quantity A.<br />

The expectation value of A when the physical system is in the state ρ W which, according<br />

to quantum mechanics, corresponds to the state operator W , equals the quantum mechanical<br />

expression for the expectation value<br />

⟨A⟩ ρW :=<br />

∫<br />

Λ<br />

A[λ] ρ W (λ) dλ = Tr AW.<br />

We will call this last requirement (iii) the reproduction criterion.<br />

Since all probabilities in quantum mechanics can be written as Tr PW , with P ∈ P(H), it follows<br />

that all probability distributions in quantum mechanics coincide with the corresponding probability<br />

distributions in the HVT.<br />

We can now ask whether it is possible to construct a HVT satisfying the above requirements. The<br />

answer is that it is indeed possible, even in a quite trivial way, by choosing Λ large enough. We<br />

illustrate this by means of a simple example.<br />

Suppose there are only three quantities A, B, C, with possible values {a 1 }, {b 1 , b 2 }, {c 1 , c 2 } and<br />

represented by functions A, B, C : Λ → R. The possible value combinations are<br />

(a 1 , b 1 , c 1 ), (a 1 , b 1 , c 2 ), (a 1 , b 2 , c 1 ), (a 1 , b 2 , c 2 ). (V. 4)<br />

We now construct a space Λ by identifying every value combination with a point of Λ. If we denote<br />

these points by λ 1 , λ 2 , λ 3 and λ 4 , then<br />

A[λ 1 ] = a 1 , B[λ 3 ] = b 2 , C [λ 4 ] = c 2 , etc. (V. 5)<br />

When there are more quantities, we extend Λ correspondingly.<br />

We have to introduce a probability measure<br />

µ : F (Λ) → [0, 1] with<br />

∑<br />

µ(λ j ) = 1 (V. 6)<br />

j<br />

such that (V. 1) is satisfied. In our case Λ is discrete and consists of four points only, as a result of<br />

which the integral (V. 1) becomes a sum. For example, to quantity B it must apply that<br />

Tr B W =<br />

4∑<br />

B[λ j ] µ W (λ j )<br />

j=1<br />

This is satisfied by<br />

= b 1<br />

(<br />

µW (λ 1 ) + µ W (λ 2 ) ) + b 2<br />

(<br />

µW (λ 3 ) + µ W (λ 4 ) ) . (V. 7)<br />

µ W (a i , b j , c k ) = Tr P ai W Tr P bj W Tr P ck W, (V. 8)


V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 113<br />

where P ai is the projector on the subspace corresponding to the eigenvalue a i of A, etc. Indeed,<br />

according to quantum mechanics<br />

and therefore<br />

while, with<br />

we have<br />

B = b 1 P b1 + b 2 P b2 , (V. 9)<br />

Tr BW = b 1 Tr P b1 W + b 2 Tr P b2 W, (V. 10)<br />

P a1 = 11, P b1 + P b2 = 11, P c1 + P c2 = 11, (V. 11)<br />

µ W (λ 1 ) + µ W (λ 2 ) = µ W (a 1 , b 1 , c 1 ) + µ W (a 1 , b 1 , c 2 )<br />

Likewise we find<br />

= Tr P a1 W Tr P b1 W (Tr P c1 W + Tr P c2 W )<br />

= Tr P b1 W. (V. 12)<br />

µ W (λ 3 ) + µ W (λ 4 ) = Tr P b2 W. (V. 13)<br />

Therefore, (V. 7) has been satisfied, and the same applies to the expectation values of A and C.<br />

If we have, in general, the quantities A, B, C, . . . , F , with values a i , b j , c k , . . . , f l , where<br />

i = 1, . . . , n A , j = 1, . . . , n B , etc., the measure<br />

µ W (a i , b j , c k , . . . , f l ) = Tr P ai W Tr P bj W Tr P ck W · · · Tr P fl W, (V. 14)<br />

satisfies requirement (V. 3) for all quantities. For example, the probability of finding for quantity A<br />

the value a i is<br />

Prob µ W<br />

(A : a i ) =<br />

∑<br />

µ W (a i , b j , c k , . . . , f l ) = Tr P ai W, (V. 15)<br />

j, k,..., l<br />

because all others sum up to 1. Here we have the required quantum mechanical result. Kochen<br />

and Specker (1967) showed how to formulate this idea in the case of an infinite number of physical<br />

quantities.<br />

This solution of the completeness problem is, however, not very interesting physically. It can be<br />

seen from the factorizable probabilities in (V. 8) that all quantities are treated here as being statistically<br />

independent which is not in agreement with physical practice. Some quantities are functions of<br />

other quantities, e.g., kinetic energy is a function of momentum, E kin = p2<br />

2m<br />

, while other quantities<br />

link with two or more other quantities, such as kinetic, potential and total energy, E = E kin + E pot .<br />

In the just outlined HVT we have ignored such links.


114 CHAPTER V. HIDDEN VARIABLES<br />

To illustrate this we assume that in our example C = A+B so that c 1 = a 1 +b 1 and c 2 = a 1 +b 2 .<br />

Now the possible value combinations in the HVT are<br />

(a 1 , b 1 , a 1 + b 1 ), (a 1 , b 1 , a 1 + b 2 ), (a 1 , b 2 , a 1 + b 1 ), (a 1 , b 2 , a 1 + b 2 ), (V. 16)<br />

and we see that (A + B) [λ] is not equal to A [λ] + B [λ] for all λ. Nevertheless, the HVT succeeded<br />

in reproducing, by construction, all quantum mechanical expectation values, in other words,<br />

the HVT reproduces the relation<br />

⟨ψ | A + B | ψ⟩ = ⟨ψ | A | ψ⟩ + ⟨ψ | B | ψ⟩, (V. 17)<br />

without requiring<br />

(A + B)[λ] = A[λ] + B[λ]. (V. 18)<br />

If we would require (V. 18), Λ would only consist of the points (a 1 , b 1 , a 1 + b 1 ) and (a 1 , b 2 , a 1 + b 2 )<br />

which is, of course, a strong restriction.<br />

In the very first proof of the impossibility of a HVT, that is, of the insolubility of the completeness<br />

problem, given by Von Neumann (1932), the requirement (V. 18) was indeed imposed on the HVT.<br />

Von Neumann required (V. 18) for every hidden variable state, in particular also for pure hidden<br />

variable states, which means that (V. 18) must apply to all λ ∈ Λ. We don’t need to discuss Von<br />

Neumann’s elaborate proof of this claim in detail, since J.S. Bell (1966) has shown this impossibility<br />

by means of a very simple example.<br />

Since the values of A[λ] etc. have to be the eigenvalues of the corresponding operators, it can be<br />

seen immediately that this requirement cannot be satisfied in general. Consider for example the Pauli<br />

matrices<br />

σ x =<br />

( ) 0 1<br />

, σ<br />

1 0 y =<br />

( ) 0 − i<br />

i 0<br />

and σ x + σ y =<br />

( )<br />

0 1 − i<br />

. (V. 19)<br />

1 + i 0<br />

The eigenvalues σ x and σ y are ±1, but the eigenvalues of σ x + σ y are ± √ 2, and therefore, (V. 18)<br />

cannot be satisfied.<br />

Bell argued that the requirement (V. 18) is physically unreasonable. For instance, measuring<br />

σ x ,σ y and σ x +σ y requires three different measurement apparatuses, for example three Stern - Gerlach<br />

magnets in three different orientations. There is absolutely no reason to assume that an algebraical<br />

link would exist between the individual outcomes of these measurements. The fact that in quantum<br />

mechanics the relation (V. 17) exists for pure states, even in case A and B do not commute, must be<br />

considered as a particular property of quantum mechanics.<br />

Since the requirement (V. 18) is unreasonably strong, one can wonder whether there are other,<br />

reasonable, requirements which can be imposed to a HVT in order to find acceptable solutions of the<br />

completeness problem. This brings us to the next section.


V. 3 KOCHEN AND SPECKER’S THEOREM<br />

V. 3. KOCHEN AND SPECKER’S THEOREM 115<br />

As we already proved in section II. 4, p. 28, in quantum mechanics the next theorem holds: if the<br />

operators A, B, C, . . . commute, there is a maximal operator O of which they are a function,<br />

A = f (O), B = g(O), etc. (V. 20)<br />

A measuring procedure for A, B, C, . . . would be to measure O and apply the function relation to the<br />

result in order to find the values for A, B, C, . . . Kochen and Specker (1967, p. 64) call the quantities<br />

corresponding to A, B, C, . . . commeasurable.<br />

Now it seems reasonable to require, as Von Neumann did, that the HVT also has this structure, i.e.,<br />

for B, C : Λ → R, if B = f (C), it follows that B[λ] = f ( C [λ] ) , or<br />

f (C)[λ] = f ( C [λ] ) . (V. 21)<br />

This function rule, (V. 21), yields the so - called sum rule for commuting operators,<br />

[A, B] = 0 =⇒ (A + B)[λ] = A[λ] + B[λ], (V. 22)<br />

since, with O again the maximal operator of which A and B are a function, A = f (O), B = g(O),<br />

implying<br />

(A + B) = h(O) with h = f + g, (V. 23)<br />

from (V. 21) it then follows in this HVT that<br />

(A + B)[λ] = h(O)[λ] = h ( O[λ] ) = f ( O[λ] ) + g ( O[λ] )<br />

= (f O)[λ] + (g O)[λ] = A[λ] + B[λ]. (V. 24)<br />

EXERCISE 31. Prove, again using (V. 21), the product rule for commuting operators,<br />

[A, B] = 0 =⇒ (A B)[λ] = A[λ] B[λ]. (V. 25)<br />

Now we will see how the requirement, (V. 21), which at first sight is eminently reasonable, nevertheless<br />

renders a HVT of quantum mechanics impossible.<br />

THEOREM :<br />

A HVT satisfying the requirements (i) - (iii), p. 111, and the function rule (V. 21), does<br />

not exist if dim H > 2.


116 CHAPTER V. HIDDEN VARIABLES<br />

Proof<br />

Consider a complete collection of mutually orthogonal projectors P 1 , . . . ,P N on a N - dimensional<br />

Hilbert space. Such projectors mutually commute; [P i , P j ] = 0. An arbitrary sum of such projectors<br />

over some subset ∆ ⊂ {1, . . . , N} is again a projector,<br />

∑<br />

i∈Delta<br />

P i = P ∆ ∈ P (H). (V. 26)<br />

Therefore, according to the sum rule (V. 22) it has to hold that<br />

∑<br />

, P i [λ] = P ∆ [λ]. (V. 27)<br />

i ∈∆<br />

But the values P i [λ] are the eigenvalues of the operators P i , therefore they are 0 or 1, likewise<br />

for P ∆ [λ], these values also follow from (V. 21). In particular, taking ∆ = {1, . . . , N}, we find<br />

N∑<br />

, P i [λ] = 11[λ] = 1.<br />

i=1<br />

But then the value assignment P i [λ] to the projectors satisfies the requirements for a probability<br />

measure on P (H), i.e.<br />

µ λ (P i ) := P i [λ] ∈ {0, 1} (V. 28)<br />

is a normalized, additive mapping on the subspaces of H. According to Gleason’s theorem, p. 47,<br />

this probability measure can always be written as<br />

µ λ (P i ) = Tr P i W λ , (V. 29)<br />

for a certain state operator W λ , provided that dim H > 2. There is, however, a contradiction<br />

between (V. 29) and (V. 28). The measure (V. 29) is continuous; a small change of the direction<br />

of P i induces a small change of µ(P i ). The measure (V. 28) is however necessarily discontinuous<br />

because µ(P i ) can only have the values 0 and 1.<br />

The conclusion has to be that a value assignment to quantities satisfying (V. 21), and therefore<br />

(V. 27), is impossible. As a consequence, a HVT of this type is not possible. □<br />

In this proof we used Gleason’s theorem, which is difficult to prove, and his own proof is not very<br />

transparent. There have also been given direct proofs for the impossibility of this value assignment.<br />

Bell (1966) and Kochen and Specker (1967) were the first to prove this in general, i.e., for dim H > 2<br />

and for all states; see also Belinfante (1973). We will not discuss these proofs in detail but restrict<br />

ourselves to a number of observations. Before we do so, we formulate Kochen en Specker’s theorem.<br />

KOCHEN AND SPECKER’S THEOREM :<br />

It is not possible to assign values to all physical quantities of an arbitrary physical system,<br />

with a Hilbert space of dim > 2, in accordance with function rule (V. 21).


V. 3. KOCHEN AND SPECKER’S THEOREM 117<br />

Sketch of the direct proof<br />

We can formulate the problem as follows. Consider as a particular case of (V. 26) a resolution of<br />

identity into 1 - dimensional projectors,<br />

P 1 + P 2 + · · · + P n = 11. (V. 30)<br />

According to (V. 21), thence (V. 22), the following must hold<br />

P 1 [λ] + P 2 [λ] + · · · + P n [λ] = 11[λ] = 1 (V. 31)<br />

for every resolution of identity. Consider the 1 - dimensional projectors H as lines in all possible<br />

directions through the origin of H. Now assign to all lines the value 0 or 1, such that the sum<br />

of the values of each complete set of orthogonal lines is 1. Alternatively, consider the points of<br />

intersection of these lines with the surface of the unit sphere in H. To each point of the sphere the<br />

value 0 or 1 is assigned, antipodal points are assigned the same value, and the sum of the values<br />

of the points of intersection of an orthogonal basis with the surface of the sphere is 1.<br />

If this problem is soluble in a complex H, it is also soluble in a real H with the same dimension.<br />

To see this, choose a basis in H and generate, by application of real orthogonal transformations,<br />

a structure which is isomorphic to a real H. Therefore, we can restrict ourselves to proving the<br />

impossibility of the requested value assignment in a real H.<br />

Furthermore, the impossibility in H N implies the impossibility in H N+1 . This can be shown<br />

by considering the N - dimensional subspace which is orthogonal to a line having value 0. Each<br />

orthogonal (N + 1) - tuple of which this line is a part then turns into an N - tuple with a correct<br />

value assignment. In other words, if it is possible in an (N +1) - dimensional H, it is also possible<br />

in an N - dimensional H and, therefore, we only have to consider a real H with a dimension as<br />

low as possible.<br />

Notice that the problem for a 2 - dimensional Hilbert space H 2 does have a solution, see for<br />

example the diagram V. 1.<br />

1 0<br />

0<br />

1<br />

Figure V. 1: A solution for dim H = 2<br />

All proofs therefore aim at the case of a real, 3 - dimensional Hilbert space H 3 . Now it immediately<br />

seems plausible that the requested value assignment in H 3 is not possible, to each point of<br />

the unit sphere R 3 with value 1 infinitely many points belong having value 0, namely, the equator<br />

of which that point is a pole. On the other hand, of each orthogonal triad of points only two points<br />

have the value 0. But this is, of course, not a proof.<br />

Bell (1966, pp. 450, 451) showed that points with different values cannot be arbitrarily close.<br />

This is an independent proof of the continuity of the measure, and therefore contrary to the necessary<br />

discontinuity of (V. 28).


118 CHAPTER V. HIDDEN VARIABLES<br />

Kochen and Specker (1967, p. 69) explicitly constructed a set of 117 spin quantities for which no<br />

consistent value assignment exists. This construction is depicted on the cover of Redhead (1987)<br />

and can be seen in figure V. 2 a. It shows that every value assignment in accordance with function<br />

rule (V. 21) leads to contradictions.<br />

Kochen and Conway only needed 31 quantities in the so - called Peres cube of 33 points (Peres 1993).<br />

This construction is depicted in figure V. 2 b. □<br />

Figure V. 2: a) Kochen - Specker diagram b) Conway - Kochen diagram<br />

(Redhead 1987 ) (Tkadlec 2000 )


V. 3. KOCHEN AND SPECKER’S THEOREM 119<br />

Figure V. 3: M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of the<br />

left pillar. Each cube has 4 lines from the mutual center to its vertices, 6 lines to the centers of<br />

its edges, and 3 lines to the centers of its faces. Three of the lines are shared by all three cubes,<br />

giving 3 · (4 + 6 + 3 ) − 6 = 33 lines. These are Peres’ vectors. (Text Meyer 2003 )<br />

It is interesting to see what the measure (V. 29), according to Von Neumann the probability measure<br />

of quantum mechanics, looks like in this case. For a pure state W = |ψ⟩ ⟨ψ|, with P i = |χ⟩ ⟨χ|<br />

the measure (V. 29) is<br />

µ(P i ) = Tr P i W = ⟨ψ | P i | ψ⟩ = |⟨χ | ψ⟩| 2 (V. 32)<br />

so that in a real space we have<br />

µ(P i ) = |⟨χ | ψ⟩| 2 = cos 2 θ, (V. 33)


120 CHAPTER V. HIDDEN VARIABLES<br />

with θ the angle between |ψ⟩ and |χ⟩, see figure V. 4.<br />

ψ<br />

1<br />

θ<br />

χ<br />

cos 2 θ<br />

0<br />

Figure V. 4: µ(P i ) = cos 2 θ<br />

In the appendix of these lecture notes, p. 183, ff., we will prove that, if we assign to each point<br />

of the upper half of a unit sphere a non - negative real number such that 1 is assigned to the ’north<br />

pole’, 0 is assigned to the ’equator’ and the sum of the values of each orthogonal triad in this half<br />

sphere is 1, there is only one possible value assignment and that is the quantum mechanical one,<br />

i.e., in accordance with cos 2 θ.<br />

◃ Remarks<br />

First, illustrations of Kochen and Specker’s theorem are easy to find for Hilbert spaces of dimension<br />

larger than 3, for example 8, in which case a handful of quantities suffices, see Mermin (1993). We<br />

will come back to that in section VII. 6. Second, when restricted to rational angles between spin<br />

vectors, no contradiction with quantum mechanics can be obtained, as D.A. Meyer (1999) proved. ▹<br />

V. 3. 1 SUMMARY<br />

According to Kochen and Specker’s theorem, a HVT satisfying the state postulate and the observables<br />

postulate, p. 111 (i) and (ii), together with the function rule (V. 21), is contradictory to the state<br />

postulate and the observables postulate of quantum mechanics if dim H > 2, although for Hilbert<br />

spaces with dim H 2 it is possible. This conclusion shows how stringent the vector space structure<br />

of quantum mechanics is, and in particular, the fact that there are many different decompositions of<br />

unity forms a heavy barrier for a HVT.<br />

V. 4 CONTEXTUAL HIDDEN VARIABLES<br />

Essential for Kochen and Specker’s proof is the fact that a 1 - dimensional projector can be part<br />

of several decompositions of unity. This is possible as long as the projectors are not maximal, i.e.,<br />

if dim H > 2. The existence of degenerated projectors, apart from unity, is essential for the proof of<br />

Kochen and Specker, and for this reason it does not hold in a 2 - dimensional H where all projectors,<br />

except 11, are maximal. By means of degenerated projectors also non - commuting operators become<br />

connected to each other. By the requirement (V. 21) this is transferred to the quantities of the HVT, so


V. 4. CONTEXTUAL HIDDEN VARIABLES 121<br />

that via a detour we still impose a requirement for non - commeasurable quantities on the HVT. We<br />

will consider this in detail now.<br />

Suppose that operator A commutes with the maximal operators C 1 and C 2 , while [C 1 , C 2 ] ≠ 0.<br />

Then we have<br />

which implies<br />

A = f (C 1 ) and A = g(C 2 ), (V. 34)<br />

f (C 1 ) = g(C 2 ), (V. 35)<br />

and we see that A is degenerate. Function rule (V. 21) leads to the same relation between the quantities<br />

of the HVT,<br />

yielding<br />

A[λ] = f ( C 1 [λ] ) and A[λ] = g ( C 2 [λ] ) , (V. 36)<br />

f ( C 1 [λ] ) = g ( C 2 [λ] ) . (V. 37)<br />

Again, this is a relation between the value assignments to quantities which do not commute in quantum<br />

mechanics, but the relation is not one - to - one, the functions f and g are not bijective.<br />

It can be supposed that such a requirement is unreasonable is because such quantities are not<br />

commeasurable. In other words, the structure of quantum mechanics, and particularly the proposition<br />

that an operator can be a function of two non - commuting maximal operators, leads to relations<br />

between quantities which cannot be measured in one single experiment.<br />

The following is what occurs at the different decompositions of unity. Consider two bases, {|α j ⟩}<br />

and {|β j ⟩}, in a Hilbert space H of dimension N > 2 and suppose that |α 1 ⟩ = |β 1 ⟩, while all other<br />

basis vectors are different. Then we have<br />

N∑<br />

P |αj ⟩ = 11 =<br />

j=1<br />

N∑<br />

P |βj ⟩ and P |α1 ⟩ = P |β1 ⟩. (V. 38)<br />

j=1<br />

Define, as follows, two maximal operators with all coefficients c j and d j distinct,<br />

C :=<br />

N∑<br />

c j P |αj ⟩ and D :=<br />

j=1<br />

N∑<br />

d j P |βj ⟩, (V. 39)<br />

j=1<br />

then it follows that<br />

P |α1 ⟩ = f (C) = g(D). (V. 40)<br />

This leads to a connection between the non - commuting operators C and D, and using (V. 21)<br />

this leads to a connection between the corresponding representations C[λ] and D[λ] in the HVT. It is<br />

this type of relations which the HVT cannot satisfy.


122 CHAPTER V. HIDDEN VARIABLES<br />

◃ Remark<br />

Notice that the occurrence of non - maximal operators P |αi ⟩ is indeed essential, if P |αi ⟩ would be<br />

maximal, C and D would commute, as we saw in section II. 4 on p. 30. M.J. Maczynski (1971) has<br />

proved that if we exclusively consider maximal quantities, and therefore we would apply (V. 21) to<br />

maximal quantities only, Kochen and Specker’s theorem is no longer valid, and in that case a HVT is<br />

possible. ▹<br />

An obvious expedient is to strictly constrain requirement (V. 21) to quantities which are measurable<br />

within one context. In our example the projector P |α1 ⟩ is commeasurable with both C and D,<br />

while mutually C and D are not commeasurable. Therefore, we have to distinguish between a value<br />

assignment P |αi ⟩[λ] within the context of a measurement of C, and one within the context of a measurement<br />

of D. We can think, for example, of a measurement of C and application of the function relation<br />

P |α1 ⟩ = f(C), or of a measurement of D and application of the function relation P |α1 ⟩ = g(D).<br />

More generally, suppose<br />

A = f (C) = g(D) where [C, D] ≠ 0. (V. 41)<br />

Then we distinguish the hidden variable quantities A C [λ] and A D [λ], where the index indicates the<br />

context of measurement. If C and D do not commute there is, according to a contextual HVT, no<br />

reason to assume that for all λ ∈ Λ it holds that<br />

A C [λ] = A D [λ], (V. 42)<br />

as is the case in every HVT we have considered so far.<br />

Kochen and Specker do assume (V. 42), however, and find a contradiction with quantum mechanics.<br />

The remedy is therefore to ‘split up’ all degenerate quantities by addition of the context in which<br />

they are measured, as was firstly proposed by B.C. van Fraassen (1973). For the sake of convenience<br />

we here assume that a measurement of a degenerated quantity always develops by means of the measurement<br />

of a maximal quantity, which does not have to be split up. By definition we then have<br />

A C [λ] = f ( C [λ] ) and A D [λ] = g ( D[λ] ) . (V. 43)<br />

This yields a weaker form of (V. 21). Suppose A = f (C), B = g(C) and A = h(B) = h(g(C)),<br />

then using (V. 43) we have<br />

A C [λ] = h ( B C [λ] ) . (V. 44)<br />

This consideration leads to a new postulate for a HVT, which, in case the HVT accommodates this<br />

postulate, we call contextual.<br />

CONTEXTUAL OBSERVABLES POSTULATE:<br />

If A is a physical quantity which can be taken as a function of at least two other physical<br />

quantities, for example A = f (C) and A = g (D), then, in the HVT, to A corresponds<br />

a function A C : Λ → R iff quantity C is measured, and a function A D : Λ → R iff<br />

quantity D is measured. If A, f(C) and g(D) are the corresponding quantum mechanical<br />

operators, the following applies,<br />

∀ λ ∈ Λ : A C [λ] = A D [λ] ⇐⇒ [C, D] = 0. (V. 45)


V. 4. CONTEXTUAL HIDDEN VARIABLES 123<br />

Although splitting up quantities is a natural consequence of the idea of commeasurability, it means<br />

giving up a one - to - one relation between the quantities of quantum mechanics and those of the HVT in<br />

a very drastic manner; since the operator P |α1 ⟩ is part of infinitely many decompositions of unity, there<br />

are infinitely many contexts in which P |α1 ⟩ can be measured.<br />

The idea that the context of the measurement must be taken into the consideration can already be<br />

found in Bell (1966). In this article, which was actually written earlier than his famous article with the<br />

Bell inequality, Bell makes some observations concerning the requirements which could be imposed<br />

to a contextual HVT. They have to have a spatial meaning and enable us to interpolate a space - time<br />

picture, preferably causally, between the preparation and the measurement of states.<br />

He then considers Bohm’ s theory of the quantum potential, see chapter VI, and shows that this<br />

theory is not local. He wonders if every HVT of quantum mechanics must have this non - local character<br />

(Bell 1966, p. 452),<br />

However, it must be stressed that, to the present writer’s knowledge, there is no proof that<br />

any hidden variable account of quantum mechanics must have this extraordinary character.<br />

It would therefore be interesting, perhaps, to pursue some further “impossibility<br />

proofs,” replacing the arbitrary axioms objected to above by some condition of locality,<br />

or of separability of distant systems.<br />

Meanwhile, still before the delayed publication of his article, Bell (1964) himself had found such a<br />

proof.<br />

Now we will show how the idea of locality can be brought to expression in a contextual HVT with<br />

‘split’ quantities. Consider a composite system with Hilbert space H = H I ⊗ H II and an operator of<br />

the form A ⊗ 11 where A is maximal in H I . Then the operator A ⊗ 11 is not maximal in H, and<br />

A ⊗ 11 = f (X), (V. 46)<br />

where X is some maximal operator on H. Especially consider an X of the form<br />

X = X I ⊗ X II . (V. 47)<br />

Suppose there is no interaction, or not anymore, between the systems I and II. Then we can raise the<br />

question if X II must be taken to belong to the context of A ⊗ 11.<br />

Consider a second maximal operator<br />

Y = X I ⊗ Y II (V. 48)<br />

which only differs from X in the last factor. We then have<br />

A ⊗ 11 = f (X) = g(Y ). (V. 49)<br />

A requirement of locality is now that<br />

(A ⊗ 11) XI ⊗ X II<br />

[λ] = (A ⊗ 11) XI ⊗ Y II<br />

[λ], (V. 50)<br />

in other words, a change in that what is measured of system II, does not result in a splitting of<br />

quantities of system I. A contextual HVT satisfying (V. 50) is called local.


124 CHAPTER V. HIDDEN VARIABLES<br />

The key question is if a local contextual HVT is compatible with quantum mechanics. As an<br />

example we consider Bohm’s version of the thought experiment of EPR (Cooke and Hilgevoord 1979);<br />

two spin 1/2 particles being in a singlet state. Measurements of the spin of each of the particles<br />

correspond to operators of the form σ i ⊗ τ j , where σ i is the operator of the component of the spin of<br />

the first particle in the direction i and τ j is, likewise, the operator for the second particle. In contrast<br />

to the previously considered operators of the form X I ⊗ X II , the operators σ i ⊗ τ j are not maximal.<br />

Let us consider three directions, i, j ∈ {1, 2, 3}, which means there are nine such measurements.<br />

The result of a measurement of spin is either up or down, and consequently every measurement has<br />

four possible outcomes. If we introduce a quantity in the HVT for each of the nine quantities, we can,<br />

as we saw, reproduce the quantum mechanical predictions. Between the operators the relation<br />

σ i ⊗ τ j = (σ i ⊗ 11) (11 ⊗ τ j ), with i, j ∈ {1, 2, 3} (V. 51)<br />

holds. Now we also have to introduce quantities in the HVT for the six operators σ i ⊗ 11 and 11 ⊗ τ j .<br />

In an autonomous HVT the quantities must also satisfy (V. 51), because the factors on the right -<br />

hand side of (V. 51) commute. This means that there are only six independent quantities in the<br />

HVT and it can be shown that with this the experimental predictions of quantum mechanics can not<br />

be reproduced, see Wigner’s derivation in VII. 3.<br />

In a contextual HVT however, we consider the quantities σ i ⊗ 11 and 11 ⊗ τ j to be dependent of<br />

the context of the operators of which they are functions. Let χ(τ j ) be a function which assigns the<br />

value 1 to the outcome of every spin measurement τ j ,<br />

We then have<br />

χ(τ j ) = 11, with j ∈ {1, 2, 3}. (V. 52)<br />

(σ i ⊗ 11) σi ⊗ τ j<br />

[λ] = (σ i ⊗ χ(τ j ))[λ]. (V. 53)<br />

This quantity represents the spin of particle 1 within the context of a measurement of σ i ⊗ τ j ,<br />

which is a measurement of both spins followed by multiplication of the results. Since j ∈ {1, 2, 3},<br />

this gives a 3 - fold splitting of the quantity σ i ⊗ 11. The product rule now only applies to quantities<br />

in the same context, and the validity is trivial in this case. There are enough independent quantities in<br />

the HVT again to be able to reproduce quantum mechanics. The splitting worked out.<br />

But at the same time we see the price we have to pay; the splitting does not satisfy the weak<br />

requirement of locality (V. 50), because for j ≠ j ′ we make a distinction between the quantities<br />

(σ i ⊗ 11) σi ⊗ τ j<br />

[λ] and (σ i ⊗ 11) σi ⊗ τ j ′ [λ]. (V. 54)<br />

This means that properties, quantities having values, of the one particle can no longer be specified<br />

independent of those of the other particle, even if there is no interaction between these particles and<br />

they are located in different galaxies. Redhead (1987, p. 135) speaks of an ontological contextuality.<br />

The conclusion is that a contextual HVT has to be non - local to be compatible with quantum<br />

mechanics.<br />

◃ Remark<br />

Notice that we did not speak of a measurement of the quantity σ i ⊗ 11. We have invariably seen


V. 4. CONTEXTUAL HIDDEN VARIABLES 125<br />

this as being derived from the measurement of an operator of which it is a function. In this way the<br />

maximal operators eventually acquire a special status, they are not being split up and they are the<br />

only operators which can be measured directly. This can be assumed theoretically, but the relation<br />

with the experimental practice in the laboratory, where almost exclusively degenerated quantities are<br />

measured, is less clear. ▹


VI<br />

BOHMIAN <strong>MECHANICS</strong><br />

My suggestion is that at each state the proper order of operation of the mind requires<br />

an overall grasp of what is generally known, not only in formal, logical, mathematical<br />

terms, but also intuitively, in images, feelings, poetic usage of language, etc.<br />

— David Bohm<br />

But why then had Born not told me of this “pilot wave?” If only to point out what was<br />

wrong with it? [. . . ] Why is the pilot wave picture ignored in text books? Should it not be<br />

taught, not as the only way, but as an antidote to the prevailing complacency? To show<br />

that vagueness, subjectivity, and indeterminism, are not forced on us by experimental<br />

facts, but by deliberate theoretical choice?<br />

— John Bell<br />

We briefly describe Bohm’s hidden variables theory, which we will call Bohmian mechanics.<br />

Bohmian mechanics seems to have the same empirical strength as quantum mechanics, but succeeds<br />

to provide an image in space and time of what exactly takes place in micro - physical reality.<br />

VI. 1<br />

INTRODUCTION<br />

The debate between Bohr and Einstein concerning the interpretation of quantum mechanics<br />

reached its peak in the 1935 EPR - article. Although both authors frequently returned to the problems,<br />

neither of them has afterwards introduced new elements in his point of view. For most of<br />

the physicists in the nineteen thirties and later it was not difficult to declare a winner to the debate,<br />

Bohr’s view was accepted nearly unanimously. The question whether a physical reality hides behind<br />

quantum mechanics, which exists of objects having properties and of which we can form ourselves a<br />

picture in space and time, was put aside. It was also thought that Von Neumann’s proof, as discussed<br />

in V. 2, p. 114, made a hidden variables reconstruction of quantum mechanics untenable.<br />

It is the merit of Bohm to have made a breach in the Copenhagen interpretation for the first time,<br />

by doing exactly that what was impossible or meaningless according to the Copenhageners. In 1952<br />

he published two articles in which he presented a HVT of quantum mechanics. In the second article<br />

he describes the breach as follows (Bohm 1952 part II, p. 188)<br />

The usual interpretation of the quantum theory implies that we must renounce the possibility<br />

of describing an individual system in terms of a single precisely defined conceptual<br />

model. We have, however, proposed an alternative interpretation which does not imply


128 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

such a renunciation, but which instead leads us to regard a quantum - mechanical system<br />

as a synthesis of a precisely definable particle and a precisely definable ψ - field which<br />

exerts a force on this particle.<br />

Bohm’s theory is strongly related to ideas which Louis de Broglie already put forward at the<br />

Solvay Conference in 1927. However, criticism from the Copenhageners at the conference, especially<br />

expressed by Pauli, made de Broglie abandon his theory, which was indeed not quite completely and<br />

consistently developed. Bohm devised, independently of de Broglie, an entirely elaborated version,<br />

which brought about a reconversion of de Broglie.<br />

We will study Bohm’s theory because it is an example of a concrete HVT, in contrast to the abstract<br />

characterization of such theories which we discussed in the previous chapter. We will see that Bohm’s<br />

theory shows remarkable aspects which differ thoroughly from classical physics.<br />

VI. 2<br />

THE <strong>QUANTUM</strong> POTENTIAL<br />

Bohm’s theory, which we will call Bohmian mechanics, starts from wave mechanics, i.e. quantum<br />

mechanics with L 2 (R n ) as its Hilbert space, but without the projection postulate. 1 This means that<br />

Bohm assumes that there is a wave function ψ(⃗q, t) which always satisfies the Schrödinger equation.<br />

First we consider the 1 - particle case, if there are more particles, ψ has more arguments.<br />

The idea is to interpret this wave function as a statistical description of a particle which always has<br />

a certain position and momentum. We will see that this particle must then be subjected to dynamics<br />

which differs from classical dynamics, by assuming that the forces acting on the particle are not<br />

exclusively the forces known from classical physics.<br />

The basic assumption is the Schrödinger equation for a particle with mass m in a time independent<br />

potential V (⃗q),<br />

i <br />

∂ψ(⃗q, t)<br />

∂t<br />

= − 2<br />

2 m ∇2 ψ(⃗q, t) + V (⃗q) ψ(⃗q, t), (VI. 1)<br />

but we will interpret the wave function differently from its usual interpretation in quantum mechanics.<br />

To this end, we rewrite ψ, with the help of two real functions R, S : R 4 → R, as<br />

ψ(⃗q, t) = R(⃗q, t) e i S(⃗q, t) . (VI. 2)<br />

It is always possible to find such functions R and S. Requiring R(⃗q, t) 0, R and S are, at given ψ,<br />

uniquely defined, except where ψ = 0. Substitution of (VI. 2) in (VI. 1), and separating the real and<br />

imaginary parts of the resulting equation, leads to two equations,<br />

∂R(⃗q, t)<br />

∂t<br />

∂S(⃗q, t)<br />

∂t<br />

= − 1 (<br />

R(⃗q, t) ∇ 2 S(⃗q, t) + 2 ∇ R(⃗q, t) · ∇ S(⃗q, t) ) ,<br />

2 m<br />

(VI. 3)<br />

( ) 2 ∇ S(⃗q, t)<br />

= −<br />

− V (⃗q) +<br />

2 ∇ 2 R(⃗q, t)<br />

.<br />

2 m<br />

2 m R(⃗q, t)<br />

(VI. 4)<br />

1 In the literature, under Bohmian mechanics a ’streamlined’ version of Bohm’s original theory is understood, without a<br />

quantum potential.


VI. 2. THE <strong>QUANTUM</strong> POTENTIAL 129<br />

First we consider equation (VI. 3). Using the abbreviation ρ = R 2 this equation becomes<br />

∂ρ(⃗q, t)<br />

∂t<br />

+ ∇ ·<br />

(<br />

ρ(⃗q, t)<br />

)<br />

∇ S(⃗q, t)<br />

m<br />

= 0, (VI. 5)<br />

where ρ = R 2 is equal to |ψ| 2 , the quantum mechanical probability density for finding a particle<br />

at a certain position, which leads to the interpretation of ρ(⃗q, t) to be the probability density to find<br />

the particle at time t at position ⃗q ∈ R 3 . If we now interpret ∇S (⃗q, t) as the momentum of the<br />

particle, ∇S = ⃗p = m⃗v, (VI. 5) acquires a clear meaning; it is the continuity equation for a probability<br />

density ρ, which expresses that the total probability, given by the integral of ρ(⃗q, t) over R, is<br />

constant in time.<br />

Now consider equation (VI. 4). The last term in this equation is the only term of both (VI. 3)<br />

and (VI. 4) in which Planck’s constant appears explicitly. For this term we define the so - called<br />

quantum potential,<br />

U (⃗q, t) : = − 2<br />

2 m<br />

∇ 2 R(⃗q, t)<br />

. (VI. 6)<br />

R(⃗q, t)<br />

In case the quantum potential U would be equal to 0, equation (VI. 4) reads<br />

∂S(⃗q, t)<br />

∂t<br />

= −<br />

(<br />

∇ S(⃗q, t)<br />

) 2<br />

2 m<br />

− V (⃗q), (VI. 7)<br />

which is exactly the classical Hamilton - Jacobi equation for one particle. In (VI. 7), S is called the<br />

action, and ∇S is, as mentioned above, the momentum of the particle. In other words, if U = 0,<br />

we can interpret equations (VI. 3) and (VI. 4), and therefore also the equivalent Schrödinger equation<br />

(VI. 1), as the statistical description of a particle moving in a potential V in accordance with the<br />

laws of classical mechanics. We will discuss (VI. 7) more elaborately in section VI. 5, thereby also<br />

motivating the interpretation of ∇S.<br />

In case the quantum potential U would not be equal to 0, the just discussed interpretation can<br />

still be given if we assume that, next to the classical potential V , the quantum potential U is added<br />

as a correction to the equation of motion. The momentum is still given by ⃗p = ∇S, and (VI. 5)<br />

remains to be a continuity equation. However, (VI. 7) is replaced by (VI. 4), the Hamilton - Jacobi<br />

equation for a particle in the potential field V + U. We see that we have now adopted, besides the<br />

well - known −∇V , an extra force which acts on the particle,<br />

⃗F (⃗q, t) =<br />

d⃗p(⃗q, t)<br />

dt<br />

= − ∇ ( V (⃗q) + U (⃗q, t) ) . (VI. 8)<br />

If the limit → 0 is taken in the Schrödinger equation, (VI. 1), the result is nonsense, but if → 0 is<br />

taken in the definition (VI. 6) of the quantum potential U, we have U (⃗q, t) = 0, and (VI. 8) reduces<br />

to Newton’s law of motion.<br />

We will now discuss a simple example to illustrate the difference between Bohmian mechanics<br />

and quantum mechanics.


130 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

EXAMPLE<br />

A particle sits in a 1 - dimensional ‘box’ of length L, having walls which are formed by infinitely<br />

high potential barriers. Quantum mechanics gives as stationary solutions<br />

ψ n (q, t) = ψ n (q) e − i En t , (VI. 9)<br />

with<br />

ψ n (q) =<br />

√<br />

2<br />

( nπq<br />

)<br />

L sin , q ∈ [0, L], (VI. 10)<br />

L<br />

and energy values<br />

E n =<br />

2<br />

2 m<br />

( n π<br />

) 2<br />

. (VI. 11)<br />

L<br />

Therefore, in Bohmian mechanics for a stationary state we have<br />

R n (q, t) = ψ n (q) and S n (q, t) = − E n t. (VI. 12)<br />

Now it is surprising that in this example it holds that<br />

p = ∂S n<br />

∂q<br />

= ∂(− E n t)<br />

∂q<br />

= 0, (VI. 13)<br />

i.e., according to Bohmian mechanics the particle is motionless. This also applies to other cases of<br />

stationary states, for example to the ground state of the hydrogen atom. It is in straight contradiction<br />

to the statements of quantum mechanics. After all, in the case of the box quantum mechanics<br />

assigns, if the particle is in the state ψ n , a large probability to finding the momentum p having values<br />

around ±nπ<br />

L<br />

, in which case the particle moves with p m<br />

> 0, although the quantum mechanical<br />

expectation value of p is zero for the particle in the box.<br />

This example shows that the statements of quantum mechanics and Bohmian mechanics do not<br />

coincide for all quantities. They only correspond concerning probability distributions for position<br />

measurements. Bohmian mechanics is, therefore, not a HVT in the sense of chapter V, where it was<br />

assumed that the statements of such a theory are similar to the statements of quantum mechanics for<br />

all quantities. Von Neumann’s impossibility proof is therefore not applicable to Bohmian mechanics.<br />

The explanation of the discrepancy between Bohmian mechanics and quantum mechanics lies, of<br />

course, in the use of the quantum potential. According to Bohm, the energy of the particle in the box<br />

has been entirely stored in the form of potential energy as a result of the quantum potential, hence,<br />

the particle has no kinetic energy.<br />

This changes however as soon as we open the box by removing one or both barriers. The quantum<br />

potential energy is again released, and the particle will start to move. The wave packet ψ(⃗q, t) then<br />

spreads out in space, in exactly the same way as prescribed by the Schrödinger equation, and there<br />

is no difference anymore between the statements of both theories concerning the movement of the<br />

particle.


VI. 2. THE <strong>QUANTUM</strong> POTENTIAL 131<br />

The discrepancy between Bohmian mechanics and quantum mechanics has no perceptible consequences<br />

if we argue that all measurements are ultimately made by means of observation of position.<br />

Every physical quantity is eventually determined by a ‘pointer’ with a certain position, and a momentum<br />

measurement must eventually be registered by means of the displacement of some object.<br />

◃ Remark<br />

Notice that Bohm’s point of view deviates from that of Bohr, which says that position and momentum<br />

measurements exclude each other in principle but are both necessary to be able to give an exhaustive<br />

description of the system. ▹<br />

Figure VI. 1: The quantum potential for the two slit system as viewed from the screen, under assumption<br />

of a Gaussian distribution at the slits (Bohm 1989 )<br />

Finally we consider a special case. Suppose that A, B ⊂ R 3 are disjoint areas in space,<br />

i.e. A ∩ B = ∅, ψ A and ψ B are wave functions which are 0 outside these areas, and the wave<br />

function has the following form,<br />

ψ(⃗q) = a ψ A (⃗q) + b ψ B (⃗q), (VI. 14)<br />

with a, b ∈ R. Since ψ A and ψ B have no overlap, for all ⃗q ∈ R 3 it holds that<br />

ψ A (⃗q) ψ B (⃗q) = 0. (VI. 15)<br />

Therefore, the probability density belonging to (VI. 14) is<br />

ρ(⃗q) = |a ψ A (⃗q)| 2 + |b ψ B (⃗q)| 2 , (VI. 16)<br />

without a cross - term, and we see that the ensemble of particles described by the density |ψ (⃗q)| 2<br />

behaves like a mixture.


132 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

With<br />

S(⃗q) =<br />

⎧<br />

⎪⎨<br />

⎪⎩<br />

S A (⃗q) for ⃗q ∈ A,<br />

S B (⃗q) for ⃗q ∈ B,<br />

0 elsewhere,<br />

(VI. 17)<br />

and ψ A (⃗q) = R A (⃗q)e i S A(⃗q) , etc., (VI. 14) reads<br />

ψ(⃗q) = ( a R A (⃗q) + b R B (⃗q) ) e i S(⃗q) , (VI. 18)<br />

which means that also the quantum potential, as depicted in figure VI. 1, can now be taken as a sum<br />

of terms belonging to separate areas. The particles in area A do not perceive the wave function in<br />

area B at all.<br />

Figure VI. 2: A simulation of the double slit experiment in Bohmian mechanics. Each particle follows<br />

a certain path between the slits and the photographic plate. All particles coming from the upper slit<br />

arrive at the upper half of the photographic plate, likewise for the lower slit and lower half of the<br />

plate. The twists in the paths are caused by the quantum potential U. (Vigier et al. 1987 )<br />

VI. 3<br />

COMPOSITE SYSTEMS<br />

The technique used to rewrite the Schrödinger equation into equations describing particles with<br />

definite position and momentum in a non - classical potential field, can easily be generalized. For


VI. 3. COMPOSITE SYSTEMS 133<br />

example, for a system of two particles, represented by the wave function ψ (⃗q 1 , ⃗q 2 , t), we interpret<br />

|ψ(⃗q 1 , ⃗q 2 , t)| 2 as the probability density that, simultaneously, particle 1 is located at position ⃗q 1<br />

and particle 2 at position ⃗q 2 .<br />

We write<br />

ψ(⃗q 1 , ⃗q 2 , t) = R(⃗q 1 , ⃗q 2 , t) e i S(⃗q 1, ⃗q 2 , t) , (VI. 19)<br />

and the quantum potential is now given by<br />

2 ( 2 ∇1 R(⃗q 1 , ⃗q 2 , t)<br />

U (⃗q 1 , ⃗q 2 , t) = −<br />

+ ∇ 2 2 )<br />

R(⃗q 1 , ⃗q 2 , t)<br />

, (VI. 20)<br />

R(⃗q 1 , ⃗q 2 , t) 2 m 1 2 m 2<br />

where ∇ i := ∂ /∂⃗q i is the gradient to the coordinates of particle i. In this expression the coordinates<br />

of both particles occur. Therefore, the force on particle 1, ⃗ F 1 = −∇(V + U), also depends,<br />

by means of the quantum potential, on the position of particle 2, and vice versa. This can be compared<br />

to the situation in Newton’s gravitation theory, where such a dependence appears in the classical<br />

potential V ; there is an instantaneous interaction (Latin: actio in distans) between particles, a choice<br />

of another initial position of one particle immediately influences the dynamics of the other.<br />

Notice, however, that in Bohmian mechanics this influence does not have to decrease with the<br />

distance between the particles. Even if R (⃗q 1 , ⃗q 2 , t) would go to 0 for ∥⃗q 1 − ⃗q 2 ∥ → ∞, the quantum<br />

potential U(⃗q 1 , ⃗q 2 ) does not need to do so, it depends on the second derivative, which means that<br />

it depends on the strength of the oscillation of R, not on the amplitude.<br />

Also notice that the mutual dependence between the particles does not only appear by means of<br />

the quantum potential. The momentum of particle 1, given by ∇ 1 S(⃗q 1 , ⃗q 2 , t), cannot be chosen independently<br />

of the position of particle 2, and vice versa. This does not even happen in a classical theory<br />

with an actio in distans, and it gives Bohmian mechanics a deeply ‘holistic’ character.<br />

Only when the total wave function is a product this mutual dependence disappears, because then<br />

yielding<br />

ψ(⃗q 1 , ⃗q 2 , t) = ψ 1 (⃗q 1 , t) ψ 2 (⃗q 2 , t), (VI. 21)<br />

R(⃗q 1 , ⃗q 2 , t) = R 1 (⃗q 1 , t) R 2 (⃗q 2 , t),<br />

S(⃗q 1 , ⃗q 2 , t) = S 1 (⃗q 1 , t) + S 2 (⃗q 2 , t) (VI. 22)<br />

and, consequently, (VI. 20) becomes<br />

U (⃗q 1 , ⃗q 2 , t) = U 1 (⃗q 1 , t) + U 2 (⃗q 2 , t). (VI. 23)<br />

Each particle only feels its own potential field, and its momentum does not depend on the position<br />

of the other particle. If now the classical potential V is also a sum of 1 - particle potentials, this<br />

factorizability is preserved in time.<br />

We know, however, that the wave function ψ (⃗q 1 , ⃗q 2 , t) does in general not have to be a product<br />

state, and even if it is a product state at some moment, it will generally not remain to be one. We<br />

must therefore conclude that the quantum potential U represents a non - local connection between the<br />

particles.


134 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

◃ Remark<br />

For Bell, this observation was a reason to examine if quantum mechanical HVT’s can, in fact, be local<br />

at all. We will come back to this in chapter VII. ▹<br />

An intermediate form occurs if A, B, C, D ⊂ R 3 are certain areas in space, such that A ∩ C = ∅<br />

or B ∩ D = ∅, ψ A , ψ C , ϕ B , ϕ D are wave functions which are 0 outside these areas, and the wave<br />

function is, analogously to (VI. 14), of the form<br />

ψ(⃗q 1 , ⃗q 2 ) = a ψ A (⃗q 1 ) ϕ B (⃗q 2 ) + b ψ C (⃗q 1 ) ϕ D (⃗q 2 ), (VI. 24)<br />

with a, b ∈ R. Since the pair ψ A and ψ C , or the pair ϕ B and ϕ D , or both, have no overlap, for<br />

all ⃗q 1 , ⃗q 2 ∈ R 3 we have<br />

ψ A (⃗q 1 ) ψ C (⃗q 1 ) = 0 or ϕ B (⃗q 2 ) ϕ D (⃗q 2 ) = 0. (VI. 25)<br />

Therefore, the probability density belonging to (VI. 24) is<br />

ρ(⃗q 1 , ⃗q 2 ) = R 2 (⃗q 1 , ⃗q 2 ) = |a ψ A (⃗q 1 ) ϕ B (⃗q 2 )| 2 + |b ψ C (⃗q 1 ) ϕ D (⃗q 2 )| 2 , (VI. 26)<br />

without a cross - term, and we see that the ensemble, again analogously to (VI. 14), behaves like a<br />

mixture. In this case we call the wave function ψ(⃗q 1 , ⃗q 2 ) effectively factorizable.<br />

With<br />

⎧<br />

S ⎪⎨ A (⃗q 1 ) + S B (⃗q 2 ) for ⃗q 1 ∈ A, ⃗q 2 ∈ B<br />

S tot (⃗q 1 , ⃗q 2 ) = S C (⃗q 1 ) + S D (⃗q 2 ) for ⃗q 1 ∈ C, ⃗q 2 ∈ D<br />

(VI. 27)<br />

⎪⎩<br />

0 elsewhere,<br />

and ψ A (⃗q 1 ) = R A (⃗q 1 )e i S A(⃗q 1 ) , etc., because of (VI. 25) it holds that<br />

ψ(⃗q 1 , ⃗q 2 ) = a R A (⃗q 1 ) R B (⃗q 2 ) e i (S A(⃗q 1 ) + S B (⃗q 2 ))<br />

+ b R C (⃗q 1 ) R D (⃗q 2 ) e i (S C (⃗q 1 ) + S D (⃗q 2 ))<br />

(VI. 28)<br />

= ( a R A (⃗q 1 ) R B (⃗q 2 ) + b R C (⃗q 1 ) R D (⃗q 2 ) ) e i Stot(⃗q 1, ⃗q 2 ) .<br />

Therefore, also in case of composite systems, the quantum potential can be taken as a sum of terms<br />

belonging to the separate particles, and the momentum of a particle does not depend on the other<br />

particle.<br />

Consequently, we can interpret the system as being composed of a pair of particles of which one<br />

particle is in area A and the other in B, or, likewise, in area C and D. The pair of particles is not<br />

influenced by the wave functions or the quantum potential in the other area. For this reason, these<br />

pilot waves are also called empty waves. They have no dynamic influence on the particles, but they<br />

do contain energy. If, at some time, the wave functions will have overlap again, they will of course<br />

also regain influence.


VI. 4. REMARKS AND PROBLEMS 135<br />

VI. 4<br />

REMARKS AND PROBLEMS<br />

In Bohmian mechanics the wave function a plays a double role. On the one hand, we see<br />

that ρ(⃗q, t 0 ) = R 2 = |ψ (⃗q, t 0 )| 2 is equal to the probability density to find a particle at time t 0 at<br />

a certain position, and we use this to characterize the ensemble at t 0 . On the other hand, ψ determines<br />

the value of R, and thereby, by means of formula (VI. 6) or (VI. 20), also the quantum potential which<br />

has the same status as the classical potential V . This means that ψ is also connected with the dynamic<br />

evolution of particles.<br />

This is strange if seen from a classical perspective. In classical statistical mechanics it is always<br />

possible to specify the form of the probability density at t 0 independently of the dynamics. Inversely,<br />

the force acting on a particle in a classical theory does not depend on the probabilities that the particle<br />

would be at another position then it actually is. But we saw that in Bohmian mechanics the force does<br />

depend on the probabilities. In Bohm’s interpretation we must therefore assume that if at an initial<br />

time t 0 the quantum mechanical probability density is |ψ (⃗q, t 0 )| 2 , the particles subsequently move<br />

under the influence of forces which are also determined by ψ(⃗q, t 0 ).<br />

Nonetheless, it can be proved that if this pre - established harmony is valid at one moment in<br />

time, it remains valid at all other times. In later work, Bohm speculated that this harmony between<br />

the quantum potential and the probability density could possibly be understood as a requirement for<br />

equilibrium of an underlying ‘sub - quantum aether’. From this idea the expectation arises that if this<br />

equilibrium can be disrupted, it can only after some time become restored again, so that deviations<br />

from the quantum mechanical predictions can appear at very swift measurements. Until now such<br />

deviations have not been found.<br />

Bohmian mechanics gives, on the basis of the thesis that, eventually, all measurements are position<br />

measurements, the same empirically verifiable predictions as standard quantum mechanics does.<br />

Moreover, it provides a picture in which particles have position and momentum and it can be visualized<br />

how the particles move through space, even if there is no measurement. Also, Bohmian<br />

mechanics is deterministic; the evolution is determined by classical mechanics, extended with the<br />

quantum potential. Although these properties seem to be large advantages, Bohm’s proposal evoked<br />

no enthusiasm in the nineteen fifties.<br />

Of course, from the side of the Copenhageners little support was to be expected. The proposal was<br />

dismissed as ‘metaphysical speculation’, a return to the lost paradise of classical physics. Bohm parried<br />

this argument by calling the Copenhageners’ ‘completeness’ claim untestable and metaphysical.<br />

But Einstein also found the idea ‘too cheap’ because it leaned too much on the quantum mechanical<br />

formalism in combination with the classical idea of particles. Einstein himself thought that a<br />

completely new theory with a totally different perspective was necessary, such as his unified field<br />

theory. Probably, Einstein also had objections because of the far - reaching non - locality of Bohmian<br />

mechanics.<br />

Others stumbled at the fact that Bohmian mechanics only relies on a rewriting of the Schrödinger<br />

equation, and contains nothing new. Bohm had foreseen this criticism and tried to argue that his theory<br />

presents new ideas for experiments and that on distance and energy scales which are within range of<br />

Heisenberg’s indeterminacy principle, Bohmian mechanics will prove to be necessary. But above all,<br />

Bohm wanted to show the possibility of a HVT and to challenge the necessity of the Copenhagen<br />

interpretation.


136 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

Bohmian mechanics has not lead to new verifiable statements, although ‘tunneling times’ are debated,<br />

about which quantum mechanics does not say anything, but Bohmian mechanics does. Furthermore,<br />

by the fresh look supplied by Bohmian mechanics, new extensions of the theory are suggested,<br />

such as the suggestion of an underlying sub - quantum aether, as a result of the unexpected double<br />

role of the wave function.<br />

In the nineteen nineties, a growing group of physicists considered Bohmian mechanics to be a<br />

serious alternative for the Copenhagen interpretation, see for example Holland (1993) and Cushing<br />

(1994), who suggests a sociological explanation for the fact that the physicists’ community did not<br />

replace quantum mechanics by the, according to Cushing, superior Bohmian mechanics.<br />

VI. 5<br />

THE HAMILTON - JACOBI EQUATION<br />

In classical mechanics we assume that for a system of n point particles, with canonical positions<br />

⃗q = (q 1 , . . . , q n ) ∈ R 3n and speeds ˙⃗q = ( ˙q 1 , . . . , ˙q n ) ∈ R 3n , a Lagrangian L(⃗q, ˙⃗q, t) can be<br />

found, the Lagrangian L = T − V being the difference between kinetic and potential energy. Define<br />

the following functional, called the action<br />

∫<br />

S γ (⃗q, t; ⃗q 0 , t 0 ) := L(⃗q, ˙⃗q, t) dt, (VI. 29)<br />

γ<br />

where the integral, for n particles in 3 dimensions, is taken over a continuous path γ in configuration<br />

space R 3n between an initial configuration ⃗q 0 at time t 0 and the configuration ⃗q at time t. In case the<br />

Lagrangian does not explicitly depend on t, we can also write S γ (⃗q, ⃗q 0 , t − t 0 ).<br />

The equations of motion are found by application of Hamilton’s principle of least action; for the<br />

path γ 0 which is actually followed, the action reaches an extremum in comparison to all possible<br />

continuous paths. This requirement,<br />

δS γ = 0, (VI. 30)<br />

provides n equations of motion of Euler and Lagrange,<br />

d<br />

dt<br />

∂L<br />

∂ ˙q j<br />

− ∂L<br />

∂q j<br />

= 0. (VI. 31)<br />

The Hamiltonian, H = T + V , is defined as the Legendre transform of the Lagrangian,<br />

H (⃗q, ⃗p, t) :=<br />

3n∑<br />

j=1<br />

p j ˙q j − L(⃗q, ˙⃗q, t) (VI. 32)<br />

where<br />

p j := ∂L<br />

∂ ˙q j<br />

(VI. 33)<br />

is the canonical momentum.


VI. 5. THE HAMILTON - JACOBI EQUATION 137<br />

Substitution of (VI. 32) in (VI. 29) yields<br />

S γ =<br />

∫<br />

γ<br />

( 3n∑<br />

j=1<br />

)<br />

p j ˙q j − H (⃗q, ⃗p, t) dt =<br />

3n∑<br />

j=1<br />

∫<br />

γ<br />

p j dq j −<br />

∫<br />

γ<br />

H (⃗q, ⃗p, t) dt, (VI. 34)<br />

and variation of S γ in this form yields the 2n Hamiltonian equations of motion,<br />

˙q j = ∂H<br />

∂p i<br />

,<br />

ṗ j = − ∂H<br />

∂q i<br />

. (VI. 35)<br />

Now consider the action S γ along a real path γ 0 , i.e., a path satisfying the equations of motion,<br />

and form its differential,<br />

dS(⃗q, ⃗q 0 , t − t 0 ) =<br />

3n∑<br />

j=1<br />

(p j dq j − p 0j dq 0j ) − H (⃗q, ⃗p, t) dt. (VI. 36)<br />

Comparison with<br />

dS(⃗q, ⃗q 0 , t − t 0 ) =<br />

3n∑<br />

j=1<br />

( ∂S<br />

∂q j<br />

dq j +<br />

∂S )<br />

dq 0j + ∂S dt (VI. 37)<br />

∂q 0j ∂t<br />

and using requirement (VI. 30) shows that<br />

H (⃗q, ⃗p, t) = − ∂S<br />

∂t ,<br />

p j = ∂S<br />

∂q j<br />

,<br />

p 0j = − ∂S<br />

∂q 0j<br />

, (VI. 38)<br />

and therefore<br />

∂S<br />

(<br />

∂t + H ⃗q, ∂S )<br />

∂⃗q , t<br />

= 0. (VI. 39)<br />

This is (VI. 7), the Hamilton - Jacobi equation, as discussed on p. 129. The technique to solve the<br />

mechanical equations of motion by means of this equation is especially due to Jacobi. Without discussing<br />

this technique in detail, we mention the following.<br />

For definite q 0 and t 0 it is possible to consider the action S as a function on configuration space. It<br />

can be shown that the paths satisfying the equations of motion are always perpendicular to the hyperplanes<br />

of constant S, hence the frequently quoted analogy with optics; paths are comparable to rays<br />

of light, and planes of constant S to wave fronts. If, for one moment in time, the values S are given<br />

over the complete configuration space, the Hamilton - Jacobi equation determines how they evolve in<br />

the course of time. The problem to find the paths of the particles is thus reduced to constructing the<br />

curves which are normal to the planes of constant S.<br />

◃ Remark<br />

Schrödinger originally based his derivation of wave mechanics on the idea that wave mechanics is to<br />

classical mechanics as wave optics is to ray optics, and with the just mentioned wave fronts and the<br />

Hamilton - Jacobi equation he came to his wave mechanics. ▹


VII<br />

BELL’S INEQUALITIES<br />

There is hardly a paper - nor was there any during the past two and a half decades -<br />

which deals with the foundations of quantum mechanics and does not refer to the work<br />

of John Stewart Bell.<br />

Bell’s theorem is the most profound discovery of science.<br />

— Max Jammer<br />

— Henry Stapp<br />

[. . . ] Bell is generally credited with having brought down a purely philosophical issue<br />

from the lofty realms of abstract speculation to the tangible reach of empirical investigation<br />

and of having thereby established what has been called ‘experimental metaphysics’.<br />

— Max Jammer<br />

The ‘Bell inequalities’ is a generic term for inequalities in terms of measurable physical quantities<br />

which are satisfied by hidden variables theories, but are violated by quantum mechanics. We will<br />

derive several Bell inequalities, belonging to different types of hidden variables theories. This<br />

also includes indeterministic, stochastic HVT’s, which fell outside the scope of chapter V.<br />

VII. 1<br />

LOCAL DETERMINISTIC HIDDEN VARIABLES<br />

VII. 1. 1<br />

DERIVATION <strong>OF</strong> THE FIRST BELL INEQUALITY<br />

Returning to the hidden variables theories, HVT’s, we focus our attention at a specific experiment.<br />

In the article ‘On the Einstein Podolsky Rosen paradox’ (1964), J.S. Bell examines the EPR experiment,<br />

discussed in section I. 2, in a version which was given by Bohm and Aharonov (Bohm 1957),<br />

also called the EPRB experiment. Bohm and Aharonov proposed an experiment in which two spin<br />

1/2 particles are prepared in the singlet state and, next, move apart in opposite directions. After they<br />

are separated, the spin of each of the particles is measured in an arbitrary direction, where the spin of<br />

particle 1 is measured in direction ⃗a and the remote particle 2 in direction ⃗ b, as in figure III. 3, p. 73.<br />

In this experiment, one can follow the same argument as EPR. Using the notation of section III. 6,<br />

if measurement of ⃗σ 1 · ⃗a yields the value +1 then, for the singlet state, measurement of ⃗σ 2 · ⃗a must<br />

yield the value −1 and vice versa.<br />

Since the result of a measurement of a spin component of the one particle can be predicted with<br />

certainty by measuring the same component of the other particle, whereas the particles are far away


140 CHAPTER VII. BELL’S INEQUALITIES<br />

from each other and do not interact, it follows, according to EPR, that the result of a measurement<br />

of any spin component is determined in advance, i.e., that it is an element of physical reality. This<br />

suggests that there there should be a more complete description of the state of the particles, including<br />

hidden variables.<br />

Specify this description of the pair of particles with variables λ ∈ Λ as we did in chapter V. We<br />

write the quantities corresponding to (⃗σ 1·⃗a)⊗(⃗σ 2·⃗b) as the pair (A, B), having values a,b = ±1. In a<br />

contextual HVT, these values are dependent on the hidden variable λ and the total measuring context,<br />

which can be specified here by means of the measurement directions ⃗a and ⃗ b, leading to<br />

A = A(⃗a, ⃗ b, λ) and B = B(⃗a, ⃗ b, λ). (VII. 1)<br />

Now the essential assumption is the requirement of locality that the quantity A does not depend<br />

on the reading ⃗ b of a remote spin meter, and vice versa for B and ⃗a. These quantities therefore only<br />

depend upon the local context,<br />

A(⃗a, ⃗ b, λ) = A(⃗a, λ), a = ±1,<br />

B(⃗a, ⃗ b, λ) = B( ⃗ b, λ), b = ±1. (VII. 2)<br />

b = +1<br />

b = −1<br />

B( ⃗ b, λ)<br />

A(⃗a, λ)<br />

a = +1<br />

a = −1<br />

b ′ = +1<br />

b ′ = −1<br />

B( ⃗ b ′ , λ)<br />

A(⃗a ′ , λ)<br />

a ′ = +1<br />

a ′ = −1<br />

Spin meter B<br />

ρ(λ)<br />

Spin meter A<br />

Source<br />

Figure VII. 1: Thought experiment of Einstein, Podolsky and Rosen on the singlet<br />

The source emitting the particle pairs probably does not prepare the pairs in the same state λ each<br />

time. We assume that the source can be characterized by a probability density ρ,<br />

∫<br />

ρ(λ) dλ = 1, (VII. 3)<br />

Λ<br />

where we also assume that this probability density does not depend on the measuring directions ⃗a<br />

and ⃗ b, which, after all, can be established long after the particles have left the source. The expectation<br />

value of the product of A and B in this HVT is therefore<br />

∫<br />

E(⃗a, ⃗ b) = A(⃗a, λ) B( ⃗ b, λ) ρ(λ) dλ. (VII. 4)<br />

Λ


VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 141<br />

Quantum mechanics gives as the expectation value, with the particle pair in the singlet state, see<br />

equation (III. 171), p. 73,<br />

E QM (⃗a, ⃗ b) = ⟨ ⃗σ 1 · ⃗a ⊗ ⃗σ 2 · ⃗b ⟩ = −⃗a · ⃗b = − cos θ ⃗a, ⃗ b<br />

. (VII. 5)<br />

But the expressions (VII. 4) and (VII. 5) cannot coincide for all directions ⃗a and ⃗ b. According<br />

to (VII. 2), the expectation value E(⃗a, ⃗ b) of the product of A and B cannot be less than −1. Therefore,<br />

to reach −1 at ⃗a = ⃗ b, also requiring equality between (VII. 4) and (VII. 5), it must hold for all unit<br />

vectors ⃗n that<br />

A(⃗n, λ) = − B(⃗n, λ), (VII. 6)<br />

which leads to<br />

∫<br />

E(⃗a, ⃗ b) = −<br />

Λ<br />

A(⃗a, λ) A( ⃗ b, λ) ρ(λ) dλ. (VII. 7)<br />

Now it follows, because of ( A(⃗n, λ) ) 2 = 1, that<br />

∫<br />

E(⃗a, ⃗ b) − E(⃗a, ⃗ (<br />

b ′ ) = − A(⃗a, λ) A( ⃗ b, λ) − A(⃗a, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ<br />

=<br />

Λ<br />

∫<br />

Λ<br />

A(⃗a, λ) A( ⃗ b, λ) ( A( ⃗ b, λ) A( ⃗ b ′ , λ) − 1 ) ρ(λ) dλ, (VII. 8)<br />

where ⃗ b ′ is another setting of the remote spin meter, and A( ⃗ b ′ , λ) also has values ±1. Taking the<br />

absolute value on both sides, keeping in mind that |A(⃗a, λ)A( ⃗ b, λ)| = 1, it follows that<br />

∫<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ (<br />

b ′ )| 1 − A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ, (VII. 9)<br />

or,<br />

Λ<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| 1 + E( ⃗ b, ⃗ b ′ ). (VII. 10)<br />

This is the original Bell inequality.<br />

VII. 1. 2<br />

THE BELL INEQUALITY <strong>OF</strong> CLAUSER, HORNE, SHIMONY AND HOLT<br />

Next, we will derive a second inequality. In (VII. 8), we replace ⃗a by ⃗a ′ and the − sign by<br />

the + sign,<br />

∫<br />

E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ (<br />

b ′ ) = − A(⃗a ′ , λ) A( ⃗ b, λ) + A(⃗a ′ , λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ<br />

∫<br />

= −<br />

Λ<br />

Λ<br />

A(⃗a ′ , λ) A( ⃗ b, λ) ( 1 + A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ. (VII. 11)


142 CHAPTER VII. BELL’S INEQUALITIES<br />

Now, in the same way as we derived (VII. 10), we obtain<br />

|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 1 − E( ⃗ b, ⃗ b ′ ). (VII. 12)<br />

Combination of (VII. 10) and (VII. 12) leads to<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2. (VII. 13)<br />

This version of the Bell inequality has been first derived although under weaker assumptions than<br />

used here, by Clauser, Horne, Shimony and Holt (Clauser 1969), for which reason it is also called the<br />

CHSH inequality. We will return to these assumptions in section VII. 2,<br />

VII. 1. 3<br />

VIOLATION <strong>OF</strong> THE BELL INEQUALITIES BY <strong>QUANTUM</strong> <strong>MECHANICS</strong><br />

We will now prove the following theorem.<br />

BELL’S FIRST THEOREM:<br />

A local deterministic HVT is empirically contradictory to quantum mechanics.<br />

Proof<br />

With the expression empirically contradictory we mean that the two theories make contradictory<br />

statements in terms of measurable physical quantities. We will show that, quantum mechanically,<br />

there are spin quantities which violate the Bell inequalities.<br />

Consider the configuration below, where all vectors lie in the same plane.<br />

a<br />

a ′ , b<br />

b ′<br />

ϕ<br />

ϕ<br />

Figure VII. 2: A configuration in which the spin quantities violate the Bell inequality<br />

Using (VII. 5) for this configuration, and substituting the quantum mechanical expression into (VII. 13),<br />

F (ϕ) := | − cos ϕ + cos 2ϕ | + | − cos ϕ − 1| 2, . (VII. 14)<br />

This function is plotted in figure VII. 3.


VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 143<br />

2<br />

F (ϕ)<br />

0<br />

π/2<br />

ϕ →<br />

π<br />

Figure VII. 3: The Bell inequality violated for every acute angle ϕ<br />

We see that (VII. 14) is violated for every ϕ ∈ (0, 1 2 π). The maximum violation is F (60◦ ) = 5 2 ,<br />

as can be seen in the figure.<br />

Even larger violations are by the next configuration:possible in other configurations. The largest<br />

violation is obtained in the configuration of figure VII. 4(with all vectors in a single plane),<br />

leading to<br />

E QM (⃗a, ⃗ b) = − cos 45 ◦ = − 1 2<br />

√<br />

2,<br />

E QM (⃗a, ⃗ b ′ ) = − cos 135 ◦ = 1 2<br />

√<br />

2,<br />

E QM (⃗a ′ , ⃗ b) = − cos 135 ◦ = 1 2<br />

√<br />

2,<br />

E QM (⃗a ′ , ⃗ b ′ ) = − cos 135 ◦ = 1 2<br />

√<br />

2,<br />

|E QM (⃗a, ⃗ b) − E QM (⃗a, ⃗ b ′ )| + |E QM (⃗a ′ , ⃗ b) + E QM (⃗a ′ , ⃗ b ′ )| = 2 √ 2. (VII. 15)<br />

This is a violation of 41%. □<br />

a<br />

b<br />

a ′ 45 ◦ b ′<br />

Figure VII. 4: the configuration giving the largest violation of the Bell inequality (all vectors in the<br />

same plane)


144 CHAPTER VII. BELL’S INEQUALITIES<br />

VII. 1. 4<br />

THE BELL INEQUALITY IN A NON-CONTEXTUAL, LOCAL DETERMINISTIC HVT<br />

To show that the Bell inequality, derived for a local deterministic contextual HVT, also holds for<br />

a local deterministic autonomous HVT, we consider a local deterministic autonomous model for the<br />

singlet.<br />

Assume that both particles are characterized by a ‘classical’ spin vector, ⃗ J and − ⃗ J, about a<br />

common axis. This is the hidden variable. In this HVT, we further assume that the outcome of a<br />

measurement of spin in the direction ⃗n is determined by the sign of the component of the spin vector<br />

in the direction ⃗n. Now let the particles fly away from each other. If the spin of the first particle in the<br />

direction ⃗a is measured we find the outcome<br />

⃗J · ⃗a<br />

∥ ⃗ J · ⃗a∥<br />

∈ {− 1, 1}, (VII. 16)<br />

for the spin of the second particle in direction ⃗ b we find<br />

− ⃗ J · ⃗b<br />

∥ ⃗ J · ⃗b∥<br />

∈ {− 1, 1}. (VII. 17)<br />

The result of the measurement of the first particle is independent of the direction ⃗ b and vice versa,<br />

therefore, the model is local.<br />

Now consider an ensemble of such two particle systems where ⃗ J is distributed isotropically. If a n<br />

is the sign of ⃗ J · ⃗a in the n th pair, and likewise, b n the sign of − ⃗ J · ⃗b, then if ⃗ J pierces through the<br />

shaded area of the unit sphere on the right side in figure VII. 5, a n b n = +1. Otherwise, a n b n = −1.<br />

⃗a<br />

+<br />

⃗a<br />

⃗ b<br />

−<br />

⃗J<br />

θ<br />

θ<br />

+<br />

−<br />

⃗ b<br />

−<br />

+<br />

− ⃗ J<br />

Figure VII. 5: Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n is<br />

positive, in the unshaded areas a n b n is negative.


VII. 2. LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES 145<br />

The surface of the shaded area is 4θ ⃗a, ⃗ b<br />

, that of the remaining part is 4(π − θ ⃗a, ⃗ b<br />

). For an isotropic<br />

distribution, averaging over the surface of the unit sphere, we therefore find<br />

⟨a n b n ⟩ = 1 (<br />

4 θ⃗a, ⃗<br />

4 π b<br />

− 4 (π − θ ⃗a, ⃗ b<br />

) ) = − 1 + 2 π θ ⃗a, ⃗ , (VII. 18)<br />

b<br />

which is an increasing line through (0, −1) having slope π 2 . This runs from perfect anti - correlation<br />

for θ = 0 to perfect correlation for θ = π.<br />

1<br />

− cos θ ⃗a, ⃗ b<br />

⟨a n b n ⟩<br />

0<br />

θ →<br />

π<br />

− 1<br />

Figure VII. 6: Comparison of the quantum mechanical expectation values and those for the local<br />

deterministic HVT<br />

In this HVT, equation (VII. 18) must satisfy the Bell inequality (VII. 13) for E (⃗a, ⃗ b) = ⟨a n b n ⟩.<br />

Choosing the angles as in the example on p. 142, figure VII. 2, if (VII. 18) is substituted in (VII. 13)<br />

it yields exactly 2 for any θ π, where the quantum mechanical expectation values violated the<br />

inequality for every θ ∈ (0, 1 2 π).<br />

In the configuration giving the largest violation of the inequality (VII. 13), see figure VII. 4, we<br />

have<br />

θ ⃗a, ⃗ b<br />

= 1 4 π and θ ⃗a, ⃗ = θ<br />

b ′ ⃗a ′ , ⃗ b = θ ⃗a ′ , ⃗ = 3 b ′ 4<br />

π, (VII. 19)<br />

and therefore, (VII. 18) substituted in (VII. 13) yields<br />

| ( − 1 + 2) 1 ( ) ( ) (<br />

− − 1 +<br />

3<br />

2 | + | − 1 +<br />

3<br />

2 + − 1 +<br />

3<br />

2)<br />

| = 1 + 1 = 2, (VII. 20)<br />

where quantum mechanically, on p. 143 we found 2 √ 2.<br />

We see that where quantum mechanics violated the inequality (VII. 13), this local deterministic<br />

autonomous HVT satisfies it, thereby confirming Bell’s first theorem.<br />

VII. 2<br />

LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES<br />

We have seen that a considerable difference exists between the empirically verifiable statements<br />

of quantum mechanics and those of a local deterministic, autonomous HVT for a singlet state and


146 CHAPTER VII. BELL’S INEQUALITIES<br />

suitably chosen spin directions. This enables an experimental test of these statements, and therefore<br />

of the correctness of the philosophical bases of both theories. A. Shimony (1989) spoke, concerning<br />

the experimental testing of the Bell inequalities, of ‘experimental metaphysics’.<br />

However, the question of experimental testing puts the derivation of the Bell inequalities in another<br />

perspective. We no longer want to compare a HVT with quantum mechanics, but with experimental<br />

results. In this respect (VII. 6), implying perfect anti - correlation when ⃗a = ⃗ b, is overly<br />

idealized. In a real experiment the particle detectors are not perfectly efficient, in the sense that not<br />

all particles are registered. Imagine a detector which, even if A(⃗a, λ) = 1, sometimes gives 0, i.e. not<br />

measured, or even −1, i.e. wrongly measured. Moreover, in a contextual HVT the outcomes could also<br />

be dependent of the measuring context, i.e. of (possibly hidden) variables of the detectors. But also in<br />

this generalized situation it is possible to derive the inequality (VII. 13) from a locality assumption.<br />

We will show this by proving the next theorem.<br />

BELL’S SECOND THEOREM:<br />

A local deterministic contextual HVT is empirically inconsistent with quantum mechanics.<br />

Proof<br />

Assume that the quantities A and B are functions of three arguments,<br />

A = A(⃗a, λ, µ), B = B( ⃗ b, λ, ν) where A, B ∈ {− 1, 1}. (VII. 21)<br />

Here the local deterministic character of the HVT is expressed; the outcome of the measurement<br />

at the measuring apparatus measuring ⃗a · ⃗σ is determined by λ ∈ Λ, describing the source, by the<br />

local hidden variables of that measuring device, expressed symbolically by µ ∈ Λ a , and by the<br />

position ⃗a of the meter pointer. Therefore, the requirement of locality is that A does not depend<br />

on ⃗ b and ν, and B does not depend on ⃗a and µ. We also assume that the hidden variables of the<br />

apparatuses are independent of each other and of λ,<br />

Defining<br />

ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ) ρ 2 (ν). (VII. 22)<br />

and<br />

⟨A(⃗a, λ)⟩ :=<br />

⟨B( ⃗ b, λ)⟩ :=<br />

∫<br />

A(⃗a, λ, µ) ρ 1 (µ) dµ (VII. 23)<br />

Λ a<br />

∫<br />

B( ⃗ b, λ, ν) ρ 2 (ν) dν, (VII. 24)<br />

Λ b<br />

we have, instead of assumption (VII. 2), the much weaker requirements<br />

|⟨A(⃗a, λ)⟩| 1 and |⟨B( ⃗ b, λ)⟩| 1, (VII. 25)<br />

and we will show now that from this it is again possible to derive the Bell inequality (VII. 13).


dµ A(⃗a, λ, µ) dν B( ⃗ b, λ, ν) ρ(λ, µ, ν)<br />

VII. 3. WIGNER’S DERIVATION 147<br />

The expectation value in this HVT is<br />

∫ ∫<br />

∫<br />

E(⃗a, ⃗ b) = dλ<br />

Λ Λ a Λ b<br />

∫<br />

= ⟨A(⃗a, λ)⟩ ⟨B( ⃗ b, λ)⟩ ρ(λ) dλ, (VII. 26)<br />

Λ<br />

which is an ‘averaged’ version of (VII. 4). With (VII. 25) we see that<br />

∫<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| = |⟨A(⃗a, λ)⟩ ( ⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩ ) | ρ(λ) dλ<br />

Λ<br />

∫<br />

|⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ. (VII. 27)<br />

Λ<br />

Likewise we have<br />

|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| <br />

∫<br />

Λ<br />

|⟨B( ⃗ b, λ)⟩ + ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ, (VII. 28)<br />

and therefore<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2, (VII. 29)<br />

since |x + y| + |x − y| 2 if |x| 1 and |y| 1. We see that (VII. 29) is, indeed, the Bell<br />

inequality (VII. 13).<br />

For ⃗a ′ = ⃗ b ′ and the assumption of perfect anti - correlation E ( ⃗ b ′ , ⃗ b ′ ) = −1, from inequality<br />

(VII. 13) follows the original Bell inequality (VII. 10). But, as we showed, (VII. 13) remains<br />

valid under the weaker conditions (VII. 25). □<br />

◃ Remark<br />

It is not necessary to assume mutual independence for µ and λ or for ν and λ as in (VII. 22), the<br />

result (VII. 25) also follows when we make the weaker assumption that the conditional probability<br />

distributions of the apparatuses factorize the conjoint probability distribution ρ,<br />

ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ | λ) ρ 2 (ν | λ). ▹ (VII. 30)<br />

VII. 3<br />

WIGNER’S DERIVATION<br />

E.P. Wigner (1970) was the first to give an elegant derivation of a Bell inequality in terms<br />

of probabilities. We again consider the EPRB experiment from section VII. 1. Using three directions,<br />

⃗n 1 , ⃗n 2 , ⃗n 3 ∈ R 3 , define<br />

σ i := ⃗n i · ⃗σ and τ i := ⃗n i · ⃗τ with i ∈ {1, 2, 3}. (VII. 31)


148 CHAPTER VII. BELL’S INEQUALITIES<br />

Here ⃗σ and ⃗τ are the spin operators of particle 1 and particle 2, respectively. We assume the quantities<br />

of particle 1 to be independent of those of particle 2 and therefore<br />

(σ i ⊗ 11) σi ⊗ τ j<br />

[λ] = (σ i ⊗ 11) σi ⊗ τ j ′ [λ], (VII. 32)<br />

(11 ⊗ τ j ) σi ⊗ τ j<br />

[λ] = (11 ⊗ τ j ) σi ′ ⊗ τ j<br />

[λ]. (VII. 33)<br />

for i ′ ≠ i and j ′ ≠ j. This is the requirement of locality. Without this requirement we would have<br />

nine quantities in the HVT, namely the pairs (σ i ,τ j ), that is, as much quantities as measuring contexts.<br />

Now we have only six: σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 .<br />

The outcome of measurement of every spin quantity is ±1 in units of 1 2<br />

. A HVT must grant a<br />

probability to every combination of outcomes,<br />

0 p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) 1, (VII. 34)<br />

with the usual marginal distributions, for instance<br />

p (σ 1 , τ 1 ) =<br />

∑+1<br />

∑+1<br />

∑+1<br />

∑+1<br />

σ 2 =−1 σ 3 =−1 τ 2 =−1 τ 3 =−1<br />

p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ), (VII. 35)<br />

and so on.<br />

◃ Remark<br />

Quantum mechanics does not have such joint probability distributions because these six quantities<br />

do not all in pairs commute with each other. The spin quantities are not jointly measurable but in<br />

the HVT their values are all fixed. ▹<br />

Calling the angles between ⃗n 1 , ⃗n 2 , ⃗n 3 : θ 12 , θ 23 , θ 31 , then in the singlet state we have, see chapter<br />

III, (III. 176) and (III. 177),<br />

Prob (σ i = 1 ∧ τ j = 1) = 1 2 sin2 1 2 θ ij, (VII. 36)<br />

Prob (σ i = 1 ∧ τ j = − 1) = 1 2 cos2 1 2 θ ij. (VII. 37)<br />

These are the quantum mechanical probabilities and we will see that the HVT, satisfying requirement<br />

(VII. 34), cannot reproduce this. From (VII. 36) and (VII. 37) follows the requirement<br />

p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) = 0 unless σ 1 = − τ 1 , σ 2 = − τ 2 , σ 3 = − τ 3 , (VII. 38)<br />

because the hidden variables cannot assume values giving a positive spin of both particles in the same<br />

direction.<br />

The probability for σ 1 and τ 3 to be both +1 is, using (VII. 36),<br />

∑ ∑<br />

p (+, σ 2 , σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 13 (VII. 39)<br />

τ 1 ,τ 2<br />

σ 2 ,σ 3<br />

= p (+, +, −, −, −, +) + p (+, −, −, −, +, +).


Likewise we calculate the following probabilities<br />

∑ ∑<br />

and<br />

σ 1 ,σ 3<br />

∑ ∑<br />

σ 2 ,σ 3<br />

VII. 3. WIGNER’S DERIVATION 149<br />

τ 1 ,τ 2<br />

p (σ 1 , +, σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 23 (VII. 40)<br />

= p (+, +, −, −, −, +) + p (−, +, −, +, −, +)<br />

τ , τ 3<br />

p (+, σ 2 , σ 3 , τ 1 , +, τ 3 ) = 1 2 sin2 1 2 θ 12 (VII. 41)<br />

From (VII. 40) and (VII. 41) it follows that<br />

= p (+, −, +, −, +, −) + p (+, −, −, −, +, +).<br />

p (+, +, −, −, −, +) 1 2 sin2 1 2 θ 23 and (VII. 42)<br />

p (+, −, −, −, +, +) 1 2 sin2 1 2 θ 12, (VII. 43)<br />

respectively. Consequently, we have for (VII. 39), the probability for σ 1 and τ 3 to be both +1,<br />

1<br />

2 sin2 1 2 θ 23 + 1 2 sin2 1 2 θ 12 1 2 sin2 1 2 θ 13, (VII. 44)<br />

which, using sin 2 1 2 θ = 1 2<br />

(1 − cos θ), is equal to<br />

(1 − cos θ 23 ) + (1 − cos θ 12 ) (1 − cos θ 13 ). (VII. 45)<br />

This is, in essence, the same as inequality (VII. 10); rewriting (VII. 45), realizing that 1 − cos θ 0,<br />

and comparing E(⃗a, ⃗ b) to − cos θ 12 etc. yields<br />

1 − cos θ 23 | − cos θ 12 + cos θ 13 |. (VII. 46)<br />

n 2<br />

n 1<br />

ϕ ϕ<br />

n 3<br />

Figure VII. 7: Violation of the Bell inequality again<br />

With θ 23 = θ 12 = 1 2 θ 13 = ϕ as in diagram VII. 7, (VII. 45) becomes<br />

1 − 2 cos ϕ + cos 2ϕ 0, (VII. 47)<br />

and using cos 2ϕ = 2 cos 2 ϕ − 1 we see that<br />

cos ϕ (1 − cos ϕ) 0. (VII. 48)<br />

Since 1 − cos ϕ 0 for every ϕ, this inequality is violated for every acute angle.


150 CHAPTER VII. BELL’S INEQUALITIES<br />

EXERCISE 32. What type of HVT is excluded by Wigner’s reasoning?<br />

◃ Remark<br />

Wigner (1970) makes the observation that the HVT would have been possible if the terms in (VII. 44)<br />

had been sin 1 2 θ instead of sin2 1 2θ. Apparently, our world depends on such ‘minimal’ mathematical<br />

differences. ▹<br />

VII. 4<br />

THE DERIVATION <strong>OF</strong> EBERHARD AND STAPP<br />

In the previous derivations of the Bell inequalities hidden variables were assumed, which represent<br />

properties of the pair of particles and determine the outcomes of measurements of all physical<br />

quantities. As a consequence, in this HVT a joint probability is defined for the values of non -<br />

commuting quantities also, as we saw in Wigner’s derivation. This follows from the fact that at<br />

given λ both A(⃗a, λ) and A(⃗a ′ , λ) are fixed, for example<br />

p ( A(⃗a) = 1 ∧ A(⃗a ′ ) = 1 ) ∫<br />

= ρ(λ) dλ, (VII. 49)<br />

∆<br />

where ∆ ⊂ Λ is the area in which both A(⃗a, λ) = 1 and A(⃗a ′ , λ) = 1. Since quantum mechanics<br />

does not acknowledge such ‘simultaneous probabilities’ for non - commuting quantities, the quantities<br />

not being simultaneously measurable, it could be suspected that this property of the HVT is the main<br />

reason for the deviation from quantum mechanics, instead of locality or determinism.<br />

In the next derivation of the Bell inequality, given by P. Eberhard and H. Stapp (1977), the existence<br />

of hidden variables is not assumed. They claim that the Bell inequality follows from an assumption<br />

of locality only. However, what will be shown to be necessary in this derivation, is the<br />

assumption that we can speak reasonably about the outcomes of measurements which have not actually<br />

been carried out.<br />

THE EBERHARD - STAPP THEOREM:<br />

Quantum mechanics is a non - local theory.<br />

Proof<br />

Consider again the EPRB experiment. Let ⃗a and ⃗a ′ be two readings of the spin meter at A, and ⃗ b<br />

and ⃗ b ′ likewise at B. We can carry out four experiments:<br />

I : ⃗a, ⃗ b II : ⃗a, ⃗ b ′ III : ⃗a ′ , ⃗ b IV : ⃗a ′ , ⃗ b ′ . (VII. 50)<br />

Define, for the n th pair of particles, a n (I) as the outcome of a spin measurement in the direction ⃗a<br />

of the particle traveling to A while the meter at A points in the direction ⃗a, while at the other<br />

particle, which travels to B, spin in the direction ⃗ b is measured; this gives a n (I) = ±1 for<br />

experiment I and likewise for a n (II), a n ′ (III), a n ′ (IV), b n (I), b n ′ (II), b n (III) and b n ′ (IV).<br />

These values represent outcomes of measurements of actual or possible measurements, not actual<br />

properties of the particles which also exist if they are not measured.


VII. 4. THE DERIVATION <strong>OF</strong> EBERHARD AND STAPP 151<br />

The assumption of locality is that an outcome of measurement of spin of particle 1, in direction ⃗a,<br />

does not depend on which spin direction, ⃗ b or ⃗ b ′ , is measured of the other, remote particle 2. This<br />

is the supposition of locality from the Eberhard - Stapp theorem, leading to what we will call the<br />

matching condition,<br />

a n (I) = a n (II),<br />

a n ′ (III) = a n ′ (IV),<br />

b n (I) = b n (III), b n ′ (II) = b n ′ (IV), (VII. 51)<br />

for all N particle pairs in the singlet state |Ψ 0 ⟩.<br />

Now we can define the following mathematical expression<br />

γ n := a n (I) b n (I) + a n (II) b n ′ (II) + a n ′ (III) b n (III) − a n ′ (IV) b n ′ (IV), (VII. 52)<br />

where the first term corresponds to experiment I, the second to experiment II, etc. Because of the<br />

value assignment ±1, γ is an even integer, and the fourth term being the product of the first three<br />

terms, subtraction of the fourth term means that γ has only two values, as we will see. Moreover,<br />

subtraction allows for an inequality similar to Bell’s inequality (VII. 13).<br />

In (VII. 52) we can omit writing out the labels referring to the numbers of the experiments because<br />

of the matching condition (VII. 51); a n := a n (I) = a n (II), etc. Rewriting (VII. 52),<br />

γ n = a n (b n + b n ′ ) + a n ′ (b n − b n ′ ), (VII. 53)<br />

because of the value assignment ±1 we immediately see that either the first or the second term<br />

equals 0, yielding for all n<br />

γ n = ± 2. (VII. 54)<br />

Averaging over N recurrences of the experiment we have<br />

∣ 1 N<br />

N∑ ∣ ∣∣<br />

γ n =<br />

n=1<br />

1<br />

∣<br />

N<br />

N∑<br />

a n b n +<br />

n=1<br />

Defining the correlation coefficients<br />

N∑<br />

a n b ′ n +<br />

n=1<br />

N∑<br />

a ′ n b n −<br />

n=1<br />

N∑<br />

a ′ ′<br />

n b n ∣ 2. (VII. 55)<br />

n=1<br />

we conclude<br />

c N (⃗a, ⃗ b) :=<br />

1 N<br />

N∑<br />

a n b n etc., (VII. 56)<br />

n=1<br />

|c N (⃗a, ⃗ b) + c N (⃗a, ⃗ b ′ ) + c N (⃗a ′ , ⃗ b) − c N (⃗a ′ , ⃗ b ′ )| 2. (VII. 57)<br />

This is indeed a Bell inequality again, equivalent to inequality (VII. 13) in the limit N → ∞.<br />

The expectation value of c(⃗a, ⃗ b) = ⟨a n b n ⟩ in quantum mechanics is given by (VII. 5) and the<br />

contradiction with (VII. 57) follows as in section VII. 2. □


152 CHAPTER VII. BELL’S INEQUALITIES<br />

◃ Remark<br />

The derivation of (VII. 57) directly comes from expression (VII. 52) and as a result, the existence of<br />

hidden variables does not have to be presumed, only locality was required. Sensationally, we seem to<br />

have proved that quantum mechanics is empirically inconsistent with the requirement of locality. ▹<br />

The experimental violation of the Bell inequalities thus leads us to the conclusion that physical<br />

reality is not local. What we, however, have presupposed in the matching condition (VII. 51) is that<br />

we can simultaneously assign values to a n and a n′ , although they cannot be simultaneously measured<br />

because the spin measuring device cannot be at the same time in both positions ⃗a and ⃗a ′ ≠ ⃗a. In fact,<br />

of the set of four terms in (VII. 52), at the most one of them is experimentally realizable. Still, we<br />

spoke of outcomes of measurements that have not actually been carried out. Of course, the derivation<br />

of the Bell inequality (VII. 57) from the matching condition (VII. 51) is mathematically flawless. The<br />

question is whether the matching condition (VII. 51) follows from the requirement of locality. We<br />

will now explore this question further.<br />

VII. 4. 1<br />

COUNTERFACTUAL CONDITIONAL STATEMENTS AND INDETERMINISM<br />

Let a n be the outcome of experiment I. With their matching condition, Eberhard and Stapp claim<br />

that this value of a n would be unaltered if we had carried out experiment II instead of experiment I<br />

because these experiments only differ in the settings of the B - meter, which is far away. Therefore,<br />

a n is the outcome which the spin meter A would have given for the n th pair of particles for both<br />

experiment I and experiment II. Redhead (1987, p. 92) formulates this requirement as follows<br />

PRINCIPLE <strong>OF</strong> LOCAL COUNTERFACTUAL DEFINITENESS (PLCD):<br />

The result of an experiment which could be performed on a microscopic system has a<br />

definite value which does not depend on the setting of a remote piece of apparatus.<br />

This means that if this setting would have been different, the outcome of the experiment would<br />

not have been different. Using the same mathematics as before it follows that<br />

PLCD → Bell inequality. (VII. 58)<br />

Since PLCD is an assumption of locality concerning outcomes of measurements, (VII. 58) seems<br />

to be independent of the existence of hidden variables. But appearances are deceptive. In fact, PLCD<br />

is only reasonable in a deterministic context, and not in the case of indeterminism.<br />

Consider the following example given by Redhead (ibid.). Suppose that, at t 1 , just before the<br />

clock strikes twelve, I raise my hand. Now I ask the question if the clock would also have struck if<br />

I had not raised my hand at t 1 . Intuitively, the right answer is ‘Yes’, in agreement with PLCD. Now<br />

replace the clock by a radioactive atom which decays at t 2 . Suppose I raised my hand at t 1 < t 2 ,<br />

would the atom also have decayed if I had not done this? Now the answer is far from clear. If the<br />

decay is purely indeterministic, a recurrence of the experiment, even if it is just a thought experiment,<br />

does not have to have the same outcome. The supposition that the atom would not have decayed if I<br />

had not raised my hand, is not contradictory to locality.<br />

The assumptions that outcomes of measurements remain to have the same values even if they are<br />

not measured, or that measurements which are not carried out have certain outcomes in advance, are


VII. 5. STOCHASTIC HIDDEN VARIABLES 153<br />

only reasonable in a deterministic context. But in a deterministic context these assumptions do not<br />

differ from each other, and a outcome of measurement is decisively linked to the value the quantity<br />

had just beforehand, therefore, to a hidden variable.<br />

The conclusion is that the assumption of Eberhard and Stapp, PLCD, is no more general than the<br />

assumption that the value a n is a property of the particles which is determined in advance, and which<br />

is independent of the settings of the meter at B. This means that the derivation is no more general<br />

than the derivation for a local deterministic HVT.<br />

VII. 5<br />

STOCHASTIC HIDDEN VARIABLES<br />

In this section we will no longer require determinism in the HVT; the λ only determine the probability<br />

that a quantity has a certain value, which is revealed by the measuring apparatus in the way a<br />

balance reveals our weight. A stochastic HVT is linked more closely to quantum mechanics, enabling<br />

a more well - defined comparison between the assumptions leading to the Bell inequalities on the one<br />

hand, and quantum mechanics on the other.<br />

In our stochastic HVT we assume the existence of a probability distribution at given directions<br />

⃗a, ⃗ b ∈ R 3 of the spin meters in the EPRB experiment<br />

p ⃗a, ⃗ b<br />

(a, b, λ), (VII. 59)<br />

which is the probability to find for the quantities A = ⃗σ 1 · ⃗a and B = ⃗σ 2 · ⃗b the values a and b,<br />

respectively, where it holds that a,b = ±1. Again, λ ∈ Λ is the hidden variable describing the source.<br />

Such a probability distribution can always be written in terms of conditional probabilities,<br />

p ⃗a, ⃗ b<br />

(a, b, λ) = p ⃗a, ⃗ b<br />

(a | b ∧ λ) p ⃗a, ⃗ b<br />

(b | λ) ρ ⃗a, ⃗ b<br />

(λ). (VII. 60)<br />

To be able to derive the Bell inequalities we make the following three suppositions.


154 CHAPTER VII. BELL’S INEQUALITIES<br />

1. Outcome independence<br />

The probability to find a value a for ⃗a · ⃗σ is ‘completely’ determined by the settings of the spin<br />

meters and by λ, particularly, it is not necessary to also give outcome b, likewise for finding a<br />

value b,<br />

p ⃗a, ⃗ b<br />

(a | b ∧ λ) = p ⃗a, ⃗ b<br />

(a | λ) and p ⃗a, ⃗ b<br />

(b | a ∧ λ) = p ⃗a, ⃗ b<br />

(b | λ). (VII. 61)<br />

2. Parameter independence<br />

The probability to find the outcome of measurement a or b is independent of the settings of the<br />

remote spin meter,<br />

p ⃗a, ⃗ b<br />

(a | λ) = p ⃗a (a | λ) and p ⃗a, ⃗ b<br />

(b | λ) = p ⃗b (b | λ). (VII. 62)<br />

3. Source independence<br />

The distribution of λ in the source does not depend on the settings of the spin meters,<br />

ρ ⃗a, ⃗ b<br />

(λ) = ρ(λ). (VII. 63)<br />

In principle we can adjust the spin meters ‘at the last moment’, long after the particles have left<br />

the source. It is reasonable to assume that the source is not influenced by what happens to the<br />

measuring devices in the future.<br />

Now we will prove the next theorem.<br />

BELL’S THIRD THEOREM:<br />

A stochastic HVT which is in agreement with outcome, parameter and source independence<br />

is empirically inconsistent with quantum mechanics.<br />

Proof<br />

As a consequence of the aforementioned properties, in every local stochastic HVT, (VII. 60) becomes<br />

p ⃗a, ⃗ b<br />

(a, b, λ) = p ⃗a (a | λ) p ⃗b (b | λ) ρ(λ), (VII. 64)<br />

or<br />

p ⃗a, ⃗ b<br />

(a, b | λ) = p ⃗a (a | λ) p ⃗b (b | λ), (VII. 65)<br />

which means that the quantities A and B are statistically independent of each other for given λ.<br />

This statement is often called factorizability or conditional independence.


VII. 5. STOCHASTIC HIDDEN VARIABLES 155<br />

Using (VII. 64), another Bell inequality can be derived for E(⃗a, ⃗ b) by means of the relation<br />

∫<br />

E(⃗a, ⃗ (<br />

b) = p⃗a, ⃗ b<br />

(1, 1, λ) − p ⃗a, ⃗ b<br />

(1, −1, λ) (VII. 66)<br />

Defining<br />

Λ<br />

− p ⃗a, ⃗ b<br />

(−1, 1, λ) + p ⃗a, ⃗ b<br />

(−1, −1, λ) dλ )<br />

∫<br />

(<br />

= p⃗a (1 | λ) − p ⃗a (−1 | λ) ) ( p ⃗b (1 | λ) − p ⃗b (−1 | λ) ) ρ(λ) dλ.<br />

Λ<br />

f (⃗a, λ) := p ⃗a (1 | λ) − p ⃗a (−1 | λ) (VII. 67)<br />

and<br />

g( ⃗ b, λ) := p ⃗b (1 | λ) − p ⃗b (−1 | λ), (VII. 68)<br />

we see that<br />

|f (⃗a, λ)| 1 and |g( ⃗ b, λ)| 1, (VII. 69)<br />

which brings us back to (VII. 25) and the subsequent equations so that again we obtain the Bell<br />

inequality (VII. 13). Violation of this Bell inequality means that (VII. 64) can not apply and<br />

therefore no HVT can guarantee both outcome independence (VII. 61) and parameter independence<br />

(VII. 62). □<br />

VII. 5. 1<br />

OUTCOME, PARAMETER AND SOURCE INDEPENDENCE<br />

The importance of the distinction between outcome and parameter independence was first brought<br />

to attention by J. Jarrett (1984).<br />

1. Outcome independence, (VII. 61), means that the probability of outcome b, for given λ, does<br />

not depend on the outcome a. This is motivated by the idea that λ gives a complete description of<br />

the state of the pair of particles; the variable λ contains an exhaustive specification of all factors<br />

which are relevant for the outcomes of measurement. Therefore, specifying the extra information that<br />

outcome a has occurred can, if λ is already known, not lead to new information on b.<br />

The purpose of the requirement can be illustrated by giving the next example, in which it is not<br />

satisfied. Suppose that two people, without looking, each draw a little ball out of a box containing two<br />

little balls, one black and one white. Hereafter they separate, one travels to New York, the other to<br />

Tokyo. Now consider a ‘stochastic hidden variable’ with probability 1 2<br />

for the little balls to be black<br />

or white. On arrival at Tokyo the traveler opens his hand and sees that his little ball is black, which<br />

instantaneously enables him to predict the color of the little ball in New York, it has to be white. Here<br />

the outcome of measurement of the one little ball does provide relevant information on the outcome<br />

of a measurement of the other little ball.


156 CHAPTER VII. BELL’S INEQUALITIES<br />

The idea behind the requirement of outcome independence is that such a situation could only<br />

occur because the HVT was incomplete; in a complete specification of the state of the pair of particles<br />

which existed at the beginning of the trip also the color of the little balls should have been included,<br />

even though the travelers did not know the color of their little ball. Then it automatically follows,<br />

at given λ, that the little ball in New York is white and the observation in Tokyo provides no new<br />

information.<br />

2. Parameter independence, (VII. 62), means that the probability distribution of the outcomes<br />

at A is independent of external changes at B, e.g. pointing the spin meter. The argumentation leading<br />

to the assumption of parameter independence is generally associated with the possibility of signaling.<br />

Suppose that, for example, adjustments ⃗ b and ⃗ b ′ existed such that<br />

p ⃗a, ⃗ b<br />

(a | λ) ≠ p ⃗a, ⃗ b ′ (a | λ), (VII. 70)<br />

then, in principle, it is possible to instantaneously exchange signals between experimenters located<br />

at A and B. Since the experimenter located at B can choose if he points his spin meter in the<br />

direction ⃗ b or ⃗ b ′ , an experimenter located at A is able, if the source emits particle pairs in a pure<br />

hidden - variables state λ, to register the relative frequency of outcomes of A and thereby retrieve<br />

which adjustment has been chosen by the experimenter at B. Violation of parameter independence<br />

therefore means that the HVT enables the instantaneous exchange of signals over arbitrarily large<br />

distances.<br />

3. Source independence, (VII. 63), means that the probability distribution over the hidden variable<br />

describing the particle pair cannot depend on the measuring directions chosen by the experimenters.<br />

The argumentation leading to the assumption of source independence is often described<br />

in terms of the ‘free will’ of the experimenters. The experimenters are considered to be completely<br />

‘free’ in their decision how to point their spin meters, and even to make their choice just at the last<br />

moment, when the particles have long left the source. Therefore, the probability distribution ρ(λ),<br />

which characterizes the source of the particle pairs, cannot depend on that.<br />

Of course, here too it applies that violation of the requirement is logically conceivable. It is<br />

possible that this freedom does not exist, and that at emitting the particles, the directions in which<br />

the experimenters will measure have already been determined. It is also conceivable that by some<br />

other cause a correlation exists between λ and the directions ⃗a and ⃗ b, influencing both. The first case,<br />

in which all relevant factors of the EPR experiment are determined in advance and the experimenters<br />

have no free will, is called super - determinism. Therefore, in a super - deterministic HVT the Bell<br />

inequalities can be violated also.<br />

VII. 5. 2<br />

<strong>QUANTUM</strong> <strong>MECHANICS</strong> AS A STOCHASTIC HVT<br />

Exclusively giving probability statements concerning outcomes of measurements, a stochastic<br />

HVT conceptually differs less from quantum mechanics than other HVT’s. In fact we can, without<br />

objection, take quantum mechanics itself as an example of a stochastic HVT by identifying λ with the<br />

quantum mechanical state and Λ with the relevant Hilbert space. Since quantum mechanics does not<br />

satisfy the Bell inequalities, it is interesting to examine which of the aforementioned requirements is<br />

violated inevitably by quantum mechanics.


VII. 5. STOCHASTIC HIDDEN VARIABLES 157<br />

3. Source independence. We already discussed the possibility of violation of the Bell inequalities<br />

by a super-deterministic theory without source independence. It is a philosophical question whether<br />

we can somehow establish if we have free will or not, therefore, it is a possibility, but not an inevitability,<br />

leaving outcome and parameter independence.<br />

2. Parameter independence. Describing the pairs of particles in the singlet state |Ψ 0 ⟩, (III. 165),<br />

by a pure hidden - variables state, the probability distribution is a delta - distribution,<br />

ρ Ψ0 (λ) = δ λ0 (λ) := δ(λ − λ 0 ), (VII. 71)<br />

which leads to<br />

∫<br />

p ⃗a, ⃗ b,λ0<br />

(a, b, λ) ρ Ψ0 (λ) dλ = p ⃗a, ⃗ b,λ0<br />

(a, b). (VII. 72)<br />

Λ<br />

The probabilities for the outcomes of measurement are given by (III. 176),<br />

p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = 1) = 1 2 sin2 1 2 θ ⃗a, ⃗ b ,<br />

p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = −1) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (VII. 73)<br />

b<br />

EXERCISE 33. Also calculate the other two joint probabilities, that is, for a = 1 ∧ b = 1<br />

and a = −1 ∧ b = 1.<br />

The marginal probabilities are, using (VII. 73),<br />

p ⃗a, ⃗ b<br />

(a | λ 0 ) = p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = −1) = 1 2 ,<br />

p ⃗a, ⃗ b<br />

(b | λ 0 ) = p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0<br />

(a = −1 ∧ b = 1) = 1 2<br />

, (VII. 74)<br />

which means that, both being equal to 1 2<br />

, they are not dependent of the settings of a remote measuring<br />

device. Consequently, even the quantum mechanical correlations in the singlet cannot be used for<br />

signaling, there is no actio in distans, leading to the following theorem.<br />

NO - SIGNALING THEOREM:<br />

Quantum mechanics satisfies parameter independence, i.e., if subsystems of a composite<br />

physical system no longer interact, the probability of finding certain outcomes of measurement<br />

for an arbitrary quantity of subsystem 1 is independent of which quantity of<br />

subsystem 2 is measured, and vice versa.<br />

EXERCISE 34. Prove that the EPRB experiment is an example of the no - signaling theorem.<br />

Optional: prove, in general, the no - signaling theorem using state operators. Whoever cannot<br />

solve this problem, is advised to consult Ghirardi, Rimini and Weber (1980).


158 CHAPTER VII. BELL’S INEQUALITIES<br />

1. Outcome independence. In quantum mechanics it is indeed the requirement of outcome independence<br />

that is not satisfied. The conditional probabilities, i.e., the probabilities for the spin of<br />

particle 1 to be found in the direction ⃗a, given that the spin of particle 2 was found in the direction ⃗ b<br />

and vice versa, which were defined in (III. 172), p. 74, are clearly not independent,<br />

p ⃗a, ⃗ b<br />

(a = 1 | λ 0 ∧ b = 1) = sin 2 1 2 θ ⃗a, ⃗ , (VII. 75)<br />

b<br />

p ⃗a, ⃗ b<br />

(a = −1 | λ 0 ∧ b = 1) = cos 2 1 2 θ ⃗a, ⃗ . (VII. 76)<br />

b<br />

According to quantum mechanics, physical systems are inseparable. However, this interdependence<br />

of outcomes cannot be used to exchange signals since we do not control the outcomes of spin measurements<br />

and therefore we are unable to actively influence the probability distribution over the outcomes<br />

of measurements from a distance. Shimony (1984, p. 227) called it passion at a distance. The experimenter<br />

at B can, on the basis of his observation, indeed do a better prediction concerning an outcome<br />

at A than that which is possible on just the knowledge of the singlet state, but he cannot warn the<br />

observer at A, he only can watch passively.<br />

The singlet |Ψ 0 ⟩ ∈ C 4 does violate the Bell inequalities for suitably chosen spin quantities.<br />

The singlet is not factorizable, i.e., it cannot be written as a direct product of two states in C 2 , it is<br />

entangled. We can raise the question if types of quantum mechanical states exist which do not violate<br />

a Bell inequality for any choice of four spin quantities.<br />

Capasso, Fortunato and Selleri (1973) proved that the CHSH inequality, (VII. 13), is upheld for<br />

every choice of four spin quantities by all factorizable states and by all mixtures thereof. Violations<br />

are therefore only possible for entangled states. Vice versa, Home and Selleri (1991, pp. 22 - 26)<br />

proved that for every entangled pure state, that is, a state which cannot be written as a direct product,<br />

it is always possible to choose spin quantities in such a way that the CHSH inequality is violated.<br />

These results can be summarized in the statement that entanglement and violation of Bell inequalities<br />

are equivalent. It confirms Schrödinger’s insight from 1935 (Schrödinger 1935a) that the<br />

existence of entangled states marks the cardinal difference between classical and quantum mechanics.<br />

VII. 6<br />

AN ALGEBRAIC PRO<strong>OF</strong> WITHOUT INEQUALITIES<br />

The contradiction between a local deterministic or a local stochastic HVT, both either autonomous<br />

or contextual, on the one hand, and quantum mechanics on the other hand, is statistical in nature, because<br />

it concerns inequalities in terms of expectation values or probabilities, like all Bell’s theorems.<br />

But Kochen and Specker’s theorem, which we discussed in V. 3, does not contain any inequalities. In<br />

this case it is customary to speak of algebraic proof.<br />

This raises the question whether an algebraic proof of Bell’s theorems is also possible, that is,<br />

without appealing to the measurement postulate. The answer is affirmative. Using a spin state of a<br />

composite system of four particles, D.M. Greenburger, M.A. Horn en A. Zeilinger (1989) showed<br />

that it is mathematically impossible to locally and separably assign values to all spin quantities. Here<br />

we will show a simplified version given by N.D. Mermin (1993), where, in using |GHZ⟩, we refer to<br />

the aforementioned authors.


VII. 6. AN ALGEBRAIC PRO<strong>OF</strong> WITHOUT INEQUALITIES 159<br />

Consider a composite system of three spin 1/2 fermions with pure states in the direct product<br />

Hilbert space C 2 ⊗ C 2 ⊗ C 2 = C 8 . We look at 10 physical quantities which correspond to the<br />

spin operators represented in the Mermin pentagon, figure VII. 8. In this diagram σy<br />

1 is shorthand<br />

for σ y (1) ⊗ 11 (2) ⊗ 11 (3), and σy 1 σy 2 σx 3 is likewise for σ y (1) ⊗ σ y (2) ⊗ σ x (3), etc. On every<br />

straight line through the Mermin pentagon we find four commuting operators. These operators are<br />

products of commuting operators with eigenvalues ±1 and therefore have eigenvalues ±1 also.<br />

σ 1 y<br />

σ 1 x σ 2 x σ 3 x σ 1 y σ 2 y σ 3 x σ 1 y σ 2 x σ 3 y σ 1 x σ 2 y σ 3 y<br />

σ 3 x<br />

σ 3 y<br />

σ 1 x<br />

σ 2 y<br />

σ 2 x<br />

Figure VII. 8: The Mermin pentagon<br />

Using the properties of the Pauli matrices (III. 122), p. 66, it can be shown that<br />

(<br />

σx (1) ⊗ σ y (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ x (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ y (2) ⊗ σ x (3) )<br />

= − σ x (1) ⊗ σ x (2) ⊗ σ x (3), (VII. 77)<br />

where we note that the four operators acting in C 8 commute. Consequently, they have a simultaneous<br />

eigenstate in C 8 , having eigenvalue +1 for the three operators on the left - hand side of the equation,<br />

and eigenvalue −1 for the operator on the right - hand side. The entangled state in C 8 ,<br />

|GHZ⟩ := 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ ⊗ |z ↑⟩ ⊗ |z ↑⟩ − |z ↓⟩ ⊗ |z ↓⟩ ⊗ |z ↓⟩<br />

)<br />

, (VII. 78)<br />

is such a state.<br />

We assume that the three particles are already far away from each other and are moving still further<br />

apart, and the composite system is, as far as spin is concerned, in the state |GHZ⟩. A measurement of<br />

two particles, of which we assume that it does not influence the third particle in any way, determines<br />

the value of the third particle because, according to quantum mechanics, the product of the outcomes<br />

of measurement is determined.


160 CHAPTER VII. BELL’S INEQUALITIES<br />

According to a HVT, at the moment a measurement is made the values of the spin quantities<br />

are revealed. If we call these values w x (1) for the spin in x - direction of particle 1, etc, then, because<br />

|GHZ⟩, (VII. 78), is a simultaneous eigenstate for the four quantities in C 8 , (VII. 77), it must<br />

hold that<br />

and<br />

w x (1) w y (2) w y (3) = w y (1) w x (2) w y (3) = w y (1) w y (2) w x (3) = + 1, (VII. 79)<br />

w x (1) w x (2) w x (3) = − 1. (VII. 80)<br />

The product of these four factors is<br />

(<br />

wx (1) w y (2) w y (3) ) ( w y (1) w x (2) w y (3) ) ( w y (1) w y (2) w x (3) ) ( w x (1) w x (2) w x (3) )<br />

= (+ 1) (+ 1) (+ 1) (− 1) = − 1. (VII. 81)<br />

But if we consider the product as as a product of the 12 values of these spin quantities we find<br />

w x (1) w y (2) w y (3) w y (1) w x (2) w y (3) w y (1) w y (2) w x (3) w x (1) w x (2) w x (3)<br />

= w 2 x (1) w 2 y (1) w 2 x (2) w 2 y (2) w 2 x (3) w 2 y (3) = 1 6 = + 1, (VII. 82)<br />

This leads to +1 = −1, which is, of course, an algebraical absurdity. And indeed, this is an algebraic<br />

proof since it contains no probabilities or inequalities.<br />

EXERCISE 35. What kind of HVT is excluded by the foregoing reasoning? Which postulates of<br />

quantum mechanics are necessary to obtain the contradiction?<br />

VII. 7<br />

MISCELLANEA<br />

Literature concerning the Bell inequalities has reached an extraordinarily large extent since the<br />

seventies of the 20 th century, however, its growth has decreased in recent years. In conclusion of this<br />

chapter we will briefly discuss some of the main topics.<br />

VII. 7. 1<br />

LOCALITY AND RELATIVITY<br />

Although in these lecture notes we have restricted ourselves to non - relativistic quantum mechanics,<br />

the speed of light did not play a role in our considerations, it is, of course, especially the<br />

special theory of relativity which provides the inspiration to study the (im -) possibility of signaling.


VII. 7. MISCELLANEA 161<br />

Therefore, it is interesting to consider the EPRB experiment schematically in a Minkowski diagram,<br />

figure VII. 9.<br />

ct<br />

A<br />

B<br />

λ<br />

Figure VII. 9: Minkowski diagram of the EPRB experiment, where λ is in the past light cones of both<br />

A and B<br />

A natural requirement of locality for a relativistic stochastic HVT is that the probability of an<br />

outcome A depends exclusively on the variables which specify the state in the past light cone of<br />

the measuring event at A, and likewise for B. Bell has called it local causality. We have seen that<br />

quantum mechanics is not a local causal theory. Indeed, the probability of an outcome at A cannot be<br />

influenced by the choice of the direction of measurement ⃗ b at B, but with the outcome at B, which<br />

can be registered there, a prediction can be done by an observer at B concerning the particle at A<br />

which an observer at A can not do, even if he has complete knowledge of the state in the past light<br />

cone of A.<br />

x<br />

VII. 7. 2<br />

LOCALITY VERSUS CONDITIONAL INDEPENDENCE<br />

A problem that is brought up in some publications, e.g. Fine (1982), De Muynck (1986, 1996),<br />

is, to what extent locality is necessary to derive the Bell inequalities. The authors argue that in<br />

‘requirements of locality’ only a special form of statistic independence is expressed. The distance<br />

between the measuring apparatuses is in absolutely no way manifest in the requirement. Although<br />

‘locality’ is a term which seems to presuppose a space - time, such space - times are conspicuous by<br />

their absence in relevant locality assumptions, they all are probability statements without reference to<br />

space or time.<br />

Indeed, strictly speaking one cannot say that these assumptions express a requirement of locality.<br />

It could be possible to expect an analogous independence for a hypothetical pair of particles, for<br />

example a photon and a gluon, which absolutely cannot interact with each other, but are located very<br />

close to each other. The essence is that in a local theory the large distance between the particles can<br />

be taken to be a sufficient, but not necessary condition for the absence of interactions.<br />

The requirement of outcome independence in the HVT is not a representation of the requirement<br />

of locality, it has only been motivated by it. The conclusion that is sometimes drawn from this, that<br />

apparently locality itself is irrelevant for the Bell inequality, is, however, incorrect. Factual violation<br />

of the Bell inequality means that every stochastic HVT satisfying the factorizability as formulated in<br />

section VII. 5 is excluded, and therefore, also the local versions are excluded.


162 CHAPTER VII. BELL’S INEQUALITIES<br />

VII. 7. 3<br />

DETERMINISM<br />

Another widespread view is that the derivation of the Bell inequalities always relies on a supposition<br />

of determinism in the HVT, so that giving up determinism would be a possible expedient from<br />

the Bell inequalities.<br />

Bell himself has emphasized the inadequacy of this view. Determinism, which is the possibility<br />

to make predictions concerning a remote object with certainty before making measurements, indeed<br />

plays an important role in the original version. But this is a consequence of the perfect correlation<br />

in the quantum mechanical expression (VII. 5), i.e., this determinism follows from the singlet state<br />

itself, and is not a specific supposition of the HVT, see for instance Suppes and Zanotti (1976), and<br />

Dieks (1983).<br />

We saw that in a stochastic, or indeterministic, HVT the Bell inequalities are also derivable, so<br />

that giving up determinism does not help. Moreover, the opposite is true; especially super - determinism,<br />

the supposition that also the choice of the direction of measurement by the experimenter is<br />

determined in advance, offers a way out of the Bell inequalities.


VIII<br />

THE MEASUREMENT PROBLEM<br />

[. . . ] if one has to stick to these darn quantum jumps then I regret that I ever have taken<br />

part in the whole thing.<br />

— Erwin Schrödinger<br />

In this final chapter we will elaborate on the most important interpretation problem, the measurement<br />

problem, which has the subject of an ever-continuing series of publications. We will give<br />

an introduction to Von Neumann’s quantum mechanical measurement theory and formulate the<br />

measurement problem, we will go through a number of attempts to solve it, and finally we will<br />

discuss some criticism of the theory.<br />

VIII. 1<br />

INTRODUCTION<br />

The term ‘measurement’ plays a very special role in quantum mechanics, and we suggest a short<br />

rereading of the first paragraphs of chapter V. It is remarkable that the term arises in the Von Neumann<br />

postulates as described in chapter III, p. 41, ff. Both in the measurement postulate, specifying the<br />

possible outcomes of measurement and giving a physical meaning to the probability measure which<br />

is determined by the state vector, or the state operator, in terms of outcomes of measurement, and<br />

in the projection postulate, establishing the evolution in time of the state at measurement, the term<br />

‘measurement’ comes forward.<br />

That special role also becomes apparent in the debates concerning the interpretation of the theory,<br />

where it is frequently remarked that measurement ‘creates’ the value for a quantity, or that it causes a<br />

sudden state change, as expressed by Dirac (1958, p. 36),<br />

In this way we see that a measurement always causes the system to jump into an eigenstate<br />

of the dynamical variable that is being measured, the eigenvalue this eigenstate<br />

belongs to being equal to the result of the measurement.<br />

From the perspective of classical physics, this is extremely unusual. In Newton’s theory of gravitation,<br />

or the electrodynamics of Faraday and Maxwell, measurements are sometimes mentioned, as<br />

suppliers of experimental facts, but never as specific types of operation on physical systems, needing<br />

a separate treatment in the theory.<br />

The point here is not only that measurements in classical physics, as is frequently stated, always<br />

bring about a negligible or compensable disturbance of the system and therefore can remain outside<br />

consideration, much more important is, that in in classical physics there is no distinction in principle


164 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

between processes which serve as measurements and processes that do not. Every physical process<br />

or every mutual influence of physical systems can, under suitable circumstances, be considered as<br />

a measurement. Since it is the physical theory that indicates which physical processes in nature are<br />

possible, the theory itself also provides the criterion for the kinds of measurements which are possible.<br />

According to Von Neumann’s postulates, in quantum mechanics this is exactly the other way<br />

around. First we must, according to the aforementioned postulates, have a criterion to know when a<br />

process is a measurement, before we can indicate what the theory has to say concerning the process,<br />

before we can apply the postulates. That the term measurement in this way gets a more fundamental<br />

status than the physical theory, is also expressed by the words of Pauli as quoted in chapter I, p. 9,<br />

that a measurement creating values is “outside the laws of nature”.<br />

Intuition tells us that measurements are just an ‘ordinary kind’ of physical interactions, and this<br />

intuition cannot easily be wept out, from which we will give an illustration. Consider a photon which<br />

has gone through a slit and is on its way to a photographic plate. If we presume the interaction with<br />

this photographic plate to be a measurement, the wave function of the photon must, according to the<br />

projection postulate, collapse on arrival at the plate. But we also know that the photographic plate has<br />

a microscopic structure. It contains silver atoms in an emulsion which can be excited by the photon<br />

and start a chemical process in such a way that we can see something when the plate is developed.<br />

Would it not be plausible that quantum mechanics could describe such a process using a Schrödinger<br />

equation?<br />

In every way this event looks like a physical interaction which falls completely within the well -<br />

known laws of nature, instead of without. And if this is denied, how shall we decide at all when<br />

a microscopic interaction between a photon and an atom can and when it cannot be labeled as a<br />

measurement? Asking an experimental physicist how her measurement setup works, one will be<br />

given an answer in which physical interactions, generally of electromagnetic nature, are of uppermost<br />

importance. It seems absurd to deny that events take place in the laboratory that are “outside the laws<br />

of nature”.<br />

The clash between the conception that measurements do not differ from other physical interactions<br />

on the one hand, and the fact that measurements in quantum mechanics acquired a special status<br />

because they are not classified to be physical interactions on the other hand, is called the quantum<br />

mechanical measurement problem in the broad sense.<br />

VIII. 2<br />

MEASUREMENT ACCORDING TO CLASSICAL PHYSICS<br />

Although usually no special attention is given to measurements in classical physics, it is no problem<br />

to give a general, schematic description of how a measurement is treated classically.<br />

A measurement brings about a correlation between a quantity A of a physical system S which<br />

is, within the context of a measurement, frequently called an object system, and a quantity R, where<br />

the R comes from reading, which is characteristic for the measuring apparatus M, the apparatus<br />

being a physical system also. In classical physics we assume that A has a certain value a ∈ R,<br />

where a is an element from a set of possible values, for instance a 1 , . . . , a n ⊂ R, and that after the<br />

measurement process R has a value r j = m(a j ), where m is a bijection of the possible values of A<br />

before the measurement, to the possible values of R after the measurement.


VIII. 2. MEASUREMENT ACCORDING TO CLASSICAL PHYSICS 165<br />

Take, for example, S to be yourself and M to be a balance, A is your weight and R is the reading<br />

of the pointer of the balance. Now you have an unknown weight value, a, which is revealed by<br />

the balance indicating r = m(a) = 63 kg. The role of a measurement is pragmatic; the value of<br />

a physical quantity of the object system which is not directly or not easily observable, for example<br />

mass, is correlated to a quantity that is directly observable, in this case the position of an pointer. For a<br />

correlation to occur between A and R there must be an interaction between S and M. This interaction<br />

can, potentially, influence the value of A in such a way that the value before the measurement can<br />

change to another value after measurement. Measurement is a process looking towards the past and<br />

its aim is to reveal the value of A before the interaction with M.<br />

If it is possible to predict, from the value a and the interaction between S and M, the value a ′<br />

which A has after measurement, then the measurement also looks at the future and acts like an apparatus<br />

which prepares a state of S in which A has the value a ′ . Think, for example, of an ammeter<br />

in an electric circuit with an energy source of V volt; if the current through a resistor R is I = V R<br />

without the ammeter, then, after the ammeter has been connected in series with the resistor, the current<br />

I ′ V<br />

equals<br />

R+R s<br />

, where R s is the internal resistance of the ammeter. In case a ′ equals a, the<br />

measurement is called non - disturbing or ideal. The measurement process thus has two aspects; what<br />

happens to the measuring apparatus M, and what happens to the physical system S, i.e. measurement<br />

and state preparation.<br />

In classical physics the measurement interaction can be taken to be arbitrarily small, in which<br />

case the value of A is not disturbed. Therefore, the transition in such an ideal measurement process<br />

is<br />

(a j , r 0 ) (a j , r j ) = ( a j , m(a j ) ) . (VIII. 1)<br />

Notice that the characteristics of the measurement are left out of the consideration. The method<br />

of measuring does not have anything to do with the phenomenon one wants to get information about.<br />

The motion of the planets in the gravitational field of the sun is studied by looking at them, i.e., by<br />

using the fact that the planets reflect sunlight. The optical instruments that are used have nothing to<br />

do with the gravitational motion under examination.<br />

Also notice that in this consideration the question how to measure A is only transformed into the<br />

question how to find the value of R. If we also would have to measure the value of R, this could<br />

lead to an infinite chain of measuring apparatuses. This is avoided by assuming that the quantity R is<br />

directly observable, hence the term pointer reading for R, where we have to take the term ‘pointer’<br />

very generally, for instance, screens showing results of measurements or results printed on paper are<br />

included in the term.<br />

We appeal in our description to a distinction between two different types of quantities; the directly<br />

observable quantities, that is, observable to the naked eye, versus the not directly observable or unobservable<br />

quantities. But this is not a distinction which corresponds to a fundamental distinction of<br />

these quantities, in classical physics all quantities are treated as properties of objects. The fact that we<br />

stop at a directly observable quantity R is a decision based on purely contingent factors, particularly<br />

human physiology and the physics of the human senses.


166 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

VIII. 3<br />

MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong><br />

The following schematic representation of the measurement process in quantum mechanics is<br />

given by Von Neumann (1932).<br />

Suppose that A is a physical quantity of the object system S, represented quantum mechanically<br />

by the maximal operator A on Hilbert space H S , having a discrete spectrum a 1 , . . . , a N . Now<br />

let S interact with a measuring apparatus M, where M is described quantum mechanically also.<br />

For the measuring apparatus M to be able to function as a measuring apparatus, it has to have an<br />

pointer quantity R, represented by the operator R on Hilbert space H M , having orthonormal eigenstates<br />

|r 0 ⟩, . . . , |r N ⟩. These eigenstates have to be orthonormal since they correspond to pointer<br />

readings which can be distinguished by the human eye. Let |r 0 ⟩ be the eigenstate in which the pointer<br />

shows no deflection. The Hilbert space of this composite system S M is H = H S ⊗ H M with<br />

dim H M = dim H S + 1, the basis of R including |r 0 ⟩, where that of A does not include |a 0 ⟩.<br />

Prior to the measurement, the measuring apparatus M is in the eigenstate |r 0 ⟩. We want this state<br />

to change, as a result of the measurement interaction, into the eigenstate |r j ⟩ which is indicative of the<br />

value a j of A, thus, let S initially be in the eigenstate |a j ⟩ of A. Moreover, we want the measurement<br />

to be ideal, so that the state |a j ⟩ of S does not change.<br />

Von Neumann showed that this transition can indeed be brought about by a unitary transformation,<br />

which means we have to find for the composite system SM a unitary evolution operator U, inducing<br />

the transition<br />

U ( |a j ⟩ ⊗ |r 0 ⟩ ) = |a j ⟩ ⊗ |r j ⟩, (VIII. 2)<br />

where U describes the measurement interaction lasting some unspecified time interval.<br />

EXERCISE 36. Show that the operator<br />

U =<br />

N∑ N∑<br />

|a l ⟩ ⊗ |r [l+m] ⟩ ⟨a l | ⊗ ⟨r m | (VIII. 3)<br />

l=1 m=0<br />

(a) is unitary, and (b) induces the desired transition (VIII. 2). Here, [l + m] means l + m modulo<br />

N + 1, i.e.: [N + 1] = 0, [N + 2] = 1, etc.<br />

The formula (VIII. 2) strongly resembles the transition (VIII. 1). Apparently, everything we desired<br />

concerning the ideal measurement process in quantum mechanics, including the requirement<br />

that the value of A must not be disturbed, can be achieved using a unitary operator. At first sight,<br />

there does not seem to be any problem with a completely quantum mechanical treatment of the measurement<br />

interaction, taken as an ordinary physical process obeying Schrödinger’s equation. As in the<br />

classical case, the method of measuring is not discussed. We also did not appeal to the measurement<br />

or the projection postulate.


VIII. 3. MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong> 167<br />

However, at a second look, the transition (VIII. 2) turns out to have peculiar consequences. The<br />

formula (VIII. 2) assumed that the object system S was, before the measurement, in an eigenstate of<br />

A. But what if S is in an arbitrary state |ψ⟩ ∈ H S ?<br />

We can decompose this arbitrary state |ψ⟩ into the orthonormal eigenstates |a j ⟩ of A with coefficients<br />

c j = ⟨a j | ψ⟩. Therefore, using |ψ⟩ = ∑ c j |a j ⟩ and the linearity of the evolution operator it<br />

follows that<br />

U ( |ψ⟩ ⊗ |r 0 ⟩ ) = U<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r 0 ⟩ =<br />

N S ∑<br />

j=1<br />

c j U ( |a j ⟩ ⊗ |r 0 ⟩ )<br />

=<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r j ⟩ =: |Φ⟩. (VIII. 4)<br />

We see that the state |Φ⟩ of the composite system of object S and measuring apparatus M after the<br />

measurement is no longer a product state, rather it is entangled. This implies that we cannot describe<br />

S, nor M, with a pure state; the partial traces S and M yield mixed states, see section III. 4.<br />

This aspect has no classical analogue. We will come back to this, but first we consider the question<br />

whether this quantum mechanical description of the measurement process is compatible with the<br />

measurement postulate. Or, more precisely, whether application of the measurement postulate to A<br />

leads to the same result as its direct application to S. And we ask whether the desired correlation<br />

between the values of A and R is achieved. We will show now that this is indeed the case.<br />

The quantity R of the measuring apparatus M is represented on the Hilbert space H S ⊗ H M of<br />

the composite system SM as 11⊗R. The probability to find for this quantity the value r k is, according<br />

to the measurement postulate,<br />

Prob |Φ⟩ (R : r k ) = ⟨Φ| ( 11 ⊗ |r k ⟩ ⟨r k | ) |Φ⟩. (VIII. 5)<br />

With (VIII. 4) this yields<br />

Prob |Φ⟩ (R : r k ) = |c k | 2 , (VIII. 6)<br />

where we have used the orthonormality of the |r k ⟩ ∈ H M . This is the same result as yielded by<br />

direct application of the measurement postulate to the arbitrary |ϕ⟩ from (VIII. 4). Apparently, the<br />

probability to find an outcome r k when measuring R of M is always equal to the probability to find<br />

the outcome a k of A on S. This former measurement can therefore be regarded as a substitute for the<br />

latter.<br />

The validity of (VIII. 6) itself does not show that a correlation between the value of A and R has<br />

been established. To show that such a correlation exists, we have to know the probability of a certain<br />

pair of outcomes (a i , r k ) for A ⊗ R, in the state |Φ⟩ of (VIII. 4). The joint probability to find this pair<br />

of outcomes is<br />

Prob |Φ⟩ (A : a i ∧ R : r k ) = ⟨Φ| ( |a i ⟩ ⟨a i | ⊗ |r k ⟩ ⟨r k | ) |Φ⟩<br />

= ∣ ∣ ( ⟨a i | ⊗ ⟨r k | ) |Φ⟩ ∣ ∣ 2 = |c i | 2 δ ik . (VIII. 7)


168 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

The conditional probability to find for A the value a i , given that for R the value r k has been found,<br />

is therefore<br />

Prob |Φ⟩ (A : a i | R : r k ) = Prob (A : a i ∧ R : r k )<br />

Prob (R : r k )<br />

= |c i| 2 δ ik<br />

|c k | 2 = δ ik . (VIII. 8)<br />

In other words, in the state |Φ⟩ a strict correlation exists between the quantities A and R, represented<br />

quantum mechanically by the operators A and R.<br />

The schematic representation of the ideal measurement process is, as we have seen, consistent<br />

with the measurement postulate in the sense that a measurement on M can be a substitute for a measurement<br />

on S. Notice that, to answer this question, we did appeal to the measurement postulate.<br />

This is unavoidable, since the final state after the measurement process, (VIII. 4), is an entangled<br />

quantum state. We can only specify its empirical consequences by appealing to the meaning quantum<br />

mechanics attributes to such quantum states, and in Von Neumann’s postulates that meaning is established<br />

by means of the measurement postulate. Unfortunately, this postulate forces us to consider of a<br />

measurement again, namely, a measurement on the measuring apparatus M itself, by reading off the<br />

position of the pointer. Now we have to ask if this second measurement can also be represented as a<br />

normal interaction.<br />

Suppose that we introduce a second measuring apparatus M ′ which we use to read off the result<br />

of M using a new pointer quantity R ′ , represented by the operator R ′ in H M ′. As an example, we<br />

can think of a quantum mechanical description of our eye. Schematically, we then have the process<br />

|r j ⟩ ⊗ |r ′ 0⟩ −→ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 9)<br />

where the |r ′ j⟩ are the eigenstates of R ′ of M ′ . Let U ′ be the unitary operator describing the measurement<br />

by M ′ on M, lasting again some unspecified amount of time. Now we have, for the composite<br />

system SM M ′ in the Hilbert space H = H S ⊗ H M ⊗ H M ′,<br />

|a j ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩<br />

U<br />

|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ 0⟩<br />

U ′<br />

|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 10)<br />

and therefore, if we start from a general initial state |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩, the final state will be<br />

|Φ ′ ⟩ = U ′ U ( |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩ ) =<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩. (VIII. 11)<br />

Again, one can argue that all this is consistent with the measurement postulate. That is, upon measurement<br />

of R ′ , the probability of finding the value r ′ k, is equal to |c k | 2 , etc.<br />

We can extend this type of reasoning ad nauseam, by incorporating more and more systems in<br />

the chain of measurement apparatuses, even including a photon scattered by the pointer and entering<br />

the eye of the observer, his retina, the nerve fibres of his brain, etc. All this is consistent with the<br />

measurement postulate, and you can, if you want to, be satisfied with this.<br />

However, the argument does not show that we can take measurements to be on an entirely equal<br />

footing with other physical interactions. No matter how far we extend the chain of apparatuses, the<br />

final state will always be a superposition of the form (VIII. 4) or (VIII. 11)); the meaning of which<br />

can only be specifies by saying what we will find at yet another measurement. The transition to the


VIII. 3. MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong> 169<br />

conclusion that a certain state has been actually found, sometimes called the ‘Heisenberg cut’ (e.g.<br />

Primas 1993), cannot be made within the formalism. Rudolf Haag has expressed this situation as<br />

follows (Haag 1990, p. 246),<br />

Indeed the problem faced in the development in quantum theory has [. . . ] been [. . . ] the<br />

inability of devising any coherent realistic picture conforming with the observed phenomena.<br />

We can shift the place where we want to make the Heisenberg cut at will, by incorporating more<br />

and more systems in the quantum mechanical description. But the transition itself, exchanging the<br />

quantum mechanical description for a description in terms of observed facts, must come from outside<br />

quantum mechanics.<br />

One can of course, in analogy to the classical measurement scheme, simply postulate that this<br />

quantum mechanical description of the measurement process ends as soon as we can couple the system<br />

S, perhaps by means of many intermediate steps, to some measuring apparatus M whose pointer<br />

quantity R is directly observable. But here we are dealing with the fundamental issue in the theory<br />

and therefore we cannot be satisfied with a pragmatical point of view. Also, we would be faced<br />

with the question which quantities deserve to have the special status of being “directly observable”.<br />

Furthermore, there is the problem that the final states of (VIII. 4) or (VIII. 11) are entangled superpositions<br />

of states with different pointer positions. As we mentioned above, this has no classical analogue.<br />

Without further analysis it is hard to imagine what a direct observation on such states would look like.<br />

An example in which these issues emerge sharply is Schrödinger’s famous cat paradox (1935b),<br />

which we discussed already in the introduction. Schrödinger imagined that a living cat is locked up in<br />

a hermetically closed box, together with a radioactive substance of which perhaps one atom decays<br />

in the course of one hour. The box is provided with a Geiger counter which can register the decay of<br />

the atom, and activates upon decay an installation which lets escape a deadly gas.<br />

Assume that initially the quantum mechanical state of this total system is a product state with a<br />

very large number of factors. The state of the radioactive atom evolves in the course of the hour we<br />

agreed upon to wait into a superposition of the atom before and after decay. The evolution of the total<br />

state then takes on the same form as the state |Φ⟩ in (VIII. 11), i.e., the state evolves into something<br />

like<br />

c 1 (t) |A : 1⟩ ⊗ |ν : 0⟩ ⊗ · · · ⊗ |cat : ⌣⟩<br />

+ c 2 (t) |A : 0⟩ ⊗ |ν : 1⟩ ⊗ · · · ⊗ |cat : †⟩, (VIII. 12)<br />

where t is the time we wait and the system evolves, |A : 1⟩ and |A : 0⟩ are the states of the radioactive<br />

atom before and after decay, |ν : 1⟩ and |ν : 0⟩ are the states of the electromagnetic field with<br />

and without a photon, etc.<br />

The composite system is therefore in a gigantic superposition of states in which the cat is living<br />

and in which it is dead. If we want to hold on to the orthodox interpretation of quantum mechanics<br />

to the bitter end, we have to say that in this state the cat is neither living nor dead, and that only at<br />

measuring, which is perhaps lifting the lid of the box after one hour, there is a certain probability,<br />

namely |c 2 | 2 versus |c 1 | 2 , to find the cat dead or alive. It is the observer, the opener of the box, who<br />

determines the fate of the cat.


170 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

Figure VIII. 1: Schrödinger’s cat paradox (DeWitt 1970 )<br />

VIII. 4<br />

THE MEASUREMENT PROBLEM IN THE NARROW SENSE<br />

In the previous section we have seen how the measurement process, (VIII. 4), brings the composite<br />

system in a superposition of macroscopic different states, e.g. pointer positions. The development of<br />

such superpositions is a consequence of the linearity of the evolution operator. An example is given in<br />

the discussion between Einstein and Pauli, described in the introduction, p. 9, concerning the center of<br />

mass of a macroscopic body. The strangeness of a superposition comes from our tacit presupposition<br />

that the macroscopic pointer positions not only act as possible outcomes of a measurement, but can<br />

also be taken as properties of the pointer. We think that pointers of a measuring apparatus indicate<br />

something, even if we are not in the act of reading them off.<br />

Assuming, for the sake of convenience, that observing something is sufficient to decide that there<br />

is an element of physical reality which is responsible for the observation, we expect that if the quantum<br />

state presents a complete description of the system, i.e., if every element of physical reality has a<br />

counterpart in quantum mechanics, then those macroscopic properties should be represented by it.<br />

That is, however, not the case in the state (VIII. 4).<br />

The idea playing a background role is the next postulate, often called the ‘eigenstate-eigenvalue<br />

link’. It was explicitly supported both by Dirac (1958, p. 46) and Von Neumann (1955, p. 253).<br />

EIGENSTATE-EIGENVALUE LINK, PURE CASE:<br />

A physical system S has the property that quantity A has a definite value iff its state is an<br />

eigenstate of the operator A which, according to the observables postulate, corresponds<br />

to A.<br />

It is also conceivable that a system possesses a definite but unknown value for a quantity. If we use<br />

the ‘ignorance interpretation of mixtures’, as discussed in chapter III, p. 52, we obtain the variation<br />

EIGENSTAT-EIGENVALUE LINK, MIXED CASES:<br />

A physical system S has the property that quantity A has a definite but unknown value


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 171<br />

iff its state is in a mixture of eigenstates of the operator A which, according to the observables<br />

postulate, corresponds to A.<br />

These postulates speak about the existence of properties, about physical quantities having values,<br />

independent of a measurement or a measuring context.<br />

EXERCISE 37. Discuss the link between the property postulates and the sufficient condition of<br />

reality EPR(EPR) of Einstein, Podolsky and Rosen, section I. 2, p. 12, ff.<br />

From this point of view it would be good to have a quantum mechanical description of the measurement<br />

process in which, in any case, the measuring apparatus has a certain property after completion<br />

of the measurement. This means that, instead of the superposition (VIII. 4), we require, as a final<br />

state, the mixture<br />

W ′ =<br />

N∑<br />

|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |. (VIII. 13)<br />

j=1<br />

Some authors, e.g. Landau and Lifshitz (1958, pp. 21 - 24), go still further and require as a final<br />

state an eigenstate |r k ⟩ of the pointer quantity R, corresponding to the pointer position found after<br />

measurement. According to them the measuring interaction finishes with an indeterministic jump,<br />

with probability |c j | 2 , to one of the states |a j ⟩ ⊗ |r j ⟩.<br />

Summarizing, we have the following options for the description of the measurement process. For<br />

the initial state there is no comtroversy,<br />

|ψ⟩ ⊗ |r 0 ⟩ =<br />

N S ∑<br />

j=1<br />

For the final state there are three possibilities,<br />

c j |a j ⟩ ⊗ |r 0 ⟩. (VIII. 14)<br />

1.<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r j ⟩, (VIII. 15)<br />

2. W ′ = ∑ j<br />

|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |, (VIII. 16)<br />

3. |a j ⟩ ⊗ |r j ⟩ with probability |c j | 2 . (VIII. 17)<br />

According to the foregoing line of reasoning we require that, at the end of a measuring interaction,<br />

the pointer of the measuring apparatus, which is of course macroscopic, designates something.<br />

The state (VIII. 15) does not satisfy this requirement, on the contrary, the quantum mechanical superposition<br />

|ψ⟩ of eigenstates |a j ⟩ of the quantity that is measured and which prohibited us to ascribe,


172 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

preliminary to the measurement, a certain value A to the object system S, proves to be contagious;<br />

after the interaction also the pointer quantity of the measuring apparatus has no definite value anymore,<br />

and if the composite system SM is coupled to another measuring apparatus M ′ , this also<br />

becomes infected with ‘property loss’. This is why (VIII. 16) and (VIII. 17) are preferred as final<br />

states over (VIII. 15).<br />

The problem of giving a treatment of the measurement process which produces one of these two<br />

final states, and which therefore ‘creates’ the definite values by means of the measuring interaction,<br />

is the measurement problem in the narrow sense. Notice that (VIII. 16) and (VIII. 17) cannot be<br />

obtained from the initial state by means of a unitary transformation. Therefore, we have to adjust or<br />

extend the first five Von Neumann postulates. We will discuss some proposals for a solution.<br />

VIII. 4. 1<br />

THE PROJECTION POSTULATE AND CONSCIOUSNESS<br />

By adding the projection postulate to the first five postulates, p. 41, Von Neumann gave the standard<br />

solution to the measurement problem in the narrow sense. He distinguished two ways in which<br />

a state can change in time,<br />

Process 1. The discontinuous, non - unitary, indeterministic projection occurring at a<br />

measurement; the projection postulate.<br />

Process 2. The continuous, unitary, deterministic evolution which is consistent with the<br />

Schrödinger equation or its generalization to mixed states, as long as no measurement is<br />

made on the system; the Schrödinger postulate.<br />

At measurement the state undergoes a transition into the eigenstate belonging to the outcome of<br />

measurement. Therefore, this brings about the final state (VIII. 17) and gives, in accordance with the<br />

eigenstate-eigenvalue link, p. 170, definite properties to both the object system and the pointer of the<br />

measuring apparatus.<br />

Although the measurement problem in the narrow sense is solved with these two types of evolution,<br />

the measurement problem in the broad sense, p. 164, comes into prominence more than ever.<br />

We would now like to have an explanation for the particular nature of a measurement, or at least a<br />

criterion with which it can be distinguished of other processes.<br />

Such a criterion is provided, by Von Neumann and for instance Wigner, W. Heitler (1970 p. 42),<br />

and F. London and E. Bauer (1939), in terms of the consciousness of an observer. London and Bauer<br />

reason as follows.<br />

Consider an object system S, a measuring apparatus M and a conscious observer B. The state of<br />

the composite system after measurement is, according to (VIII. 11),<br />

|Φ⟩ = ∑ j<br />

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |b j ⟩. (VIII. 18)<br />

According to London and Bauer, this is the description of the state for us. But for the conscious<br />

observer B it is not the same, because B has the characteristic capacity of introspection. By introspection<br />

he knows in which eigenstate he is, he perceives one certain pointer position. This breaks the


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 173<br />

quantum mechanical chain. If he knows that he is in the state |b k ⟩ and sees the meter indicating something<br />

which corresponds to the pointer state |r k ⟩, then from that moment on the state has immediately<br />

become |a k ⟩ ⊗ |r k ⟩ ⊗ |b k ⟩. Conscious introspection of the observer therefore causes the collapse<br />

of the wave packet. This strange situation is expressed in the thought experiment called ‘Wigner’s<br />

friend’, in which the measuring device is replaced by a friend who communicates the outcome of<br />

measurement to Wigner.<br />

The aforementioned authors emphasize the role of consciousness in the interpretation of quantum<br />

mechanics. It need hardly be emphasized that for the majority of physicists something like this is<br />

unacceptable. They are of the opinion that a measurement is finished as soon as the result is registered<br />

somewhere in the equipment. It is not necessary that it subsequently comes to attention of a conscious<br />

being. But of course, then the question remains again which criterion can be given for a permanent<br />

registration.<br />

VIII. 4. 2<br />

BOHMIAN <strong>MECHANICS</strong><br />

An important advantage of the theory of chapter VI is its avoidance of the projection postulate.<br />

This has consequences for the treatment of measurements. ‘Measuring’ is not a primitive concept in<br />

Bohmian mechanics, measurements are treated on an equal footing with all other physical interactions.<br />

The measuring apparatus is treated in the same manner as the measured object system, namely<br />

with the Bohmian equations, which are derived from the Schrödinger equations. As a consequence,<br />

the interaction between an object system and a measuring apparatus can be given according to the<br />

measurement scheme (VIII. 4).<br />

If, for the sake of simplicity, we limit ourselves to two terms, the interaction is of the<br />

form (VI. 24), p. 134, where ϕ B and ϕ D are the eigen - wave functions of the pointer quantity, corresponding<br />

to the various pointer positions. It is plausible to assume that ϕ B and ϕ D have no overlap.<br />

Consequently, the wave function of the object system and the measuring apparatus is effectively factorizable<br />

and we can regard the superposition as a mixture. There is no measurement problem in<br />

Bohmian mechanics.<br />

◃ Remark<br />

The requirement that ϕ B and ϕ D in (VI. 24) have no overlap is stronger than what is required in<br />

Von Neumann’s model. There it suffices that the wave functions are orthogonal, i.e., ⟨ϕ B | ϕ D ⟩ = 0<br />

instead of ϕ B (⃗q)ϕ D (⃗q) = 0 for all ⃗q ∈ R 3 . ▹<br />

VIII. 4. 3<br />

SPONTANEOUS COLLAPSE<br />

The next option has been developed by G.C. Ghirardi, A. Rimini, and T. Weber (1986), a related<br />

proposal comes from F.A. Bopp (1947). In this view the evolution from the Schrödinger postulate has<br />

to be replaced by an indeterministic evolution. A stochastic term is added, making the Schrödinger<br />

equation non - linear. This has as a consequence that every physical system from time to time spontaneously<br />

makes a small jump, so that the wave function collapses to, almost, a position eigenstate.<br />

The new constant of nature characterizing the relevant time scale is such that the probability of a<br />

spontaneous collapse of the wave function for a single elementary particle is extremely small, in the


174 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

order of once every 10 10 years, leaving the continuous Schrödinger equation an excellent approach<br />

for such a physical system.<br />

In this theory it can be shown that in case of composite systems a collapse of the state of a partial<br />

system brings about a collapse of the state of the entire composite system. This has as a consequence<br />

that the average frequency of these spontaneous jumps per unit of time increases with the number of<br />

degrees of freedom, and for a macroscopic system with approximately 10 25 particles the average time<br />

between two jumps, and therefore two collapses, will only be 10 −5 milliseconds. Hence, in good<br />

approximation, macroscopic systems always have a definite position where microscopic systems do<br />

not.<br />

The difference between this approach and that of Von Neumann is that in the first place there<br />

is no fundamental difference between measurements and other interactions, consciousness plays no<br />

role. Moreover, by adapting the evolution equation, this theory leads to predictions which differ from<br />

quantum mechanics, making it verifiable. By means of experiments it is possible to obtain upper<br />

and lower limits for the collapse frequency. Ghirardi, Rimini and Weber are of the opinion that the<br />

experimental data we have at present are still compatible with a finite interval for their new constant<br />

of nature.<br />

VIII. 4. 4<br />

MANY WORLDS<br />

Another option is the many - worlds interpretation of H. Everett (1957), J.A. Wheeler (1957) and,<br />

especially, B.S. DeWitt (1970, 1971). In this view it is posed that the quantum mechanics of the<br />

first five postulates gives a universally valid description of reality. Therefore, in principle the wave<br />

function of the universe can be written down. There is no part of the world, including the context of<br />

measurement, which is described classically. Moreover, there is no projection postulate. The wave<br />

function develops according to a unitary evolution, which means that it remains a pure state for all<br />

time.<br />

Everett models a measurement process by assuming that a certain system has a complete set<br />

of orthonormal eigenstates, which are interpreted to signify that certain outcomes of measurement<br />

have occurred and are permanently registered in a memory. They are analogous to the previously<br />

mentioned pointer positions |r j ⟩. The state |Ψ⟩ of the composite system of object system S and<br />

measuring apparatus M remains in the superposition form (VIII. 15) for all time. To every state |ϕ i ⟩<br />

of the object system corresponds a relative state of the measuring apparatus,<br />

|ψ⟩ rel<br />

Ψ, ϕ i<br />

:= N i<br />

∑<br />

j<br />

c ij |r j ⟩ with c ij = ( ⟨ϕ i | ⊗ ⟨r j | ) |Ψ⟩, (VIII. 19)<br />

where N i is a normalization constant and {|ϕ i ⟩} and {|r j ⟩} are arbitrary orthonormal bases of the<br />

Hilbert spaces H S and H M of the object system and measuring apparatus, respectively. It can simply<br />

be shown that this definition is independent of the choice of this basis, so that the relative state is<br />

uniquely defined by |Ψ⟩ and |ϕ i ⟩.<br />

In case of an ideal measurement we have<br />

|ψ⟩ rel<br />

Ψ, ϕ i<br />

= |r i ⟩. (VIII. 20)


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 175<br />

This relative state yields the usual conditional probability distribution for the possible outcomes of<br />

measurement of a quantity in case the object system is found in the state |ϕ i ⟩. This is substantiated by<br />

Everett by showing that, if we set the right conditions for the state |ϕ i ⟩, all predictions for quantities<br />

which only refer to the object system S can be determined using the relative state. Therefore, we can<br />

act as if a projection to that state has taken place. In reality, however, the superposition (VIII. 15)<br />

remains.<br />

Now the question is, of course, how this superposition must be interpreted. Especially DeWitt<br />

has propagated a radical view; all terms in this superposition represent real, existing worlds. The<br />

transition during the measurement process is a division of the world in uncountably many copies,<br />

where a different result is registered in each of them. All these worlds exist and develop further next<br />

to one another, without being able to have mutual contact. The problem how to choose one really<br />

realized term from the superposition, as we do using the projection postulate, is avoided because all<br />

terms are realized.<br />

Postulating the existence of such an multiplicity of worlds, with which, moreover, we absolutely<br />

cannot make contact, is acceptable only for a small number of people. But probably worse is the<br />

idea that any decay process in a star in a remote part of the universe can split up our local world into<br />

millions of copies of itself.<br />

Moreover, a difficult point in this theory is how the ‘splitting’ must be understood exactly. It<br />

seems that DeWitt intends a special kind of physical process which emerges at registration. This<br />

would look like adopting a second type of process besides the Schrödinger evolution, in contrast to<br />

the objective of the interpretation; the measurement problem in the broad sense would not be solved.<br />

There is also the problem which process we have to suggest for the reversed evolution; a ‘melting’ of<br />

worlds? In Everett’s original work the idea of a physical splitting of the universe does not occur. He<br />

only regards this as a ‘bookkeeping’ transition to a relative state.<br />

Finally there is the supposition that to a set of states |r j ⟩ of the measuring apparatus the interpretation<br />

can be given that herewith an outcome of measurement is permanently registered. This<br />

supposition cannot without problems be brought into conformity with quantum mechanics because it<br />

still concerns superpositions.<br />

VIII. 4. 5<br />

SUPERSELECTION RULES<br />

Again another option is to introduce superselection rules. Certain superpositions of microscopic<br />

states do not seem to occur in nature, for example, superpositions of states with unequal charge, e.g.<br />

electric, baryonic, or superpositions of states with integer and half integer spin. Therefore, it could be<br />

assumed that superpositions of macroscopically different states do not occur also, and the dynamics<br />

of quantum mechanics must then be adapted to account for this.<br />

In such a setup of quantum mechanics, e.g., in which the superposition principle is not valid<br />

in general, it is possible to have W ′ , (VIII. 16), as the final state of the measurement process, see<br />

Beltrametti and Cassinelli (1981, p. 57). More precisely, in the presence of superselection rules the<br />

mixture (VIII. 16) and the pure state (VIII. 15) become equivalent; the superselection rules provide<br />

the same expectation values for all physical quantities allowed by the superselection operators.<br />

An example of this approach is the suggestion of R. Penrose (1996) that in a future unified theory<br />

for quantum gravitation a superselection rule would apply to the space - time metric. Because


176 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

the gravitational field is taken into account in the metric, also the positions of massive bodies such<br />

as pointers of measuring apparatuses are superselected since the field depends on the positions of<br />

massive macroscopic bodies.<br />

VIII. 4. 6<br />

IRREVERSIBILITY <strong>OF</strong> MEASUREMENT<br />

The next option is to appeal to the special characteristic properties of measuring apparatuses,<br />

and to the theory of irreversible processes, as is done in the work of A. Daneri, A. Loinger and<br />

G.M. Prosperi (1962). According to these authors it is characteristic for measuring apparatuses that<br />

they are in a metastable state. An interaction with a microscopic system then causes, by means of a<br />

chain reaction, an irreversible response of the measuring apparatus.<br />

The description of such an irreversible process within quantum mechanics is not straightforward,<br />

because the unitary evolution is always reversible. It is necessary to make special assumptions concerning<br />

the structure of the macroscopic measuring apparatus and its observable quantities; all matrices<br />

corresponding to these quantities have to be almost diagonal in the energy representation. Then<br />

it can be shown that, as regards the empirical statements for this observable quantities, the final<br />

state (VIII. 15) can be replaced by that of (VIII. 16).<br />

The elegance of this approach is that the details and construction of the measuring apparatus are<br />

discussed. The presence of a metastable state indeed seems to be an essential aspect, like for example<br />

the Geiger counter, or the bubble chamber using superheated liquids. But the introduction of irreversible<br />

processes asks for a modification of the unitary evolution and therefore of the Schrödinger<br />

postulate. Just as in the quantum theory of Ghirardi, Rimini and Weber, this is a fundamental modification<br />

of quantum mechanics.<br />

VIII. 4. 7<br />

MODAL INTERPRETATION<br />

This option to solve the measurement problem is provided by the so - called modal interpretation,<br />

introduced by B.C. van Fraassen (1979) and developed by S. Kochen (1985), D. Dieks, (1989) and<br />

R. Healey (1989). Overviews are given by Vermaas (1999), and Dieks and Vermaas (1998).<br />

In the modal interpretation the projection postulate is removed together with a part of the property<br />

postulate, while the measurement postulate is replaced by a postulate saying that every vector of the<br />

form<br />

|ψ⟩ = ∑ j<br />

c j |a j ⟩ ⊗ |r j ⟩ (VIII. 21)<br />

describes the situation in which system 1 has, as a property, the value a j for the quantity A corresponding<br />

to the operator which is determined by the basis {|a j (t)⟩} and in which, similarly, system 2<br />

has the value r j . Each of these states has a probability |c j | 2 to be realized. This is not different from<br />

the usual ‘ignorance interpretation’ of probabilities. Finally, the Schrödinger postulate is declared to<br />

be valid universally, it is, therefore, also effective during the measurement process.<br />

An important theorem by E. Schmidt, the so - called (biorthogonal -) decomposition theorem, says<br />

that for every composite system the evolution of a state |ψ⟩ in the form (VIII. 21) is unique as long


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 177<br />

as |c j | ̸= |c k | for j ≠ k. Therefore it is possible for every state |ψ⟩ for which this holds to exactly<br />

indicate the potential corresponding properties. A generalization to mixed states can be achieved by<br />

taking the spectral decomposition of W of the composite system as the preferred decomposition, the<br />

Schmidt decomposition (VIII. 21) is then found for the special case of pure states.<br />

The idea that the meaning of the state vector can be exclusively formulated in terms of measurements<br />

is rejected, the state vector describes factual properties. The description by the wave function<br />

is, however, incomplete, |ψ⟩ determines the possibilities and the probabilities of the possibilities, but<br />

the real physical situation is not determined. Quantum mechanics is fundamentally indeterministic<br />

because sometimes one possibility, at other times another one occurs.<br />

Moreover, in this interpretation the ‘only if’ part of the property postulate is rejected, if a system<br />

is in an eigenstate it has indeed the corresponding eigenvalue, but not ‘only if’; a system which is<br />

in a superposition of eigenstates, (VIII. 21), nevertheless has one of the properties. In the first case<br />

a composite physical system necessarily has the property, in the second case contingently. In logic<br />

the italicized words are called ‘modalities’, hence the name modal interpretation. The projection<br />

postulate is now superfluous.<br />

If, however, the singlet state, being a state of a composite system also, is considered in the modal<br />

interpretation, this interpretation tells us less than quantum mechanics with the property postulate<br />

does.<br />

◃ Remarks<br />

In this interpretation, the metastability or possibly permanent nature of the quantities of system 2 plays<br />

no role in attributing properties. Another point in this interpretation is that, besides the Schrödinger<br />

dynamics for the state, there seems to be a need for a dynamics describing how properties change in<br />

time. Several attempts have been made to that end. ▹<br />

EXERCISE 38. What does quantum mechanics with the property postulate say about the EPRB<br />

experiment, p. 139, that the modal interpretation does not say, and why? Does it help to couple a<br />

measuring apparatus to the composite system of the two spin particles?<br />

VIII. 4. 8<br />

DECOHERENCE<br />

Finally we will discuss the option which is possibly supported by the majority of physicists, see<br />

H.J. Groenewold (1946), K. Gottfried (1989), N.G. van Kampen (1988), W.H. Zurek (1981 and 1982).<br />

Bell (1990) named this option the For All Practical Purposes solution, briefly FAPP. The idea is to<br />

show that the difference between the pure state (VIII. 15) and the mixed state (VIII. 16) is hardly<br />

perceptible in practice.<br />

A measuring apparatus is a macroscopic system which is in continuous interaction with its surroundings.<br />

A more realistic representation of the measurement process will therefore be of the


178 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

form (VIII. 11), but with a very large number of terms and factors, e.g.<br />

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |s 0 ⟩ ⊗ · · · ⊗ |t 0 ⟩ ∑ j<br />

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩. (VIII. 22)<br />

In practice, the coherence between the various terms of the superposition will rapidly be lost because<br />

this coherence can only be revealed if the expectation values of the quantities contain cross<br />

terms. To see this, consider a quantity which is a product of quantities of the various partial systems<br />

S, M, M ′ , . . . , M ′′ , for instance, of the form à ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T , or a summation thereof,<br />

which contains non - zero off - diagonal matrix elements. This means that we assume that<br />

(<br />

⟨ai ′| ⊗ ⟨r j ′| ⊗ · · · ⊗ ⟨t k ′| ) Ã ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T ( |a i ⟩ ⊗ |r j ⟩ ⊗ · · · ⊗ |t k ⟩ )<br />

= ⟨a i ′ | A | a i ⟩ ⟨r j ′ | R | r j ⟩ · · · ⟨t k ′ | T | t k ⟩ (VIII. 23)<br />

does not exclusively contains diagonal terms. However, in practice such quantities cannot be measured,<br />

as soon as we do not measure one of the partial systems the coherence is already broken.<br />

For example, because of the orthogonality of the states |s j ⟩, the expectation value of the quantity<br />

˜Q ⊗ ˜R ⊗ 11 ⊗ · · · ⊗ ˜T in the state (VIII. 22) is equal to that in the mixed state<br />

W ′′ = ∑ |c j | 2 ( |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩ )<br />

j<br />

(<br />

⟨aj | ⊗ ⟨r j | ⊗ ⟨s j | ⊗ · · · ⊗ ⟨t j | ) . (VIII. 24)


VIII. 5. INCOMPATIBLE QUANTITIES 179<br />

The step from the pure state (VIII. 22) to the mixture (VIII. 24) is therefore justified by limiting<br />

ourselves to practically realizable states.<br />

At first sight, this reasoning is in every way reasonable. Of course, the reasoning only refers to<br />

a particular class of quantities; a physical quantity for a composite system is certainly not always a<br />

direct product or a summation thereof. But it can be maintained that quantities which are not direct<br />

products are even harder to measure in practice. It is, however, beyond doubt that experimentally<br />

distinguishing the pure state (VIII. 22) from the mixed state (VIII. 24) using macroscopic quantities<br />

will be extremely difficult.<br />

Bell considers this FAPP solution as a pitfall, he speaks of the FAPP - trap. He emphasizes that the<br />

measurement problem is not a practical but a fundamental problem. The core of the problem is if,<br />

after the measurement process, certain properties are present in the measuring apparatus. The FAPP<br />

reasoning shows that, generally, in practice the system behaves as if it had those properties, but it<br />

leaves untouched the fact that ‘in reality’ the system does not have those properties, and that, if our<br />

experimental possibilities would be more ample, this is also experimentally provable.<br />

EXERCISE 39. Show that, using the physical quantity corresponding to the operator |Ψ⟩ ⟨Ψ|, in<br />

which |Ψ⟩ is the right - hand side of (VIII. 22), experimental distinction can be made between the<br />

pure state (VIII. 22) and the mixed state (VIII. 24).<br />

VIII. 5<br />

INCOMPATIBLE QUANTITIES<br />

So far we considered measuring a single physical quantity or two compatible, or commeasurable,<br />

physical quantities of the object system, where compatible quantities are quantities corresponding<br />

to commutating operators. The simple measurement theory (VIII. 2) however, enables us to discuss<br />

also the measurement of incompatible quantities.<br />

Let A and B be two arbitrary, incompatible quantities of the object system S corresponding<br />

to the maximal operators A and B. Measuring apparatus M 1 measures A and apparatus M 2 measures<br />

B. The pointer observables of the apparatuses are R and T , corresponding to the operators<br />

R and T , the eigenstates are |a j ⟩, |b j ⟩, |r j ⟩, |t j ⟩, respectively. The initial state is |ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />

in H = H S ⊗ H 1 ⊗ H 2 , and with dim H S = N S ,<br />

|ψ⟩ =<br />

N S<br />

∑<br />

⟨a j | ψ⟩ |a j ⟩ =<br />

j=1<br />

N S<br />

∑<br />

⟨b k | ψ⟩ |b k ⟩. (VIII. 25)<br />

k=1


180 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

Now we first measure A and next B. The measurement scheme (VIII. 4) gives<br />

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />

A<br />

N S<br />

∑<br />

⟨a j | ψ⟩ |a j ⟩ ⊗ |r j ⟩ ⊗ |t 0 ⟩<br />

j=1<br />

B ∑<br />

N S<br />

N S<br />

∑<br />

⟨a j | ψ⟩ ⟨b k | a j ⟩ |b j ⟩ ⊗ |r j ⟩ ⊗ |t j ⟩. (VIII. 26)<br />

j=1 k=1<br />

If we first measure B and then A, we have<br />

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />

B<br />

N S<br />

∑<br />

⟨b k | ψ⟩ |b k ⟩ ⊗ |r 0 ⟩ ⊗ |t k ⟩<br />

k=1<br />

A ∑<br />

N S<br />

N S<br />

k=1 j=1<br />

∑<br />

⟨b k | ψ⟩ ⟨b k | a j ⟩ ∗ |a j ⟩ ⊗ |r j ⟩ ⊗ |t k ⟩. (VIII. 27)<br />

We see that the final states (VIII. 26) and (VIII. 27) differ from each other. For the probability to get<br />

for A the outcome a j and for B the outcome b k we find<br />

and<br />

Prob A, B (R : r j ∧ T : t k ) = |⟨a j | ψ⟩| 2 |⟨b k | a j ⟩| 2 (VIII. 28)<br />

Prob B, A (T : t k ∧ B : r j ) = |⟨b k | ψ⟩| 2 |⟨a j | b k ⟩| 2 . (VIII. 29)<br />

The good thing is that the measurement theory enables us to make a statement about measurements<br />

of the incompatible quantities A and B which are done after each other, on the basis of the,<br />

possibly simultaneous, measurements of the compatible quantities R and T .<br />

EXERCISE 40. Why are R and T compatible?<br />

We see that the order in which A and B are measured is important. Here the result of the ‘measurement<br />

disturbance’ develops within the framework of the unitary time evolution of the state.<br />

For the conditional probability to find b k if we have found a j , and vice versa, we find,<br />

with (VIII. 28) and (VIII. 29), |⟨b k | a j ⟩| 2 and |⟨a j | b k ⟩| 2 , respectively, and we see that they are equal.<br />

This can be generalized easily. If we successively measure the discrete quantities A, A ′ , A ′′ , . . . ,<br />

having eigenvalues a i , a ′ j, a ′′ k, . . . , the probability to find, given that measurement of A yielded the<br />

outcome a i , for A ′ the outcome a ′ j and for A ′′ the outcome a ′′ k, etc. is equal to<br />

Prob ( · · · A ′′ : a ′′ k ∧ A ′ : a ′ j | A : a i )<br />

= · · · |⟨a ′′ k | a ′ j⟩| 2 |⟨a ′ j | a i ⟩| 2 = ⟨a i | a ′ j⟩ ⟨a ′ j | a ′′ k⟩ · · · ⟨a ′′ k | a ′ j⟩ ⟨a ′ j | a i ⟩<br />

= ⟨a i | P ′ j P ′′ k · · · P ′′ k P ′ j | a i ⟩ = Tr P i P ′ j P ′′ k · · · P ′′ kP ′ j. (VIII. 30)<br />

This result does not apply to degenerated eigenvalues.


VIII. 6. COMMENTS ON THE THEORY <strong>OF</strong> MEASUREMENT 181<br />

We can consider (VIII. 30) to be the most general statement of quantum mechanics for maximal,<br />

discrete quantities; a probability statement concerning the occurrence of correlations between the<br />

outcomes of consecutive measurements. Empirically speaking, all of physics is about such statements,<br />

including classical physics. But classical physics permits us to associate with it a picture of physical<br />

systems as scraps and pieces of matter with properties, moving through space, while in quantum<br />

mechanics such a picture is not available.<br />

◃ Remark<br />

If we measure on S, as in (VIII. 2), the same quantity for a number of times, we will always find the<br />

same outcome. Then the projections in (VIII. 30) are orthogonal and<br />

Tr P i P j P k · · · P k P j = δ ij δ jk · · · . ▹ (VIII. 31)<br />

VIII. 6<br />

COMMENTS ON THE THEORY <strong>OF</strong> MEASUREMENT<br />

Although the measurement scheme (VIII. 2) seems evident, it is not entirely so. To show this, we<br />

start with deriving a desired consequence from it; only physical quantities which correspond to normal<br />

operators are measurable. The pointer states of the measuring apparatus have to be macroscopically<br />

distinguishable, which means that the eigenstates |r j ⟩ of operator R are orthonormal, ⟨r j | r k ⟩ = δ jk ,<br />

since R corresponds to the observable pointer position R of the measuring apparatus. Because the<br />

measurement interaction is unitary, it holds that<br />

(<br />

⟨ai | ⊗ |r 0 ⟩ ) ( ⟨a j | ⊗ |r 0 ⟩ ) = ( ⟨a i | ⊗ ⟨r i | ) ( |a j ⟩ ⊗ |r j ⟩ ) , (VIII. 32)<br />

or<br />

⟨a i | a j ⟩ ⟨r 0 | r 0 ⟩ = ⟨a i | a j ⟩ ⟨r i | r j ⟩ = δ ij , (VIII. 33)<br />

and therefore, ⟨a i | a j ⟩ = 0 if i ≠ j, where the |a j ⟩ are again the eigenvectors of the maximal operator<br />

A, introduced on p. 166. The |a j ⟩ are thus orthonormal and can therefore be a basis. According<br />

to the spectral theorem of p. 26, every basis generates, by means of the projectors projecting on the<br />

elements of the basis, a normal operator. The physical quantity A indeed corresponds to the normal<br />

operator A and, representing a physical quantity, the eigenvalues of A are real. Consequently, A is,<br />

on finite dimensional Hilbert spaces, self - adjoint.<br />

Now we will discuss some points of criticism. The measurement scheme (VIII. 2) is strongly<br />

idealized. It does not say anything about the physical nature of measurements, which are nearly<br />

always of electromagnetic nature. In case of a concrete description, the evolution operator U (t) will<br />

have to represent something, i.e., a Hamiltonian H is needed which generates this evolution by means<br />

of U (t) = e − i H t . In general, A and U will not commute, in which case the |a j ⟩ do not transform<br />

into themselves, unless the duration of the measurement is ‘sufficiently short’. But the question what,<br />

in this connection, is sufficiently short cannot be answered without discussing the characteristics of U<br />

and H.<br />

Likewise, complying with the conservation laws evokes problems as is shown in a theorem by<br />

Wigner (1952) and Araki and Yanase (1960).


182 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

THE WIGNER - ARAKI - YANASE THEOREM:<br />

The evolution U(τ), which brings about the measurement transition (VIII. 4) when measuring<br />

physical quantity A, is possible iff A commutes with all additive conserved quantities<br />

of the composite system of object system and measuring apparatus. In other words,<br />

conserved physical quantities which are not additive, additive physical quantities which<br />

are not conserved, and physical quantities which are neither conserved nor additive, cannot<br />

be measured exactly.<br />

Proof<br />

Here we will only prove the ‘if’ - part of the theorem. Let B be an additive conserved quantity of<br />

the composite system SM, i.e., B is, by definition, of the form<br />

B = B 1 ⊗ 11 + 11 ⊗ B 2 , (VIII. 34)<br />

which is conserved. This means that B commutes with the Hamiltonian H of the composite<br />

system,<br />

[B, H] = 0. (VIII. 35)<br />

Then B also commutes with every function of H and therefore with U (τ) = e − i H τ ,<br />

[B, U (τ)] = 0 =⇒ B = U † (τ) B U (τ). (VIII. 36)<br />

Consider the matrix element<br />

B jk := ⟨a j | ⊗ ⟨r 0 | B |a k ⟩ ⊗ |r 0 ⟩. (VIII. 37)<br />

On the one hand, because of the additivity of B, (VIII. 34), we have<br />

B jk = ⟨a j | ⊗ ⟨r 0 | (B 1 ⊗ 11 + 11 ⊗ B 2 ) |a k ⟩ ⊗ |r 0 ⟩<br />

= ⟨a j | B 1 | a k ⟩ + δ jk ⟨r 0 | B 2 | r 0 ⟩, (VIII. 38)<br />

while on the other hand, using (VIII. 36), we see that<br />

B jk = ⟨a j | ⊗ ⟨r 0 | U † (τ) B U (τ) |a j ⟩ ⊗ |r 0 ⟩ = ⟨a j | ⊗ ⟨r j | B | a k ⟩ ⊗ |r k ⟩<br />

= δ jk ⟨a j | B 1 | a k ⟩ + δ jk ⟨r j | B 2 | r k ⟩. (VIII. 39)<br />

Comparison of these two results shows that<br />

⟨a j | B 1 | a k ⟩ = 0 for j ≠ k, (VIII. 40)<br />

which means that in the basis {|a j ⟩} of H S , B 1 is in diagonal form and therefore A commutes<br />

with B 1 . □<br />

This theorem shows that the scheme (VIII. 4) can, strictly speaking, apply only to measurement of<br />

quantities which commute with all additive conserved quantities. However, the measurement scheme<br />

remains approximately valid if the value of the conserved quantity is large, which will easily be the<br />

case for macroscopic apparatuses. We therefore see that, whereas the U (t) in (VIII. 2) exists, a more<br />

concrete interpretation can come across problems. The shortcomings of the conventional formalism<br />

of quantum mechanics with regard to giving a faithful description of the measurement process, has<br />

lead to interesting extensions of the formalism, see, for instance, Busch, Lahti and Mittelstaedt (1991).


A<br />

GLEASON’S THEOREM<br />

Proofs really aren’t there to convince you that something is true - they’re there to show<br />

you why it is true.<br />

— Andrew Gleason<br />

Of course mathematics works in physics! It is designed to discuss exactly the situation<br />

that physics confronts; namely, that there seems to be some order out there - let’s find<br />

out what it is.<br />

— Andrew Gleason<br />

In section III. 2 we mentioned that Von Neumann suggested for a quantum mechanical probability<br />

measure the trace formula Tr P W , with P a projector. Gleason’s theorem shows that<br />

this probability measure in fact characterizes all probability measures on P (H), the set of all<br />

projectors on H. Since Gleason’s original proof is very difficult, in this appendix we will give a<br />

simplified version by proving the theorem for pure states only.<br />

A. 1 INTRODUCTION<br />

Let H be a real or complex Hilbert space with dim H > 2, and P (H) the set of all projectors<br />

on H. Let µ be a mapping µ : P (H) → [0, 1]. This µ is called a measure on H if it is additive,<br />

satisfying<br />

P i ⊥ P j =⇒ µ(P i + P j ) = µ(P i ) + µ(P j ) ∀ P i , P j ∈ P (H) (A. 1)<br />

µ(0 ) = 0 and µ(11) = 1. (A. 2)<br />

Combination of (A. 1) and the last requirement of (A. 2) implicates that µ attributes the value 1 to any<br />

orthogonal decomposition of unity.<br />

In section III. 2, p. 46, we saw that pure states are represented by the extreme elements of a convex<br />

set, and by proving the theorem on p. 49 we showed that the extreme elements of the convex set S(H)<br />

of state operators on H are the 1 - dimensional projectors in P (H). Consequently, the measure µ is<br />

called extreme if there exists a 1 - dimensional projector P such that<br />

µ(P ) = 1. (A. 3)<br />

This is also expressed by saying that µ is concentrated on P . We can now formulate Gleason’s<br />

theorem for pure states.


184 APPENDIX A. GLEASON’S THEOREM<br />

GLEASON’S THEOREM FOR PURE STATES:<br />

Under the condition that dim H > 2, a 1 - dimensional projector P 0 ∈ P (H) exists on<br />

which the measure µ : P (H) → [0, 1] is concentrated, such that<br />

µ(P ) = Tr P 0 P (A. 4)<br />

for all P ∈ P (H).<br />

The original proof by A.M. Gleason uses sophisticated mathematical methods and is rather<br />

opaque. Several authors have undertaken attempts a to give a more simple proof, particularly<br />

C. Piron (1976), J. Dorling (unpublished) and R. Cooke, M. Keane and B. Moran (1985), where<br />

the commentaries on the ‘elementary proof’ of Cooke, Keane and Moran by R.I.G. Hughes (1989)<br />

are clarifying.<br />

The following proof is a mixture of all this work. It exists of four steps, which, for that matter, do<br />

not coincide with the sections.<br />

A. 2 CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM<br />

Before taking the first step, we discuss a number of simple observations. First, the probability<br />

measure of Gleason’s theorem has to be continuous in P . In section III. 1, p. 48, we showed that<br />

discontinuous probability measures exist for dim H = 2. Therefore, the requirement dim H > 2<br />

holds without further mentioning throughout this appendix. Second, since the trace of a projector P<br />

is equal to the dimension of the subspace onto which it projects, the trace of a 1 - dimensional projector<br />

is 1, which yields for µ being concentrated on P 0<br />

µ(11) = Tr P 0 11 = 1, (A. 5)<br />

in accordance with (A. 2) and (A. 4). Third, every measure is entirely determined by giving its values<br />

on the 1 - dimensional projectors, and, since every higher - dimensional projector P is the sum<br />

of orthogonal 1 - dimensional projectors P i we can, with (A. 1), determine µ (P ) from the values<br />

of µ(P i ). Fourth, for every Hilbert space, (A. 4) at the same time defines an extreme measure µ on H<br />

which is concentrated on P 0 , and as of now we will indicate this measure by µ 0 ,<br />

µ 0 (P ) := Tr P 0 P. (A. 6)<br />

Using the idempotence of P 0 we have<br />

µ 0 (P 0 ) = Tr P 0 2 = Tr P 0 = 1, (A. 7)<br />

from which we see that (A. 4) holds for µ = µ 0 and P = P 0 . Since this measure, being concentrated<br />

on P 0 , assigns the value 0 to all projectors orthonormal to P 0 , it can also easily be verified that this<br />

measure satisfies the requirements (A. 1) and (A. 2).<br />

The foregoing observations lead to the conclusion that to prove Gleason’s theorem for pure states<br />

we have to prove that µ = µ 0 for all P ∈ P (H). Now we will take the first step.


A. 2. CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM 185<br />

A. 2. 1 STEP 1<br />

THEOREM 1:<br />

If Gleason’s theorem for pure states is true for any 3 - dimensional real Hilbert space, it<br />

is also true for any complex Hilbert space with dim H > 2.<br />

We will prove theorem 1 using a proof by contradiction.<br />

Proof<br />

Let H be a complex Hilbert space with dim H > 3 for which Gleason’s theorem is not true. Since<br />

all higher - dimensional projectors can be decomposed to 1 - dimensional projectors, it suffices to<br />

prove this theorem for 1 - dimensional projectors.<br />

Assume a measure µ on H exists, which is concentrated on P 0 ∈ P (H) such that µ(P 0 ) = 1,<br />

but differs from the measure µ 0 defined by (A. 6) in the sense that there is some 1 - dimensional<br />

projector P 1 for which the theorem does not hold,<br />

µ 0 (P 1 ) := Tr P 0 P 1 ≠ µ(P 1 ). (A. 8)<br />

First we will show that, if these measures differ on a higher - dimensional Hilbert space, they also<br />

differ on a 3 - dimensional Hilbert space.<br />

Using the projectors P 0 and P 1 , we can construct a set of three orthogonal 1 - dimensional projectors<br />

P 0 , ˜P 1 , P 2 in the following way. With P 0 = |e 0 ⟩ ⟨e 0 | and P 1 = |e 1 ⟩ ⟨e 1 |, construct a<br />

unit vector |ẽ 1 ⟩ in the plane spanned by |e 0 ⟩ and |e 1 ⟩ which is perpendicular to |e 0 ⟩, i.e.<br />

|ẽ 1 ⟩ ∝ (11 − P 0 ) |e 1 ⟩, (A. 9)<br />

as can be seen in figure A. 1. Then the projector ˜P 1 := |ẽ 1 ⟩ ⟨ẽ 1 | is perpendicular to P 0 . 1<br />

ẽ 1<br />

e 1<br />

e 2 e 0<br />

Figure A. 1: Construction of a 3 - dimensional subspace E<br />

Let P 2 be a 1 - dimensional projector which is perpendicular to both P 0 and ˜P 1 , it is always possible<br />

to choose such a projector because dim H > 3. With P 2 = |e 2 ⟩ ⟨e 2 |, the three orthonormal<br />

vectors |e 0 ⟩,|ẽ 1 ⟩ and |e 2 ⟩ together span a 3 - dimensional Hilbert space, which is a subspace of H.<br />

We will call this space E, and, by construction, P 0 , P 1 , ˜P 1 , P 2 ∈ P (E).<br />

1 To be exact<br />

˜P 1 = (1 − Tr P 0 P 1 ) −1 (P 1 + P 0 P 1 P 0 − P 1 P 0 − P 0 P 1 ).


186 APPENDIX A. GLEASON’S THEOREM<br />

Now we have the following statements,<br />

(a) P (E) ⊂ P (H),<br />

(b) the restriction of µ 0 to P (E) is a measure on P (E),<br />

(c) the restriction of µ to P (E) is a measure on P (E),<br />

(d) the measures µ 0 and µ differ on P (E).<br />

Statement (a) follows immediately from E ⊂ H. The statements (b) and (c) follow from the<br />

fact that both µ 0 and µ, being concentrated on P 0 , assign the value 1 to E, thereby assigning<br />

the value 0 to all subspaces of H perpendicular to E. Statement (d) follows from our assumption<br />

(A. 8).<br />

Next, we have to show that the Hilbert space E can be real. A Hilbert space is real if scalar<br />

multiplication and linear combinations of vectors are only carried out with real coefficients and<br />

the inner products are real. Choosing the vectors |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩, we have the freedom to<br />

absorb an arbitrary phase factor, which means that we can also take them real. Furthermore, we<br />

can exploit that freedom to bring about that the vector |e 1 ⟩, lying in the plane spanned by |e 0 ⟩<br />

and |ẽ 1 ⟩, becomes a linear combination with real coefficients, i.e.,<br />

|e 1 ⟩ = a |e 0 ⟩ + b |ẽ 1 ⟩ with a, b ∈ R. (A. 10)<br />

All inner products of the four vectors |e 0 ⟩, |ẽ 1 ⟩, |e 2 ⟩ and |e 1 ⟩ now have a real value. The required<br />

real Hilbert space is obtained by taking all linear combinations of |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩ with real<br />

coefficients. Because both |e 0 ⟩ and |e 1 ⟩ are elements of this Hilbert space, (a) through (d) remain<br />

valid.<br />

We see that, if Gleason’s theorem for pure states is not true for a complex Hilbert space with<br />

dim > 3, it is also not true for a real 3 - dimensional Hilbert space. Now assume that the theorem<br />

is proven to be true for a real Hilbert space with dim = 3. At the same time supposing that it is<br />

not true for a Hilbert space with dim > 3, so that it would, as we showed, also not be true for a<br />

real H with dim = 3, yields a contradiction. Therefore, theorem 1 is true. □<br />

A. 3 FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE<br />

While by proving theorem 1 we showed that, if Gleason’s theorem for pure states is true for a<br />

real, 3 - dimensional Hilbert space, it is also true for a complex Hilbert space with dim > 2, we did<br />

not prove that µ = µ 0 . In this section we will take the next steps towards proving that indeed µ = µ 0<br />

for all P ∈ P (H) in a real, 3 - dimensional Hilbert space.<br />

Conversion of an arbitrary complex Hilbert space to a 3 - dimensional real Hilbert space is convenient<br />

because this space is isomorphic with the usual 3 - dimensional Euclidean space R 3 . Here,<br />

the 1 - dimensional projectors correspond to lines through the origin, and we can identify them with<br />

points on the surface of a unit sphere, or actually, with half of the unit sphere because |e⟩ and −|e⟩ represent<br />

the same state. Those points will be designated by means of their spherical coordinates (θ, ϕ),<br />

or as points, or directions, on the surface of the unit sphere p, q, r, s, t, . . . , ∈ S 2 , where S 2 is the<br />

standard notation for this surface, and the index 2 refers to the fact that it is 2 - dimensional.


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 187<br />

Letting lines through the origin represent 1 - dimensional projectors, the mapping µ is represented<br />

by a function µ which is a function of the points on S 2 or of the spherical coordinates of those points,<br />

having the following characteristics.<br />

The point p 0 , corresponding to the projector P 0 for which the measure µ is extreme, therefore<br />

µ(p 0 ) = 1, (A. 11)<br />

is called the north pole by convention, the other 1 - dimensional projectors are represented by points on<br />

the northern hemisphere. The set of all 1 - dimensional projectors perpendicular to a given direction r<br />

is called a great circle with axis r, this can be seen in figure A. 2. The great circle representing the<br />

projectors perpendicular to P 0 is called the equator, for which it holds that for any point s on the<br />

equator, according to (A. 1), (A. 11), and the requirement 0 µ 1,<br />

µ(s) = 0. (A. 12)<br />

The requirement (A. 1) for µ to be a measure is, if µ is taken to be a function of points on the<br />

surface of the unit sphere, that for arbitrary, mutually perpendicular axes (r, s, t) in the northern<br />

hemisphere it holds that<br />

µ(r) + µ(s) + µ(t) = 1, (A. 13)<br />

while for µ taken as a function of the spherical coordinates (θ, ϕ) of the points of intersection of the<br />

arbitrary axes (r, s, t) with the surface of the unit sphere we have<br />

µ(θ r , ϕ r ) + µ(θ s , ϕ s ) + µ(θ t , ϕ t ) = 1 (A. 14)<br />

where for any ϕ it holds that<br />

µ(0, ϕ) = 1 and µ( 1 2π, ϕ) = 0, (A. 15)<br />

assigning the required values to the north pole and the equator.<br />

Since we are working in a real, 3 - dimensional Hilbert space, we can assign values to the special<br />

measure (A. 6) in accordance with Von Neumann’s value assignment (V. 33), p. 119. Using (III. 45),<br />

with P 0 = |e 0 ⟩ ⟨e 0 | and P s = |ψ⟩ ⟨ψ|,<br />

µ 0 (P s ) = Tr P 0 P s = |⟨ψ | e 0 ⟩| 2 , (A. 16)<br />

with θ s the angle between s and the north pole, the special measure can be written as<br />

µ 0 (s) = cos 2 θ s . (A. 17)<br />

We will come back to this value assignment in section A. 4. In the next two steps we will prove<br />

that any measure µ (s) satisfying the requirements (A. 11) to (A. 13), or (A. 14) and (A. 15), is a<br />

nonincreasing function in θ s , and does not depend on ϕ.


188 APPENDIX A. GLEASON’S THEOREM<br />

A. 3. 1 STEP 2<br />

THEOREM 2:<br />

If the function µ (s) or, equivalently, µ (θ s , ϕ s ), satisfies the requirements (A. 11)<br />

to (A. 15), then µ(s) is a nonincreasing function in θ s .<br />

We will prove this theorem using two lemmas.<br />

A. 3. 1. 1 LEMMA 1<br />

A LITTLE LEMMA:<br />

Let {s ∈ S 2 | s ⊥ r} be the great circle with axis r ≠ p 0 . Furthermore, let s 0 represent<br />

the most northern point of this circle. Then for all points s of this great circle it holds<br />

that<br />

µ(s 0 ) µ(s), (A. 18)<br />

i.e., if we let s travel along a great circle, µ(s) will have its maximum value in the most<br />

northern point s 0 .<br />

Proof<br />

Choose a set of three orthogonal directions r, s, t, with s ∈ S 2 an arbitrary point on the great<br />

circle around axis r. From (A. 13) we have<br />

µ(r) + µ(s) + µ(t) = 1. (A. 19)<br />

Now carry out a rotation of the orthogonal pair s and t around the axis r until s arrives at the most<br />

northern point s 0 of the great circle. Under this rotation t arrives at a point t ′ at the equator as<br />

can be seen in figure A. 2.<br />

r<br />

p 0<br />

s 0<br />

θ<br />

t ′<br />

t<br />

∆ϕ<br />

s<br />

equator<br />

Figure A. 2: Rotation of s to s 0 and t to t ′ along a great circle around axis r


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 189<br />

Since r, s 0 and t ′ are still mutually orthogonal, we have<br />

µ(r) + µ(s 0 ) + µ(t ′ ) = 1, (A. 20)<br />

and combination with (A. 19) gives<br />

µ(s) + µ(t) = µ(s 0 ) + µ(t ′ ). (A. 21)<br />

But t ′ is on the equator, where, according to (A. 12), µ(t ′ ) = 0, and with 0 µ 1 we see that<br />

µ(s) = µ(s 0 ) − µ(t) µ(s 0 ). (A. 22)<br />

Therefore, on the great circle µ(s) has its largest value in the most northern point. □<br />

A. 3. 1. 2 LEMMA 2<br />

PIRON’S GEOMETRIC LEMMA :<br />

If the pair (s, t) are points on the northern hemisphere, and s lies more northwards than t,<br />

a curve of s to t can be found, existing entirely of segments of great circles, always<br />

starting at their most northern point.<br />

The following proof of Piron’s geometric lemma using projective geometry has been given by<br />

Cooke, Keane and Moran (1985).<br />

Proof<br />

The surface of the northern hemisphere of the unit sphere can be projected bijectively from the<br />

origin onto the horizontal plane P tangent to the north pole, as can be seen in figure A. 3. Therefore,<br />

we can also formulate our problem in this plane.<br />

P<br />

p 0 = Im(p 0 )<br />

Im(s ′ )<br />

Im(s 0 )<br />

s ′ s 0<br />

Im(s ′′ )<br />

s ′′<br />

Figure A. 3: Projection of points on a great circle onto a plane P through the north pole


190 APPENDIX A. GLEASON’S THEOREM<br />

All great circles, except the equator, are projected onto this plane as straight lines. The most<br />

northern point of such a great circle is projected onto the point of its corresponding line that is<br />

closest to the north pole. The line connecting the image of the north pole, Im(p 0 ), and the image<br />

of s 0 , Im(s 0 ), therefore intersects this line at a right angle.<br />

The projection plane therefore contains circles around the projected north pole corresponding to<br />

circles of constant northern latitude, where θ is constant, lines through the projected north pole<br />

corresponding to meridians which are lines of constant ϕ, and projected great circles, where one<br />

of those great circles is depicted in figure A. 4 by the thick grey line, while the projection of its<br />

most northern point is connected with the projected north pole by the thin grey line.<br />

P<br />

θ = c<br />

ϕ = c<br />

Figure A. 4: Projection of meridians, circles with constant latitude, and a great circle<br />

A continuous path from s to t, with s more northern than t, therefore θ s < θ t , along a series of<br />

segments of great circles while always starting at their most northern point, is represented in this<br />

way by a spiral consisting of straight line segments as shown in figure A. 5.<br />

t<br />

S N<br />

Figure A. 5: Spiral representing a projected path from s to t along subsequent great circles, each time<br />

starting at their most northern point<br />

By increasing the number of segments between s and t, we can let this spiral approach a circle<br />

with the north pole as its center. This means that on the northern hemisphere we can travel<br />

every desired distance in longitude by changing over to other great circles, while by changing<br />

over frequently enough we can make the decrease in northern latitude arbitrarily small, leaving θ<br />

constant or nearly constant.<br />

It is also possible to travel from a point t to a more southern point v having the same longitude,<br />

ϕ t = ϕ v . Of course, this can be done by traveling along a nearly circular path as described<br />

p 0<br />

S s


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 191<br />

above while taking ϕ from 0 to 2π and changing over just often enough to descend the required<br />

distance, but we will show it can also be done taking a path along two great circles only, again<br />

starting in their most northern points.<br />

As we saw, on the plane P paths of constant latitude are represented by circles around the north<br />

pole p 0 . Taking t as the starting point, choose it to be the most northern point of a great circle and<br />

travel along a segment, projected onto P as a straight line, to arrive at u, with θ u > θ t . From u,<br />

also choosing it to be the most northern point of a great circle, travel along a segment in opposite<br />

rotational direction, to arrive at v, the projection of which can be seen in figure A. 6.<br />

v<br />

t<br />

u<br />

ϕ(t) = ϕ(v)<br />

S<br />

Figure A. 6: Path from t to v, having the same longitude<br />

By traveling far enough along the great circle through t, u can always be chosen such that v can<br />

be reached from t in two steps. This means that we can always combine paths with constant latitude<br />

and constant longitude to create a path between two points s and t, where s is more northern<br />

than t, existing entirely of segments of great circles, always starting at their most northern point,<br />

thereby satisfying Piron’s lemma. □<br />

p 0<br />

A. 3. 1. 3 RESULT <strong>OF</strong> LEMMA 1 AND 2<br />

By proving the first lemma, we showed that µ(s 0 ), with s 0 the most northern point of the great<br />

circle through s, is always larger than, or equal to, µ(s), consequently, µ can only remain constant or<br />

decrease along a great circle if traveling along the circle starts from its most northern point.<br />

According to lemma 2, traveling from s to t, where s is more northern that t, is always possible<br />

to follow a path along subsequent great circles, each time starting at their most northern points.<br />

Combination of the two lemmas means that Piron’s lemma implies that we can find a sequence of<br />

points s, ′ , s ′′ , . . . , t, with<br />

and therefore<br />

µ(s) µ(s ′ ) . . . µ(t) for θ s < θ s ′ < . . . < θ t , (A. 23)<br />

µ(s) µ(t) for θ s < θ t , (A. 24)<br />

which proves theorem 2. □


192 APPENDIX A. GLEASON’S THEOREM<br />

A. 3. 2 STEP 3<br />

THEOREM 3:<br />

The function µ is constant at constant latitude and hence does not depend on ϕ,<br />

θ s = θ t ⇒ µ(θ s , ϕ s ) = µ(θ t , ϕ t ). (A. 25)<br />

Proof, first part<br />

Again, we will use a proof by contradiction.<br />

Suppose a latitude exists, i.e., there is a horizontal circle B on the surface of the unit sphere,<br />

B(θ 0 ) = {s ∈ S 2 | θ s = θ 0 }, (A. 26)<br />

for which µ is not constant. Here we assume that B(θ 0 ) is not the north pole or the equator, where<br />

theorem 3 is obvious. Now let<br />

and<br />

M (θ 0 ) := sup{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )} (A. 27)<br />

m(θ 0 ) := inf{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )}, (A. 28)<br />

where M (θ 0 ) is the least upper bound, or supremum, and m(θ 0 ) is the greatest lower bound, or<br />

infimum, of all values of µ over B(θ 0 ). If µ does not remain constant, it applies, for certain ε > 0,<br />

that<br />

M (θ 0 ) − m(θ 0 ) = ε. (A. 29)<br />

Now let C be an arbitrary continuous curve which intersects each circle of constant latitude at<br />

most once, i.e., C is strictly in - or decreasing.<br />

p<br />

B(θ 0 )<br />

C<br />

Figure A. 7: A strictly in - or decreasing curve C<br />

Let p be the point where the curve C intersects the latitude (A. 26),<br />

p = C ∩ B(θ 0 ), (A. 30)<br />

as can be seen in figure A. 7.


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 193<br />

For every point s 1 on this curve north of p we have θ s1 < θ 0 , which means that according<br />

to (A. 24) it holds that µ(s 1 ) µ(s) for every point s ∈ B(θ 0 ). Consequently, it also holds that<br />

µ(θ s1 ) M (θ 0 ). (A. 31)<br />

Likewise, for all points s 2 of C south of B(θ 0 ) we see that<br />

µ(θ s2 ) m(θ 0 ). (A. 32)<br />

This reasoning holds no matter how close to B(θ 0 ) the points s 1 and s 2 are chosen.<br />

Because of (A. 29) we conclude that the value of µ, when traveling from north to south along<br />

the curve C, makes a discontinuous jump of at least<br />

M (θ 0 ) − m(θ 0 ) = ε (A. 33)<br />

to a lower value when passing B (θ 0 ). This conclusion applies to every continuous, strictly in -<br />

or decreasing curve intersecting B (θ 0 ), which means we can also choose the curve C to be a<br />

meridian,<br />

C = {s ∈ S 2 | ϕ s = ϕ 0 }, (A. 34)<br />

which is a great circle through the north pole having its axis t at the equator, see figure A. 8.<br />

p 0<br />

s ⊥ B(θ 0 )<br />

q<br />

θ q<br />

s<br />

p<br />

C<br />

t<br />

Figure A. 8: Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ )<br />

Let q ∈ C be orthogonal to the point of intersection p of C and B (θ 0 ), such that t, p and q<br />

are mutually orthogonal. Choose an orthogonal pair (s, s ⊥ ) ∈ C to be a rigid coordinate system.<br />

Rotating this system around axis t, we move s from north to south through point p, whereby,<br />

according to (A. 33), the value of µ jumps discontinuously with at least ε while crossing over the<br />

latitude of B(θ 0 ). The pair s and s ⊥ forming a rigid system, we know that<br />

µ(s) + µ(s ⊥ ) + µ(t) = 1, (A. 35)<br />

where µ(t) = 0 because the axis t is on the equator.


194 APPENDIX A. GLEASON’S THEOREM<br />

Therefore, if s moves southwards, passing through p, and simultaneously s ⊥ moves northwards,<br />

passing through q, the value of µ(s ⊥ ) also has to jump discontinuously. If µ(s) jumps with −ε,<br />

then µ(s ⊥ ) jumps with ε.<br />

Now choose another great circle C ′ with axis t ′ , which intersects B(θ 0 ) in p under a slightly tilted<br />

angle, as can be seen in figure A. 9.<br />

p 0<br />

q<br />

q ′ q ′′ B(θ 0 )<br />

t ′′<br />

C ′′<br />

p<br />

t ′<br />

C ′<br />

C<br />

t<br />

Figure A. 9: Great circle C and tilted great circles C ′ and C ′′<br />

For this great circle we can repeat the same argument, and conclude that for s ′ ∈ C ′ , while<br />

passing the latitude of B(θ 0 ), µ(s ′ ) makes a jump of at least ε, and an equally valued but opposite<br />

jump is made by µ (s ′⊥ ) in a point q ′ ∈ C ′ which is again perpendicular to p. Notice that,<br />

because C ′ is tilted with respect to C, θ(q) ≠ θ(q ′ ).<br />

This argument can be repeated endlessly, with great circles C ′′ , C ′′′ , . . . , C n , intersecting B(θ 0 )<br />

in p, always under different angles. We therefore find a series of points q, q ′ , q ′′ , . . . , q n where,<br />

in passing through one of them while traveling along one of the great circles through p, the value<br />

of µ jumps discontinuously. □<br />

Here we briefly pause from the proof of theorem 3 to prove a simple lemma.<br />

ACCESSORY LEMMA:<br />

Let C 1 and C 2 be two continuous curves on S 2 , intersecting in q, where q is not the<br />

most northern point of either curve. For some s ∈ C 1 , with s more northern than q,<br />

suppose that, traveling south, µ(s) makes a discontinuous jump of −ε < 0 in the point<br />

of intersection q.<br />

This means that it holds for all s ∈ C 1 and some constant a,<br />

and<br />

θ s < θ q ⇒ µ(s) a, (A. 36)<br />

θ s > θ q ⇒ µ(s) a − ε, (A. 37)<br />

and consequently, for all t ∈ C 2 , µ(t) also makes a discontinuous jump in q of at least ε.


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 195<br />

Proof<br />

For every pair of points (s 1 , s 2 ) ∈ C 1 , where θ s1 < θ q and θ s2 > θ q , we can always find a<br />

pair (t 1 , t 2 ) ∈ C 2 , such that θ s1 < θ t1 < θ q and θ s2 > θ t2 > θ q , see figure A. 10.<br />

s 1<br />

t 1<br />

q<br />

s 2 t 2<br />

C 1<br />

C 2<br />

θ q<br />

Figure A. 10: Two continuous curves on S 2 , intersecting in q<br />

Using (A. 24), (A. 36) and (A. 37), we have for t ∈ C 2<br />

θ s < θ t < θ q ⇒ µ(s) µ(t) a, (A. 38)<br />

and<br />

θ s > θ t > θ q ⇒ µ(s) µ(t) a − ε. (A. 39)<br />

This holds no matter how close to q the points s and t are chosen, which proves the lemma. □<br />

Proof, second part<br />

Now we continue the proof of theorem 3. In the first part of the theorem we proved for the pair<br />

(s, s ⊥ ) that if µ jumps with ε in p, it also jumps with ε in q. The same rigidity holding for any<br />

pair (s i , s i⊥ ) ∈ C i , we concluded that µ jumps in every point q, q ′ , q ′′ , . . . , q n with at least ε.<br />

With the accessory lemma, we proved that, if µ makes a jump of at least ε at some point on one<br />

curve C, it does so on any curve C i through that point.<br />

Since we chose the directions q, q ′ , q ′′ , . . . , q n perpendicular to p, see figure A. 9, they all lie<br />

on C p , a great circle with axis p. Starting in its most northern point q, upon descending along this<br />

great circle C p towards the equator, µ(s) remains constant or decreases, as we showed by proving<br />

theorem 2.<br />

But according to the first part of this proof and the accessory lemma, upon descending along<br />

this great circle C p towards the equator, in each of the points q, q ′ , q ′′ , . . . , q n , µ jumps with<br />

at least −ε while passing their various latitudes. Since we can choose n arbitrary large, we can<br />

choose n to be larger than n > ε 1 , making the total jump nε > 1. This leads to µ acquiring values<br />

smaller than 0, which is contradictory to the requirement that 0 µ 1. We have to conclude<br />

that ε = 0, which yields M (θ 0 ) = m(θ 0 ).<br />

We proved that if on the surface of the unit sphere a horizontal circle B exists for which µ is not<br />

constant, then µ /∈ [0, 1], hence µ is constant on constant latitude and does not depend on ϕ,<br />

which proves theorem 3. □


196 APPENDIX A. GLEASON’S THEOREM<br />

A. 4 AN ANALYTIC LEMMA<br />

We have to take one more step to prove that µ = µ 0 , but first we prove a lemma using results<br />

from previous sections.<br />

LEMMA:<br />

The special measure µ 0 can be written as<br />

µ 0 (χ s ) = χ s , (A. 40)<br />

Proof<br />

The special measure (A. 6) can, as we saw in (A. 17), be written as<br />

µ 0 (P ) = Tr P 0 P = |⟨ψ | e 0 ⟩| 2 = cos 2 θ. (A. 41)<br />

As we proved that µ is a nonincreasing function in θ, and does not depend on ϕ, we can take µ<br />

to be a function of a function of θ, and to already make a connection with the analytic lemma<br />

of step 4 which will follow shortly, we choose this function to be the constant, nonincreasing<br />

function χ s : [0, 1 2 π] → [0, 1], χ(θ s) := cos 2 θ s , where θ s is the angle between the direction s<br />

and the north pole. In the next step we will show that this measure satisfies the requirements for µ<br />

to be a measure.<br />

For the special measure µ 0 (θ s ), with s representing an arbitrary P , (A. 41) now reads<br />

µ 0 (χ(θ s )) = cos 2 θ s (A. 42)<br />

which can be written as<br />

µ 0 (χ s ) = χ s . □ (A. 43)<br />

What is left for us to do is to see whether a measure exists, not equal to µ 0 , for which this does<br />

not hold for some P ∈ P (H), as was our assumption (A. 8) in A. 2. 1. This will be the final step,<br />

where, by proving the next theorem, we will see that such a measure does not exist.<br />

A. 4. 1 STEP 4<br />

THEOREM 4:<br />

The only form of µ satisfying (A. 24),<br />

is<br />

θ s < θ t ⇒ µ(s) µ(t), (A. 44)<br />

µ(χ s ) = χ s . (A. 45)


A. 4. AN ANALYTIC LEMMA 197<br />

To prove this theorem, we will use an analytic lemma given by Cooke, Keane and Moran (1985).<br />

But before we will do so, we make some observations.<br />

First, for any triple of mutually perpendicular directions (r, s, t) and some direction q it holds in<br />

general that<br />

cos 2 θ r + cos 2 θ s + cos 2 θ t = 1, (A. 46)<br />

where θ r is the angle between the direction q and axis r, and r corresponds to cos θ r , likewise for s<br />

and t. We can easily see that (A. 46) holds in general if we express the directions in the usual spherical<br />

coordinates,<br />

cos θ r = cos ϕ sin θ, cos θ s = sin ϕ sin θ, and cos θ t = cos θ, (A. 47)<br />

from which we readily know that their squares add up to 1.<br />

With χ(θ r ) = cos 2 θ r etc., we can write (A. 46) as<br />

χ r + χ s + χ t = 1. (A. 48)<br />

Second, for µ as a function of χ(θ s ), µ : [0, 1] → [0, 1], it holds that although µ is nonincreasing<br />

in θ, it is nondecreasing in χ s . The requirements for µ to be a measure, (A. 14) and (A. 15), can now<br />

be rewritten as<br />

µ(χ r ) + µ(χ s ) + µ(χ t ) = 1, (A. 49)<br />

µ(0) = 0 and µ(1) = 1. (A. 50)<br />

With these properties, µ equals the function f in the analytic lemma which now follows.<br />

ANALYTIC LEMMA:<br />

If f : [0, 1] → [0, 1] is a function such that<br />

(1) f (0) = 0,<br />

(2) f is nondecreasing, i.e., if a < b then f (a) f (b),<br />

(3) if a, b, c ∈ [0, 1] and a + b + c = 1, then f (a) + f (b) + f (c) = 1,<br />

then f is the identity function: f (a) = a for all a ∈ [0, 1].<br />

Proof<br />

Choosing c = 0, from (3) we have b = 1 − a, yielding<br />

f (a) = 1 − f (1 − a) (A. 51)<br />

for all values a ∈ [0, 1]. Next, choose c = 1 − (a + b),<br />

f (a) + f (b) = 1 − f (1 − (a + b) = 1 − (1 − f (a + b)) = f (a + b) (A. 52)<br />

for all a, b, a + b ∈ [0, 1].


198 APPENDIX A. GLEASON’S THEOREM<br />

Iteration of (A. 52) yields, for n ∈ N + ,<br />

nf (a) = f (na) for n a 1. (A. 53)<br />

Taking a = 1 n<br />

we see that<br />

( 1<br />

f =<br />

n)<br />

f (1)<br />

n<br />

and iterating again, we have<br />

or, indeed,<br />

( m<br />

)<br />

f = m n n<br />

= 1 , (A. 54)<br />

n<br />

for m, n ∈ N, m < n, (A. 55)<br />

f (a) = a ∀ a ∈ Q. (A. 56)<br />

From (2) we see that<br />

lim f (a) = sup f (a) = 0, (A. 57)<br />

a→0 a→0<br />

and, using again (A. 52),<br />

lim f (a + b) = f (b) ∀ 0 b 1. (A. 58)<br />

a→0<br />

Therefore, f is continuous, and<br />

f (a) = a ∀ a. □ (A. 59)<br />

A. 5 SUMMARY<br />

In this appendix we proved Gleason’s theorem for pure states, represented by extreme measures µ.<br />

In section A. 2 we proved that if Gleason’s theorem for pure states is true for any 3 - dimensional<br />

real Hilbert space, it is also true for any complex Hilbert space with dim H > 2. In A. 3. 1 we showed<br />

that µ is a nonincreasing function in θ, and in A. 3. 2 we proved that µ does not depend on ϕ.<br />

Finally, by proving the analytic lemma we showed that there can only be one form for the measure<br />

µ which satisfies these requirements for all P ∈ P (H) and that is the quantum mechanical one,<br />

i.e., in accordance with cos 2 θ.


WORKS CONSULTED<br />

Most subjects in these lecture notes are also found in Redhead (1987), Krips (1987), Hughes (1989),<br />

D’Espagnat (1989) and Bub (1997).<br />

Dickson (1998) is an accessible monograph.<br />

Jammer (1974) is a survey of the research in foundations of quantum mechanics in historical perspective<br />

from the beginnings of quantum mechanics until 1974. However, Jammer remains indispensable<br />

for every student seriously studying foundations of quantum mechanics.<br />

Bell (1987) contains his articles on quantum mechanics.<br />

Von Neumann’s Grundlagen (1932) is a masterpiece, which is still fully worth studying.<br />

Prugovečki (2006) is a modernized and more systematic version, but it evades subjects of interpretation<br />

and is mainly a mathematical reference book.<br />

Busch, Lahti and Mittelstaedt (1996) is a monograph on quantum mechanical measurement theory.<br />

Hooker (1975) is a collection of important articles of algebraic and logical signature.<br />

Wheeler and Zurek (1983) is an extensive collection of photocopies of important articles (EPR, Bohr,<br />

Bohm, Everett, etc.).<br />

Fine (1986) is the unequalled monograph on Einstein and quantum mechanics.<br />

Contributions to the research of foundations of quantum mechanics from Utrecht University are the<br />

work of Hilgevoord and Uffink and vice versa, of Dieks and Vermaas about the modal interpretation<br />

of quantum mechanics and Uffink’s thesis (1990) about uncertainty relations.


BIBLIOGRAPHY<br />

Albers, D.J., Alexanderson, G.L., Reid, C. (1990) More Mathematical People : Contemporary Conversations<br />

Boston: Harcourt Brace Jovanovich<br />

Araki, H., Yanase, M.M. (1960) ‘Measurement of Quantum Mechanical Operators’<br />

Physical Review 120 (2) pp. 622-626<br />

Aspect, A., Dalibard, J., Roger, G. (1982) ‘Experimental Test of Bell’s Inequalities Using Time -<br />

Varying Analyzers’<br />

Physical Review Letters 49 (25) pp. 1804-1807<br />

Belinfante, F.J. (1973) A Survey of Hidden - Variables Theories<br />

Oxford: Pergamon Press<br />

Bell, J.S. (1964) ‘On the Einstein Podolsky Rosen Paradox’<br />

Physics 1 (3) pp. 195-200, repr. in Wheeler and Zurek (1983)<br />

Bell, J.S. (1966) ‘On the Problem of Hidden Variables in Quantum Mechanics’<br />

Reviews of modern physics 38 pp. 447-452<br />

Bell, J.S. (1971) ‘Introduction to the hidden - variables question’<br />

In d’Espagnat (1971), repr. in Bell (1987)<br />

Bell, J.S. (1975) ‘The Theory of Local Beables’<br />

Presented at the sixth GIFT Seminar, Jaca, 2 - 7 June 1975, repr. in Bell (1987)<br />

Bell, J.S. (1982) ‘On the impossible pilot wave’<br />

Foundations of Physics 12 (10) pp. 989-999<br />

Bell, J.S. (1987) Speakable and Unspeakable in Quantum Mechanics<br />

Cambridge: Cambridge University Press<br />

Bell, J.S. (1990) ‘Against measurement’<br />

Physics World (August) pp. 33-40<br />

Beltrametti, E.G., Cassinelli, G. (1981) The Logic of Quantum Mechanics<br />

Reading: Addison - Wesley Publishing Company<br />

Birkhoff, G., Von Neumann, J. (1936) ‘The Logic of Quantum Mechanics’<br />

The Annals of Mathematics, Second Series 37 (4) pp. 823-843<br />

Bohm, D.J. (1952) ‘A Suggested Interpretation of the Quantum Theory in Terms of “Hidden” Variables.<br />

I, II’<br />

Physical Review 85 (2) pp. 166-179, pp. 180-193


202 BIBLIOGRAPHY<br />

Bohm, D.J., Aharonov, Y. (1957) ‘Discussion of Experimental Proof for the Paradox of Einstein,<br />

Rosen, and Podolsky’<br />

Physical Review 108 (4) pp. 1070-1076<br />

Bohm, D.J. (1981) Wholeness and the implicate order<br />

London: Routledge & Kegan Paul<br />

Bohm, D.J., Peat, F.D. (1989) Science, order, and creativity<br />

London: Routledge<br />

Bohr, N.H.D. (1928) ‘The Quantum Postulate and the Recent Development of Atomic Theory’<br />

Nature 121 (3050) pp. 580-590<br />

Bohr, N.H.D. (1931) ‘Maxwell and Modern Theoretical Physics’<br />

Nature 128 (3234) pp. 691-692<br />

Bohr, N.H.D. (1934) Atomic Theory and the Description of Nature<br />

New York: The Macmillan Company<br />

Bohr, N.H.D. (1935a) ‘Quantum Mechanics and Physical Reality’<br />

Nature 136 p. 65<br />

Bohr, N.H.D. (1935b) ‘Can Quantum - Mechanical Description of Physical Reality Be Considered<br />

Complete?’<br />

Physical Review 48 (8) pp. 696-702<br />

Bohr, N.H.D. (1939) ‘The causality problem in atomic physics’<br />

In Bohr, N.H.D. (1939) New Theories in Physics<br />

Paris: International Institute of Intellectual Co - operation<br />

Bohr, N.H.D. (1947) ‘Newton’s Principles and Modern Atomic Mechanics’<br />

In The Royal Society of London (1947) Newton Tercentenary Celebrations. 15 - 19 July 1946<br />

Cambridge: Cambridge University Press<br />

Bohr, N.H.D. (1949) ‘Discussion with Einstein on epistemological problems in atomic physics’<br />

In Schilpp (1949), repr. in Wheeler and Zurek (1983)<br />

Bopp, F.A. (1947) ‘Quantenmechanische Statistik und Korrelationsrechnung’<br />

Zeitschrift für Naturforschung A 2 pp. 202-216<br />

Born, M., Jordan, P., (1925) ‘Zur Quantenmechanik’<br />

Zeitschrift fur Physik 34 (1) pp. 858-888<br />

Eng. tr. (abridged): ‘On Quantum mechanics’<br />

In Van der Waerden (1967)<br />

Bródy F., Vámos, T. (eds) (1995) The Neumann Compendium<br />

Singapore: World Scientific Publishing Company


BIBLIOGRAPHY 203<br />

Broglie, L.V.P.R. de (1928) ‘La nouvelle dynamique des quanta’<br />

La Commission Administrative de l’Institut Internale de Physique Solvay (1928) Électrons et<br />

Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24<br />

au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay<br />

Paris: Gauthier - Villars<br />

Eng. tr.: ’The new dynamics of quanta’<br />

In Bacciagaluppi, G., Valentini, A. (2009) Quantum Theory at the Crossroads : Reconsidering<br />

the 1927 Solvay Conference<br />

Cambridge: Cambridge University Press<br />

Bub, J., Clifton, R.K. (1996) ‘A Uniqueness Theorem for ‘No Collapse’ Interpretations of Quantum<br />

Mechanics’<br />

Studies in the History and Philosophy of Modern Physics B 27 (2) pp. 181-219<br />

Bub, J., Clifton, R.K., Goldstein, S. (2000) ‘Revised Proof of the Uniqueness Theorem for ‘No Collapse’<br />

Interpretations of Quantum Mechanics’<br />

Studies in the History and Philosophy of Modern Physics B 31 pp. 95-98<br />

Bub, J. (1997) Interpreting the Quantum World<br />

Cambridge: Cambridge University Press<br />

Busch, P.S., Grabowski, M.P., and Lahti, P.J. (1995) Operational Quantum physics<br />

Berlin: Springer - Verlag<br />

Busch, P., Lahti, P.J., Mittelstaedt, P. (1991) The Quantum Theory of Measurement<br />

Berlin: Springer - Verlag<br />

Capasso, V., Fortunato, D., Selleri, F. (1973)‘Sensitive Observables of Quantum Mechanics’<br />

International Journal of Theoretical Physics 7 (5) pp. 319-326<br />

Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A. (1969) ‘Proposed Experiment to test Local Hidden<br />

- Variable Theories’<br />

Physical Review Letters 23 (15) pp. 880-884<br />

Clifton, R.K., Butterfield, J.N., Redhead, M.L.G. (1990) ‘Nonlocal Influences and Possible Worlds –<br />

A Stapp in the Wrong Direction’<br />

British Journal for the Philosophy of Science 41 (1) pp. 5-58<br />

Condon, E.U. (1929) ‘Remarks on uncertainty principles’<br />

Science 69 pp. 573-574<br />

Cooke, R.M, Hilgevoord, J. (1979) ‘Correspondence, Equivalence and Completeness’<br />

Epistemological Letters (March) pp. 42-54<br />

Cooke, R.M., Keane, M.S., Moran, W. (1985) ‘An elementary proof of Gleason’s theorem’<br />

Mathematical Proceedings of the Cambridge Philosophical Society 98 pp. 117-128<br />

Cushing, J.T. (1994) Quantum Mechanics : Historical Contingency and the Copenhagen Hegemony<br />

Chicago: The University of Chicago Press


204 BIBLIOGRAPHY<br />

Daneri, A., Loinger, A., Prosperi, G.M. (1962) ‘Quantum Theory of Measurement and Ergodicity<br />

Conditions’<br />

Nuclear Physics 33 (1962) pp. 297-319<br />

De Muynck, W.M. (1986) ‘The Bell Inequalities and their Irrelevance to the Problem of Locality in<br />

Quantum Mechanics’<br />

Physics Letters A 114 (2) pp. 65-67<br />

De Muynck, W.M. (1996) ‘Can We Escape from Bell’s Conclusion that Quantum Mechanics Describes<br />

a Non - Local Reality?’<br />

Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of<br />

Modern Physics 27 (3) pp. 315-330<br />

DeWitt, B.S. (1970) ‘Quantum mechanics and reality’<br />

Physics Today 23 (9) pp. 30-40<br />

DeWitt, B.S. (1971) ‘The Many - Universes Interpretation of Quantum Mechanics’<br />

In d’Espagnat (1971), repr. in DeWitt and Graham (1973)<br />

DeWitt, B.S., Graham, R.N. (eds) (1973) The Many - Worlds Interpretation of Quantum Mechanics<br />

Princeton: Princeton University Press<br />

Dickson W.M. (1998) Quantum Chance and Non - locality : Probability and Non - locality in the<br />

Interpretations of Quantum Mechanics<br />

Cambridge: Cambridge University Press<br />

Dieks, D.G.B.J. (1983) ‘Stochastic Locality and Conservation Laws’<br />

Lettere al Nuovo Cimento 38 (13) pp. 443-447<br />

Dieks, D.G.B.J. (1989) ‘Resolution of the Measurement Problem through Decoherence of the Quantum<br />

State’<br />

Physics Letters A 142 (8,9) pp. 439-446<br />

Dieks, D.G.B.J. and Vermaas, P.E. (eds) (1998) The Modal Interpretation of Quantum Mechanics<br />

Dordrecht: Kluwer Academic Publishers<br />

Dirac, P.A.M., (1925) ‘The Fundamental Equations of Quantum Mechanics’<br />

Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical<br />

and Physical Character 109 (752) pp. 642-653<br />

Dirac, P.A.M. (1958) The Principles of Quantum Mechanics<br />

Oxford: at the Clarendon Press<br />

Dirac, P.A.M. (1963) ‘The Evolution of the Physicist’s Picture of Nature’<br />

Scientific American 208 (5) pp. 45-53<br />

Eberhard, P.H. (1977) ‘Bell’s Theorem without Hidden Variables’<br />

Il Nuovo Cimento B 38 (1) pp. 75-80


BIBLIOGRAPHY 205<br />

Einstein, A. (1921) ‘Geometrie und Erfahrung’<br />

Sitzungsberichte der Preussischen Akademie der Wissenschaften pp. 123-130<br />

Eng. tr.: Bargmann, S. (transl) ‘Geometry and Experience’<br />

In Janssen, M., Schulmann, R., Illy, J., Lehner, C., Buchwald, D. (eds) (2002) The Collected<br />

Papers of Albert Einstein, Volume 7 : The Berlin Years: Writings, 1918 - 1921<br />

Princeton: Princeton University Press<br />

Einstein, A. (1934) Mein Weltbild<br />

Amsterdam: Querido Verlag<br />

Eng. tr.: Bargmann, S. (transl), Seelig, C. (ed) (1954) Ideas and opinions<br />

New York: Bonanza Books<br />

Einstein, A., Podolsky, B., Rosen, N. (1935) ‘Can Quantum - Mechanical Description of Physical<br />

Reality Be Considered Complete?’<br />

Physical Review 47 (10) pp. 777-780<br />

Einstein, A., Born, M., Born, H. (1971) The Born - Einstein letters : correspondence between Albert<br />

Einstein and Max and Hedwig Born from 1916 to 1955 with commentaries by Max Born<br />

London: The Macmillan Press<br />

Espagnat, B. d’ (ed) (1971) Foundations of Quantum Mechanics : Proceedings of the International<br />

School of Physics ”Enrico Fermi”, held at Varenna, 29th June-11th July, 1970, Course IL<br />

New York: Academic Press<br />

Espagnat, B. d’ (1989) Conceptual Foundations Of Quantum Mechanics<br />

New York: Perseus Books<br />

Everett, H. III (1957) ‘The Theory of the Universal Wave Function’<br />

In DeWitt and Graham (1973)<br />

Everett, H. III (1957) “‘Relative State” Formulation of Quantum Mechanics’<br />

Reviews of Modern Physics 29 (3) pp. 454-462<br />

Fine, A.I. (1982) ‘Hidden Variables, Joint Probability, and the Bell Inequalities’<br />

Physical Review Letters 48 (5) pp. 291-295<br />

Fine, A. (1986) The shaky game : Einstein, realism and the quantum theory<br />

Chicago: University of Chicago Press<br />

Folse, H.J. (1985) The philosophy of Niels Bohr : the framework of complimentarity<br />

Amsterdam: North - Holland Physics Publishing<br />

Fraassen, B.C. van (1973) ‘Semantic Analysis of Quantum Logic’<br />

In Hooker, C.A. (ed) (1973) Contemporary Research in the Foundations and Philosophy of<br />

Quantum Theory<br />

Dordrecht: D. Reidel Publishing Company<br />

Fraassen, B.C. van (1979) ‘Hidden Variables and the Modal Interpretation of Quantum Theory’<br />

Synthese 42 (1) pp. 155-165


206 BIBLIOGRAPHY<br />

Frank, P.G. (1949) Modern Science and Its Philosophy<br />

Cambridge: Harvard University Press<br />

Freedman, S.J., Clauser, J.F. (1972) ‘Experimental Test of Local Hidden - Variable Theories’<br />

Physical Review Letters 28 (14) pp. 938-941<br />

Ghirardi, G.C., Rimini, A., Weber, T. (1980) ‘A General Argument against Superluminal Transmission<br />

through the Quantum Mechanical Measurement Process’<br />

Lettere al Nuovo Cimento 27 (10) pp. 293-298<br />

Ghirardi, G.C., Rimini, A., Weber, T. (1986) ‘Unified dynamics for microscopic and macroscopic<br />

systems’<br />

Physical Review D 34 (2) pp. 470-491<br />

Gleason A.M., (1957) ‘Measures on the Closed Subspaces of a Hilbert space’<br />

Journal of Mathematics and Mechanics 6 pp. 885-893<br />

Gottfried, K. (1989) ‘Does Quantum Mechanics describe the Collapse of the Wavefunction?’<br />

Unpublished contribution to the 1989 Conference, International School of History of Science,<br />

Erice, Italy, 5 - 14 August<br />

Greenberger, D.M., Horne, M.A., Zeilinger, A. (1989) Going Beyond Bell’s Theorem<br />

In Kafatos, M.C. (ed) (1989) Bell’s Theorem, Quantum Theory and Conceptions of the Universe<br />

Dordrecht: Kluwer Academic Publishers<br />

http://arxiv.org/abs/0712.0921<br />

Groenewold, H.J. (1946) ‘On the Principles of Elementary Quantum Mechanics’<br />

Physica 12 (7) pp. 405-460<br />

Haag, R. (1990) ‘Fundamental Irreversibility and the Concept of Events’<br />

Communications in Mathematical Physics 132 pp. 245-251<br />

Healey, R.A. (1989) The philosophy of quantum mechanics : An interactive interpretation<br />

Cambridge: Cambridge University Press<br />

Heisenberg, W. (1925) ‘Über quantentheoretische Umdeutung kinematischer und mechanischer<br />

Beziehungen’<br />

Zeitschrift für Physik 33 (1) pp. 879-893<br />

Eng. tr.: ‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’<br />

In Van der Waerden, B.L. (1967)<br />

Heisenberg, W.K. (1927) ‘Über den anschaulichen Inhalt der quantentheoretischen Kinematik und<br />

Mechanik’<br />

Zeitschrift für Physik 43 (3/4) pp. 172-198<br />

Eng. tr.: ‘The physical content of quantum kinematics and mechanics’<br />

In Wheeler and Zurek (1983)


BIBLIOGRAPHY 207<br />

Heisenberg, W.K., (1930) Die Physikalischen Prinzipien der Quantentheorie<br />

Leipzig: Verlag von S. Hirzel<br />

Eng. tr.: Eckart, C., Hoyt, F.C. (transl) (1930) The Physical Principles of Quantum Theory<br />

New York: Dover Publications<br />

Heisenberg, W. (1963) Niels Bohr Library and Archives<br />

Interview with Werner Heisenberg by T. S. Kuhn at the Max Planck Institute, Munich, Germany,<br />

February 25. Transcript Session VIII<br />

http://www.aip.org/history/ohilist/4661underscore8.html<br />

Heitler, W.H. (1970) Der Mensch und die naturwissenschaftliche Erkenntniss<br />

Braunschweig: Friedrich Vieweg & Sohn Verlagsgesellschaft<br />

Hey, T., Walters, P. (2003) The New Quantum Universe<br />

Cambridge: Cambridge University Press<br />

Hilgevoord, J., Uffink, J.B.M. (1988) ‘The mathematical expression of the uncertainty principle’<br />

In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1988) Microphysical Reality and Quantum<br />

Formalism. Volume I<br />

Dordrecht: Kluwer Academic Publishers<br />

Hilgevoord, J., Uffink, J.B.M. (1990) ‘A new view on the uncertainty principle’<br />

In Miller A.I. (ed) (1990) Sixty - Two years of Uncertainty : Historical, Philosophical and<br />

Physical Inquiries into the Foundations of Quantum Mechanics<br />

New York: Plenum Press<br />

Hilgevoord, J. (2002) ‘Time in quantum mechanics’<br />

American Journal of Physics 70 (3) pp. 301-306<br />

Holevo, A.S. (1982) Probabilistic and Statistical Aspects of Quantum Theory<br />

Amsterdam: North - Holland Publishing Company<br />

Holland, P.R. (1993) The Quantum Theory of Motion : An Account of the de Broglie - Bohm Causal<br />

Interpretation of Quantum Mechanics<br />

Cambridge: Cambridge University Press<br />

Home, D., Selleri, F. (1991) ‘Bell’s Theorem and the EPR Paradox’<br />

La Rivista del Nuovo Cimento 14 (9) pp. 1-95<br />

’t Hooft, G. (1997) In search of the ultimate building blocks<br />

Cambridge: Cambridge University Press<br />

Hooker, C.A. (ed) (1975) The Logico - Algebraic Approach to Quantum Mechanics. Volume I: the<br />

Historical Evolution<br />

Dordrecht: D. Reidel Publishing Company<br />

Hughes, R.I.G. (1989) The Structure and Interpretation of Quantum Mechanics<br />

Cambridge: Harvard University Press


208 BIBLIOGRAPHY<br />

Isham, C.J. (1995) Lectures on Quantum Theory : Mathematical and Structural Foundations<br />

River Edge: Imperial College Press<br />

Jacques, V., Wu, E., Grosshans, F., Treussart, F., Grangier, P., Aspect, A., Roch, J-F. (2007) ‘Experimental<br />

realization of Wheeler’s delayed - choice gedanken experiment’<br />

Science 315 (5814) pp. 966-968<br />

Jammer, M. (1974) The Philosophy of Quantum Mechanics : The Interpretations of Quantum Mechanics<br />

in Historical Perspective<br />

New York: John Wiley & Sons<br />

Jammer, M. (1990) ‘John Stewart Bell and His Work - On the Occasion of His Sixtieth Birthday’<br />

Foundations of Physics 20 (10) pp. 1139-1145<br />

Jammer, M., (1992) ‘John Stewart Bell and the Debate on the Significance of His Contributions to<br />

the Foundations of Quantum Mechanics’<br />

In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1992) International Conference on Bell’s<br />

Theorem and the Foundations of Modern Physics<br />

Singapore: World Scientific Publishing<br />

Jarrett, J.P. (1984) ‘On the Physical Significance of the Locality Conditions in the Bell Arguments’<br />

Noûs 18 (4) pp. 569-589<br />

Jauch, J.M. (1968) Foundations of Quantum Mechanics<br />

Reading: Addison - Wesley Educational Publishers<br />

Kalckar, J. (ed) (1996) Niels Bohr - Collected Works : Volume 7 - Foundations of Quantum Physics II<br />

(1933 - 1958)<br />

Amsterdam: Elsevier Science<br />

Kampen, N.G. van (1988) ‘Ten Theorems about Quantum Mechanical Measurements’<br />

Physica A 153 pp. 97-113<br />

Kennard, E.H. (1927) ‘Zur Quantenmechanik einfacher Bewegungstypen’<br />

Zeitschrift für Physik 44 (4/5) pp. 326-352<br />

Kochen, S., Specker, E.P. (1967) ‘The Problem of Hidden Variables in Quantum Mechanics’<br />

Journal of Mathematics and Mechanics 17 (1) pp. 59-87<br />

Kochen, S. (1985) ‘A New Interpretation of Quantum Mechanics’<br />

In Lahti, P.J., Mittelstaedt, P. (eds) (1985) Symposium on the foundations of modern physics<br />

1985 : 50 years of the Einstein - Podolsky - Rosen Gedankenexperiment<br />

Singapore: World Scientific Publishing Company<br />

Krips, H. (1987) The Metaphysics of Quantum Theory<br />

Oxford: Clarendon Press<br />

Landau, L.D., Lifshitz, E.M. (1958) Quantum Mechanics : Non - Relativistic theory<br />

London: Pergamon Press


BIBLIOGRAPHY 209<br />

Landau, H.J., Pollack, H.O. (1961) ‘Prolate Spheroidal Wave Functions, Fourier Analysis and Uncertainty<br />

- II’<br />

The Bell System Technical Journal 40 pp. 65-84<br />

London, F., Bauer, E. (1939) La Théorie de l’Observation en Mécanique Quantique<br />

Paris: Hermann<br />

Eng. tr.: ‘The Theory of Observation in Quantum Mechanics’<br />

In Wheeler and Zurek (1983)<br />

Lüders, G., (1951) ‘Über die Zustandsänderung durch den Meßprozeß’<br />

Annalen der Physik 443 (5 - 8) pp. 322-328<br />

Eng. tr.: Kirkpatrick, K.A. (transl) (2006) ‘Concerning the state - change due to the measurement<br />

process’<br />

Annalen der Physik 15 (9) pp. 663-670<br />

Maczynski, M.J. (1971) ‘Boolean Properties of Observables in Axiomatic Quantum Mechanics’<br />

Reports on Mathematical Physics 2 (2) pp. 135-150<br />

Mermin, N.D. (1993) ‘Hidden variables and the two theorems of John Bell’<br />

Reviews of Modern Physics 65 (3) pp. 803-815<br />

Meyer, D.A. (1999) ‘Finite precision measurement nullifies the Kochen - Specker theorem’<br />

Physical Review Letters 83 pp. 3751-3754<br />

Meyer, D.A. (2003) ‘Coloring, quantum mechanics, and Euclid’<br />

Pdf file: math.ucsd.edu/ dmeyer/research/talks/cqmE.pdf<br />

Miller, A.I. (1990) (ed) Sixty - two Years of Uncertainty : Historical, Philosophical and Physical<br />

Inquiries into the Foundations of Quantum Mechanics<br />

New York: Plenum Press<br />

Miller, W.A., Wheeler, J.A. (1984) ‘Delayed - Choice Experiments and Bohr’s Elementary Quantum<br />

Phenomenon’<br />

In Nakajima, S., Murayama, Y., Tonomura, A. (eds) (1996) Foundations of Quantum Mechanics<br />

in the Light of New Technology<br />

Singapore: World Scientific Publishing<br />

Muller, F.A. (1997a) ‘The Equivalence Myth of Quantum Mechanics–Part I’<br />

Studies in History and Philosophy of Modern Physics 28 (1) pp. 35-61<br />

(1997b) ‘Part II’<br />

ibid. 28 (2) pp. 219-247<br />

(1999) ‘(Addendum)’<br />

ibid. 30 (4) pp. 543-545<br />

Neumann, J. Von (1932) Mathematische Grundlagen der Quantenmechanik<br />

Berlin: Verlag von Julius Springer<br />

Eng. tr.: Beyer, R.T. (transl) (1955) The Mathematical Foundations of Quantum Mechanics<br />

Princeton: Princeton University Press


210 BIBLIOGRAPHY<br />

Pauli, W.E. (1933) Die allgemeinen Prinzipien der Wellenmechanik<br />

Berlin: Verlag von Julius Springer<br />

Eng. tr.: (1950) The General principles of wave mechanics<br />

Urbana - Champaign: University of Illinois Press<br />

Penrose, R. (1996) ‘On Gravity’s Role in Quantum State Reduction’<br />

General Relativity and Gravitation 28 (5) pp. 581-600<br />

Peres, A. (1993) Quantum Theory: Concepts and Methods<br />

Dordrecht: Kluwer Academic Publishers<br />

Petersen, A. (1963) ‘The Philosophy of Niels Bohr’<br />

Bulletin of the Atomic Scientists 19 (7) pp. 8-14<br />

Petersen, A. (1968) Quantum Physics and the Philosophical Tradition<br />

Cambridge: M.I.T. Press<br />

Piron, C. (1976) Foundations of Quantum Physics<br />

Reading: W.A. Benjamin<br />

Prugovečki, E. (2006) Quantum Mechanics in Hilbert Space<br />

Mineola: Dover Publications<br />

Przibram, K. (ed) (1963) Briefe zur Wellenmechanik : Schrödinger, Planck, Einstein, Lorentz<br />

Wien: Springer - Verlag<br />

Eng. tr.: Przibram, K. (ed) (1963) Letters on wave mechanics : Schrödinger, Planck, Einstein,<br />

Lorentz<br />

New York: Philosophical Library<br />

Rauch, H., Werner, S.A. (2000) Neutron Interferometry : Lessons in Experimental Quantum Mechanics<br />

Oxford: Oxford University Press<br />

Redhead, M.L.G. (1987) Incompleteness, Nonlocality and Realism : A Prolegomenon to the Philosophy<br />

of Quantum Mechanics<br />

Oxford: Clarendon Press<br />

Robertson, H.P. (1929) ‘The Uncertainty Principle’<br />

Physical Review 34 p. 163<br />

Scheibe, E., Sykes, J.B., (transl) (1973) The Logical Analysis of Quantum Mechanics<br />

Oxford: Pergamon Press<br />

Schiff, L.I. (1949) Quantum Mechanics<br />

New York: McGraw - Hill<br />

Schilpp, P.A. (ed) (1949) Albert Einstein : Philosopher - Scientist<br />

Evanston: The Library of Living Philosophers


BIBLIOGRAPHY 211<br />

Schmidt, E. (1907) ‘Zur Theorie der linearen und nichtlinearen Integralgleichungen. I. Teil’<br />

Mathematische Annalen 63 pp. 433-476<br />

(1907) ‘Zweite Abhandlung’<br />

ibid. 64 pp. 161-174<br />

(1907) ‘III. Teil’<br />

ibid. 65 pp. 370-399<br />

Schrödinger, E.R.J.A. (1926) ‘An Undulatory Theory of the Mechanics of Atoms and Molecules’<br />

The Physical Review 28 (6) pp. 1049-1070<br />

Schrödinger, E.R.J.A. (1930) ‘Zum Heisenbergschen Unschärfeprinzip’<br />

Sitzungsberichte der Preußischen Akademie der Wissenschaften. Physikalisch - mathematische<br />

Klasse pp. 296-303<br />

Schrödinger, E.R.J.A. (1935a) ‘Discussion of Probability Relations between Separated Systems’<br />

Mathematical Proceedings of the Cambridge Philosophical Society 31 (4) pp. 555-563<br />

Schrödinger, E.R.J.A. (1935b) ‘Die gegenwärtige Situation in der Quantenmechanik’<br />

Naturwissenschaften 23 (48) pp. 807-812, (49) pp. 823-828, (50) pp. 844-849<br />

Eng. tr.: Trimmer, J.D. (transl) (1980) ‘The Present Situation in Quantum Mechanics: A Translation<br />

of Schrödinger’s “Cat Paradox”’<br />

Proceedings of the American Philosophical Society 124 (5) pp. 323-338<br />

Repr. in Wheeler and Zurek (1983)<br />

Shimony, A. (1984) ‘Controllable and Uncontrollable Non - Locality’<br />

In Kamefuchi, S., et al. (eds) Proceedings of the International Symposium : Foundations of<br />

Quantum Mechanics in the Light of New Technology<br />

Tokyo: Physical Society of Japan<br />

Shimony, A. (1989) ‘Search for a Worldview Which Can Accommodate Our Knowledge of Microphysics’<br />

In Cushing, J.T., McMullin, E. (eds) Philosophical Consequences of Quantum Theory : Reflections<br />

on Bell’s Theorem<br />

Notre Dame: University of Notre Dame Press<br />

Shimony, A. (1995) ‘Degree of entanglement’<br />

In Greenberger, D.M., Zeilinger, A. (eds) Fundamental Problems in Quantum Theory : In<br />

Honor of Professor John A. Wheeler<br />

New York: New York Academy of Sciences<br />

Stapp, H.P. (1975) ‘Bell’s Theorem and World Process’<br />

Il Nuovo Cimento B 29 (2) pp. 270-276<br />

Stapp, H.P. (1977) ‘Are Superluminal Connections Necessary?’<br />

Il Nuovo Cimento B 40 (1) pp. 191-205<br />

Stone, M.H. (1932) ‘On One - Parameter Unitary Groups in Hilbert Space’<br />

The Annals of Mathematics, Second Series 33 (3) pp. 643-648


212 BIBLIOGRAPHY<br />

Suppes, P., Zanotti, M. (1976) ‘On the Determinism of Hidden Variable Theories with Strict Correlation<br />

and Conditional Statistical Independence of Observables’<br />

In Suppes, P. (ed) Logic and Probability in Quantum Mechanics<br />

Dordrecht: D. Reidel Publishing Company<br />

Svetlichny, G., Redhead, M.L.G., Brown, H.R., Butterfield, J. (1988) ‘Do the Bell Inequalities Require<br />

the Existence of Joint Probability Distributions?’<br />

Philosophy of Science 55 (3) pp. 387-401<br />

Tkadlec, J. (2000) ‘Diagrams of Kochen - Specker Type Constructions’<br />

International Journal of Theoretical Physics 39 (3) pp. 921-926<br />

Uffink, J.B.M., Hilgevoord, J. (1985) ‘Uncertainty Principle and Uncertainty Relations’<br />

Foundations of Physics 15 (9) pp. 925-944<br />

Uffink, J.B.M., Hilgevoord, J. (1988) ‘Interference and Distinguishability in Quantum Mechanics’<br />

Physica B 151 pp. 309-313<br />

Uffink, J.B.M. (1990) Measures of Uncertainty and the Uncertainty Principle<br />

Utrecht: Rijksuniversiteit te Utrecht, Dissertation<br />

Vermaas, P.E., Dieks, D.G.B.J. (1995) ‘The Modal Interpretation of Quantum Mechanics and its<br />

Generalization to Density Operators’<br />

Foundations of Physics 25 (1) pp. 145-158<br />

Vermaas, P.E. (1999) A Philosopher’s Understanding of Quantum Mechanics : Possibilities and Impossibilities<br />

of a Modal Interpretation<br />

Cambridge: Cambridge University Press<br />

Vigier, J.-P., Dewdney, C., Holland, P.R., Kyprianidis, A. (1987) ‘Causal particle trajectories and the<br />

interpretation of quantum mechanics’<br />

In Hiley, B.J., Peat, F.D. (eds) (1987) Quantum implications : essays in honour of David Bohm<br />

London: Routledge & Kegan Paul<br />

Waerden, B.L. Van der (ed) (1967) Sources of Quantum mechanics<br />

Amsterdam: North - Holland Publishing Company<br />

Weihs, G., Jennewein, T., Simon, C., Weinfurter, H., Zeilinger, A. (1998) ‘Violation of Bell’s Inequality<br />

under Strict Einstein Locality Conditions’<br />

Physical Review Letters 81 (23) pp. 5039-5043<br />

Wheatley, M.J. (2001) Leadership and the New Science : Discovering Order in a Chaotic World<br />

San Francisco: Berrett - Koehler Publishers<br />

Wheeler, J.A. (1957) ‘Assessment of Everett’s “Relative State” Formulation of Quantum Theory’<br />

Reviews of Modern Physics 29 (3) pp. 463-465<br />

Wheeler, J.A., Zurek, W.H. (eds) (1983) Quantum Theory and Measurement<br />

Princeton: Princeton University Press


BIBLIOGRAPHY 213<br />

Wick, G.C., Wightman, A.S., Wigner E.P. (1952)‘The Intrinsic Parity of Elementary Particles’<br />

Physical Review 88 (1) pp. 101-105<br />

Wigner, E.P. (1952) ‘Die Messung quantenmechanischer Operatoren’<br />

Zeitschrift für Physik 133 pp. 101-108<br />

Wigner, E.P. ‘Remarks on the mind - body question’<br />

In Good, I.J. (1962) The scientist speculates : an anthology of partly - baked ideas<br />

London: Heinemann<br />

Repr in Wheeler and Zurek (1983)<br />

Wigner, E.P. (1963) ‘The problem of measurement’<br />

American Journal of Physics 31 (6) pp. 6-15<br />

Wigner, E.P. (1970) ‘On Hidden Variables and Quantum Mechanical Probabilities’<br />

Americal Journal of Physics 38 (8) pp. 1005-1009<br />

Wigner, E.P. (1983) ‘Interpretation of Quantum Mechanics’<br />

In Wheeler and Zurek (1983)<br />

Zukav, G. (1984) The Dancing Wu Li Masters : An Overview of the New Physics<br />

New York: Bantam Books<br />

Zurek, W.H. (1981) ‘Pointer basis of quantum apparatus: Into what mixture does the wave packet<br />

collapse?’<br />

Physical Review D 24 (6) pp. 1516-1525<br />

Zurek, W.H. (1982) ‘Environment - induced superselection rules’<br />

Physical Review D 26 (8) pp. 1862-1880

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!