06.04.2013 Views

Spectral Theory in Hilbert Space

Spectral Theory in Hilbert Space

Spectral Theory in Hilbert Space

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Spectral</strong> <strong>Theory</strong> <strong>in</strong> <strong>Hilbert</strong><br />

<strong>Space</strong><br />

Lectures fall 2008<br />

Christer Bennewitz


Copyright c○ 1993–2008 by Christer Bennewitz


Preface<br />

The aim of these notes is to present a reasonably complete exposition<br />

of <strong>Hilbert</strong> space theory, up to and <strong>in</strong>clud<strong>in</strong>g the spectral theorem<br />

for the case of a (possibly unbounded) selfadjo<strong>in</strong>t operator. As an application,<br />

eigenfunction expansions for regular and s<strong>in</strong>gular boundary<br />

value problems of ord<strong>in</strong>ary differential equations are discussed. We<br />

first do this for the simplest Sturm-Liouville equation, and then, us<strong>in</strong>g<br />

very similar methods of proof, for a fairly general type of first order<br />

systems, which <strong>in</strong>clude so called Hamiltonian systems.<br />

Prerequisites are modest, but a good understand<strong>in</strong>g of Lebesgue<br />

<strong>in</strong>tegration is assumed, <strong>in</strong>clud<strong>in</strong>g the concept of absolute cont<strong>in</strong>uity.<br />

Some previous exposure to l<strong>in</strong>ear algebra and basic functional analysis<br />

(uniform boundedness pr<strong>in</strong>ciple, closed graph theorem and maybe<br />

weak ∗ compactness of the unit ball <strong>in</strong> a (separable) Banach space) is<br />

expected from the reader, but <strong>in</strong> the two places where we could have<br />

used weak ∗ compactness, a direct proof has been given. The standard<br />

proofs of the Banach-Ste<strong>in</strong>haus and closed graph theorems are given<br />

<strong>in</strong> Appendix A. A brief exposition of the Riemann-Stieltjes <strong>in</strong>tegral,<br />

sufficient for our needs, is given <strong>in</strong> Appendix B. A few elementary facts<br />

about ord<strong>in</strong>ary l<strong>in</strong>ear differential equations are used. These are proved<br />

<strong>in</strong> Appendix C. In addition, some facts from elementary analytic function<br />

theory are used. Apart from this the lectures are essentially selfconta<strong>in</strong>ed.<br />

Egevang, August 2008<br />

Christer Bennewitz<br />

i


Contents<br />

Preface i<br />

Chapter 0. Introduction 1<br />

Chapter 1. L<strong>in</strong>ear spaces 5<br />

Exercises for Chapter 1 8<br />

Chapter 2. <strong>Space</strong>s with scalar product 9<br />

Exercises for Chapter 2 13<br />

Chapter 3. <strong>Hilbert</strong> space 15<br />

Exercises for Chapter 3 21<br />

Chapter 4. Operators 23<br />

Exercises for Chapter 4 30<br />

Chapter 5. Resolvents 31<br />

Exercises for Chapter 5 33<br />

Chapter 6. Nevanl<strong>in</strong>na functions 35<br />

Chapter 7. The spectral theorem 39<br />

Exercises for Chapter 7 44<br />

Chapter 8. Compactness 45<br />

Exercises for Chapter 8 49<br />

Chapter 9. Extension theory 51<br />

1. Symmetric operators 51<br />

2. Symmetric relations 53<br />

Exercises for Chapter 9 57<br />

Chapter 10. Boundary conditions 59<br />

Exercises for Chapter 10 67<br />

Chapter 11. Sturm-Liouville equations 69<br />

Exercises for Chapter 11 81<br />

Chapter 12. Inverse spectral theory 83<br />

1. Asymptotics of the m-function 84<br />

2. Uniqueness theorems 87<br />

Chapter 13. First order systems 91<br />

iii


iv CONTENTS<br />

Exercises for Chapter 13 98<br />

Chapter 14. Eigenfunction expansions 101<br />

Exercises for Chapter 14 104<br />

Chapter 15. S<strong>in</strong>gular problems 105<br />

Exercises for Chapter 15 114<br />

Appendix A. Functional analysis 117<br />

Appendix B. Stieltjes <strong>in</strong>tegrals 121<br />

Exercises for Appendix B 127<br />

Appendix C. L<strong>in</strong>ear first order systems 129<br />

Appendix. Bibliography 133


CHAPTER 0<br />

Introduction<br />

<strong>Hilbert</strong> space is the most immediate generalization to the <strong>in</strong>f<strong>in</strong>ite<br />

dimensional case of f<strong>in</strong>ite dimensional Euclidean spaces (i.e., essentially<br />

R n for real, and C n for complex vector spaces). Probably its most important<br />

uses, and certa<strong>in</strong>ly its historical roots, are <strong>in</strong> spectral theory.<br />

<strong>Spectral</strong> theory for differential equations orig<strong>in</strong>ates with the method of<br />

separation of variables, used to solve many of the equations of mathematical<br />

physics. This leads directly to the problem of expand<strong>in</strong>g an<br />

‘arbitrary’ function <strong>in</strong> terms of eigenfunctions of the reduced equation,<br />

which is the central problem of spectral theory. A simple example<br />

is that of a vibrat<strong>in</strong>g str<strong>in</strong>g. The str<strong>in</strong>g is supposed to be stretched<br />

over an <strong>in</strong>terval I ⊂ R, be fixed at the endpo<strong>in</strong>ts a, b and vibrate<br />

transversally (i.e., <strong>in</strong> a direction perpendicular to the <strong>in</strong>terval I) <strong>in</strong> a<br />

plane conta<strong>in</strong><strong>in</strong>g I. The str<strong>in</strong>g can then be described by a real-valued<br />

function u(x, t) giv<strong>in</strong>g the location at time t of the po<strong>in</strong>t of the str<strong>in</strong>g<br />

which moves on the normal to I through the po<strong>in</strong>t x ∈ I. In appropriate<br />

units the function u will then (for sufficiently small vibrations, i.e.,<br />

we are deal<strong>in</strong>g with a l<strong>in</strong>earization of a more accurate model) satisfy<br />

the follow<strong>in</strong>g equation:<br />

(0.1)<br />

⎧<br />

⎪⎨<br />

∂<br />

⎪⎩<br />

2u ∂x2 = ∂2u ∂t2 (wave equation)<br />

u(a, t) = u(b, t) = 0 for t > 0<br />

u(x, 0) and ut(x, 0) given<br />

(boundary conditions)<br />

(<strong>in</strong>itial conditions).<br />

The idea <strong>in</strong> separat<strong>in</strong>g variables is first to disregard the <strong>in</strong>itial conditions<br />

and try to f<strong>in</strong>d solutions to the differential equation that satisfy<br />

the boundary condition and are stand<strong>in</strong>g waves, i.e., of the special form<br />

u(x, t) = f(x)g(t). The l<strong>in</strong>earity of the equation implies that sums of<br />

solutions are also solutions (the superposition pr<strong>in</strong>ciple), so if we can<br />

f<strong>in</strong>d enough stand<strong>in</strong>g waves there is the possibility that any solution<br />

might be a superposition of stand<strong>in</strong>g waves. By substitut<strong>in</strong>g f(x)g(t)<br />

for u <strong>in</strong> (0.1) it follows that f ′′ (x)/f(x) = g ′′ (t)/g(t). S<strong>in</strong>ce the left<br />

hand side does not depend on t, and the right hand side not on x, both<br />

sides are <strong>in</strong> fact equal to a constant −λ. S<strong>in</strong>ce the general solution of<br />

the equation g ′′ (t) + λg(t) = 0 is a l<strong>in</strong>ear comb<strong>in</strong>ation of s<strong>in</strong>( √ λ t) and<br />

1


2 0. INTRODUCTION<br />

cos( √ λ t), it follows that<br />

<br />

′′<br />

−f = λf <strong>in</strong> I<br />

(0.2)<br />

f(a) = f(b) = 0,<br />

and that g(t) = A s<strong>in</strong>( √ λ t) + B cos( √ λ t) for some constants A and<br />

B. As is easily seen, (0.2) has non-trivial solutions only when λ is<br />

an element of the sequence {λj} ∞ 1 , where λj = ( π<br />

b−a j)2 . The numbers<br />

λ1, λ2, . . . are the eigenvalues of (0.2), and the correspond<strong>in</strong>g solutions<br />

(non-trivial multiples of s<strong>in</strong>(j π (x − a))), are the eigenfunctions of<br />

b−a<br />

(0.2). The set of eigenvalues is called the spectrum of (0.2). In general,<br />

a superposition of stand<strong>in</strong>g waves is therefore of the form u(x, t) =<br />

<br />

(Aj s<strong>in</strong>( √ λj t) + Bj cos( √ λj t)) s<strong>in</strong>( λj (x − a)). If we assume that<br />

we may differentiate the sum term by term, the <strong>in</strong>itial conditions of<br />

(0.1) therefore require that<br />

<br />

Bj s<strong>in</strong>( π<br />

b−aj(x − a)) and Aj π π j s<strong>in</strong>( j(x − a))<br />

b−a b−a<br />

are given functions. The question of whether (0.1) has a solution which<br />

is a superposition of stand<strong>in</strong>g waves for arbitrary <strong>in</strong>itial conditions, is<br />

then clearly seen to amount to the question whether an ‘arbitrary’<br />

function may be written as a series uj, where each term is an eigenfunction<br />

of (0.2), i.e., a solution for λ equal to one of the eigenvalues.<br />

We shall eventually show this to be possible for much more general<br />

differential equations than (0.1).<br />

The technique above was used systematically by Fourier <strong>in</strong> his Theorie<br />

analytique de la Chaleur (1822) to solve problems of heat conduction,<br />

which <strong>in</strong> the simplest cases (like our example) lead to what are<br />

now called Fourier series expansions. Fourier was never able to give a<br />

satisfactory proof of the completeness of the eigenfunctions, i.e., the<br />

fact that essentially arbitrary functions can be expanded <strong>in</strong> Fourier<br />

series. This problem was solved by Dirichlet somewhat later, and at<br />

about the same time (1830) Sturm and Liouville <strong>in</strong>dependently but<br />

simultaneously showed weaker completeness results for more general<br />

ord<strong>in</strong>ary differential equations of the form −(pu ′ ) ′ + qu = λu, with<br />

boundary conditions of the form Au + Bpu ′ = 0, to be satisfied at the<br />

endpo<strong>in</strong>ts of the given <strong>in</strong>terval. Here p and q are given, sufficiently regular<br />

functions, and A, B given real constants, not both 0 and possibly<br />

different <strong>in</strong> the two <strong>in</strong>terval endpo<strong>in</strong>ts. The Fourier cases correspond<br />

to p ≡ 1, q ≡ 0 and A or B equal to 0.<br />

For the Fourier equation, the distance between successive eigenvalues<br />

decreases as the length of the base <strong>in</strong>terval <strong>in</strong>creases, and as the<br />

base <strong>in</strong>terval approaches the whole real l<strong>in</strong>e, the eigenvalues accumulate<br />

everywhere on the positive real l<strong>in</strong>e. The Fourier series is then<br />

replaced by a cont<strong>in</strong>uous superposition, i.e., an <strong>in</strong>tegral, and we get<br />

the classical Fourier transform. Thus a cont<strong>in</strong>uous spectrum appears,


0. INTRODUCTION 3<br />

and this is typical of problems where the basic doma<strong>in</strong> is unbounded,<br />

or the coefficients of the equation have sufficiently bad s<strong>in</strong>gularities at<br />

the boundary.<br />

In 1910 Hermann Weyl [12] gave the first rigorous treatment, <strong>in</strong> the<br />

case of an equation of Sturm-Liouville type, of cases where cont<strong>in</strong>uous<br />

spectra can occur. Weyl’s treatment was based on the then recently<br />

proved spectral theorem by <strong>Hilbert</strong>. <strong>Hilbert</strong>’s theorem was a generalization<br />

of the usual diagonalization of a quadratic form, to the case of<br />

<strong>in</strong>f<strong>in</strong>itely many variables. <strong>Hilbert</strong> applied it to certa<strong>in</strong> <strong>in</strong>tegral operators,<br />

but it is not directly applicable to differential operators, s<strong>in</strong>ce<br />

these are ‘unbounded’ <strong>in</strong> a sense we will discuss <strong>in</strong> Chapter 4. With<br />

the creation of quantum mechanics <strong>in</strong> the late 1920’s, these matters became<br />

of basic importance to physics, and mathematicians, who had not<br />

advanced much beyond the results of Weyl, took the matter up aga<strong>in</strong>.<br />

The outcome was the general spectral theorem, generally attributed to<br />

John von Neumann (1928), although essentially the same theorem had<br />

been proved by Torsten Carleman <strong>in</strong> 1923, <strong>in</strong> a less abstract sett<strong>in</strong>g.<br />

Von Neumann’s theorem is an abstract result, and detailed applications<br />

to differential operators of reasonable generality had to wait until<br />

the early 1950’s. In the meantime many <strong>in</strong>dependent results about<br />

expansions <strong>in</strong> eigenfunctions had been given, particularly for ord<strong>in</strong>ary<br />

differential equations.<br />

In these lectures we will prove von Neumann’s theorem. We will<br />

then apply this theorem to differential equations, <strong>in</strong>clud<strong>in</strong>g those that<br />

give rise to the classical Fourier series and Fourier transform. Once one<br />

has a result about expansion <strong>in</strong> eigenfunctions a host of other questions<br />

appear, some of which we will discuss <strong>in</strong> these notes. Sample questions<br />

are:<br />

• How do eigenvalues and eigenfunctions depend on the doma<strong>in</strong><br />

I and on the form of the equation (its order, coefficients etc.)?<br />

A partial answer is given if one can calculate the asymptotic<br />

distribution of the eigenvalues, i.e., approximate the growth of<br />

λj as a function of j. For simple ord<strong>in</strong>ary differential operators<br />

this can be done by fairly elementary means. The first such<br />

result for a partial differential equation was given by Weyl <strong>in</strong><br />

1912, and his method was later improved and extended by<br />

Courant.<br />

• How well does the expansion converge when expand<strong>in</strong>g different<br />

classes of functions? Aga<strong>in</strong>, for ord<strong>in</strong>ary differential<br />

operators some questions of this type can be handled by elementary<br />

methods, but <strong>in</strong> general the answer lies <strong>in</strong> the explicit<br />

asymptotic behavior of the so called spectral projectors. The<br />

first such asymptotic result was given by Carleman <strong>in</strong> 1934,<br />

and his method has been the basis for most later results.


4 0. INTRODUCTION<br />

• Can the equation be reconstructed if the spectrum is known? If<br />

not, what else must one know? If different equations can have<br />

the same spectrum, how many different equations? What do<br />

they have <strong>in</strong> common? Questions like these are part of what is<br />

called <strong>in</strong>verse spectral theory. Really satisfactory answers have<br />

only been obta<strong>in</strong>ed for the equation −u ′′ + qu = λu, notably<br />

by Gelfand and Levitan <strong>in</strong> the early 1950:s. Pioneer<strong>in</strong>g work<br />

was done by Göran Borg <strong>in</strong> the 1940:s.<br />

• Another aspect of the first po<strong>in</strong>t is the follow<strong>in</strong>g: Given a ‘base’<br />

equation (correspond<strong>in</strong>g to a ‘free particle’ <strong>in</strong> quantum mechanics)<br />

and another equation, which outside some bounded<br />

region is close to the base equation (an ‘obstacle’ has been <strong>in</strong>troduced),<br />

how can one relate the eigenfunctions for the two<br />

equations? The ma<strong>in</strong> questions of so called scatter<strong>in</strong>g theory<br />

are of this type.<br />

• Related to the previous po<strong>in</strong>t is the problem of <strong>in</strong>verse scatter<strong>in</strong>g.<br />

Here one is given scatter<strong>in</strong>g data, i.e., the answer to<br />

the question <strong>in</strong> the previous po<strong>in</strong>t, and the question is whether<br />

the equation is determ<strong>in</strong>ed by scatter<strong>in</strong>g data, whether there<br />

is a method for reconstruct<strong>in</strong>g the equation from the scatter<strong>in</strong>g<br />

data, and similar questions. Many questions of this k<strong>in</strong>d<br />

are of great importance to applications.


CHAPTER 1<br />

L<strong>in</strong>ear spaces<br />

This chapter is <strong>in</strong>tended to be a quick review of the basic facts<br />

about l<strong>in</strong>ear spaces. In the def<strong>in</strong>ition below the set K can be any field,<br />

although usually only the fields R of real numbers and C of complex<br />

numbers are of <strong>in</strong>terest.<br />

Def<strong>in</strong>ition 1.1. A l<strong>in</strong>ear space or vector space over K is a set L<br />

provided with an addition +, which to every pair of elements u, v ∈ L<br />

associates an element u + v ∈ L, and a multiplication, which to every<br />

λ ∈ K and u ∈ L associates an element λu ∈ L. The follow<strong>in</strong>g rules<br />

for calculation hold:<br />

(1) (u + v) + w = u + (v + w) for all u, v and w <strong>in</strong> L.(associativity)<br />

(2) There is an element 0 ∈ L such that u + 0 = 0 + u = u for<br />

every u ∈ L. (existence of neutral element)<br />

(3) For every u ∈ L there exists v ∈ L such that u+v = v +u = 0.<br />

One denotes v by −u. (existence of additive <strong>in</strong>verse)<br />

(4) u + v = v + u for all u, v ∈ L. (commutativity)<br />

(5) λ(u + v) = λu + λv for all λ ∈ K and all u, v ∈ L.<br />

(6) (λ + µ)u = λu + µu for all λ, µ ∈ K and all u ∈ L.<br />

(7) λ(µu) = (λµ)u for all λ, µ ∈ K and all u ∈ L.<br />

(8) 1u = u for all u ∈ L.<br />

If K = R we have a real l<strong>in</strong>ear space, if K = C a complex l<strong>in</strong>ear<br />

space. Axioms 1–3 above say that L is a group under addition, axiom<br />

4 that the group is abelian (or commutative). Axioms 5 and 6 are<br />

distributive laws and axiom 7 an associative law related to the multiplication<br />

by scalars, whereas axiom 8 gives a k<strong>in</strong>d of normalization for<br />

the multiplication by scalars.<br />

Note that by restrict<strong>in</strong>g oneself to multiply<strong>in</strong>g only by real numbers,<br />

any complex space may also be viewed as a real l<strong>in</strong>ear space.<br />

Conversely, every real l<strong>in</strong>ear space can be ‘extended’ to a complex l<strong>in</strong>ear<br />

space (Exercise 1.1). We will therefore only consider complex l<strong>in</strong>ear<br />

spaces <strong>in</strong> the sequel.<br />

Let M be an arbitrary set and let C M be the set of complex-valued<br />

functions def<strong>in</strong>ed on M. Then C M , provided with the obvious def<strong>in</strong>itions<br />

of the l<strong>in</strong>ear operations, is a complex l<strong>in</strong>ear space (Exercise 1.2).<br />

In the case when M = {1, 2, . . . , n} one writes C n <strong>in</strong>stead of C {1,2,...,n} .<br />

An element u ∈ C n is of course given by the values u(1), u(2), . . . , u(n)<br />

5


6 1. LINEAR SPACES<br />

of u so one may also regard C n as the set of ordered n-tuples of complex<br />

numbers. The correspond<strong>in</strong>g real space is the usual R n .<br />

If L is a l<strong>in</strong>ear space and V a subset of L which is itself a l<strong>in</strong>ear<br />

space, us<strong>in</strong>g the l<strong>in</strong>ear operations <strong>in</strong>herited from L, one says that V is<br />

a l<strong>in</strong>ear subspace of L.<br />

Proposition 1.2. A non-empty subset V of L is a l<strong>in</strong>ear subspace<br />

of L if and only if u + v ∈ V and λu ∈ V for all u, v ∈ V and λ ∈ C.<br />

The proof is left as an exercise (Exercise 1.3). If u1, u2, . . . , uk are<br />

elements of a l<strong>in</strong>ear space L we denote by [u1, u2, . . . , uk] the l<strong>in</strong>ear<br />

hull of u1, u2, . . . , uk, i.e., the set of all l<strong>in</strong>ear comb<strong>in</strong>ations λ1u1 +· · ·+<br />

λkuk, where λ1, . . . , λk ∈ C. It is not hard to see that l<strong>in</strong>ear hulls are<br />

always subspaces (Exercise 1.5). One says that u1, . . . , uk generates<br />

L if L = [u1, . . . , uk], and any l<strong>in</strong>ear space which is the l<strong>in</strong>ear hull<br />

of a f<strong>in</strong>ite number of its elements is called f<strong>in</strong>itely generated or f<strong>in</strong>itedimensional.<br />

A l<strong>in</strong>ear space which is not f<strong>in</strong>itely generated is called<br />

<strong>in</strong>f<strong>in</strong>ite-dimensional. It is clear that if, for example, u1 is a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of u2, . . . , uk, then [u1, . . . , uk] = [u2, . . . , uk]. If none of<br />

u1, . . . , uk is a l<strong>in</strong>ear comb<strong>in</strong>ation of the others one says that u1, . . . , uk<br />

are l<strong>in</strong>early <strong>in</strong>dependent. It is clear that any f<strong>in</strong>itely generated space<br />

has a set of l<strong>in</strong>early <strong>in</strong>dependent generators; one simply starts with<br />

a set of generators and goes through them one by one, at each step<br />

discard<strong>in</strong>g any generator which is a l<strong>in</strong>ear comb<strong>in</strong>ation of those com<strong>in</strong>g<br />

before it. A set of l<strong>in</strong>early <strong>in</strong>dependent generators for L is called a basis<br />

for L. A given f<strong>in</strong>ite-dimensional space L can of course be generated<br />

by many different bases. However, a fundamental fact is that all such<br />

bases of L have the same number of elements, called the dimension of<br />

L. This follows immediately from the follow<strong>in</strong>g theorem.<br />

Theorem 1.3. Suppose u1, . . . , uk generate L, and that v1, . . . , vj<br />

are l<strong>in</strong>early <strong>in</strong>dependent elements of L. Then j ≤ k.<br />

Proof. S<strong>in</strong>ce u1, . . . , uk generate L we have v1 = k<br />

s=1 x1sus,<br />

for some coefficients x11, . . . , x1k which are not all 0 s<strong>in</strong>ce v1 = 0.<br />

By renumber<strong>in</strong>g u1, . . . , uk we may assume x11 = 0. Then u1 =<br />

1<br />

x11 v1 − k x1s<br />

s=2 x11 us, and therefore v1, u2, . . . , uk generate L. In particular,<br />

v2 = x21v1 + k s=2 x2sus for some coefficients x21, . . . , x2k. We<br />

can not have x22 = · · · = x2k = 0 s<strong>in</strong>ce v1, v2 are l<strong>in</strong>early <strong>in</strong>dependent.<br />

By renumber<strong>in</strong>g u2, . . . , uk, if necessary, we may assume x22 = 0. It<br />

follows as before that v1, v2, u3, . . . , uk generate L. We can cont<strong>in</strong>ue <strong>in</strong><br />

this way until we run out of either v:s (if j ≤ k) or u:s (if j > k). But<br />

if j > k we would get that v1, . . . , vk generate L, <strong>in</strong> particular that<br />

vj is a l<strong>in</strong>ear comb<strong>in</strong>ation of v1, . . . , vk which contradicts the l<strong>in</strong>ear<br />

<strong>in</strong>dependence of the v:s. Hence j ≤ k. <br />

For a f<strong>in</strong>ite-dimensional space the existence and uniqueness of coord<strong>in</strong>ates<br />

for any vector with respect to an arbitrary basis now follows


1. LINEAR SPACES 7<br />

easily (Exercise 1.6). More importantly for us, it is also clear that L<br />

is <strong>in</strong>f<strong>in</strong>ite dimensional if and only if every l<strong>in</strong>early <strong>in</strong>dependent subset<br />

of L can be extended to a l<strong>in</strong>early <strong>in</strong>dependent subset of L with arbitrarily<br />

many elements. This usually makes it quite easy to see that a<br />

given space is <strong>in</strong>f<strong>in</strong>ite dimensional (Exercise 1.7).<br />

If V and W are both l<strong>in</strong>ear subspaces of some larger l<strong>in</strong>ear space<br />

L, then the l<strong>in</strong>ear span [V, W ] of V and W is the set<br />

[V, W ] = {u | u = v + w where v ∈ V and w ∈ W }.<br />

This is obviously a l<strong>in</strong>ear subspace of L. If <strong>in</strong> addition V ∩ W = {0},<br />

then for any u ∈ [V, W ] there are unique elements v ∈ V and w ∈ W<br />

such that u = v + w. In this case [V, W ] is called the direct sum of V<br />

and W and is denoted by V ˙+W . The proof of these facts is left as an<br />

exercise (Exercise 1.9).<br />

If V is a l<strong>in</strong>ear subspace of L we can create a new l<strong>in</strong>ear space<br />

L/V , the quotient space of L by V , <strong>in</strong> the follow<strong>in</strong>g way. We say<br />

that two elements u and v of L are equivalent if u − v ∈ V . It is<br />

immediately seen that any u is equivalent to itself, that u is equivalent<br />

to v if v is equivalent to u, and that if u is equivalent to v, and v to<br />

w, then u is equivalent to w. It then easily follows that we may split<br />

L <strong>in</strong>to equivalence classes such that every vector is equivalent to all<br />

vectors <strong>in</strong> the same equivalence class, but not to any other vectors.<br />

The equivalence class conta<strong>in</strong><strong>in</strong>g u is denoted by u + V , and then<br />

u + V = v + V precisely if u − v ∈ V . We now def<strong>in</strong>e L/V as the set of<br />

equivalence classes, where addition is def<strong>in</strong>ed by (u + V ) + (v + V ) =<br />

(u+v)+V and multiplication by scalar as λ(u+V ) = λu+V . It is easily<br />

seen that these operations are well def<strong>in</strong>ed and that L/V becomes a<br />

l<strong>in</strong>ear space with neutral element 0 + V (Exercise 1.10). One def<strong>in</strong>es<br />

codim V = dim L/V . We end the chapter by a fundamental fact about<br />

quotient spaces.<br />

Theorem 1.4. dim V + codim V = dim L.<br />

We leave the proof for Exercise 1.11.


8 1. LINEAR SPACES<br />

Exercises for Chapter 1<br />

Exercise 1.1. Let L be a real l<strong>in</strong>ear space, and let V be the set<br />

of ordered pairs (u, v) of elements of L with addition def<strong>in</strong>ed componentwise.<br />

Show that V becomes a complex l<strong>in</strong>ear space if one def<strong>in</strong>es<br />

(x + iy)(u, v) = (xu − yv, xv + yu) for real x, y. Also show that L can<br />

be ‘identified’ with the subset of elements of V of the form (u, 0), <strong>in</strong><br />

the sense that there is a one-to-one correspondence between the two<br />

sets preserv<strong>in</strong>g the l<strong>in</strong>ear operations (for real scalars).<br />

Exercise 1.2. Let M be an arbitrary set and let C M be the set<br />

of complex-valued functions def<strong>in</strong>ed on M. Show that C M , provided<br />

with the obvious def<strong>in</strong>itions of the l<strong>in</strong>ear operations, is a complex l<strong>in</strong>ear<br />

space.<br />

Exercise 1.3. Prove Proposition 1.2.<br />

Exercise 1.4. Let M be a non-empty subset of R n . Which of the<br />

follow<strong>in</strong>g choices of L make it <strong>in</strong>to a l<strong>in</strong>ear subspace of C M ?<br />

(1) L = {u ∈ C M | |u(x)| < 1 for all x ∈ M}.<br />

(2) L = C(M) = {u ∈ C M | u is cont<strong>in</strong>uous <strong>in</strong> M}.<br />

(3) L = {u ∈ C(M) | u is bounded on M}.<br />

(4) L = L(M) = {u ∈ C M | u is Lebesgue <strong>in</strong>tegrable over M}.<br />

Exercise 1.5. Let L be a l<strong>in</strong>ear space and uj ∈ L, j = 1, . . . , k.<br />

Show that [u1, u2, . . . , uk] is a l<strong>in</strong>ear subspace of L.<br />

Exercise 1.6. Show that if e1, . . . , en is a basis for L, then for<br />

each u ∈ L there are uniquely determ<strong>in</strong>ed complex numbers x1, . . . , xn,<br />

called coord<strong>in</strong>ates for u, such that u = x1e1 + · · · + xnen.<br />

Exercise 1.7. Verify that L is <strong>in</strong>f<strong>in</strong>ite dimensional if and only if<br />

every l<strong>in</strong>early <strong>in</strong>dependent subset of L can be extended to a l<strong>in</strong>early<br />

<strong>in</strong>dependent subset of L with arbitrarily many elements. Then show<br />

that u1, . . . , uk are l<strong>in</strong>early <strong>in</strong>dependent if and only if λ1u1+· · ·+λkuk =<br />

0 only for λ1 = · · · = λk = 0. Also show that C M is f<strong>in</strong>ite-dimensional<br />

if and only if the set M has f<strong>in</strong>itely many elements.<br />

Exercise 1.8. Let M be an open subset of R n . Verify that L is<br />

<strong>in</strong>f<strong>in</strong>ite-dimensional for each of the choices of L <strong>in</strong> Exercise 1.4 which<br />

make L <strong>in</strong>to a l<strong>in</strong>ear space.<br />

Exercise 1.9. Prove all statements <strong>in</strong> the penultimate paragraph<br />

of the chapter.<br />

Exercise 1.10. Prove that if L is a l<strong>in</strong>ear space and V a subspace,<br />

then L/V is a well def<strong>in</strong>ed l<strong>in</strong>ear space.<br />

Exercise 1.11. Prove Theorem 1.4.


CHAPTER 2<br />

<strong>Space</strong>s with scalar product<br />

If one wants to do analysis <strong>in</strong> a l<strong>in</strong>ear space, some structure <strong>in</strong> addition<br />

to the l<strong>in</strong>earity is needed. This is because one needs some way<br />

to def<strong>in</strong>e limits and cont<strong>in</strong>uity, and this requires an appropriate def<strong>in</strong>ition<br />

of what a neighborhood of a po<strong>in</strong>t is. Thus one must <strong>in</strong>troduce a<br />

topology <strong>in</strong> the space. We will not deal with the general notion of topological<br />

vector space here, but only the follow<strong>in</strong>g particularly convenient<br />

way to <strong>in</strong>troduce a topology <strong>in</strong> a l<strong>in</strong>ear space. This also covers most<br />

cases of importance to analysis. A metric space is a set M provided<br />

with a metric, which is a function d : M × M → R such that for any<br />

x, y, z ∈ M the follow<strong>in</strong>g holds.<br />

(1) d(x, y) ≥ 0 and = 0 if and only if x = y. (positive def<strong>in</strong>ite)<br />

(2) d(x, y) = d(y, x). (symmetric)<br />

(3) d(x, y) ≤ d(x, z) + d(z, y). (triangle <strong>in</strong>equality)<br />

A neighborhood of x ∈ M is then a subset O of M such that for some<br />

ε > 0 the set O conta<strong>in</strong>s all y ∈ M for which d(x, y) < ε. An open<br />

set is a set which is a neighborhood of all its po<strong>in</strong>ts, and a closed set<br />

is one with an open complement. One says that a sequence x1, x2, . . .<br />

of elements <strong>in</strong> M converges to x ∈ M if d(xj, x) → 0 as j → ∞.<br />

The most convenient, but not the only important, way of <strong>in</strong>troduc<strong>in</strong>g<br />

a metric <strong>in</strong> a l<strong>in</strong>ear space L is via a norm (Exercise 2.1). A norm<br />

on L is a function · : L → R such that for any u, v ∈ L and λ ∈ C<br />

(1) u ≥ 0 and = 0 if and only if u = 0. (positive def<strong>in</strong>ite)<br />

(2) λu = |λ|u. (positive homogeneous)<br />

(3) u + v ≤ u + v. (triangle <strong>in</strong>equality)<br />

The usual norm <strong>in</strong> the real space R 3 is of course obta<strong>in</strong>ed from the dot<br />

product (x1, x2, x3) · (y1, y2, y3) = x1y1 + x2y2 + x3y3 by sett<strong>in</strong>g x =<br />

√ x · x. For an <strong>in</strong>f<strong>in</strong>ite-dimensional l<strong>in</strong>ear space L, it is sometimes<br />

possible to def<strong>in</strong>e a norm similarly by sett<strong>in</strong>g u = 〈u, u〉, where<br />

〈·, ·〉 is a scalar product on L. A scalar product is a function L×L → C<br />

such that for all u, v and w <strong>in</strong> L and all λ, µ ∈ C holds<br />

(1) 〈λu + µv, w〉 = λ〈u, w〉 + µ〈v, w〉. (l<strong>in</strong>earity <strong>in</strong> first argument)<br />

(2) 〈u, v〉 = 〈v, u〉. (Hermitian symmetry)<br />

(3) 〈u, u〉 ≥ 0 with equality only if u = 0. (positive def<strong>in</strong>ite)<br />

If <strong>in</strong>stead of (3) holds only<br />

(3’) 〈u, u〉 ≥ 0, (positive semi-def<strong>in</strong>ite)<br />

9


10 2. SPACES WITH SCALAR PRODUCT<br />

one speaks about a semi-scalar product. Note that (2) implies that<br />

〈u, u〉 is real so that (3) makes sense. Also note that by comb<strong>in</strong><strong>in</strong>g (1)<br />

and (2) we have 〈w, λu + µv〉 = λ〈w, u〉 + µ〈w, v〉. One says that the<br />

scalar product is anti-l<strong>in</strong>ear <strong>in</strong> its second argument (Warn<strong>in</strong>g: In the<br />

so called Dirac formalism <strong>in</strong> quantum mechanics the scalar product is<br />

<strong>in</strong>stead anti-l<strong>in</strong>ear <strong>in</strong> the first argument, l<strong>in</strong>ear <strong>in</strong> the second). Together<br />

with (1) this makes the scalar product <strong>in</strong>to a sesqui-l<strong>in</strong>ear (=1 1<br />

2 -l<strong>in</strong>ear)<br />

form. In words: A scalar product is a Hermitian, sesqui-l<strong>in</strong>ear and<br />

positive def<strong>in</strong>ite form. We now assume that we have a scalar product<br />

on L and def<strong>in</strong>e u = 〈u, u〉 for any u ∈ L. To show that this<br />

def<strong>in</strong>ition makes · <strong>in</strong>to a norm we need the follow<strong>in</strong>g basic theorem.<br />

Theorem 2.1. (Cauchy-Schwarz) If 〈·, ·〉 is a semi-scalar product<br />

on L, then for all u, v ∈ L holds |〈u, v〉| 2 ≤ 〈u, u〉〈v, v〉.<br />

Proof. For arbitrary complex λ we have 0 ≤ 〈λu + v, λu + v〉 =<br />

|λ| 2 〈u, u〉 + λ〈u, v〉 + λ〈v, u〉 + 〈v, v〉. For λ = −r〈v, u〉 with real r<br />

we obta<strong>in</strong> 0 ≤ r 2 |〈u, v〉| 2 〈u, u〉 − 2r|〈u, v〉| 2 + 〈v, v〉. If 〈u, u〉 = 0 but<br />

〈u, v〉 = 0 this expression becomes negative for r > 1〈v,<br />

v〉|〈u, v〉|−2<br />

2<br />

which is a contradiction. Hence 〈u, u〉 = 0 implies that 〈u, v〉 = 0 so<br />

that the theorem is true <strong>in</strong> the case when 〈u, u〉 = 0. If 〈u, u〉 = 0<br />

we set r = 〈u, u〉 −1 and obta<strong>in</strong>, after multiplication by 〈u, u〉, that<br />

0 ≤ −|〈u, v〉| 2 + 〈u, u〉〈v, v〉 which proves the theorem. <br />

In the case of a scalar product, def<strong>in</strong><strong>in</strong>g u = 〈u, u〉, we may<br />

write the Cauchy-Schwarz <strong>in</strong>equality as |〈u, v〉| ≤ uv. In this<br />

case it is also easy to see when there is equality <strong>in</strong> Cauchy-Schwarz’<br />

<strong>in</strong>equality. To see that · is a norm on L the only non-trivial po<strong>in</strong>t<br />

is to verify that the triangle <strong>in</strong>equality holds; but this follows from<br />

Cauchy-Schwarz’ <strong>in</strong>equality (Exercise 2.4).<br />

Recall that <strong>in</strong> a f<strong>in</strong>ite dimensional space with scalar product it is<br />

particularly convenient to use an orthonormal basis s<strong>in</strong>ce this makes<br />

it very easy to calculate the coord<strong>in</strong>ates of any vector. In fact, if<br />

x1, . . . , xn are the coord<strong>in</strong>ates of u <strong>in</strong> the orthonormal basis e1, . . . , en,<br />

then xj = 〈u, ej〉 (recall that e1, . . . , en is called orthonormal if all basis<br />

elements have norm 1 and 〈ej, ek〉 = 0 for j = k). Given an arbitrary<br />

basis it is easy to construct an orthonormal basis by use of the Gram-<br />

Schmidt method (see the proof of Lemma 2.2).<br />

In an <strong>in</strong>f<strong>in</strong>ite-dimensional space one can not f<strong>in</strong>d a (f<strong>in</strong>ite) basis.<br />

The best one can hope for are <strong>in</strong>f<strong>in</strong>itely many vectors e1, e2, . . . such<br />

that each f<strong>in</strong>ite subset is l<strong>in</strong>early <strong>in</strong>dependent, and any vector is the<br />

limit <strong>in</strong> norm of a sequence of f<strong>in</strong>ite l<strong>in</strong>ear comb<strong>in</strong>ations of e1, e2, . . . .<br />

Aga<strong>in</strong>, it will turn out to be very convenient if e1, e2, . . . is an orthonormal<br />

sequence, i.e., ej = 1 for j = 1, 2, . . . and 〈ej, ek〉 = 0 for j = k.<br />

The follow<strong>in</strong>g lemma is easily proved by use of the Gram-Schmidt procedure.


2. SPACES WITH SCALAR PRODUCT 11<br />

Lemma 2.2. Any <strong>in</strong>f<strong>in</strong>ite-dimensional l<strong>in</strong>ear space L with scalar<br />

product conta<strong>in</strong>s an orthonormal sequence.<br />

Proof. Accord<strong>in</strong>g to Chapter 1 we can f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent<br />

sequence <strong>in</strong> L, i.e., a sequence u1, u2, . . . such that u1, . . . , uk are l<strong>in</strong>early<br />

<strong>in</strong>dependent for any k. Put e1 = u1/u1 and v2 = u2 −〈u2, e1〉e1.<br />

Next put e2 = v2/v2. If we have already found e1, . . . , ek, put<br />

vk+1 = uk+1 − k<br />

j=1 〈uk+1, ej〉ej and ek+1 = vk+1/vk+1. I claim<br />

that this procedure will lead to a well def<strong>in</strong>ed orthonormal sequence<br />

e1, e2, . . . . This is left for the reader to verify (Exercise 2.6). <br />

Suppos<strong>in</strong>g we have an orthonormal sequence e1, e2, . . . <strong>in</strong> L a natural<br />

question is: How well can one approximate (<strong>in</strong> the norm of L) an<br />

arbitrary vector u ∈ L by f<strong>in</strong>ite l<strong>in</strong>ear comb<strong>in</strong>ations of e1, e2, . . . . Here<br />

is the answer:<br />

Lemma 2.3. Suppose e1, e2, . . . is an orthonormal sequence <strong>in</strong> L<br />

and put, for any u ∈ L, ûj = 〈u, ej〉. Then we have<br />

k<br />

(2.1) u − λjej 2 = u 2 k<br />

− |ûj| 2 k<br />

+ |λj − ûj| 2<br />

j=1<br />

for any complex numbers λ1, . . . , λk.<br />

The proof is by calculation (Exercise 2.7). The <strong>in</strong>terpretation of<br />

Lemma 2.3 is very <strong>in</strong>terest<strong>in</strong>g. The identity (2.1) says that if we want<br />

to choose a l<strong>in</strong>ear comb<strong>in</strong>ation k<br />

j=1 λjej of e1, . . . , ek which approximates<br />

u well <strong>in</strong> norm, the best choice of coefficients is to take λj = ûj,<br />

j = 1, . . . , k. Furthermore, with this choice, the error is given exactly<br />

by u − k<br />

j=1 ûjej 2 = u 2 − k<br />

j=1 |ûj| 2 . One calls the coefficients<br />

û1, û2, . . . the (generalized) Fourier coefficients of u with respect to the<br />

orthonormal sequence e1, e2, . . . . The follow<strong>in</strong>g theorem is an immediate<br />

consequence of Lemma 2.3 (Exercise 2.8).<br />

Theorem 2.4 (Bessel’s <strong>in</strong>equality). For any u the series ∞ 2<br />

j=1 |ûj|<br />

converges and one has<br />

∞<br />

j=1<br />

j=1<br />

|ûj| 2 ≤ u 2 .<br />

Another immediate consequence of Lemma 2.3 is the next theorem<br />

(cf. Exercise 2.9).<br />

Theorem 2.5 (Parseval’s formula). The series ∞<br />

j=1 ûjej converges<br />

(<strong>in</strong> norm) to u if and only if ∞<br />

j=1 |ûj| 2 = u 2 .<br />

There is also a slightly more general form of Parseval’s formula.<br />

Corollary 2.6. Suppose ∞<br />

j=1 |ûj| 2 = u 2 for some u ∈ L. Then<br />

∞<br />

j=1 ûjˆvj = 〈u, v〉 for any v ∈ L.<br />

j=1


12 2. SPACES WITH SCALAR PRODUCT<br />

Proof. Consider the follow<strong>in</strong>g form on L.<br />

∞<br />

[u, v] = 〈u, v〉 − ûjˆvj .<br />

S<strong>in</strong>ce |ûjˆvj| ≤ 1<br />

2 (|ûj| 2 + |ˆvj| 2 ) by the arithmetic-geometric <strong>in</strong>equality,<br />

Bessel’s <strong>in</strong>equality shows that the series is absolutely convergent. It<br />

follows that [·, ·] is a Hermitian, sesqui-l<strong>in</strong>ear form on L. Because<br />

of Bessel’s <strong>in</strong>equality it is also positive (but not positive def<strong>in</strong>ite).<br />

Thus [·, ·] is a semi-scalar product on L. Apply<strong>in</strong>g Cauchy-Schwarz’<br />

<strong>in</strong>equality we obta<strong>in</strong> |[u, v]| 2 ≤ [u, u][v, v]. By assumption [u, u] =<br />

u 2 − ∞<br />

j=1 |ûj| 2 = 0 so that the corollary follows. <br />

It is now obvious that the closest analogy to an orthonormal basis<br />

<strong>in</strong> an <strong>in</strong>f<strong>in</strong>ite-dimensional space with scalar product is an orthonormal<br />

sequence with the additional property of the follow<strong>in</strong>g def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition 2.7. An orthonormal sequence <strong>in</strong> L is called complete<br />

if the Parseval identity u 2 = ∞<br />

1 |ûj| 2 holds for every u ∈ L.<br />

It is by no means clear that we can always f<strong>in</strong>d complete orthonormal<br />

sequences <strong>in</strong> a given space. This requires the space to be separable.<br />

Def<strong>in</strong>ition 2.8. A metric space M is called separable if it has a<br />

dense, countable subset. This means a sequence u1, u2, . . . of elements<br />

of M, such that for any u ∈ M, and any ε > 0, there is an element uj<br />

of the sequence for which d(u, uj) < ε.<br />

The vast majority of spaces used <strong>in</strong> analysis are separable (Exercise<br />

2.10), but there are exceptions (Exercise 2.12).<br />

Theorem 2.9. A <strong>in</strong>f<strong>in</strong>ite-dimensional l<strong>in</strong>ear space with scalar product<br />

is separable if and only if it conta<strong>in</strong>s a complete orthonormal sequence.<br />

The proof is left as an exercise (Exercise 2.11). Suppose e1, e2, . . . is<br />

a complete orthonormal sequence <strong>in</strong> L. We then know that any u ∈ L<br />

may be written as u = ∞ j=1 ûjej, where the series converges <strong>in</strong> norm.<br />

Furthermore the numerical series ∞<br />

j=1 |ûj| 2 converges to u 2 . The<br />

follow<strong>in</strong>g question now arises: Given a sequence λ1, λ2, . . . of complex<br />

numbers for which ∞<br />

j=1 |λj| 2 converges, does there exist an element<br />

u ∈ L for which λ1, λ2, . . . are the Fourier coefficients? Equivalently,<br />

does ∞<br />

j=1 λjej converge to an element u ∈ L <strong>in</strong> norm? As it turns<br />

out, this is not always the case. The property required of L is that it<br />

is complete. Warn<strong>in</strong>g: This is a totally different property from the<br />

completeness of orthonormal sequences we discussed earlier! To expla<strong>in</strong><br />

what it is, we need a few def<strong>in</strong>itions.<br />

Def<strong>in</strong>ition 2.10. A Cauchy sequence <strong>in</strong> a metric space M is a<br />

sequence u1, u2, . . . of elements of M such that d(uj, uk) → 0 as j, k →<br />

j=1


EXERCISES FOR CHAPTER 2 13<br />

∞. More exactly: To every ε > 0 there exists a number ω such that<br />

d(uj, uk) < ε if j > ω and k > ω.<br />

It is clear by use of the triangle <strong>in</strong>equality that any convergent<br />

sequence is a Cauchy sequence. Far more <strong>in</strong>terest<strong>in</strong>g is the fact that<br />

this implication may sometimes be reversed.<br />

Def<strong>in</strong>ition 2.11. A metric space M is called complete if every<br />

Cauchy sequence converges to an element <strong>in</strong> M.<br />

A normed l<strong>in</strong>ear space which is complete is called a Banach space.<br />

If the norm derives from a scalar product, ∞ j=1 |λj| 2 converges and<br />

e1, e2, . . . is an orthonormal sequence we put uk = k j=1 λjej. If k < n<br />

we then have (the second equality is a special case of Lemma 2.3)<br />

un − uk 2 = <br />

n<br />

j=k+1<br />

λjej 2 =<br />

n<br />

|λj| 2 =<br />

j=k+1<br />

n<br />

|λj| 2 −<br />

j=1<br />

k<br />

|λj| 2 .<br />

S<strong>in</strong>ce ∞<br />

j=1 |λj| 2 converges the right hand side → 0 as k, n → ∞. Hence<br />

u1, u2, . . . is a Cauchy sequence <strong>in</strong> L. It therefore follows that if L is<br />

complete, then ∞<br />

j=1 λjej actually converges <strong>in</strong> norm to an element<br />

of L. On the other hand, if L is not complete and e1, e2, . . . is an<br />

orthonormal sequence, then λ1, λ2, . . . may be chosen so that the series<br />

∞<br />

j=1 λjej does not converge <strong>in</strong> L although ∞<br />

j=1 |λj| 2 is convergent<br />

(Exercise 2.14).<br />

Exercises for Chapter 2<br />

Exercise 2.1. Show that if · is a norm on L, then d(x, y) =<br />

u − v is a metric on L.<br />

Exercise 2.2. Show that d(x, y) = arctan|x − y| is a metric on<br />

R which can be extended to a metric on the set of extended reals<br />

R = R ∪ {−∞} ∪ {∞}.<br />

Exercise 2.3. Consider the l<strong>in</strong>ear space C 1 [0, 1], consist<strong>in</strong>g of<br />

complex-valued, differentiable functions with cont<strong>in</strong>uous derivative, def<strong>in</strong>ed<br />

<strong>in</strong> [0, 1]. Show that the follow<strong>in</strong>g are all norms on C 1 [0, 1].<br />

• u∞ = sup0≤x≤1|u(x)| ,<br />

• u1 = 1<br />

|u| , 0<br />

• u1,∞ = u ′ ∞ + u∞.<br />

Invent some more norms <strong>in</strong> the same spirit!<br />

Exercise 2.4. F<strong>in</strong>d all cases of equality <strong>in</strong> Cauchy-Schwarz’ <strong>in</strong>equality<br />

for a scalar product! Then show that ·, def<strong>in</strong>ed by u =<br />

〈u, u〉, where 〈·, ·〉 is a scalar product, is a norm.<br />

j=1


14 2. SPACES WITH SCALAR PRODUCT<br />

Exercise 2.5. Show that 〈u, v〉 = 1<br />

u(x)v(x) dx is a scalar prod-<br />

0<br />

uct on the space C[0, 1] of cont<strong>in</strong>uous, complex-valued functions def<strong>in</strong>ed<br />

on [0, 1].<br />

Exercise 2.6. F<strong>in</strong>ish the proof of Lemma 2.2.<br />

Exercise 2.7. Prove Lemma 2.3.<br />

Exercise 2.8. Prove Bessel’s <strong>in</strong>equality!<br />

Exercise 2.9. Prove Parseval’s formula!<br />

Exercise 2.10. It is well known that the set of step functions which<br />

are identically 0 outside a compact sub<strong>in</strong>terval of an <strong>in</strong>terval I are dense<br />

<strong>in</strong> L 2 (I). Use this to show that L 2 (I) is separable.<br />

Exercise 2.11. Prove Theorem 2.9.<br />

H<strong>in</strong>t: Use Gram-Schmidt!<br />

Exercise 2.12. Let L be the set of complex-valued functions u of<br />

the form u(x) = k j=1 λjeiαjx where α1, . . . , αk are (a f<strong>in</strong>ite number of)<br />

different real numbers and λ1, . . . , λk are complex numbers. Show that<br />

L is a l<strong>in</strong>ear subspace of C(R) (the functions cont<strong>in</strong>uous on the real<br />

uv serves as a scalar product.<br />

l<strong>in</strong>e) on which 〈u, v〉 = limT →∞ 1<br />

2T<br />

T<br />

−T<br />

Then show that the norm of e iαx is 1 for any α ∈ R and that e iαx is<br />

orthogonal to e iβx as soon as α = β. Conclude that L is not separable.<br />

Exercise 2.13. Show that as metric spaces the set Q of rational<br />

numbers is not complete but the set R of reals is.<br />

Exercise 2.14. Suppose L is a space with scalar product which is<br />

not complete, and that e1, e2, . . . is a complete orthonormal sequence<br />

<strong>in</strong> L. Show that there exists a sequence λ1, λ2, . . . of complex numbers,<br />

such that |λj| 2 < ∞ but λjej does not converge to any element<br />

of L.


CHAPTER 3<br />

<strong>Hilbert</strong> space<br />

A <strong>Hilbert</strong> space is a l<strong>in</strong>ear space H (we will as always assume that<br />

the scalars are complex numbers) provided with a scalar product such<br />

that the space is also complete, i.e., any Cauchy sequence (with respect<br />

to the norm <strong>in</strong>duced by the scalar product) converges to an element<br />

of H. We denote the scalar product of u and v ∈ H by 〈u, v〉 and<br />

the norm of u by u = 〈u, u〉. It is usually required, and we will<br />

follow this convention, that the space be separable as well, i.e., there<br />

is a countable, dense subset. Recall that this means that any element<br />

can be arbitrarily well approximated <strong>in</strong> norm by elements of this dense<br />

subset. In the present case this means that H has a complete orthonormal<br />

sequence, and conversely, if the space has a complete orthonormal<br />

sequence it is separable (Theorem 2.9). As is usual we will also assume<br />

that H is <strong>in</strong>f<strong>in</strong>ite-dimensional.<br />

Example 3.1. The space ℓ 2 consists of all <strong>in</strong>f<strong>in</strong>ite sequences u =<br />

(u1, u2, . . . ) of complex numbers for which |uj| 2 < ∞, i.e., which<br />

are square summable. The scalar product of u with v = (v1, v2, . . . ) is<br />

def<strong>in</strong>ed as 〈u, v〉 = ujvj. This series is absolutely convergent s<strong>in</strong>ce<br />

|ujvj| ≤ (|uj| 2 + |vj| 2 )/2 and u, v are square summable. Show that ℓ 2<br />

is a <strong>Hilbert</strong> space (Exercise 3.1)!<br />

The space <strong>Hilbert</strong> himself dealt with was ℓ 2 . Actually, any <strong>Hilbert</strong><br />

space is isometrically isomorphic to ℓ 2 , i.e., there is a bijective (oneto-one<br />

and onto) l<strong>in</strong>ear map H ∋ u ↦→ û ∈ ℓ 2 such that 〈u, v〉 = 〈û, ˆv〉<br />

for any u and v <strong>in</strong> H (Exercise 3.2). This is the reason any complete,<br />

separable and <strong>in</strong>f<strong>in</strong>ite-dimensional space with scalar product is called<br />

a <strong>Hilbert</strong> space. However, there are <strong>in</strong>f<strong>in</strong>itely many isomorphisms that<br />

will serve, and none of them is ‘natural’, i.e., <strong>in</strong> general to be preferred<br />

to any other, so the fact that all <strong>Hilbert</strong> spaces are isomorphic is not<br />

particularly useful <strong>in</strong> practice.<br />

Example 3.2. The most important example of a <strong>Hilbert</strong> space is<br />

L 2 (Ω, µ) where Ω is some doma<strong>in</strong> <strong>in</strong> R n and µ is a (Radon) measure<br />

def<strong>in</strong>ed there; often µ is simply Lebesgue measure. The space consists<br />

of (equivalence classes of) complex-valued functions on Ω, measurable<br />

with respect to µ and with <strong>in</strong>tegrable square over Ω with respect to µ.<br />

That this space is separable and complete is proved <strong>in</strong> courses on the<br />

theory of <strong>in</strong>tegration.<br />

15


16 3. HILBERT SPACE<br />

Given a normed space one may of course ask whether there is a<br />

scalar product on the space which gives rise to the given norm <strong>in</strong> the<br />

usual way. Here is a simple criterion.<br />

Lemma 3.3. (parallelogram identity) If u and v are elements of H,<br />

then<br />

u + v 2 + u − v 2 = 2u 2 + 2v 2 .<br />

Proof. A simple calculation gives u±v 2 = 〈u±v, u±v〉 = u 2 ±<br />

(〈u, v〉 + 〈v, u〉) + v 2 . Add<strong>in</strong>g this for the two signs the parallelogram<br />

identity follows. <br />

The name parallelogram identity comes from the fact that the<br />

lemma can be <strong>in</strong>terpreted geometrically, as say<strong>in</strong>g that the sum of the<br />

squares of the lengths of the sides <strong>in</strong> a parallelogram equals the sum<br />

of the squares of the lengths of the diagonals. This is a theorem that<br />

can be found <strong>in</strong> Euclid’s Elements. Given a normed space, Lemma 3.3<br />

shows that a necessary condition for the norm to be associated with<br />

a scalar product is that the parallelogram identity holds for all vectors<br />

<strong>in</strong> the space. It was proved by von Neumann <strong>in</strong> 1929 that this is<br />

also sufficient (Exercise 3.3). We shall soon have another use for the<br />

parallelogram identity.<br />

In practice it is quite common that one has a space with scalar product<br />

which is not complete (such a space is often called a pre-<strong>Hilbert</strong><br />

space). In order to use <strong>Hilbert</strong> space theory, one must then embed<br />

the space <strong>in</strong> a larger space which is complete. The process is called<br />

completion and is fully analogous to the extension of the rational numbers<br />

to the reals, which is also done to make the Cauchy convergence<br />

pr<strong>in</strong>ciple valid. In very brief outl<strong>in</strong>e the process is as follows. Start<strong>in</strong>g<br />

with a (not complete) normed l<strong>in</strong>ear space L let Lc be the set of all<br />

Cauchy sequences <strong>in</strong> L. The set Lc is made <strong>in</strong>to a l<strong>in</strong>ear space <strong>in</strong> the<br />

obvious way. We may embed L <strong>in</strong> Lc by identify<strong>in</strong>g u ∈ L with the<br />

sequence (u, u, u, . . . ). In Lc we may <strong>in</strong>troduce a semi-norm · (i.e.,<br />

a norm except that there may be non-zero elements u <strong>in</strong> the space for<br />

which u = 0) by sett<strong>in</strong>g (u1, u2, . . . ) = limuj. Now let Nc be the<br />

subspace of Lc consist<strong>in</strong>g of all elements with semi-norm 0, and put<br />

H = Lc/Nc, i.e., elements <strong>in</strong> Lc are identified whenever the distance<br />

between them is 0. One may now prove that · <strong>in</strong>duces a norm on H<br />

under which H is complete, and that through the identification above<br />

we may consider the orig<strong>in</strong>al space L as a dense subset of H. If the<br />

orig<strong>in</strong>al norm came from a scalar product, then so will the norm of H.<br />

We leave to the reader to verify the details, us<strong>in</strong>g the h<strong>in</strong>ts provided<br />

(Exercise 3.4).<br />

The process above is satisfactory <strong>in</strong> that it shows that any normed<br />

space may be ‘completed’ (<strong>in</strong> fact, the same process works <strong>in</strong> any metric<br />

space). Equivalence classes of Cauchy sequences are of course rather abstract<br />

objects, but <strong>in</strong> concrete cases one can often identify the elements


3. HILBERT SPACE 17<br />

of the completion of a given space with more concrete objects. So, for<br />

example, one may view L 2 (Ω, µ) as the completion, <strong>in</strong> the appropriate<br />

norm, of the l<strong>in</strong>ear space C0(Ω) of functions which are cont<strong>in</strong>uous <strong>in</strong> Ω<br />

and 0 outside a compact subset of Ω.<br />

In the sequel H is always assumed to be a <strong>Hilbert</strong> space. There are<br />

two properties which make <strong>Hilbert</strong> spaces far more convenient to deal<br />

with than more general spaces. The first is that any closed, l<strong>in</strong>ear subspace<br />

has a topological complement which can be chosen <strong>in</strong> a canonical<br />

way (Theorem 3.7). The second, a <strong>Hilbert</strong> space can be identified with<br />

its topological dual (Theorem 3.8). Both these properties are actually<br />

true even if the space is not assumed separable (and of course if the<br />

space is f<strong>in</strong>ite-dimensional), as our proofs will show. To prove them we<br />

start with the follow<strong>in</strong>g def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition 3.4. A set M is called convex if it conta<strong>in</strong>s all l<strong>in</strong>esegments<br />

connect<strong>in</strong>g two elements of the set, i.e., if u and v ∈ M, then<br />

tu + (1 − t)v ∈ M for all t ∈ [0, 1].<br />

A subset of a metric space is of course called closed if all limits<br />

of convergent sequences conta<strong>in</strong>ed <strong>in</strong> the subset are themselves <strong>in</strong> the<br />

subset. It is easily seen that this is equivalent to the complement of<br />

the subset be<strong>in</strong>g open, <strong>in</strong> the sense that it is a neighborhood of all its<br />

po<strong>in</strong>ts (check this!).<br />

Lemma 3.5. Any closed, convex subset K of H has a unique element<br />

of smallest norm.<br />

Proof. Put d = <strong>in</strong>f{u | u ∈ K}. Let u1, u2, . . . be a m<strong>in</strong>imiz<strong>in</strong>g<br />

sequence, i.e., uj ∈ K and uj → d. By the parallelogram identity<br />

we then have<br />

uj − uk 2 = 2uj 2 + 2uk 2 − 4(uj + uk)/2 2 .<br />

On the right hand side the two first terms both tend to 2d 2 as j, k → ∞.<br />

By convexity (uj + uk)/2 ∈ K so the last term is ≥ 4d 2 . Therefore<br />

u1, u2, . . . is a Cauchy sequence, and has a limit u which obviously has<br />

norm d and is <strong>in</strong> K, s<strong>in</strong>ce K is closed. If u and v are both m<strong>in</strong>imiz<strong>in</strong>g<br />

elements, replac<strong>in</strong>g uj by u and uk by v <strong>in</strong> the calculation above immediately<br />

shows that u = v, so the m<strong>in</strong>imiz<strong>in</strong>g element is unique. <br />

Lemma 3.6. Suppose M is a proper ( i.e., M = H) closed, l<strong>in</strong>ear<br />

subspace of H. Then there is a non-trivial normal to M, i.e., an<br />

element u = 0 <strong>in</strong> H such that 〈u, v〉 = 0 for all v ∈ M.<br />

Proof. Let w /∈ M and put K = w + M. Then K is obviously<br />

closed and convex so it has a smallest element u which is non-zero<br />

s<strong>in</strong>ce 0 /∈ K. Let v = 0 be <strong>in</strong> M so that u + av ∈ K for any scalar<br />

a. Hence u 2 ≤ u + av 2 = u 2 + 2 Re(a〈v, u〉) + |a| 2 v 2 . Sett<strong>in</strong>g<br />

a = −〈u, v〉/v 2 we obta<strong>in</strong> −(|〈u, v〉|/v) 2 ≥ 0 so that 〈u, v〉 = 0.


18 3. HILBERT SPACE<br />

Two subspaces M and N are said to be orthogonal if every element<br />

<strong>in</strong> M is orthogonal to every element <strong>in</strong> N. Then clearly M ∩ N = {0}<br />

so the direct sum of M and N is def<strong>in</strong>ed. In the case at hand this is<br />

called the orthogonal sum of M and N and denoted by M ⊕ N. Thus<br />

M ⊕ N is the set of all sums u + v with u ∈ M and v ∈ N. If M and<br />

N are closed, orthogonal subspaces of H, then their orthogonal sum is<br />

also a closed subspace of H (Exercise 3.5). If A is an arbitrary subset<br />

of H we def<strong>in</strong>e<br />

A ⊥ = {u ∈ H | 〈u, v〉 = 0 for all v ∈ A}.<br />

This is called the orthogonal complement of A. It is easy to see that<br />

A ⊥ is a closed l<strong>in</strong>ear subspace of H, that A ⊂ B implies B ⊥ ⊂ A ⊥ and<br />

that A ⊂ (A ⊥ ) ⊥ (Exercise 3.6).<br />

When M is a l<strong>in</strong>ear subspace of H an alternative way of writ<strong>in</strong>g<br />

M ⊥ is H ⊖ M. This makes sense because of the follow<strong>in</strong>g theorem of<br />

central importance.<br />

Theorem 3.7. Suppose M is a closed l<strong>in</strong>ear subspace of H. Then<br />

M ⊕ M ⊥ = H.<br />

Proof. M ⊕ M ⊥ is a closed l<strong>in</strong>ear subspace of H so if it is not<br />

all of H, then it has a non-trivial normal u by Lemma 3.6. But if u is<br />

orthogonal to both M and M ⊥ , then u ∈ M ⊥ ∩ (M ⊥ ) ⊥ which shows<br />

that u cannot be = 0. The theorem follows. <br />

A nearly obvious consequence of Theorem 3.7 is that M ⊥⊥ = M<br />

for any closed l<strong>in</strong>ear subspace M of H (Exercise 3.7).<br />

A l<strong>in</strong>ear form ℓ on H is complex-valued l<strong>in</strong>ear function on H. Naturally<br />

ℓ is said to be cont<strong>in</strong>uous if ℓ(uj) → ℓ(u) whenever uj → u. The<br />

set of cont<strong>in</strong>uous l<strong>in</strong>ear forms on a Banach space B (or a more general<br />

topological vector space) is made <strong>in</strong>to a l<strong>in</strong>ear space <strong>in</strong> an obvious way.<br />

This space is called the dual of B, and is denoted by B ′ . A cont<strong>in</strong>uous<br />

l<strong>in</strong>ear form on a Banach space B has to be bounded <strong>in</strong> the sense that<br />

there is a constant C such that |ℓ(u)| ≤ Cu for any u ∈ B. For<br />

suppose not. Then there exists a sequence of elements u1, u2, . . . of<br />

B for which |ℓ(uj)|/uj → ∞. Sett<strong>in</strong>g vj = uj/ℓ(uj) we then have<br />

vj → 0 but |ℓ(vj)| = 1 → 0, so ℓ can not be cont<strong>in</strong>uous. Conversely, if<br />

ℓ is bounded by C then |ℓ(uj) − ℓ(u)| = |ℓ(uj − u)| ≤ Cuj − u → 0 if<br />

uj → u, so a bounded l<strong>in</strong>ear form is cont<strong>in</strong>uous. The smallest possible<br />

bound of a l<strong>in</strong>ear form ℓ is called the norm of ℓ, denoted ℓ.<br />

It is easy to see that provided with this norm B ′ is complete, so the<br />

dual of a Banach space is a Banach space (Exercise 3.8). A familiar<br />

example is given by the space L p (Ω, µ) for 1 ≤ p < ∞, where Ω is<br />

a doma<strong>in</strong> <strong>in</strong> R n and µ a Radon measure def<strong>in</strong>ed <strong>in</strong> Ω. The dual of<br />

this space is Lq (Ω, µ), where q is the conjugate exponent to p, <strong>in</strong> the<br />

sense that 1 1 + = 1. A simple example of a bounded l<strong>in</strong>ear form on<br />

p q<br />

a <strong>Hilbert</strong> space H is ℓ(u) = 〈u, v〉, where v is some fixed element of


3. HILBERT SPACE 19<br />

H. By Cauchy-Schwarz’ <strong>in</strong>equality |ℓ(u)| ≤ vu so ℓ ≤ v. But<br />

ℓ(v) = v 2 so actually ℓ = v. The follow<strong>in</strong>g theorem, which has<br />

far-reach<strong>in</strong>g consequences for many applications to analysis, says that<br />

this is the only k<strong>in</strong>d of bounded l<strong>in</strong>ear form there is on a <strong>Hilbert</strong> space.<br />

In other words, the theorem allows us to identify the dual of a <strong>Hilbert</strong><br />

space with the space itself.<br />

Theorem 3.8 (Riesz’ representation theorem). For any bounded<br />

l<strong>in</strong>ear form ℓ on H there is a unique element v ∈ H such that ℓ(u) =<br />

〈u, v〉 for all u ∈ H. The norm of ℓ is then ℓ = v.<br />

Proof. The uniqueness of v is clear, s<strong>in</strong>ce the difference of two<br />

possible choices of v must be orthogonal to all of H (for example to<br />

itself). If ℓ(u) = 0 for all u then we may take v = 0. Otherwise we set<br />

M = {u ∈ H | ℓ(u) = 0} which is obviously l<strong>in</strong>ear because ℓ is, and<br />

closed s<strong>in</strong>ce ℓ is cont<strong>in</strong>uous. S<strong>in</strong>ce M is not all of H it has a normal w =<br />

0 by Lemma 3.6, and we may assume w = 1. If now u is arbitrary <strong>in</strong><br />

H we put u1 = u − (ℓ(u)/ℓ(w))w so that ℓ(u1) = ℓ(u) − ℓ(u) = 0, i.e.,<br />

u1 ∈ M so 〈u1, w〉 = 0. Hence 〈u, w〉 = (ℓ(u)/ℓ(w))〈w, w〉 = ℓ(u)/ℓ(w)<br />

so ℓ(u) = 〈u, v〉 where v = ℓ(w)w. We have already proved that ℓ =<br />

v. <br />

So far we have tacitly assumed that convergence <strong>in</strong> a <strong>Hilbert</strong> space<br />

means convergence <strong>in</strong> norm, i.e., uj → u means uj − u → 0. This<br />

is called strong convergence; one writes s-lim uj = u or uj → u. There<br />

is also another notion of convergence which is very important. By def<strong>in</strong>ition<br />

uj tends to u weakly, <strong>in</strong> symbols w-lim uj = u or uj ⇀ u, if<br />

〈uj, v〉 → 〈u, v〉 for every v ∈ H. It is obvious that strong convergence<br />

implies weak convergence to the same limit (the scalar product is cont<strong>in</strong>uous<br />

<strong>in</strong> its arguments by Cauchy-Schwarz), but the converse is not<br />

true (Exercise 3.9). We have the follow<strong>in</strong>g important theorem.<br />

Theorem 3.9. Every bounded sequence <strong>in</strong> H has a weakly convergent<br />

subsequence. Conversely, every weakly convergent sequence is<br />

bounded.<br />

Proof. The first claim is a consequence of the weak ∗ compactness<br />

of the unit ball of the dual of a Banach space. S<strong>in</strong>ce we do not<br />

want to assume knowledge of this, we will give a direct proof. To this<br />

end, suppose v1, v2, . . . is the given sequence, bounded by C, and let<br />

e1, e2, . . . be a complete orthonormal sequence <strong>in</strong> H. The numerical<br />

sequence {〈vj, e1〉} ∞ j=1 is then bounded and so has a convergent subsequence,<br />

correspond<strong>in</strong>g to a subsequence {v1j} ∞ j=1 of the v:s, by the<br />

Bolzano-Weierstrass theorem. The numerical sequence {〈v1j, e2〉} ∞ j=1<br />

is aga<strong>in</strong> bounded, so it has a convergent subsequence, correspond<strong>in</strong>g<br />

to a subsequence {v2j} ∞ j=1 of {v1j} ∞ j=1. Proceed<strong>in</strong>g <strong>in</strong> this manner we<br />

get a sequence of sequences {vkj} ∞ j=1, k = 1, 2, . . . , each element of<br />

which is a subsequence of those preced<strong>in</strong>g it, and with the property that


20 3. HILBERT SPACE<br />

ˆvn = limj→∞〈vnj, en〉 exists. I claim that {vjj} ∞ j=1 converges weakly to<br />

v = ˆvnen. Note that {〈vjj, en〉} ∞ j=1 converges to ˆvn s<strong>in</strong>ce it is a subsequence<br />

of {〈vnj, en〉} ∞ j=1 from j = n on. Furthermore N<br />

n=1 |ˆvn| 2 ≤ C 2<br />

for all N s<strong>in</strong>ce it is the limit as j → ∞ of N<br />

n=1 |〈vNj, en〉| 2 which<br />

by Bessel’s <strong>in</strong>equality is bounded by vNj 2 ≤ C 2 . It follows that<br />

∞<br />

n=1 |ˆvn| 2 ≤ C 2 so that v is actually an element of H.<br />

To show the weak convergence, let u = ûnen be arbitrary <strong>in</strong><br />

H. Suppose ε > 0 given arbitrarily. Writ<strong>in</strong>g u = u ′ + u ′′ where<br />

u ′ = N<br />

n=1 ûnen we may now choose N so large that u ′′ < ε so that<br />

|〈vjj, u ′′ 〉| < Cε. Furthermore |〈v, u ′′ 〉| < Cε and 〈vjj, u ′ 〉 → 〈v, u ′ 〉<br />

so limj→∞|〈vjj, u〉 − 〈v, u〉| ≤ 2Cε. S<strong>in</strong>ce ε > 0 is arbitrary the weak<br />

convergence follows.<br />

The converse is an immediate consequence of the Banach-Ste<strong>in</strong>haus<br />

pr<strong>in</strong>ciple of uniform boundedness.<br />

Theorem 3.10 (Banach-Ste<strong>in</strong>haus). Let ℓ1, ℓ2, . . . be a sequence of<br />

bounded l<strong>in</strong>ear forms on a Banach space B which is po<strong>in</strong>twise bounded,<br />

i.e., such that for each u ∈ B the sequence ℓ1(u), ℓ2(u), . . . is bounded.<br />

Then ℓ1, ℓ2, . . . is uniformly bounded, i.e., there is a constant C such<br />

that |ℓj(u)| ≤ Cu for every u ∈ B and j = 1, 2, . . . .<br />

Assum<strong>in</strong>g Theorem 3.10 (for a proof, see Appendix A), we can<br />

complete the proof of Theorem 3.9, s<strong>in</strong>ce a weakly convergent sequence<br />

v1, v2, . . . can be identified with a sequence of l<strong>in</strong>ear forms ℓ1, ℓ2, . . .<br />

by sett<strong>in</strong>g ℓj(u) = 〈u, vj〉. S<strong>in</strong>ce a convergent sequence of numbers<br />

is bounded it follows that we have a po<strong>in</strong>twise bounded sequence of<br />

l<strong>in</strong>ear functionals. By Theorem 3.10 there is a constant C such that<br />

|〈u, vj〉| ≤ Cu for every u ∈ H and j = 1, 2, . . . . In particular,<br />

sett<strong>in</strong>g u = vj gives vj ≤ C for every j.


EXERCISES FOR CHAPTER 3 21<br />

Exercises for Chapter 3<br />

Exercise 3.1. Prove the completeness of ℓ 2 !<br />

H<strong>in</strong>t: Given a Cauchy sequence show first that each coord<strong>in</strong>ate converges.<br />

Exercise 3.2. Prove that any <strong>Hilbert</strong> space is isometrically isomorphic<br />

to ℓ 2 , i.e., there is a bijective (one-to-one and onto) l<strong>in</strong>ear<br />

map H ∋ u ↦→ û ∈ ℓ 2 such that 〈u, v〉 = 〈û, ˆv〉 for any u and v <strong>in</strong> H.<br />

Exercise 3.3. Suppose L is a l<strong>in</strong>ear space with norm · which<br />

satisfies the parallelogram identity for all u, v ∈ L. Show that 〈u, v〉 =<br />

1<br />

4<br />

3<br />

k=0 ik u + i k v 2 is a scalar product on L.<br />

H<strong>in</strong>t: Show first that 〈u, u〉 = u 2 , that 〈v, u〉 = 〈u, v〉 and that<br />

〈iu, v〉 = i〈u, v〉. Then show that 〈u + v, w〉 − 〈u, w〉 − 〈v, w〉 = 0 and<br />

from that 〈λu, v〉 = λ〈u, v〉 for any rational number λ. F<strong>in</strong>ally use<br />

cont<strong>in</strong>uity.<br />

Exercise 3.4. Show that the semi-norm on the space Lc def<strong>in</strong>ed<br />

<strong>in</strong> the text is well-def<strong>in</strong>ed, i.e., that the limit limuj exists for any<br />

element (u1, u2, . . . ) ∈ Lc. Then verify that H = Lc/Nc can be given a<br />

norm under which it is complete, that L may be viewed as isometrically<br />

and densely embedded <strong>in</strong> H, and that H is a Euclidean space (a space<br />

with scalar product) if L is.<br />

Exercise 3.5. Show that if M and N are closed, orthogonal subspaces<br />

of H, then also M ⊕ N is closed.<br />

Exercise 3.6. Show that is A ⊂ H, then A ⊥ is a closed l<strong>in</strong>ear<br />

subspace of H, that A ⊂ B implies B ⊥ ⊂ A ⊥ and that A ⊂ (A ⊥ ) ⊥ .<br />

Exercise 3.7. Verify that M ⊥⊥ = M for any closed l<strong>in</strong>ear subspace<br />

M of H, and also that for an arbitrary set A ⊂ H the smallest closed<br />

l<strong>in</strong>ear subspace conta<strong>in</strong><strong>in</strong>g A is A ⊥⊥ .<br />

Exercise 3.8. Show that a bounded l<strong>in</strong>ear form on a Banach space<br />

B has a least bound, which is a norm on B ′ , and that B ′ is complete<br />

under this norm.<br />

Exercise 3.9. Show that an orthonormal sequence does not converge<br />

strongly to anyth<strong>in</strong>g but tends weakly to 0. Conclude that if <strong>in</strong> a<br />

Euclidean space every weakly convergent sequence is convergent, then<br />

the space is f<strong>in</strong>ite-dimensional.<br />

H<strong>in</strong>t: Show that the distance between two arbitrary elements <strong>in</strong> the<br />

sequence is √ 2 and use Bessel’s <strong>in</strong>equality to show weak convergence<br />

to 0.


CHAPTER 4<br />

Operators<br />

A bounded l<strong>in</strong>ear operator from a Banach space B1 to another Banach<br />

space B2 is a l<strong>in</strong>ear mapp<strong>in</strong>g T : B1 → B2 such that for some<br />

constant C we have T u2 ≤ Cu1 for every u ∈ B1. The smallest<br />

such constant C is called the norm of the operator T and denoted by<br />

T . Like <strong>in</strong> the discussion of l<strong>in</strong>ear forms <strong>in</strong> the last chapter it follows<br />

that the boundedness of T is equivalent to cont<strong>in</strong>uity, <strong>in</strong> the sense that<br />

T uj − T u2 → 0 if uj − u1 → 0 (Exercise 4.1). If B1 = B2 = B one<br />

says that T is an operator on B. The operator-norm def<strong>in</strong>ed above has<br />

the follow<strong>in</strong>g properties (Here T : B1 → B2 and S are bounded l<strong>in</strong>ear<br />

operators, and B1, B2 and B3 Banach spaces).<br />

(1) T ≥ 0, equality only if T = 0,<br />

(2) λT = |λ|T for any λ ∈ C,<br />

(3) S + T ≤ S + T if S : B1 → B2,<br />

(4) ST ≤ ST if S : B2 → B3.<br />

We leave the proof to the reader (Exercise 4.1). Thus we have made the<br />

set of bounded operators from B1 to B2 <strong>in</strong>to a normed space B(B1, B2).<br />

In fact, B(B1, B2) is a Banach space (Exercise 4.2). We write B(B)<br />

for the bounded operators on B. Because of the property (4) B(B) is<br />

called a Banach algebra.<br />

Now let H1 and H2 be <strong>Hilbert</strong> spaces. Then every bounded operator<br />

T : H1 → H2 has an adjo<strong>in</strong>t 1 T ∗ : H2 → H1 def<strong>in</strong>ed as follows. Consider<br />

a fixed element v ∈ H2 and the l<strong>in</strong>ear form H1 ∋ u ↦→ 〈T u, v〉2<br />

which is obviously bounded by T v2. By the Riesz’ representation<br />

theorem there is therefore a unique element v ∗ ∈ H1, such that<br />

〈T u, v〉2 = 〈u, v ∗ 〉1. By the uniqueness, and s<strong>in</strong>ce 〈T u, v〉2 depends<br />

anti-l<strong>in</strong>early on v, it follows that T ∗ : v ↦→ v ∗ is a l<strong>in</strong>ear operator from<br />

H2 to H1. It is also bounded, s<strong>in</strong>ce v ∗ 2 1 = 〈T v ∗ , v〉2 ≤ T v ∗ 1v2,<br />

so that T ∗ ≤ T . The adjo<strong>in</strong>t has the follow<strong>in</strong>g properties.<br />

Proposition 4.1. The adjo<strong>in</strong>t operation B(H1, H2) ∋ T ↦→ T ∗ ∈<br />

B(H2, H1) has the properties:<br />

(1) (T1 + T2) ∗ = T ∗ 1 + T ∗ 2 ,<br />

(2) (λT ) ∗ = λT ∗ for any complex number λ,<br />

(3) (T2T1) ∗ = T ∗ 1 T ∗ 2 if T2 : H2 → H3,<br />

(4) T ∗∗ = T ,<br />

1 Also operators between general Banach spaces, or even more general topological<br />

vector spaces, have adjo<strong>in</strong>ts, but they will not concern us here.<br />

23


24 4. OPERATORS<br />

(5) T ∗ = T ,<br />

(6) T ∗ T = T 2 .<br />

Proof. The first four properties are very easy to show and are<br />

left as exercises for the reader. To prove (5), note that we already<br />

have shown that T ∗ ≤ T and comb<strong>in</strong><strong>in</strong>g this with (4) gives the<br />

opposite <strong>in</strong>equality. Use of (5) shows that T ∗ T ≤ T ∗ T =<br />

T 2 and the opposite <strong>in</strong>equality follows from T u 2 2 = 〈T ∗ T u, u〉1 ≤<br />

T ∗ T u1u1 ≤ T ∗ T u 2 1 so (6) follows. The reader is asked to fill<br />

<strong>in</strong> the details miss<strong>in</strong>g <strong>in</strong> the proof (Exercise 4.3). <br />

If H1 = H2 = H3 = H, then the properties (1)–(4) above are the<br />

properties required for the star operation to be called an <strong>in</strong>volution<br />

on the algebra B(H), and a Banach algebra with an <strong>in</strong>volution, also<br />

satisfy<strong>in</strong>g (5) and (6), is called a B ∗ algebra. There are no less than<br />

three different useful notions of convergence for operators <strong>in</strong> B(H1, H2).<br />

We say that Tj tends to T<br />

• uniformly if Tj − T → 0, denoted by Tj ⇒ T ,<br />

• strongly if Tju−T u2 → 0 for every u ∈ H1, denoted Tj → T ,<br />

and<br />

• weakly if 〈Tju, v〉2 → 〈T u, v〉2 for all u ∈ H1 and v ∈ H2,<br />

denoted Tj ⇀ T .<br />

It is clear that uniform convergence implies strong convergence and<br />

strong convergence implies weak convergence, and it is also easy to see<br />

that neither of these implications can be reversed.<br />

Of particular <strong>in</strong>terest are so called projection operators. A projection<br />

P on H is an operator <strong>in</strong> B(H) for which P 2 = P . If P is<br />

a projection then so is I − P , where I is the identity on H, s<strong>in</strong>ce<br />

(I − P )(I − P ) = I − P − P + P 2 = I − P . Sett<strong>in</strong>g M = P H and<br />

N = (I − P )H it follows that M is the null-space of I − P s<strong>in</strong>ce M<br />

clearly consist of those elements u ∈ H for which P u = u. Similarly<br />

N is the null-space of P . S<strong>in</strong>ce P and I − P are bounded (i.e., cont<strong>in</strong>uous)<br />

it therefore follows that M and N are closed. It also follows<br />

that M ∩ N = {0} and the direct sum M ˙+N of M and N is H (this<br />

means that any element of H can be written uniquely as u + v with<br />

u ∈ M and v ∈ N). Conversely, if M and N are l<strong>in</strong>ear subspaces of H,<br />

M ∩ N = {0} and M ˙+N = H, then we may def<strong>in</strong>e a l<strong>in</strong>ear map P satisfy<strong>in</strong>g<br />

P 2 = P by sett<strong>in</strong>g P w = u if w = u +v with u ∈ M and v ∈ N.<br />

As we have seen P can not be bounded unless M and N are closed.<br />

There is a converse to this: If M and N are closed, then P is bounded.<br />

This follows immediately from the closed graph theorem (Exercise 4.4).<br />

In the case when the projection P , and thus also I − P , is bounded,<br />

the direct sum M ∔ N is called topological. If M and N happen to be<br />

orthogonal subspaces P is called an orthogonal projection. Obviously<br />

N = M ⊥ then, s<strong>in</strong>ce the direct sum of M and N is all of H. We have<br />

the follow<strong>in</strong>g characterization of orthogonal projections.


4. OPERATORS 25<br />

Proposition 4.2. A projection P is orthogonal if and only if it<br />

satisfies P ∗ = P .<br />

Proof. If P ∗ = P and u ∈ M, v ∈ N, then 〈u, v〉 = 〈P u, v〉 =<br />

〈u, P ∗ v〉 = 〈u, P v〉 = 〈u, 0〉 = 0 so M and N are orthogonal. Conversely,<br />

suppose M and N orthogonal. For arbitrary u, v ∈ H we<br />

then have 〈P u, v〉 = 〈P u, P v〉 + 〈P u, (I − P )v〉 = 〈P u, P v〉 so that<br />

also 〈u, P v〉 = 〈P u, P v〉. Hence 〈P u, v〉 = 〈u, P v〉 holds generally, i.e.,<br />

P ∗ = P . <br />

An operator T for which T ∗ = T is called selfadjo<strong>in</strong>t. Hence an<br />

orthogonal projection is the same as a selfadjo<strong>in</strong>t projection. We will<br />

have much more to say about selfadjo<strong>in</strong>t operators <strong>in</strong> a more general<br />

context later. Another class of operators of great <strong>in</strong>terest are the unitary<br />

operators. This is an operator U : H1 → H2 for which U ∗ = U −1 .<br />

S<strong>in</strong>ce 〈Uu, Uv〉2 = 〈U ∗ Uu, v〉1 = 〈u, v〉1 the operator U preserves the<br />

scalar product; such an operator is called isometric. If U is isometric<br />

we have 〈u, v〉1 = 〈Uu, Uv〉2 = 〈U ∗ Uu, v〉1, so that U ∗ is a left<br />

<strong>in</strong>verse of U for any isometric operator. If dim H1 = dim H2 < ∞,<br />

then a left <strong>in</strong>verse of a l<strong>in</strong>ear operator is also a right <strong>in</strong>verse, so <strong>in</strong><br />

this case isometric and unitary (orthogonal <strong>in</strong> the case of a real space)<br />

are the same th<strong>in</strong>g. If dim H1 = dim H2 or both spaces are <strong>in</strong>f<strong>in</strong>itedimensional,<br />

however, this is not the case. For example, <strong>in</strong> the space<br />

ℓ 2 we may def<strong>in</strong>e U(x1, x2, . . . ) = (0, x1, x2, . . . ), which is obviously<br />

isometric (this is a so called shift operator), but the vector (1, 0, 0, . . . )<br />

is not the image of anyth<strong>in</strong>g, so the operator is not unitary. Its adjo<strong>in</strong>t<br />

is U ∗ (x1, x2, . . . ) = (x2, x3, . . . ), which is only a partial isometry,<br />

namely an isometry on the vectors orthogonal to (1, 0, 0, . . . ). See also<br />

Exercise 4.8.<br />

It is never possible to <strong>in</strong>terpret a differential operator as a bounded<br />

operator on some <strong>Hilbert</strong> space of functions. We therefore need to<br />

discuss unbounded operators as well. Similarly, we will need to discuss<br />

operators that are not def<strong>in</strong>ed on all of H. Thus we now consider a<br />

l<strong>in</strong>ear operator T : D(T ) → H2, where the doma<strong>in</strong> D(T ) of T is some<br />

l<strong>in</strong>ear subset of H1. T is not supposed bounded. Another such operator<br />

S is said to be an extension of T if D(T ) ⊂ D(S) and Su = T u for<br />

every u ∈ D(T ). We then write T ⊂ S. We must discuss the concept<br />

of adjo<strong>in</strong>t. The form u ↦→ 〈T u, v〉2 is, for fixed v ∈ H2, only def<strong>in</strong>ed for<br />

u ∈ D(T ), and though l<strong>in</strong>ear not necessarily bounded, so there may not<br />

be any v ∗ ∈ H1 such that 〈T u, v〉2 = 〈u, v ∗ 〉1 for all u ∈ D(T ). Even<br />

if there is, it may not be uniquely determ<strong>in</strong>ed, s<strong>in</strong>ce if w ∈ D(T ) ⊥ we<br />

could replace v ∗ by v ∗ + w with no change <strong>in</strong> 〈u, v ∗ 〉. We therefore<br />

make the basic assumption that D(T ) ⊥ = {0}, i.e., D(T ) is dense <strong>in</strong><br />

H1. T is then said to be densely def<strong>in</strong>ed 2 . In this case v ∗ ∈ H1 is<br />

ter 9.<br />

2 We will discuss the case of an operator which is not densely def<strong>in</strong>ed <strong>in</strong> Chap


26 4. OPERATORS<br />

clearly uniquely determ<strong>in</strong>ed by v ∈ H2, if it exists. It is also obvious<br />

that v ∗ depends l<strong>in</strong>early on v, so we def<strong>in</strong>e D(T ∗ ) to be those v ∈ H2<br />

for which we can f<strong>in</strong>d a v ∗ ∈ H1, and set T ∗ v = v ∗ . There is no reason<br />

to expect the adjo<strong>in</strong>t T ∗ to be densely def<strong>in</strong>ed. In fact, we may have<br />

D(T ∗ ) = {0}, so T ∗ may not itself have an adjo<strong>in</strong>t. To understand this<br />

rather confus<strong>in</strong>g situation it turns out to be useful to consider graphs<br />

of operators.<br />

The graph of T is the set GT = {(u, T u) | u ∈ D(T )}. This set is<br />

clearly l<strong>in</strong>ear and may be considered a l<strong>in</strong>ear subset of the orthogonal<br />

direct sum H1 ⊕ H2, consist<strong>in</strong>g of all pairs (u1, u2) with u1 ∈ H1 and<br />

u2 ∈ H2 with the natural l<strong>in</strong>ear operations and provided with the scalar<br />

product 〈(u1, u2), (v1, v2)〉 = 〈u1, v1〉1 + 〈u2, v2〉2. This makes H1 ⊕ H2<br />

<strong>in</strong>to a <strong>Hilbert</strong> space (Exercise 4.6).<br />

We now def<strong>in</strong>e the boundary operator U : H1 ⊕ H2 → H2 ⊕ H1 by<br />

U(u1, u2) = (−iu2, iu1) (the term<strong>in</strong>ology is expla<strong>in</strong>ed <strong>in</strong> Chapter 9). It<br />

is clear that U is isometric and surjective (onto H2 ⊕ H1). It follows<br />

that U is unitary. If H1 = H2 = H it is clear that U is selfadjo<strong>in</strong>t and<br />

<strong>in</strong>volutary (i.e., U 2 is the identity). Now put<br />

(4.1) (GT ) ∗ := U((H1 ⊕ H2) ⊖ GT ) = (H2 ⊕ H1) ⊖ UGT .<br />

The second equality is left to the reader to verify who should also<br />

verify that (GT ) ∗ is a graph of an operator (i.e., the second component<br />

of each element <strong>in</strong> (GT ) ∗ is uniquely determ<strong>in</strong>ed by the first) if and<br />

only if T is densely def<strong>in</strong>ed. If T is densely def<strong>in</strong>ed we now def<strong>in</strong>e T ∗<br />

to be the operator whose graph is (GT ) ∗ . This means that T ∗ is the<br />

operator whose graph consists of all pairs (v, v ∗ ) ∈ H2 ⊕ H1 such that<br />

〈T u, v〉2 = 〈u, v ∗ 〉1 for all u ∈ D(T ), i.e., our orig<strong>in</strong>al def<strong>in</strong>ition. An<br />

immediate consequence of (4.1) is that T ⊂ S implies S ∗ ⊂ T ∗ .<br />

We say that an operator is closed if its graph is closed as a subspace<br />

of H1 ⊕ H2. This is an important property; <strong>in</strong> many ways the<br />

property of be<strong>in</strong>g closed is almost as good as be<strong>in</strong>g bounded. An everywhere<br />

def<strong>in</strong>ed operator is actually closed if and only if it is bounded<br />

(Exercise 4.7). It is clear that all adjo<strong>in</strong>ts, hav<strong>in</strong>g graphs that are orthogonal<br />

complements, are closed. Not all operators are closeable, i.e.,<br />

have closed extensions; for this is required that the closure GT of GT<br />

is a graph. But it is clear from (4.1) that the closure of the graph is<br />

(GT ∗)∗ . So, we have proved the follow<strong>in</strong>g proposition.<br />

Proposition 4.3. Suppose T is a densely def<strong>in</strong>ed operator <strong>in</strong> a<br />

<strong>Hilbert</strong> space H. Then T is closeable if and only if the adjo<strong>in</strong>t T ∗ is<br />

densely def<strong>in</strong>ed. The smallest closed extension (the closure) T of T is<br />

then T ∗∗ .<br />

The proof is left to Exercise 4.9. Note that if T is closed, its doma<strong>in</strong><br />

D(T ) becomes a <strong>Hilbert</strong> space if provided by the scalar product<br />

〈u, v〉T = 〈u, v〉1 + 〈T u, T v〉2.


4. OPERATORS 27<br />

In the rest of this chapter we assume that H1 = H2 = H. A densely<br />

def<strong>in</strong>ed operator T is then said to be symmetric if T ⊂ T ∗ . In other<br />

words, if 〈T u, v〉 = 〈u, T v〉 for all u, v ∈ D(T ). Thus 〈T u, u〉 is always<br />

real for a symmetric operator. It therefore makes sense to say that<br />

a symmetric operator is positive if 〈T u, u〉 ≥ 0 for all u ∈ D(T ). A<br />

densely def<strong>in</strong>ed symmetric operator is always closeable s<strong>in</strong>ce T ∗ is automatically<br />

densely def<strong>in</strong>ed, be<strong>in</strong>g an extension of T . If actually T = T ∗<br />

the operator is said to be selfadjo<strong>in</strong>t. This is an important property<br />

because these are the operators for which we will prove the spectral<br />

theorem. In practice it is usually quite easy to see if an operator is<br />

symmetric, but much more difficult to decide whether a symmetric operator<br />

is selfadjo<strong>in</strong>t. When one wants to <strong>in</strong>terpret a differential operator<br />

as a <strong>Hilbert</strong> space operator one has to choose a doma<strong>in</strong> of def<strong>in</strong>ition;<br />

<strong>in</strong> many cases it is clear how one may choose a dense doma<strong>in</strong> so that<br />

the operator becomes symmetric. With luck this operator may have<br />

a selfadjo<strong>in</strong>t closure 3 , <strong>in</strong> which case the operator is said to be essentially<br />

selfadjo<strong>in</strong>t. Otherwise, given a symmetric T , one will look for<br />

selfadjo<strong>in</strong>t extensions of T . If S is a symmetric extension of T , we get<br />

T ⊂ S ⊂ S ∗ ⊂ T ∗ so that any selfadjo<strong>in</strong>t extension of T is a restriction<br />

of the adjo<strong>in</strong>t T ∗ . There is now obviously a need for a theory of<br />

symmetric extensions of a symmetric operator. We will postpone the<br />

discussion of this until Chapter 9. Right now we will <strong>in</strong>stead study<br />

some very simple, but typical, examples.<br />

Example 4.4. Consider the differential operator d<br />

on some open<br />

dx<br />

<strong>in</strong>terval I. We want to <strong>in</strong>terpret it as a densely def<strong>in</strong>ed operator <strong>in</strong><br />

the <strong>Hilbert</strong> space L2 (I) and so must choose a suitable doma<strong>in</strong>. A convenient<br />

choice, which would work for any differential operator with<br />

smooth coefficients, is the set C∞ 0 (I) of <strong>in</strong>f<strong>in</strong>itely differentiable functions<br />

on I with compact support, i.e., each function is 0 outside some<br />

compact subset of I. It is well known that C∞ 0 (I) is dense <strong>in</strong> L2 (I).<br />

Let us denote the correspond<strong>in</strong>g operator T0; it is usually called the<br />

m<strong>in</strong>imal operator for d . Sometimes it is the closure of this operator<br />

dx<br />

which is called the m<strong>in</strong>imal operator, but this will make no difference<br />

to the calculations <strong>in</strong> the sequel. We now need to calculate the adjo<strong>in</strong>t<br />

of the m<strong>in</strong>imal operator.<br />

Let v ∈ D(T ∗ 0 ). This means that there is an element v∗ ∈ L2 (I)<br />

such that <br />

I ϕ′ v = <br />

I ϕv∗ for all ϕ ∈ C∞ 0 (I) and that T ∗ 0 v = v∗ . In-<br />

tegrat<strong>in</strong>g by parts we have <br />

I ϕv∗ = − <br />

I (ϕ′ v∗ ) s<strong>in</strong>ce the boundary<br />

terms vanish. Here v∗ denotes any <strong>in</strong>tegral function of v∗ . Thus we<br />

have <br />

I ϕ′ (v + v∗ ) = 0 for all ϕ ∈ C∞ 0 (I). We need the follow<strong>in</strong>g<br />

lemma.<br />

3 This is the same as T ∗ be<strong>in</strong>g selfadjo<strong>in</strong>t. Show this!


28 4. OPERATORS<br />

Lemma 4.5 (du Bois Reymond). Suppose u is locally square <strong>in</strong>tegrable<br />

on R, i.e., u ∈ L 2 (I) for every bounded real <strong>in</strong>terval I. Also<br />

suppose that uϕ ′ = 0 for every ϕ ∈ C ∞ 0 (R). Then u is (almost everywhere)<br />

equal to a constant.<br />

Assum<strong>in</strong>g the truth of the lemma for the moment it follows that,<br />

choos<strong>in</strong>g the appropriate representative <strong>in</strong> the equivalence class of v,<br />

v + v ∗ is constant. Hence v is locally absolutely cont<strong>in</strong>uous with<br />

derivative −v ∗ . It follows that D(T ∗ 0 ) consists of functions <strong>in</strong> L 2 (I)<br />

which are locally absolutely cont<strong>in</strong>uous <strong>in</strong> I with derivative <strong>in</strong> L 2 (I),<br />

and that T ∗ 0 v = −v ′ . Conversely, all such functions are <strong>in</strong> D(T ∗ 0 ),<br />

as follows immediately by partial <strong>in</strong>tegration <strong>in</strong> <br />

I ϕ′ v = <br />

I ϕv∗ . The<br />

operator T ∗ 0 is therefore also a differential operator, generated by − d<br />

dx .<br />

The differential operator − d<br />

dx is called the formal adjo<strong>in</strong>t of d<br />

dx and<br />

the operator T ∗ 0 is called the maximal operator belong<strong>in</strong>g to − d<br />

dx<br />

. In<br />

the same way any l<strong>in</strong>ear differential operator (with sufficiently smooth<br />

coefficients) has a formal adjo<strong>in</strong>t, obta<strong>in</strong>ed by <strong>in</strong>tegration by parts.<br />

For ord<strong>in</strong>ary differential operators with smooth coefficients one can<br />

always calculate adjo<strong>in</strong>ts <strong>in</strong> essentially the way we just did; for partial<br />

differential operators matters are more subtle and one needs to use the<br />

language of distribution theory.<br />

Proof of Lemma 4.5. Let ψ ∈ C∞ 0 (R) and assume that ψ = 1.<br />

Given ϕ ∈ C∞ 0 (R) we put ϕ0(x) = ψ(x) ∞<br />

−∞ϕ and Φ(x) = x<br />

−∞ (ϕ−ϕ0).<br />

It is clear that Φ is <strong>in</strong>f<strong>in</strong>itely differentiable. It also has compact support<br />

(why?), so ∞<br />

−∞uΦ′ = 0 by assumption. But ∞<br />

−∞uΦ′ = ∞<br />

uϕ −<br />

−∞ ∞<br />

−∞uψ ∞<br />

−∞ϕ so that ∞<br />

−∞ (u − K)ϕ = 0 where K = ∞<br />

uψ does not<br />

−∞<br />

depend on ϕ. S<strong>in</strong>ce C∞ 0 (R) is dense <strong>in</strong> L2 (R) this proves that u = K<br />

a.e., so that u is constant. <br />

For the m<strong>in</strong>imal operator of a differential operator to be symmetric<br />

it is clear that the differential operator has to be formally symmetric,<br />

i.e., the formal adjo<strong>in</strong>t has to co<strong>in</strong>cide with the orig<strong>in</strong>al operator.<br />

In Example 4.4 D(T0) ⊂ D(T ∗ 0 ) but there is a m<strong>in</strong>us sign prevent<strong>in</strong>g<br />

T0 from be<strong>in</strong>g symmetric. However, it is clear that had we started<br />

with the differential operator −i d<br />

dx<br />

<strong>in</strong>stead, then the m<strong>in</strong>imal opera-<br />

tor would have been symmetric, but the doma<strong>in</strong>s of the m<strong>in</strong>imal and<br />

maximal operators unchanged. One may then ask for possible selfadjo<strong>in</strong>t<br />

extensions of the m<strong>in</strong>imal operator, or equivalently for selfadjo<strong>in</strong>t<br />

restrictions of the maximal operator.<br />

Example 4.6. Let T1 be the maximal operator of −i d<br />

dx<br />

on the<br />

<strong>in</strong>terval I. Let u, v ∈ D(T1) and a, b ∈ I. Then b<br />

a T1uv − b<br />

a uT1v =<br />

−i b<br />

a (u′ v +uv ′ ) = iu(a)v(a)−iu(b)v(b). S<strong>in</strong>ce u, v, T1u and T1v are all<br />

<strong>in</strong> L 2 (I) the limit of uv exists <strong>in</strong> both endpo<strong>in</strong>ts of I. Consider the case<br />

I = R. S<strong>in</strong>ce |u(x)| 2 has limits as x → ±∞ and is <strong>in</strong>tegrable, the limits


4. OPERATORS 29<br />

must both be 0. Hence 〈T1u, v〉 − 〈u, T1v〉 = 0 for any u, v ∈ D(T1), so<br />

the maximal operator is symmetric and therefore selfadjo<strong>in</strong>t (how does<br />

this follow?). It also follows that the maximal operator is the closure of<br />

the m<strong>in</strong>imal operator so the m<strong>in</strong>imal operator is essentially selfadjo<strong>in</strong>t.<br />

Example 4.7. Consider the same operator as <strong>in</strong> Example 4.6 but<br />

for the <strong>in</strong>terval (0, ∞). If u ∈ D(T1) we obta<strong>in</strong> 〈T1u, u〉 − 〈u, T1u〉 =<br />

i|u(0)| 2 . To have a symmetric restriction of T1 we must therefore require<br />

u(0) = 0, and with this restriction on the doma<strong>in</strong> of T1 we obta<strong>in</strong> a<br />

maximal symmetric operator T . If now u ∈ D(T ) and v ∈ D(T1) we<br />

obta<strong>in</strong> 〈T u, v〉−〈u, T1v〉 = iu(0)v(0) = 0 so that T ∗ = T1. T is therefore<br />

not selfadjo<strong>in</strong>t so no matter how we choose the doma<strong>in</strong> the differential<br />

, though formally symmetric, will not be selfadjo<strong>in</strong>t <strong>in</strong><br />

operator −i d<br />

dx<br />

L2 (0, ∞). One says that −i d<br />

dx has no selfadjo<strong>in</strong>t realization <strong>in</strong> L2 (0, ∞).<br />

Example 4.8. We f<strong>in</strong>ally consider the operator of Example 4.6 for<br />

the <strong>in</strong>terval (−π, π). We now have<br />

(4.2) 〈T1u, v〉 − 〈u, T1v〉 = −i(u(π)v(π) − u(−π)v(−π)).<br />

In particular, for u = v it follows that for u to be <strong>in</strong> the doma<strong>in</strong> of a<br />

symmetric restriction of T1 we must require |u(π)| = |u(−π)| so that u<br />

satisfies the boundary condition u(π) = e iθ u(−π) for some real θ. From<br />

(4.2) then follows that if v is <strong>in</strong> the doma<strong>in</strong> of the adjo<strong>in</strong>t, then v will<br />

have to satisfy the same boundary condition. On the other hand, if we<br />

impose this condition, then the result<strong>in</strong>g operator will be selfadjo<strong>in</strong>t<br />

(because its adjo<strong>in</strong>t will be symmetric). It follows that restrict<strong>in</strong>g the<br />

doma<strong>in</strong> of T1 by such a boundary condition is exactly what is required<br />

to obta<strong>in</strong> a selfadjo<strong>in</strong>t restriction. Each θ <strong>in</strong> [0, 2π) gives a different<br />

selfadjo<strong>in</strong>t realization, but there are no others.<br />

The examples show that there may be a unique selfadjo<strong>in</strong>t realization<br />

of our formally symmetric differential operator, none at all, or<br />

<strong>in</strong>f<strong>in</strong>itely many depend<strong>in</strong>g on circumstances. It can be a very difficult<br />

problem to decide which of these possibilities occur <strong>in</strong> a given case.<br />

In particular, much effort has been devoted to decide whether a given<br />

differential operator on a given doma<strong>in</strong> has a unique selfadjo<strong>in</strong>t realization.


30 4. OPERATORS<br />

Exercises for Chapter 4<br />

Exercise 4.1. Prove that boundedness is equivalent to cont<strong>in</strong>uity<br />

for a l<strong>in</strong>ear operator between normed spaces. Then prove the properties<br />

of the operator norm listed at the beg<strong>in</strong>n<strong>in</strong>g of the chapter.<br />

Exercise 4.2. Suppose B1 and B2 are Banach spaces. Show that<br />

so is B(B1, B2).<br />

Exercise 4.3. Fill <strong>in</strong> the details of the proof of Proposition 4.1.<br />

Exercise 4.4. Show that if M and N are closed subspaces of H<br />

with M ∩N = {0} and M ˙+N = H, then the correspond<strong>in</strong>g projections<br />

onto M and N are bounded operators.<br />

H<strong>in</strong>t: The closed graph theorem!<br />

Exercise 4.5. Show that a non-trivial (i.e., the range is not {0})<br />

projection is orthogonal if and only if its operator norm is 1.<br />

Exercise 4.6. Suppose H1 and H2 are <strong>Hilbert</strong> spaces. Show that<br />

the orthogonal direct sum H1 ⊕ H2 is also a <strong>Hilbert</strong> space.<br />

Exercise 4.7. Show that a bounded, everywhere def<strong>in</strong>ed operator<br />

is automatically closed. Conversely, that an everywhere def<strong>in</strong>ed, closed<br />

operator is bounded.<br />

H<strong>in</strong>t: The closed graph theorem!<br />

Exercise 4.8. Show that if U is unitary, then all eigen-values λ of<br />

U have absolute value |λ| = 1. Also show that if e1 and e2 are eigenvectors<br />

correspond<strong>in</strong>g to eigen-values λ1 and λ2 respectively, then e1<br />

and e2 are orthogonal if λ1 = λ2.<br />

Exercise 4.9. Show that if T is densely def<strong>in</strong>ed and closeable, then<br />

the closure of T is T ∗∗ .


CHAPTER 5<br />

Resolvents<br />

We now consider a closed, densely def<strong>in</strong>ed operator T <strong>in</strong> the <strong>Hilbert</strong><br />

space H. We def<strong>in</strong>e the solvability and deficiency spaces of T at λ by<br />

Sλ = {u ∈ H | (T − λ)v = u for some v ∈ D(T )}<br />

Dλ = {u ∈ D(T ∗ ) | T ∗ u = λu}.<br />

The follow<strong>in</strong>g basic lemma is valid.<br />

Lemma 5.1. Supposed T is closed and densely def<strong>in</strong>ed. Then<br />

(1) D λ = H ⊖ Sλ.<br />

(2) If T is symmetric and Im λ = 0, then Sλ is closed and H =<br />

Sλ ⊕ D λ<br />

(3) If T is selfadjo<strong>in</strong>t and Im λ = 0, then (T − λ)v = u is uniquely<br />

solvable for any u ∈ H ( i.e., Sλ = H), T has no non-real<br />

eigen-values ( i.e., Dλ = {0}), and v ≤ 1<br />

|Im λ| u.<br />

Proof. Any element of the graph of T is of the form (v, λv + u),<br />

where u ∈ Sλ. To see this, simply put u = T v − λv for any v ∈ D(T ).<br />

Now 〈T v, w〉 − 〈v, λw〉 = 〈u + λv, w〉 − 〈v, λw〉 = 〈u, w〉, so it follows<br />

that (w, λw) ∈ GT ∗, i.e., w ∈ Dλ , if and only if w is orthogonal to Sλ.<br />

This proves (1).<br />

If T is symmetric and (v, λv+u) ∈ GT , then 〈λv+u, v〉 = 〈v, λv+u〉,<br />

i.e., Im λv2 = Im〈v, u〉, which is ≤ vu by Cauchy-Schwarz’ <strong>in</strong>equality.<br />

If Im λ = 0 we obta<strong>in</strong> v ≤ 1 u, so that v is uniquely<br />

|Im λ|<br />

determ<strong>in</strong>ed by u; <strong>in</strong> particular T has no non-real eigen-values. Furthermore,<br />

suppose that u1, u2, . . . is a sequence <strong>in</strong> Sλ converg<strong>in</strong>g to u, and<br />

that (vj, λvj + uj) ∈ GT . Then v1, v2, . . . is also a Cauchy sequence,<br />

s<strong>in</strong>ce vj − vk ≤ 1<br />

|Im λ| uj − uk. Thus vj tends to some limit v, and<br />

s<strong>in</strong>ce T is closed we have (v, λv + u) ∈ GT . Hence u ∈ Sλ, so that Sλ<br />

is closed and (2) follows.<br />

F<strong>in</strong>ally, if T is self-adjo<strong>in</strong>t, then T ∗ = T is symmetric so it has no<br />

non-real eigen-values. If Im λ = 0 it follows that Dλ = {0} so that (3)<br />

follows and the proof is complete. <br />

In the rest of this chapter we assume that T is a selfadjo<strong>in</strong>t operator.<br />

We def<strong>in</strong>e the resolvent set of T as<br />

ρ(T ) = {λ ∈ C | T − λ has a bounded, everywhere def<strong>in</strong>ed <strong>in</strong>verse} ,<br />

31


32 5. RESOLVENTS<br />

and the spectrum σ(T ) of T as the complement of ρ(T ). By Lemma 5.1.3<br />

the spectrum is a subset of the real l<strong>in</strong>e. For every λ ∈ ρ(T ) we now<br />

def<strong>in</strong>e the resolvent of T at λ as the operator Rλ = (T − λ) −1 . The<br />

resolvent has the follow<strong>in</strong>g properties.<br />

Theorem 5.2. The resolvent of a selfadjo<strong>in</strong>t operator T has the<br />

properties:<br />

(1) Rλ ≤ 1/|Im λ| if Im λ = 0.<br />

(2) (Rλ) ∗ = R λ for λ ∈ ρ(T ).<br />

(3) Rλ − Rµ = (λ − µ)RλRµ for λ and µ ∈ ρ(T ).<br />

The last statement is called the (first) resolvent relation.<br />

Proof. The first claim is simply a re-statement of Lemma 5.1.3.<br />

Note that all elements of GT are of the form (Rλu, λRλu + u). Now<br />

w ∗ = (Rλ) ∗ w precisely if 〈Rλu, w〉 = 〈u, w ∗ 〉 for all u ∈ H. Add<strong>in</strong>g<br />

λ〈Rλu, w ∗ 〉 to both sides we obta<strong>in</strong> 〈Rλu, λw ∗ + w〉 = 〈λRλu + u, w ∗ 〉,<br />

so that (w ∗ , λw ∗ + w) ∈ GT , i.e., w ∗ = R λ w. This proves (2). F<strong>in</strong>ally,<br />

suppose (w, µw+u) ∈ GT . S<strong>in</strong>ce GT is l<strong>in</strong>ear it follows that (v, λv+u) ∈<br />

GT if and only if (v, λv+u)−(w, µw+u) = (v−w, λ(v−w)+(λ−µ)w) ∈<br />

GT . But this means exactly that (3) holds. <br />

Theorem 5.3. The resolvent set ρ(T ) is open, and the function<br />

λ ↦→ Rλ is analytic <strong>in</strong> the uniform operator topology as a B(H)-valued<br />

function. This means (by def<strong>in</strong>ition) that Rλ can be expanded <strong>in</strong> a<br />

power series with respect to λ around any po<strong>in</strong>t <strong>in</strong> ρ(T ) and that the<br />

series converges <strong>in</strong> operator norm <strong>in</strong> a neighborhood of the po<strong>in</strong>t. In<br />

fact, if µ ∈ ρ(T ), then λ ∈ ρ(T ) for |λ − µ| < 1/Rµ and<br />

Rλ =<br />

∞<br />

k=0<br />

(λ − µ) k R k+1<br />

µ<br />

for |λ − µ| < 1/Rµ .<br />

F<strong>in</strong>ally, the function ρ(T ) ∋ λ ↦→ 〈Rλu, v〉 is analytic for all u, v ∈ H,<br />

and for u = v it maps the upper and lower half-planes <strong>in</strong>to themselves.<br />

Proof. The series is norm convergent if |λ − µ| < 1/Rµ s<strong>in</strong>ce<br />

(λ − µ) k R k+1<br />

µ ≤ Rµ(|λ − µ|Rµ) k , which is a term <strong>in</strong> a convergent<br />

geometric series. Writ<strong>in</strong>g T − λ = T − µ − (λ − µ) and apply<strong>in</strong>g this<br />

to the series from the left and right, one immediately sees that the<br />

series represents the <strong>in</strong>verse of T − λ. We have verified the formula<br />

for Rλ and it also follows that ρ(T ) is open. Now by Theorem 5.2<br />

we have 2i Im〈Rλu, u〉 = 〈Rλu, u〉 − 〈u, Rλu〉 = 〈(Rλ − R λ )u, u〉 =<br />

2i Im λ〈R λ Rλu, u〉 = 2i Im λRλu 2 . It follows that Im〈Rλu, u〉 has<br />

the same sign as Im λ. The analyticity of 〈Rλu, v〉 follows s<strong>in</strong>ce we<br />

have a power series expansion of it around any po<strong>in</strong>t <strong>in</strong> ρ(T ), by the<br />

series for Rλ. Alternatively, from Theorem 5.2.3 it easily follows that<br />

u, v〉 (Exercise 5.1). <br />

d<br />

dλ 〈Rλu, v〉 = 〈R 2 λ


EXERCISES FOR CHAPTER 5 33<br />

Analytic functions that map the upper and lower halfplanes <strong>in</strong>to<br />

themselves have particularly nice properties. Our proof of the general<br />

spectral theorem will be based on the fact that 〈Rλu, u〉 is such a<br />

function, so we will make a detailed study of them <strong>in</strong> the next chapter.<br />

That ρ(T ) is open means of course that the spectrum is always<br />

a closed subset of R. It is customary to divide the spectrum <strong>in</strong>to (at<br />

least) two disjo<strong>in</strong>t subsets, the po<strong>in</strong>t spectrum σp(T ) and the cont<strong>in</strong>uous<br />

spectrum σc(T ), def<strong>in</strong>ed as follows.<br />

σp(T ) = {λ ∈ C | T − λ is not one-to-one}<br />

σc(T ) = σ(T ) \ σp(T ).<br />

This means that the po<strong>in</strong>t spectrum consists of the eigen-values of T ,<br />

and the cont<strong>in</strong>uous spectrum of those λ for which Sλ is dense <strong>in</strong> H<br />

but not closed. This follows s<strong>in</strong>ce (T − λ) −1 is automatically bounded<br />

if Sλ = H, by the closed graph theorem (Exercise 5.2). For nonselfadjo<strong>in</strong>t<br />

operators there is a further possibility; one may have Sλ<br />

non-dense even if λ is not an eigenvalue. Such values of λ constitute<br />

the residual spectrum which by Lemma 5.1 is empty for selfadjo<strong>in</strong>t<br />

operators.<br />

An eigenvalue for a selfadjo<strong>in</strong>t operator is said to have f<strong>in</strong>ite multiplicity<br />

if the eigenspace is f<strong>in</strong>ite-dimensional. Remov<strong>in</strong>g from the<br />

spectrum all isolated po<strong>in</strong>ts which are eigenvalues of f<strong>in</strong>ite multiplicity<br />

leaves one with the essential spectrum. The name comes from the<br />

fact that the essential spectrum is quite stable under perturbations<br />

(changes) of the operator T , but we will not discuss such matters here.<br />

Exercises for Chapter 5<br />

Exercise 5.1. Suppose that Rλ is the resolvent of a self-adjo<strong>in</strong>t<br />

operator T <strong>in</strong> a <strong>Hilbert</strong> space H. Show directly from Theorem 5.2.3 that<br />

if u, v ∈ H, then λ ↦→ 〈Rλu, v〉 is analytic (has a complex derivative)<br />

for λ ∈ ρ(T ), and f<strong>in</strong>d an expression for the derivative. Also show that<br />

if u ∈ H, then λ ↦→ 〈Rλu, u〉 is <strong>in</strong>creas<strong>in</strong>g <strong>in</strong> every po<strong>in</strong>t of ρ ∩ R.<br />

Exercise 5.2. Show that if T is a closed operator with Sλ = H<br />

and λ /∈ σp(T ), then λ ∈ ρ(T ).<br />

H<strong>in</strong>t: The closed graph theorem!<br />

Exercise 5.3. Show that if T is a self-adjo<strong>in</strong>t operator, then U =<br />

(T +i)(T −i) −1 = I +2iR−i is unitary. Conversely, if U is unitary and 1<br />

is not an eigen-value, then T = i(U + I)(U − I) −1 is selfadjo<strong>in</strong>t. What<br />

can one do if 1 is an eigen-value? This transform, rem<strong>in</strong>iscent of a<br />

Möbius transform, is called the Cayley transform and was the basis for<br />

von Neumann’s proof of the spectral theorem for unbounded operators.


CHAPTER 6<br />

Nevanl<strong>in</strong>na functions<br />

Our proof of the spectral theorem is based on the follow<strong>in</strong>g representation<br />

theorem.<br />

Theorem 6.1. Suppose F is analytic <strong>in</strong> C \ R, F (λ) = F (λ),<br />

and F maps each of the upper and lower half-planes <strong>in</strong>to themselves.<br />

Then there exists a unique, left-cont<strong>in</strong>uous, <strong>in</strong>creas<strong>in</strong>g function ρ with<br />

ρ(0) = 0 and ∞ dρ(t)<br />

1+t2 < ∞, and unique real constants α and β ≥ 0,<br />

such that<br />

−∞<br />

(6.1) F (λ) = α + βλ +<br />

∞<br />

−∞<br />

where the <strong>in</strong>tegral is absolutely convergent.<br />

<br />

1 t<br />

−<br />

t − λ 1 + t2 <br />

dρ(t),<br />

For the mean<strong>in</strong>g of such an <strong>in</strong>tegral, see Appendix B. Functions<br />

F with the properties <strong>in</strong> the theorem are usually called Nevanl<strong>in</strong>na,<br />

Herglotz or Pick functions. I am not sure who first proved the theorem,<br />

but results of this type play an important role <strong>in</strong> the classical book<br />

E<strong>in</strong>deutige analytische Funktionen by Rolf Nevanl<strong>in</strong>na (1930). We will<br />

tackle the proof through a sequence of lemmas.<br />

Lemma 6.2 (H. A. Schwarz). Let G be analytic <strong>in</strong> the unit disk,<br />

and put u(R, θ) = Re G(Reiθ ). For |z| < R < 1 we then have:<br />

(6.2) G(z) = i Im G(0) + 1<br />

π<br />

Re<br />

2π<br />

iθ + z<br />

Reiθ u(R, θ) dθ.<br />

− z<br />

Proof. Accord<strong>in</strong>g to Poisson’s <strong>in</strong>tegral formula (see e.g. Chapter 6<br />

of Ahlfors: Complex Analysis (McGraw-Hill 1966)), we have<br />

Re G(z) = 1<br />

π<br />

R<br />

2π<br />

2 − |z| 2<br />

|Reiθ u(R, θ) dθ .<br />

− z| 2<br />

−π<br />

The <strong>in</strong>tegral here is easily seen to be the real part of the <strong>in</strong>tegral <strong>in</strong><br />

(6.2). The latter is obviously analytic <strong>in</strong> z for |z| < R < 1, so the two<br />

sides of (6.2) can only differ by an imag<strong>in</strong>ary constant. However, for<br />

z = 0 the <strong>in</strong>tegral is real, so (6.2) follows. <br />

The formula (6.2) is not applicable for R = 1, s<strong>in</strong>ce we do not<br />

know whether Re G has reasonable boundary values on the unit circle.<br />

−π<br />

35


36 6. NEVANLINNA FUNCTIONS<br />

However, if one assumes that Re G ≥ 0 the boundary values exist at<br />

least <strong>in</strong> the sense of measure, and one has the follow<strong>in</strong>g theorem.<br />

Theorem 6.3 (Riesz-Herglotz). Let G be analytic <strong>in</strong> the unit circle<br />

with positive real part. Then there exists an <strong>in</strong>creas<strong>in</strong>g function σ on<br />

[0, 2π] such that<br />

G(z) = i Im G(0) + 1<br />

2π<br />

π<br />

−π<br />

eiθ + z<br />

eiθ dσ(θ) .<br />

− z<br />

With a suitable normalization the function σ will also be unique,<br />

but we will not use this. To prove Theorem 6.3 we need some k<strong>in</strong>d<br />

of compactness result, so that we can obta<strong>in</strong> the theorem as a limit<strong>in</strong>g<br />

case of Lemma 6.2. What is needed is weak ∗ compactness <strong>in</strong> the<br />

dual of the cont<strong>in</strong>uous functions on a compact <strong>in</strong>terval, provided with<br />

the maximum norm. This is the classical Helly theorem. S<strong>in</strong>ce we assume<br />

m<strong>in</strong>imal knowledge of functional analysis we will give the classical<br />

proof.<br />

Lemma 6.4 (Helly).<br />

(1) Suppose {ρj} ∞ 1 is a uniformly bounded1 sequence of <strong>in</strong>creas<strong>in</strong>g<br />

functions on an <strong>in</strong>terval I. Then there is a subsequence<br />

converg<strong>in</strong>g po<strong>in</strong>twise to an <strong>in</strong>creas<strong>in</strong>g function.<br />

(2) Suppose {ρj} ∞ 1 is a uniformly bounded sequence of <strong>in</strong>creas<strong>in</strong>g<br />

functions on a compact <strong>in</strong>terval I, converg<strong>in</strong>g po<strong>in</strong>twise to ρ.<br />

Then<br />

<br />

(6.3)<br />

f dρj → f dρ as j → ∞,<br />

I<br />

for any function f cont<strong>in</strong>uous on I.<br />

I<br />

Proof. Let r1, r2, . . . be a dense sequence <strong>in</strong> I, for example an enumeration<br />

of the rational numbers <strong>in</strong> I. By Bolzano-Weierstrass’ theorem<br />

we may choose a subsequence {ρ1j} ∞ 1 of {ρj} ∞ 1 so that ρ1j(r1) converges.<br />

Similarly, we may choose a subsequence {ρ2j} ∞ 1 of {ρ1j} ∞ 1 such<br />

that ρ2j(r2) converges; as a subsequence of ρ1j(r1) the sequence ρ2j(r1)<br />

still converges. Cont<strong>in</strong>u<strong>in</strong>g <strong>in</strong> this fashion, we obta<strong>in</strong> a sequence of sequences<br />

{ρkj} ∞ j=1, k = 1, 2, . . . such that each sequence is a subsequence<br />

of those com<strong>in</strong>g before it, and such that ρ(rn) = limj→∞ ρkj(rn) exists<br />

for n ≤ k. Thus ρjj(rn) → ρ(rn) as j → ∞ for every n, s<strong>in</strong>ce ρjj(rn) is<br />

a subsequence of ρnj(rn) from j = n on. Clearly ρ is <strong>in</strong>creas<strong>in</strong>g, so if<br />

x ∈ I but = rn for all n, we may choose an <strong>in</strong>creas<strong>in</strong>g subsequence rjk ,<br />

k = 1, 2, . . . , converg<strong>in</strong>g to x, and def<strong>in</strong>e ρ(x) = limk→∞ ρ(rjk ).<br />

Suppose x is a po<strong>in</strong>t of cont<strong>in</strong>uity of ρ. If rk < x < rn we get<br />

ρjj(rk) − ρ(rn) ≤ ρjj(x) − ρ(x) ≤ ρjj(rn) − ρ(rk). Given ε > 0 we may<br />

1 i.e., all the functions are bounded by a fixed constant


6. NEVANLINNA FUNCTIONS 37<br />

choose k and n such that ρ(rn) − ρ(rk) < ε. We then obta<strong>in</strong><br />

−ε ≤ lim (ρjj(x) − ρ(x)) ≤ lim (ρjj(x) − ρ(x)) ≤ ε .<br />

j→∞<br />

j→∞<br />

Hence {ρjj} ∞ 1 converges po<strong>in</strong>twise to ρ, except possibly <strong>in</strong> po<strong>in</strong>ts of<br />

discont<strong>in</strong>uity of ρ. But there are at most countably many such discont<strong>in</strong>uities,<br />

ρ be<strong>in</strong>g <strong>in</strong>creas<strong>in</strong>g. Hence repeat<strong>in</strong>g the trick of extract<strong>in</strong>g<br />

subsequences, and then us<strong>in</strong>g the ‘diagonal’ sequence, we get a subsequence<br />

of the orig<strong>in</strong>al sequence which converges everywhere <strong>in</strong> I. We<br />

now obta<strong>in</strong> (1).<br />

If f is the characteristic function of a compact <strong>in</strong>terval whose endpo<strong>in</strong>ts<br />

are po<strong>in</strong>ts of cont<strong>in</strong>uity for ρ and all ρj it is obvious that (6.3)<br />

holds. It follows that (6.3) holds if f is a stepfunction with all discont<strong>in</strong>uities<br />

at po<strong>in</strong>ts where ρ and all ρj are cont<strong>in</strong>uous. If f is cont<strong>in</strong>uous<br />

and ε > 0 we may, by uniform cont<strong>in</strong>uity, choose such a stepfunction<br />

g so that supI|f − g| < ε. If C is a common bound for all ρj we then<br />

obta<strong>in</strong> | <br />

I (f − g) dρ| < 2Cε and similarly with ρ replaced by ρj. It<br />

follows that limj→∞| <br />

I f dρj − <br />

f dρ| ≤ 4Cε and s<strong>in</strong>ce ε is arbitrary<br />

I<br />

positive (2) follows. <br />

Proof of Theorem 6.3. Accord<strong>in</strong>g to Lemma 6.2 we have, for<br />

|z| < 1,<br />

G(Rz) = i Im G(0) + 1<br />

2π<br />

π<br />

−π<br />

e iθ + z<br />

e iθ − z dσR(θ) ,<br />

where σR(θ) = θ<br />

−π Re G(Reiϕ ) dϕ. Hence σR is <strong>in</strong>creas<strong>in</strong>g, ≥ 0 and<br />

bounded from above by σR(π). Now Re G is a harmonic function so it<br />

has the mean value property, which means that σR(π) = 2π Re G(0).<br />

This is <strong>in</strong>dependent of R, so by Helly’s theorem we may choose a sequence<br />

Rj ↑ 1 such that σR converges to an <strong>in</strong>creas<strong>in</strong>g function σ. Use<br />

of the second part of Helly’s theorem completes the proof. <br />

To prove the uniqueness of the function ρ of Theorem 6.1 we need<br />

the follow<strong>in</strong>g simple, but important, lemma.<br />

Lemma 6.5 (Stieltjes’ <strong>in</strong>version formula). Let ρ be complex-valued<br />

of locally bounded variation, and such that ∞ dρ(t)<br />

−∞ t2 is absolutely con-<br />

+1<br />

vergent. Suppose F (λ) is given by (6.1). Then if y < x are po<strong>in</strong>ts of<br />

cont<strong>in</strong>uity of ρ we have<br />

ρ(x) − ρ(y) = lim<br />

ε↓0<br />

1<br />

2πi<br />

x<br />

y<br />

(F (s + iε) − F (s − iε) ds<br />

= lim<br />

ε↓0<br />

1<br />

π<br />

x<br />

y<br />

∞<br />

−∞<br />

ε dρ(t)<br />

(t − s) 2 ds .<br />

+ ε2


38 6. NEVANLINNA FUNCTIONS<br />

Proof. By absolute convergence we may change the order of <strong>in</strong>tegration<br />

<strong>in</strong> the last <strong>in</strong>tegral. The <strong>in</strong>ner <strong>in</strong>tegral is then easily calculated<br />

to be<br />

1<br />

(arctan((x − t)/ε) − arctan((y − t)/ε)).<br />

π<br />

This is bounded by 1, and also by a constant multiple of 1/t2 if ε is<br />

bounded (verify this!). Furthermore it converges po<strong>in</strong>twise to 0 outside<br />

[y, x], and to 1 <strong>in</strong> (y, x) (and to 1 for t = x and t = y). The theorem<br />

2<br />

follows by dom<strong>in</strong>ated convergence. <br />

Proof of Theorem 6.1. The uniqueness of ρ follows immediately<br />

on apply<strong>in</strong>g the Stieltjes <strong>in</strong>version formula to the imag<strong>in</strong>ary part<br />

of (6.1) for λ = s + iε.<br />

We obta<strong>in</strong> (6.1) from the Riesz-Herglotz theorem by a change of<br />

variable. The mapp<strong>in</strong>g z = 1+iλ maps the upper half plane bijectively<br />

1−iλ<br />

to the unit disk, so G(z) = −iF (λ) is def<strong>in</strong>ed for z <strong>in</strong> the unit disk<br />

and has positive real part. Apply<strong>in</strong>g Theorem 6.3 we obta<strong>in</strong>, after<br />

simplification,<br />

F (λ) = Re F (i) + 1<br />

2π<br />

π<br />

−π<br />

1 + λ tan(θ/2)<br />

tan(θ/2) − λ<br />

dσ(θ) .<br />

Sett<strong>in</strong>g t = tan(θ/2) maps the open <strong>in</strong>terval (−π, π) onto the real axis.<br />

For θ = ±π the <strong>in</strong>tegrand equals λ, so any mass of σ at ±π gives rise<br />

to a term βλ with β ≥ 0. After the change of variable we get<br />

<br />

F (λ) = α + βλ +<br />

∞<br />

−∞<br />

1 + tλ<br />

t − λ<br />

dτ(t) ,<br />

where we have set α = Re F (i) and τ(t) = σ(θ)/(2π). S<strong>in</strong>ce<br />

1 + tλ<br />

t − λ =<br />

<br />

1 t<br />

−<br />

t − λ 1 + t2 <br />

(1 + t 2 )<br />

we now obta<strong>in</strong> (6.1) by sett<strong>in</strong>g ρ(t) = t<br />

0 (1 + s2 ) dτ(s).<br />

It rema<strong>in</strong>s to show the uniqueness of α and β. However, sett<strong>in</strong>g<br />

λ = i, it is clear that α = Re F (i), and s<strong>in</strong>ce we already know that ρ<br />

is unique, so is β. <br />

Actually one can calculate β directly from F s<strong>in</strong>ce by dom<strong>in</strong>ated<br />

convergence Im F (iν)/ν → β as ν → ∞. It is usual to refer to β as the<br />

‘mass at <strong>in</strong>f<strong>in</strong>ity’, an expression expla<strong>in</strong>ed by our proof. Note, however,<br />

that it is the mass of τ at <strong>in</strong>f<strong>in</strong>ity and not that of ρ!


CHAPTER 7<br />

The spectral theorem<br />

Theorem 7.1. (<strong>Spectral</strong> theorem) Suppose T is selfadjo<strong>in</strong>t. Then<br />

there exists a unique, <strong>in</strong>creas<strong>in</strong>g and left-cont<strong>in</strong>uous family {Et}t∈R of<br />

orthogonal projections with the follow<strong>in</strong>g properties:<br />

• Et commutes with T , <strong>in</strong> the sense that T Et is the closure of<br />

EtT .<br />

• Et → 0 as t → −∞ and Et → I (= identity on H) as t → ∞<br />

(strong convergence).<br />

• T = ∞<br />

−∞ t dEt <strong>in</strong> the follow<strong>in</strong>g sense: u ∈ D(T ) if and only if<br />

∞<br />

∞<br />

−∞ t2 d〈Etu, u〉.<br />

−∞ t2 d〈Etu, u〉 < ∞, 〈T u, v〉 = ∞<br />

−∞ t d〈Etu, v〉 and T u 2 =<br />

The family {Et}t∈R of projections is called the resolution of the identity<br />

for T . The formula T = ∞<br />

−∞t dEt can be made sense of directly by<br />

<strong>in</strong>troduc<strong>in</strong>g Stieltjes <strong>in</strong>tegrals with respect to operator-valued <strong>in</strong>creas<strong>in</strong>g<br />

functions. This is a simple generalization of the scalar-valued case.<br />

Although we then, formally, get a slightly stronger statement, it does<br />

not appear to be any more useful than the statement above. We will<br />

therefore omit this.<br />

For the proof we need two lemmas, the first of which actually conta<strong>in</strong>s<br />

the ma<strong>in</strong> step of the proof.<br />

Lemma 7.2. For f, g ∈ H there is a unique left-cont<strong>in</strong>uous function<br />

σf,g of bounded variation, with σf,g(−∞) = 0, and the follow<strong>in</strong>g<br />

properties:<br />

• σf,g is Hermitian <strong>in</strong> f, g ( i.e., σf,g = σg,f and is l<strong>in</strong>ear <strong>in</strong> f),<br />

and σf,f is <strong>in</strong>creas<strong>in</strong>g.<br />

• dσf,g is a bounded sesquil<strong>in</strong>ear form on H. In fact, we even<br />

have ∞<br />

−∞ |dσf,g| ≤ fg.<br />

• 〈Rλf, g〉 = ∞ dσf,g<br />

−∞ t−λ .<br />

Proof. The uniqueness of σf,g follows from the Stieltjes <strong>in</strong>version<br />

formula, applied to F (λ) = 〈Rλf, g〉. S<strong>in</strong>ce 〈Rλf, g〉 is sesqui-l<strong>in</strong>ear <strong>in</strong><br />

f, g and R ∗ λ = R λ , it then follows that σf,g is Hermitian if it exists.<br />

However, by Theorem 5.3 the function λ ↦→ 〈Rλf, f〉 is a Nevanl<strong>in</strong>na<br />

39


40 7. THE SPECTRAL THEOREM<br />

function of λ for any f, so we have<br />

∞<br />

<br />

1 t<br />

(7.1) 〈Rλf, f〉 = α + βλ + −<br />

t − λ 1 + t2 <br />

dσf,f(t),<br />

−∞<br />

where σf,f is <strong>in</strong>creas<strong>in</strong>g and α, β may depend on f. S<strong>in</strong>ce Rλ ≤ 1<br />

|Im λ| ,<br />

we f<strong>in</strong>d that f 2 is an upper bound for ν〈Riνf, f〉 for ν ∈ R, the<br />

imag<strong>in</strong>ary part of which is βν 2 + ∞<br />

−∞<br />

ν2 dσf,f (t)<br />

t2 +ν2 . Hence β = 0, and by<br />

Fatou’s lemma we get, as ν → ∞, that ∞<br />

−∞ dσf,f ≤ f 2 . A more<br />

elementary argument is the follow<strong>in</strong>g: For ν, ε > 0 we have<br />

1<br />

1 + ε2 νε<br />

∞<br />

ν<br />

dσf,f ≤<br />

2<br />

t2 + ν2 dσf,f(t) ≤ f 2 ,<br />

−νε<br />

−∞<br />

1 s<strong>in</strong>ce 1+ε2 ≤ ν2<br />

ν2 +t2 for |t| ≤ νε, so lett<strong>in</strong>g ν → ∞, and then ε → 0, we<br />

obta<strong>in</strong> the same bound. We may now assume σf,f to be normalized so<br />

as to be left-cont<strong>in</strong>uous and σf,f(−∞) = 0. Clearly ∞ t<br />

−∞ 1+t2 dσf,f(t)<br />

is absolutely convergent, so this part of the <strong>in</strong>tegral <strong>in</strong> (7.1) may be<br />

<strong>in</strong>corporated <strong>in</strong> the constant α. So, with absolute convergence, we have<br />

〈Rλf, f〉 = α ′ + ∞ dσf,f (t)<br />

. However, for λ → ∞ along the imag<strong>in</strong>ary<br />

−∞ t−λ<br />

axis, both the left hand side and the <strong>in</strong>tegral → 0 (Exercise 7.1), so we<br />

must have α ′ = 0. The proof is now f<strong>in</strong>ished <strong>in</strong> the case f = g.<br />

By the polarization identity (Exercise 7.2)<br />

〈Rλf, g〉 = 1<br />

3<br />

i<br />

4<br />

k 〈Rλ(f + i k g), f + i k g〉 ,<br />

k=0<br />

so we obta<strong>in</strong> 〈Rλf, g〉 = ∞ dσf,g(t)<br />

−∞ t−λ<br />

σf,g = 1<br />

4<br />

3<br />

k=0<br />

by sett<strong>in</strong>g<br />

i k σ f+i k g,f+i k g.<br />

The function σf,g has the correct normalization, so only the bound on<br />

the total variation rema<strong>in</strong>s to be proved. But if ∆ is an <strong>in</strong>terval, then<br />

∆ dσf,g is a semi-scalar product on H, so Cauchy-Schwarz’ <strong>in</strong>equality<br />

<br />

<br />

∆ dσf,g<br />

<br />

2 ≤ <br />

∆ dσf,f<br />

<br />

∆ dσg,g <br />

is valid. For ∆ = R this shows that<br />

R dσf,g is bounded by fg. If {∆j} ∞ 1 is a partition of R <strong>in</strong>to disjo<strong>in</strong>t<br />

<strong>in</strong>tervals we obta<strong>in</strong><br />

<br />

<br />

dσf,g<br />

<br />

≤<br />

∆j<br />

∆j<br />

dσf,f<br />

<br />

<br />

<br />

≤<br />

∆j<br />

<br />

∆j<br />

dσg,g<br />

dσf,f<br />

1<br />

2<br />

1<br />

2 <br />

∆j<br />

dσg,g<br />

1<br />

2<br />

≤ fg,


7. THE SPECTRAL THEOREM 41<br />

where the second <strong>in</strong>equality is Cauchy-Schwarz’ <strong>in</strong>equality <strong>in</strong> ℓ 2 . The<br />

proof is complete. <br />

Lemma 7.3. ∞<br />

−∞ dσf,g = 〈f, g〉 for any f, g ∈ H.<br />

Proof. Assume first that f ∈ D(T ) so that f = Rλ(v−λf), where<br />

v = T f. Thus 〈f, g〉 = −λ〈Rλf, g〉 + 〈Rλv, g〉. S<strong>in</strong>ce −iν ∞<br />

∞<br />

−∞<br />

dσf,g(t)<br />

t−iν →<br />

−∞ dσf,g as ν → ∞ by bounded convergence (Exercise 7.1), the lemma<br />

is true for f ∈ D(T ), which is dense <strong>in</strong> H. But ∞<br />

−∞ dσf,g is a bounded<br />

Hermitian form on H s<strong>in</strong>ce | ∞<br />

−∞ dσf,g| ≤ ∞<br />

−∞ |dσf,g| ≤ fg by<br />

Lemma 7.2, so the general case follows by cont<strong>in</strong>uity. <br />

Proof of the spectral theorem. We first show the uniqueness<br />

of the resolution of identity. So, assume a resolution of the identity<br />

with all the properties claimed exists. Then EtEs = Em<strong>in</strong>(s,t), so if<br />

w ∈ D(T ) and s fixed we obta<strong>in</strong><br />

s<br />

−∞<br />

d〈EtT w, v〉 = 〈EsT w, v〉<br />

= 〈T Esw, v〉 =<br />

∞<br />

−∞<br />

t d〈EtEsw, v〉 =<br />

s<br />

−∞<br />

t d〈Etw, v〉.<br />

Thus d〈EtT w, v〉 = t d〈Etw, v〉 as measures. Now suppose w = Rλu.<br />

We then get<br />

d〈Etu, v〉<br />

t − λ = d〈Et(T − λ)Rλu, v〉<br />

t − λ<br />

= d〈EtRλu, v〉.<br />

It follows that 〈Rλu, v〉 = d〈Etu,v〉<br />

t−λ . The uniqueness of the spectral<br />

projectors therefore follows from the Stieltjes <strong>in</strong>version formula.<br />

The l<strong>in</strong>ear form f ↦→ σf,g(t) is bounded for each g ∈ H (by g,<br />

accord<strong>in</strong>g to Lemma 7.2). By Riesz’ representation theorem it is therefore<br />

of the form 〈f, gt〉, where gt ≤ g. It is obvious that gt depends<br />

l<strong>in</strong>early on g so gt = Etg where Et is a l<strong>in</strong>ear operator with norm ≤ 1,<br />

which is selfadjo<strong>in</strong>t s<strong>in</strong>ce σf,g is Hermitian. Furthermore Etf ⇀ 0 as<br />

t → −∞ by the normalization of σf,g and Etf ⇀ f as t → ∞ (weak<br />

convergence) by Lemma 7.3.<br />

Suppose we knew that Et is a projection. S<strong>in</strong>ce Et is selfadjo<strong>in</strong>t it is<br />

then an orthogonal projection. It follows that Etf 2 = 〈Etf, f〉 → 0<br />

as t → −∞ and similarly f − Etf 2 = 〈f − Etf, f〉 → 0 as t → ∞.<br />

Hence we only need to show that Et is a projection <strong>in</strong>creas<strong>in</strong>g with t,<br />

and the statements about T .


42 7. THE SPECTRAL THEOREM<br />

as<br />

The resolvent relation Rλ − Rµ = (λ − µ)RλRµ may be expressed<br />

∞<br />

−∞<br />

1<br />

t − λ<br />

d〈Etf, g〉<br />

t − µ =<br />

∞<br />

−∞<br />

d〈EtRµf, g〉<br />

t − λ<br />

(check this!), so the uniqueness of the Stieltjes transform shows that<br />

〈EtRµf, g〉 = t d〈Esf,g〉<br />

. But<br />

−∞ s−µ<br />

〈EtRµf, g〉 = 〈Rµf, Etg〉 =<br />

∞<br />

−∞<br />

d〈Esf, Etg〉<br />

.<br />

s − µ<br />

So, aga<strong>in</strong> by uniqueness, 〈Esf, Etg〉 = 〈Euf, g〉 where u = m<strong>in</strong>(s, t),<br />

i.e., EtEs = Em<strong>in</strong>(s,t). For s = t this shows that Et is a projection, and<br />

if t > s we get 0 ≤ (Et − Es) ∗ (Et − Es) = (Et − Es) 2 = Et − Es so that<br />

{Et}t∈R is an <strong>in</strong>creas<strong>in</strong>g family of orthogonal projections.<br />

Now suppose f ∈ D(T ) and v = T f. For any non-real λ we then<br />

have f = Rλ(v−λf) or Rλv = f +λRλf. S<strong>in</strong>ce 1+λ/(t−λ) = t/(t−λ)<br />

we therefore obta<strong>in</strong><br />

∞<br />

−∞<br />

dσv,g(t)<br />

t − λ =<br />

∞<br />

−∞<br />

t dσf,g<br />

t − λ<br />

so that σv,g(t) = t<br />

−∞ s dσf,g(s). In particular, 〈T f, g〉 = ∞<br />

−∞t d〈Etf, g〉.<br />

We also get σv,v(t) = t<br />

−∞ s dσf,v(s) = t<br />

−∞ s2 dσf,f(s), so that T f2 =<br />

∞<br />

−∞s2d〈Esf, f〉.<br />

Next we prove that any u ∈ H for which ∞<br />

−∞s2 d〈Esu, u〉 < ∞ is<br />

<strong>in</strong> D(T ). To see this, note that<br />

<br />

<br />

<br />

<br />

<br />

|d〈Esu, v〉| ≤ d〈Esu, u〉 d〈Esv, v〉<br />

∆<br />

∆<br />

if ∆ is a f<strong>in</strong>ite union of <strong>in</strong>tervals. This follows just as <strong>in</strong> the proof of<br />

Lemma 7.2. Now let ∆k = {s | 2 k−1 < |s| ≤ 2 k }, k ∈ Z. Then<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

s d〈Esu, v〉 <br />

≤ 2k |d〈Esu, v〉|<br />

<br />

<br />

∆k<br />

∆k<br />

≤ 2 k<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

d〈Esu, u〉 d〈Esv, v〉 ≤ 2<br />

∆k<br />

∆k<br />

∆k<br />

∆<br />

s2 <br />

d〈Esu, u〉<br />

∆k<br />

d〈Esv, v〉.<br />

If now ∞<br />

−∞s2 d〈Esu, u〉 < ∞ we obta<strong>in</strong> from this by add<strong>in</strong>g over all k<br />

<br />

<br />

and us<strong>in</strong>g Cauchy-Schwarz’ <strong>in</strong>equality for sums that ∞<br />

−∞s d〈Esu,<br />

<br />

<br />

v〉 ≤<br />

∞<br />

2 −∞s2d〈Esu, u〉v so that the anti-l<strong>in</strong>ear form v ↦→ ∞<br />

−∞s d〈Esu, v〉


7. THE SPECTRAL THEOREM 43<br />

is bounded on H. It is therefore, by Riesz’ representation theorem,<br />

a scalar product 〈u1, v〉. It is obvious that u1 depends l<strong>in</strong>early on u,<br />

i.e., there is a l<strong>in</strong>ear operator S so that u1 = Su. It is clear that S is<br />

symmetric and an extension of T , so we have T ⊂ S ⊂ S∗ ⊂ T ∗ = T .<br />

Hence S = T so the claims about D(T ) are verified.<br />

F<strong>in</strong>ally, we must prove that T Et is the closure of EtT . From what<br />

we just proved it follows that if u ∈ D(T ) then Etu ∈ D(T ). For v ∈ H<br />

we then have 〈T Etu, v〉 = ∞<br />

−∞s d〈EsEtu, v〉 = ∞<br />

−∞s d〈Esu, Etv〉 =<br />

〈T u, Etv〉 = 〈EtT u, v〉 so T Et is an extension of EtT . S<strong>in</strong>ce Et is<br />

bounded and T closed it follows that T Et is closed (Exercise 7.3). Now<br />

suppose Etu ∈ D(T ). We must f<strong>in</strong>d uj ∈ D(T ) such that uj → u and<br />

EtT uj → T Etu. S<strong>in</strong>ce D(T ) is dense <strong>in</strong> H we can f<strong>in</strong>d vj ∈ D(T ) so<br />

that vj → u. Now set uj = vj −Etvj +Etu. Clearly uj ∈ D(T ), uj → u<br />

and EtT uj = T Etuj = T Etu and the proof is complete. <br />

The operator Et is called the spectral projector for the <strong>in</strong>terval<br />

(−∞, t). The spectral projector for the <strong>in</strong>terval (a, b) is E(a,b) = Eb −<br />

Ea+ where Ea+ is the right hand limit at a of Et. Similarly E[a,b] =<br />

Eb+ − Ea, etc. For a general Borel set M ⊂ R the spectral projector is<br />

def<strong>in</strong>ed to be EM = <br />

M dEt. Show that this is actually an orthogonal<br />

projection for any Borel set B!<br />

Obviously the various parts of the spectrum (po<strong>in</strong>t spectrum etc.)<br />

are determ<strong>in</strong>ed by the behavior of the spectral projectors. We end this<br />

chapter with a theorem which makes explicit this connection.<br />

Theorem 7.4.<br />

(1) λ ∈ σp(T ) if and only if Et jumps at t, i.e., E{λ} = E[λ,λ] = 0.<br />

(2) λ ∈ ρ(T ) ∩ R if and only if Et is constant <strong>in</strong> a neighborhood<br />

of t = λ.<br />

It follows that the cont<strong>in</strong>uous spectrum consists of those po<strong>in</strong>ts of<br />

<strong>in</strong>crease of Et which are not jumps 1 .<br />

Proof. If Et jumps at λ we can f<strong>in</strong>d a unit vector e <strong>in</strong> the range<br />

of E{λ}, i.e., such that E{λ}e = e. It follows immediately from the<br />

spectral theorem that e ∈ D(T ) and (T −λ)e = 0. Conversely, suppose<br />

that e is a unit vector with T e = λe. Then<br />

0 = (T − λ)e 2 ∞<br />

= (t − λ) 2 d〈Ete, e〉,<br />

−∞<br />

so that the support of the non-zero, non-negative measure d〈Ete, e〉<br />

is conta<strong>in</strong>ed <strong>in</strong> {λ}. Hence Et jumps at λ, and the proof of (1) is<br />

complete.<br />

Now assume Et is constant <strong>in</strong> (λ − ε, λ + ε). Then λ is not an<br />

eigenvalue of T so Sλ is dense <strong>in</strong> H. Thus the <strong>in</strong>verse of T − λ exists<br />

1 A po<strong>in</strong>t of <strong>in</strong>crease for Et is a po<strong>in</strong>t λ such that E∆ = 0 for every open ∆ ∋ λ.


44 7. THE SPECTRAL THEOREM<br />

as a closed, densely def<strong>in</strong>ed operator. We need only show that this<br />

<strong>in</strong>verse is bounded, to see that its doma<strong>in</strong> is all of H so that λ ∈ ρ(T ).<br />

But (T − λ)u2 = ∞<br />

−∞ (t − λ)2 d〈Etu, u〉 ≥ ε2 ∞<br />

−∞d〈Etu, u〉 = ε2u2 so the <strong>in</strong>verse of T − λ is bounded by 1/ε. Conversely, assume that<br />

Et is not constant near λ. Then there are arbitrarily short <strong>in</strong>tervals ∆<br />

conta<strong>in</strong><strong>in</strong>g λ such that E∆ = 0, i.e., there are non-zero vectors u such<br />

that E∆u = u. But then (T − λ)u ≤ |∆|u, where |∆| is the length<br />

of ∆. Hence we can f<strong>in</strong>d a sequence of unit vectors uj, j = 1, 2, . . .<br />

for which (T − λ)uj → 0 (a ‘s<strong>in</strong>gular sequence’). Consequently either<br />

T −λ is not <strong>in</strong>jective, or else the <strong>in</strong>verse is unbounded so λ /∈ ρ(T ). <br />

Exercises for Chapter 7<br />

Exercise 7.1. Suppose σ is <strong>in</strong>creas<strong>in</strong>g and ∞<br />

−λ ∞<br />

−∞<br />

dσ(t)<br />

t−λ → ∞<br />

−∞<br />

the orig<strong>in</strong>. In particular, ∞<br />

−∞<br />

−∞<br />

dσ < ∞. Show that<br />

dσ as λ → ∞ along any non-real ray orig<strong>in</strong>at<strong>in</strong>g <strong>in</strong><br />

dσ(t)<br />

t−λ<br />

→ 0.<br />

Exercise 7.2. Suppose B(·, ·) is a sesqui-l<strong>in</strong>ear form on a complex<br />

l<strong>in</strong>ear space. Show the polarization identity<br />

B(u, v) = 1<br />

3<br />

i<br />

4<br />

k B(u + i k v, u + i k v) .<br />

k=0<br />

Exercise 7.3. Show that if T is a closed operator on H and S is<br />

bounded and everywhere def<strong>in</strong>ed, then T S, but not necessarily ST , is<br />

closed.<br />

Exercise 7.4. Show that if T is selfadjo<strong>in</strong>t and f is a cont<strong>in</strong>uous<br />

function def<strong>in</strong>ed on σ(T ), then f(T ) = ∞<br />

−∞f(t) dEt def<strong>in</strong>es a densely<br />

def<strong>in</strong>ed operator, which is bounded if f is and selfadjo<strong>in</strong>t if f is realvalued.<br />

Also show that (f(T )) ∗ = f(T ), that (f(T )) ∗ has the same doma<strong>in</strong><br />

as f(T ) and commutes with it <strong>in</strong> a reasonable sense, and that fg(T ) =<br />

f(T )g(T ). This is the functional calculus for a selfadjo<strong>in</strong>t operator, and<br />

also makes sense for arbitrary Borel functions. The <strong>in</strong>tegral is made<br />

sense of <strong>in</strong> the same way as <strong>in</strong> the statement of the spectral theorem.<br />

Exercise 7.5. Let T be selfadjo<strong>in</strong>t and put H(t) = e −itT , t ∈ R,<br />

the exponential be<strong>in</strong>g def<strong>in</strong>ed as <strong>in</strong> the previous exercise. Show that<br />

H(t + s) = H(t)H(s) for real t and s (a group of operators), that<br />

H(t) is unitary and that if u0 ∈ D(T ), then u(t) = H(t)u0 solves the<br />

Schröd<strong>in</strong>ger equation T u = iu ′ t with <strong>in</strong>itial data u(0) = u0.<br />

Similarly, if T ≥ 0 and t ≥ 0, show that K(t) = e −tT is selfadjo<strong>in</strong>t<br />

and bounded, that K(t + s) = K(t)K(s) for s ≥ 0 and t ≥ 0 (a semigroup<br />

of operators) and that if u0 ∈ H then u(t) = K(t)u0 solves the<br />

heat equation T u = u ′ t for t > 0 with <strong>in</strong>itial data u(0) = u0.


CHAPTER 8<br />

Compactness<br />

If a selfadjo<strong>in</strong>t operator T has a complete orthonormal sequence of<br />

eigen-vectors e1, e2, . . . , then for any f ∈ H we have f = ˆ fjej where<br />

ˆfj = 〈f, ej〉 are the generalized Fourier coefficients; we have a generalized<br />

Fourier series. However, σp(T ) can still be very complicated; it<br />

may for example be dense <strong>in</strong> R (so that σ(T ) = R), and each eigenvalue<br />

can have <strong>in</strong>f<strong>in</strong>ite multiplicity. We have a considerably simpler<br />

situation, more similar to the case of the classical Fourier series, if the<br />

resolvent is compact.<br />

Def<strong>in</strong>ition 8.1.<br />

• A subset of a <strong>Hilbert</strong> space is called precompact (or relatively<br />

compact) if every sequence of po<strong>in</strong>ts <strong>in</strong> the set has a strongly<br />

convergent subsequence.<br />

• An operator A : H1 → H2 is called compact if it maps<br />

bounded sets <strong>in</strong>to precompact ones.<br />

Note that <strong>in</strong> an <strong>in</strong>f<strong>in</strong>ite dimensional space it is not enough for a set<br />

to be bounded (or even closed and bounded) for it to be precompact.<br />

For example, the closed unit sphere is closed and bounded, and it<br />

conta<strong>in</strong>s an orthonormal sequence. But no orthonormal sequence has<br />

a strongly convergent subsequence!<br />

The second po<strong>in</strong>t means that if {uj} ∞ 1 is a bounded sequence <strong>in</strong> H1,<br />

then {Auj} ∞ 1 has a subsequence which converges strongly <strong>in</strong> H2.<br />

Theorem 8.2.<br />

(1) The operator A is compact if and only if every weakly convergent<br />

sequence is mapped onto a strongly convergent sequence.<br />

Equivalently, if uj ⇀ 0 implies that Auj → 0.<br />

(2) If A : H1 → H2 is compact and B : H3 → H1 bounded, then<br />

AB is compact.<br />

(3) If A : H1 → H2 is compact and B : H2 → H3 bounded, then<br />

BA is compact.<br />

(4) If A : H1 → H2 is compact, then so is A ∗ : H2 → H1.<br />

Proof. If uj ⇀ u then uj − u ⇀ 0, and if A(uj − u) → 0 then<br />

Auj → Au. Thus the last statement of (1) is obvious. By Theorem 3.9<br />

every bounded sequence has a weakly convergent subsequence, so if A<br />

maps weakly convergent sequences <strong>in</strong>to strongly convergent ones, then<br />

A is compact. Conversely, suppose uj ⇀ u and A is compact. S<strong>in</strong>ce<br />

45


46 8. COMPACTNESS<br />

weakly convergent sequences are bounded (Theorem 3.9), any subsequence<br />

of {Auj} ∞ 1 has a convergent subsequence. Suppose Aujk → v.<br />

Then for any w ∈ H we have 〈v, w〉 = lim〈Aujk , w〉 = lim〈ujk , A∗w〉 =<br />

〈u, A∗w〉 = 〈Au, w〉, so that v = Au. Hence the only po<strong>in</strong>t of accumulation<br />

of {Auj} ∞ 1 is Au, so Auj → Au1 . This completes the proof<br />

of (1). We leave the rest of the proof as an exercise for the reader<br />

(Exercise 8.1). <br />

Theorem 8.3. Suppose T is selfadjo<strong>in</strong>t and its resolvent Rµ is<br />

compact for some µ. Then Rλ is compact for all λ ∈ ρ(T ), and T has<br />

discrete spectrum, i.e., σ(T ) consists of isolated eigenvalues with f<strong>in</strong>ite<br />

multiplicity.<br />

Proof. By the resolvent relation Rλ = (I + (λ − µ)Rλ)Rµ where<br />

I is the identity so the first factor to the right is bounded. Hence Rλ<br />

is compact by Theorem 8.2.3.<br />

<br />

∆<br />

Now let ∆ be a bounded <strong>in</strong>terval. If u ∈ E∆H then Rλu 2 =<br />

d〈Etu,u〉<br />

|t−λ| 2<br />

≥ Ku 2 where K = <strong>in</strong>ft∈∆|t − λ| −2 > 0 (verify this calcu-<br />

lation!). We have Rλuj → 0 if uj ⇀ 0, so the <strong>in</strong>equality shows that<br />

any weakly convergent sequence <strong>in</strong> E∆H is strongly convergent (the<br />

identity operator is compact). This implies that E∆H has f<strong>in</strong>ite dimension<br />

(for example s<strong>in</strong>ce an orthonormal sequence converges weakly<br />

to 0 but is not strongly convergent). In particular eigenspaces are<br />

f<strong>in</strong>ite-dimensional. It also follows that any bounded <strong>in</strong>terval can only<br />

conta<strong>in</strong> a f<strong>in</strong>ite number of po<strong>in</strong>ts of <strong>in</strong>crease for Et, because projections<br />

belong<strong>in</strong>g to disjo<strong>in</strong>t <strong>in</strong>tervals have orthogonal ranges (Exercise 8.2).<br />

This completes the proof. <br />

Resolvents for different selfadjo<strong>in</strong>t extensions of a symmetric operator<br />

are closely related. In particular, we have the follow<strong>in</strong>g theorem.<br />

Theorem 8.4. Suppose a densely def<strong>in</strong>ed symmetric operator T0<br />

has a selfadjo<strong>in</strong>t extension with compact resolvent and that dim Dλ <<br />

∞ for some λ ∈ C \ R. Then every selfadjo<strong>in</strong>t extension of T0 has<br />

compact resolvent.<br />

Proof. Let Im λ = 0 and Rλ, ˜ Rλ be resolvents of selfadjo<strong>in</strong>t extensions<br />

of T0. Then A = Rλ − ˜ Rλ has its range <strong>in</strong> Dλ, s<strong>in</strong>ce Rλu and<br />

˜Rλu both solve the equation T ∗ 0 v = λv + u. It follows that A is a compact<br />

operator, s<strong>in</strong>ce if {uj} ∞ 1 is a bounded sequence <strong>in</strong> H, then {Auj} ∞ 1<br />

is a bounded sequence <strong>in</strong> a f<strong>in</strong>ite-dimensional space. By the Bolzano-<br />

Weierstrass theorem there is therefore a convergent subsequence. If ˜ Rλ<br />

is compact it therefore follows that Rλ = ˜ Rλ + A is compact. <br />

1 If not, there would be a neighborhood O of Au and a subsequence of {Auj} ∞ j=1<br />

that were outside O. But we could then f<strong>in</strong>d a convergent subsequence which does<br />

not converge to Au.


8. COMPACTNESS 47<br />

A natural question is now: How do I, <strong>in</strong> a concrete case, recognize<br />

that an operator is compact? One class of compact operators which<br />

are sometimes easy to recognize, are the <strong>Hilbert</strong>-Schmidt operators.<br />

Def<strong>in</strong>ition 8.5. A : H → H is called a <strong>Hilbert</strong>-Schmidt operator if<br />

for some complete orthonormal sequence e1, e2, . . . we have Aej 2 <<br />

∞. The number ||A|| = Aej 2 is called the <strong>Hilbert</strong>-Schmidt norm<br />

of A.<br />

Lemma 8.6. ||A|| is <strong>in</strong>dependent of the particular complete orthonormal<br />

sequence used <strong>in</strong> the def<strong>in</strong>ition, it is a norm, ||A|| = |||A ∗ ||, and<br />

any <strong>Hilbert</strong>-Schmidt operator is compact. The set of <strong>Hilbert</strong>-Schmidt<br />

operators on H is a <strong>Hilbert</strong> space <strong>in</strong> the <strong>Hilbert</strong>-Schmidt norm.<br />

Proof. It is clear that ||| · || is a norm. Now suppose {ej} ∞ 1 and<br />

{fj} ∞ 1 are arbitrary complete orthonormal sequences. Us<strong>in</strong>g Parseval’s<br />

formula twice it follows that <br />

jAej2 = <br />

j,k |〈Aej, fk〉| 2 =<br />

<br />

j,k |〈ej, A∗fk〉| 2 = <br />

kA∗fk2 . Thus the <strong>Hilbert</strong>-Schmidt norm has<br />

the claimed properties. To see that A is compact, suppose uj ⇀ 0 and<br />

let ε > 0. Choose N so large that ∞<br />

N A∗ ej 2 < ε and let C be a<br />

bound for the sequence {uj} ∞ 1 . By Parseval’s formula we then have<br />

Auk 2 = |〈Auk, ej〉| 2 = |〈uk, A ∗ ej〉| 2 . We obta<strong>in</strong><br />

Auk 2 ≤<br />

N<br />

1<br />

|〈uk, A ∗ ej〉| 2 + C 2 ε → C 2 ε<br />

as k → ∞ s<strong>in</strong>ce |〈uk, A ∗ ej〉| ≤ CA ∗ ej . It follows that Auk → 0<br />

so that A is compact. We leave the proof of the last statement as an<br />

exercise for the reader (Exercise 8.4). <br />

It is usual to consider a differential operator def<strong>in</strong>ed <strong>in</strong> some doma<strong>in</strong><br />

Ω ⊂ R n as an operator <strong>in</strong> the space L 2 (Ω, w) where w > 0 is<br />

measurable<br />

<br />

and the scalar product <strong>in</strong> the space is given by 〈u, v〉 =<br />

u(x)v(x)w(x) dx. In all reasonable cases the resolvent of such an<br />

Ω<br />

operator can be realized as an <strong>in</strong>tegral operator, i.e., an operator of<br />

the form<br />

<br />

(8.1) Au(x) = g(x, y)u(y)w(y) dy for x ∈ Ω.<br />

Ω<br />

The function g, def<strong>in</strong>ed <strong>in</strong> Ω ×Ω, is called the <strong>in</strong>tegral kernel of the operator<br />

A. The <strong>in</strong>tegral kernel of the resolvent of a differential operator<br />

is usually called Green’s function for the operator.<br />

Theorem 8.7. Assume g(x, y) is measurable as a function of both<br />

its variables and that y ↦→ g(x, y) is <strong>in</strong> L 2 (Ω, w) for a.a. x ∈ Ω. Then<br />

the operator A of (8.1) is a <strong>Hilbert</strong>-Schmidt operator <strong>in</strong> L 2 (Ω, w) if and


48 8. COMPACTNESS<br />

only if g ∈ L 2 (Ω, w) ⊗ L 2 (Ω, w), i.e., if and only if<br />

<br />

Ω×Ω<br />

|g(x, y)| 2 w(x)w(y) dx dy < ∞.<br />

Proof. Let {ej} ∞ 1 be a complete orthonormal sequence <strong>in</strong> the<br />

space L 2 (Ω, w). For fixed x ∈ Ω we may view Aej(x) as the j:th<br />

Fourier coefficient of g(x, ·) so Parseval’s formula gives |Aej(x)| 2 =<br />

<br />

Ω |g(x, y)|2 w(y) dy for a.a x ∈ Ω. By monotone convergence the product<br />

of this function by w is <strong>in</strong> L 1 (Ω) if and only if the <strong>Hilbert</strong>-Schmidt<br />

norm of A is f<strong>in</strong>ite. The theorem now follows by an application of<br />

Tonelli’s theorem (i.e., a positive, measurable function is <strong>in</strong>tegrable<br />

over Ω × Ω if and only if the iterated <strong>in</strong>tegral is f<strong>in</strong>ite). <br />

Example 8.8. Consider the operator T <strong>in</strong> L 2 (−π, π) with doma<strong>in</strong><br />

D(T ), consist<strong>in</strong>g of those absolutely cont<strong>in</strong>uous functions u with deriva-<br />

tive <strong>in</strong> L 2 (−π, π) for which u(π) = u(−π), and given by T u = −i du<br />

dx (cf.<br />

Example 4.8). This operator is self-adjo<strong>in</strong>t and its resolvent is given<br />

by Rλu(x) = π<br />

g(x, y, λ)u(y) dy where Green’s function g(x, y, λ) is<br />

−π<br />

given by<br />

⎧<br />

⎪⎨ −<br />

g(x, y, λ) =<br />

⎪⎩<br />

e−iλπ<br />

2 s<strong>in</strong> λπ eiλ(x−y) y < x,<br />

− eiλπ<br />

2 s<strong>in</strong> λπ eiλ(x−y) y > x.<br />

The reader should verify this! S<strong>in</strong>ce |g(x, y, λ)| 2 dx dy < ∞ for non<strong>in</strong>teger<br />

λ the resolvent is a <strong>Hilbert</strong>-Schmidt operator, so it is compact.<br />

Now consider the operator of Example 4.6. Green’s function is now<br />

only def<strong>in</strong>ed for non-real λ and given by<br />

<br />

Im λ i |Im λ| (8.2) g(x, y, λ) =<br />

eiλ(x−y) if (x − y) Im λ > 0,<br />

0 otherwise.<br />

The reader should verify this as well! In this case there is no value of λ<br />

for which g(·, ·, λ) ∈ L 2 (R 2 ) so the resolvent is not a <strong>Hilbert</strong>-Schmidt<br />

operator.


EXERCISES FOR CHAPTER 8 49<br />

Exercises for Chapter 8<br />

Exercise 8.1. Prove Theorem 8.2(2)–(4).<br />

Exercise 8.2. Show that if ∆1 and ∆2 are disjo<strong>in</strong>t <strong>in</strong>tervals and<br />

{Et}t∈R a resolution of the identity, then the ranges of E∆1 and E∆2<br />

are orthogonal. Generalize to the case when ∆1 and ∆2 are arbitrary<br />

Borel sets <strong>in</strong> R.<br />

Exercise 8.3. Show the converse of Theorem 8.3, i.e., if the spectrum<br />

consists of isolated eigen-values of f<strong>in</strong>ite multiplicity, then the<br />

resolvent is compact.<br />

H<strong>in</strong>t: Let λ1, λ2, . . . be the eigenvalues ordered by <strong>in</strong>creas<strong>in</strong>g absolute<br />

value and repeated accord<strong>in</strong>g to multiplicity and let the correspond<strong>in</strong>g<br />

normalized eigen-vectors be e1, e2, . . . . Show that Rλu2 = |〈u,ej〉| 2<br />

|λ−λj| 2<br />

and use this to see that Rλuk → 0 if uk ⇀ 0.<br />

Exercise 8.4. Prove the last statement of Lemma 8.6.<br />

Exercise 8.5. Verify all claims made <strong>in</strong> Example 8.8.<br />

Exercise 8.6. Let T be a selfadjo<strong>in</strong>t operator. Show that if the<br />

resolvent Rλ of T is a <strong>Hilbert</strong>-Schmidt operator and λj, j = 1, 2, . . .<br />

are the non-zero eigenvalues of T , then ∞ j=1 λ−2 j < ∞.


CHAPTER 9<br />

Extension theory<br />

We will here complete the discussion on selfadjo<strong>in</strong>t extensions of<br />

a symmetric operator begun <strong>in</strong> Chapter 4. This material is orig<strong>in</strong>ally<br />

due to von Neumann although our proofs are different, and we will also<br />

discuss an extension of von Neumann’s theory needed <strong>in</strong> Chapter 13.<br />

1. Symmetric operators<br />

We shall f<strong>in</strong>d criteria for the existence of selfadjo<strong>in</strong>t extensions of a<br />

densely def<strong>in</strong>ed symmetric operator, which accord<strong>in</strong>g to the discussion<br />

just before Example 4.4 must be a restriction of the adjo<strong>in</strong>t operator.<br />

We shall deal extensively with the graphs of various operators and it<br />

will be convenient to use the same notation for the graph of an operator<br />

T as for T itself. Note that if T is a closed operator on the <strong>Hilbert</strong><br />

space H, then its graph is a closed subspace of H ⊕ H, so <strong>in</strong> this case<br />

T is itself a <strong>Hilbert</strong> space.<br />

Recall that with the present notation we have<br />

T ∗ = U(H ⊕ H) ⊖ T ) = (H ⊕ H) ⊖ UT<br />

accord<strong>in</strong>g to (4.1), where U : H ⊕ H ∋ (u, v) ↦→ (−iv, iu) is the boundary<br />

operator <strong>in</strong>troduced <strong>in</strong> Chapter 4. Also recall that U is selfadjo<strong>in</strong>t,<br />

unitary and <strong>in</strong>volutary on H ⊕ H.<br />

So, assume we have a densely def<strong>in</strong>ed symmetric operator T . We<br />

want to <strong>in</strong>vestigate what selfadjo<strong>in</strong>t extensions, if any, T has. S<strong>in</strong>ce<br />

T ⊂ T ∗ the adjo<strong>in</strong>t is densely def<strong>in</strong>ed and thus the closure T = T ∗∗<br />

exists (Proposition 4.3) and is also symmetric. We may therefore as well<br />

assume that T is closed to beg<strong>in</strong> with. Recall that if S is a symmetric<br />

extension of T , then it is a restriction of T ∗ s<strong>in</strong>ce we then have T ⊂<br />

S ⊂ S ∗ ⊂ T ∗ . Now put<br />

D±i = {U ∈ T ∗ | UU = ±U}.<br />

Note that U ∈ T ∗ means exactly that U = (u, T ∗ u) for some u ∈ D(T ∗ ).<br />

It is immediately seen that Di and D−i consist of the elements of T ∗<br />

of the form (u, iu) and (u, −iu) respectively, so that u satisfies the<br />

equation T ∗ u = iu respectively T ∗ u = −iu. We may therefore identify<br />

these spaces with the deficiency spaces D±i <strong>in</strong>troduced <strong>in</strong> Chapter 5.<br />

Also D±i are therefore called deficiency spaces.<br />

Theorem 9.1 (von Neumann). If T is closed and symmetric operator,<br />

then T ∗ = T ⊕ Di ⊕ D−i.<br />

51


52 9. EXTENSION THEORY<br />

Proof. The facts that Di and D−i are eigenspaces of the unitary<br />

operator U for different eigenvalues and 〈T, UT ∗ 〉 = 0 imply that T ,<br />

Di and D−i are orthogonal subspaces of T ∗ (cf. Exercise 4.8). It<br />

rema<strong>in</strong>s to show that Di ⊕ D−i conta<strong>in</strong>s T ∗ ⊖ T . However, U ∈ T ∗ ⊖ T<br />

implies U ∈ H2 ⊖ T and thus UU ∈ T ∗ . Denot<strong>in</strong>g the identity on<br />

H2 by I and us<strong>in</strong>g U 2 = I one obta<strong>in</strong>s U+ = 1<br />

2 (I + U)U ∈ Di and<br />

U− = 1<br />

2 (I − U)U ∈ D−i. Clearly U = U+ + U− so this proves the<br />

theorem. <br />

We def<strong>in</strong>e the deficiency <strong>in</strong>dices of T to be<br />

n+ = dim Di = dim Di and n− = dim D−i = dim D−i<br />

so these are natural numbers or ∞. We may now characterize the<br />

symmetric extensions of T .<br />

Theorem 9.2. If S is a closed, symmetric extension of the closed<br />

symmetric operator T , then S = T ⊕ D where D is a subspace of<br />

Di ⊕ D−i such that<br />

D = {u + Ju | u ∈ D(J) ⊂ Di}<br />

for some l<strong>in</strong>ear isometry J of a closed subspace D(J) of Di onto part of<br />

D−i. Conversely, every such space D gives rise to a closed symmetric<br />

extension S = T ⊕ D of T .<br />

The proof is obvious after not<strong>in</strong>g that if u+, v+ ∈ Di and u−, v− ∈<br />

D−i, then 〈u+, v+〉 = 〈u−, v−〉 precisely if (u+ + u−, U(v+ + v−)) = 0.<br />

Some immediate consequences of Theorem 9.2 are as follows.<br />

Corollary 9.3. The closed symmetric operator T is maximal symmetric<br />

precisely if one of n+ and n− equals zero and selfadjo<strong>in</strong>t precisely<br />

if n+ = n− = 0.<br />

Corollary 9.4. If S is the symmetric extension of the closed symmetric<br />

operator T given as <strong>in</strong> Theorem 9.2 by the isometry J with doma<strong>in</strong><br />

D(J) ⊂ Di and range RJ ⊂ D−i, then the deficiency spaces for<br />

S are Di(S) = Di ⊖ D(J) and D−i(S) = D−i ⊖ RJ respectively.<br />

Proof. If D ⊂ Di ⊕ D−i and S = T ⊕ D is symmetric, then<br />

u ∈ Di(S) ⊂ Di precisely if 〈T ⊕ D, Uu〉 = 0. But 〈T, Uu〉 = 0 and<br />

if u+ + u− ∈ D with u+ ∈ Di, u− ∈ D−i then 〈u+ + u−, u〉 = 〈u+, u〉<br />

which shows that Di(S) = Di ⊖ D(J). Similarly the statement about<br />

D−i(S) follows. <br />

Corollary 9.5. Every symmetric operator has a maximal symmetric<br />

extension. If one of n+ and n− is f<strong>in</strong>ite, then all or none of the<br />

maximal symmetric extensions are selfadjo<strong>in</strong>t depend<strong>in</strong>g on whether<br />

n+ = n− or not. If n+ = n− = ∞, however, some maximal symmetric<br />

extensions are selfadjo<strong>in</strong>t and some are not.


2. SYMMETRIC RELATIONS 53<br />

We will now generalize Theorem 9.1. To do this, we use the notation<br />

of Lemma 5.1. Def<strong>in</strong>e<br />

Dλ = {(u, λu) ∈ T ∗ } = {(u, λu) | u ∈ Dλ}<br />

Eλ = {(u, λu + v) ∈ T ∗ | v ∈ D λ } .<br />

It is clear that Eλ for non-real λ is the direct sum of Dλ and D λ s<strong>in</strong>ce<br />

if a = i<br />

we have (u, λu + v) = a(v, λv) + (u − av, λ(u − av)). This<br />

2 Im λ<br />

direct sum is topological (i.e., the projections from Eλ onto Dλ and<br />

Dλ are bounded) s<strong>in</strong>ce all three spaces are obviously closed. Thus<br />

the assertion follows from the closed graph theorem. Carry out the<br />

argument as an exercise! We can now prove the follow<strong>in</strong>g theorem.<br />

Theorem 9.6. For any non-real λ we have T ∗ = T ˙+Eλ as a topological<br />

direct sum.<br />

Proof. S<strong>in</strong>ce all <strong>in</strong>volved spaces are closed it is enough to show<br />

the formula algebraically (the reason is as above). Let (u, v) ∈ T ∗ . By<br />

Lemma 5.1.1 H = Sλ ⊕ D λ so we may write v − λu = w0 + w λ with<br />

w λ ∈ D λ and w0 ∈ Sλ. We can f<strong>in</strong>d u0 ∈ H such that (u0, λuo+w0) ∈ T<br />

so (u, v) = (u0, λu0 + w0) + (u − u0, λ(u − u0) + w λ ). The last term is<br />

obviously <strong>in</strong> Eλ.<br />

If (u, v) ∈ T ∩ Eλ we have v − λu ∈ Sλ ∩ D λ = {0} so that λ is an<br />

eigenvalue of T if u = 0. <br />

Corollary 9.7. If Im λ > 0 then dim Dλ = n+, dim D λ = n−.<br />

Proof. Suppose U = (u, T ∗ u) and V = (v, T ∗ v) are <strong>in</strong> T ∗ . The<br />

boundary form<br />

〈U, UV 〉 = i(〈u, T ∗ v〉 − 〈T ∗ u, v〉)<br />

is a bounded Hermitian form on T ∗ . It is immediately verified that<br />

it is positive def<strong>in</strong>ite on Dλ, negative def<strong>in</strong>ite on D λ , non-positive on<br />

T ˙+Dλ and non-negative on T ˙+D λ .<br />

Let µ be a complex number with Im µ > 0. We get a l<strong>in</strong>ear map<br />

of Dµ <strong>in</strong>to Dλ <strong>in</strong> the follow<strong>in</strong>g way. Given u ∈ Dµ we may write<br />

u = u0 +uλ +u λ uniquely with u0 ∈ T , uλ ∈ Dλ and u λ ∈ D λ accord<strong>in</strong>g<br />

to Theorem 9.6. Let the image of u <strong>in</strong> Dλ be uλ. Then uλ can not be 0<br />

unless u is s<strong>in</strong>ce the boundary form is positive def<strong>in</strong>ite on Dµ but nonpositive<br />

on T ˙+D λ . It follows that dim Dµ ≤ dim Dλ. By symmetry the<br />

dimensions of Dλ and Dµ are then equal, i.e., dim Dλ = n+. Similarly<br />

one shows that dim D λ = n−. <br />

2. Symmetric relations<br />

This section is a simplified version of Section 1 of [2]. Most of it can<br />

also be found <strong>in</strong> [1]. The theory of symmetric and selfadjo<strong>in</strong>t relations<br />

is an easy extension of the correspond<strong>in</strong>g theory for operators, but will<br />

be essential for Chapters 13 and 14.


54 9. EXTENSION THEORY<br />

We call a (closed) l<strong>in</strong>ear subspace T of H 2 = H⊕H a (closed) l<strong>in</strong>ear<br />

relation on H. This is a generalization of the concept of (the graph<br />

of) a l<strong>in</strong>ear operator which will turn out to be useful <strong>in</strong> the follow<strong>in</strong>g<br />

chapters. We still denote by U the boundary operator on H 2 and def<strong>in</strong>e<br />

the adjo<strong>in</strong>t of the l<strong>in</strong>ear relation T on H by<br />

T ∗ = H 2 ⊖ UT = U(H 2 ⊖ T ) .<br />

Clearly T ∗ is a closed l<strong>in</strong>ear relation on H. Note that by not <strong>in</strong>sist<strong>in</strong>g<br />

that T and T ∗ are graphs we can, for example, now deal with adjo<strong>in</strong>ts<br />

of non-densely def<strong>in</strong>ed operators. Naturally T is called symmetric if<br />

T ⊂ T ∗ and selfadjo<strong>in</strong>t if T = T ∗ .<br />

Proposition 9.8. Let T ⊂ S be l<strong>in</strong>ear relations on H. Then S ∗ ⊂<br />

T ∗ . The closure of T is T = T ∗∗ and (T ) ∗ = T ∗ .<br />

The reader should prove this proposition as an exercise. It is very<br />

easy to obta<strong>in</strong> a spectral theorem for selfadjo<strong>in</strong>t relations as a corollary<br />

to the spectral theorem of Chapter 7. Given a relation T we call the set<br />

D(T ) = {u ∈ H | (u, v) ∈ T for some v ∈ H} the doma<strong>in</strong> of T . Now let<br />

HT be the closure of D(T ) <strong>in</strong> H and put H∞ = {u ∈ H | (0, u) ∈ T ∗ }<br />

One may view H∞ as the ‘eigen-space’ of T ∗ correspond<strong>in</strong>g to the<br />

‘eigen-value’ ∞.<br />

Proposition 9.9. H = HT ⊕ H∞.<br />

Proof. We have 〈(u, v), U(0, w)〉 = i〈u, w〉 so that (0, w) ∈ T ∗<br />

precisely when w ∈ H ⊖ D(T ). The proposition follows. <br />

Now assume T is selfadjo<strong>in</strong>t and put T∞ = {0} × H∞ and ˜ T =<br />

T ∩ H 2 T . Then it is clear that T = ˜ T ⊕ T∞ so we have split T <strong>in</strong>to<br />

its many-valued part T∞ and ˜ T which is called the operator part of T<br />

because of the follow<strong>in</strong>g theorem.<br />

Theorem 9.10 (<strong>Spectral</strong> theorem for selfadjo<strong>in</strong>t relations). If T is<br />

selfadjo<strong>in</strong>t, then ˜ T is the graph of a densely def<strong>in</strong>ed selfadjo<strong>in</strong>t operator<br />

<strong>in</strong> HT with doma<strong>in</strong> D(T ).<br />

Proof. ˜ T is the graph of a densely def<strong>in</strong>ed operator on HT s<strong>in</strong>ce<br />

(0, w) ∈ ˜ T implies w ∈ H∞ ∩ HT = {0}. ˜ T is selfadjo<strong>in</strong>t s<strong>in</strong>ce ˜ T =<br />

T ⊖ T∞ so its adjo<strong>in</strong>t (<strong>in</strong> HT ) is ˜ T ∗ = H 2 T ∩ (T ∗ ⊕ UT∞) = H 2 T ∩ T = ˜ T<br />

(check this calculation carefully!). <br />

It is now clear that we get a resolution of the identity for T by<br />

adjo<strong>in</strong><strong>in</strong>g the orthogonal projector onto H∞ to the resolution of the<br />

identity for ˜ T .<br />

Assume we have a symmetric relation T . We want to <strong>in</strong>vestigate<br />

what selfadjo<strong>in</strong>t extensions, if any, T has. S<strong>in</strong>ce the closure T of T is<br />

also symmetric we may as well assume that T is closed to beg<strong>in</strong> with.<br />

Just as is the case for operators, if S is a symmetric extension of T ,


2. SYMMETRIC RELATIONS 55<br />

then it is a restriction of T ∗ s<strong>in</strong>ce we then have T ⊂ S ⊂ S ∗ ⊂ T ∗ .<br />

Now put<br />

D±i = {u ∈ T ∗ | Uu = ±u} .<br />

It is immediately seen that Di and D−i consist of the elements of T ∗ of<br />

the form (u, iu) and (u, −iu) respectively. We call them the deficiency<br />

spaces of T . The follow<strong>in</strong>g generalizes von Neumann’s formula.<br />

Theorem 9.11. For any closed and symmetric relation T holds<br />

T ∗ = T ⊕ Di ⊕ D−i.<br />

The proof is the same as for Theorem 9.1 and is left to Exercise 9.6<br />

As before we def<strong>in</strong>e the deficiency <strong>in</strong>dices of T to be<br />

n+ = dim Di = dim Di and n− = dim D−i = dim D−i<br />

so these are aga<strong>in</strong> natural numbers or ∞. The next theorem is completely<br />

analogous to Theorem 9.2 with essentially the same proof, so<br />

we leave this as Exercise 9.7<br />

Theorem 9.12. If S is a closed, symmetric extension of the closed<br />

symmetric relation T , then S = T ⊕ D where D is a subspace of Di ⊕<br />

D−i such that<br />

D = {u + Ju | u ∈ D(J) ⊂ Di}<br />

for some l<strong>in</strong>ear isometry J of a closed subspace D(J) of Di onto part of<br />

D−i. Conversely, every such space D gives rise to a closed symmetric<br />

extension S = T ⊕ D of T .<br />

The follow<strong>in</strong>g consequences of Theorem 9.12 are completely analogous<br />

to Corollaries 9.3–9.5, and their proofs are left as Exercise 9.8<br />

Some immediate consequences of<br />

Corollary 9.13. The closed symmetric relation T is maximal<br />

symmetric precisely if one of n+ and n− equals zero and selfadjo<strong>in</strong>t<br />

precisely if n+ = n− = 0.<br />

Corollary 9.14. If S is the symmetric extension of the closed<br />

symmetric relation T given as <strong>in</strong> Theorem 9.12 by the isometry J with<br />

doma<strong>in</strong> D(J) ⊂ Di and range RJ ⊂ D−i, then the deficiency spaces<br />

for S are Di(S) = Di ⊖ D(J) and D−i(S) = D−i ⊖ RJ respectively.<br />

Corollary 9.15. Every symmetric relation has a maximal symmetric<br />

extension. If one of n+ and n− is f<strong>in</strong>ite, then all or none of the<br />

maximal symmetric extensions are selfadjo<strong>in</strong>t depend<strong>in</strong>g on whether<br />

n+ = n− or not. If n+ = n− = ∞, however, some maximal symmetric<br />

extensions are selfadjo<strong>in</strong>t and some are not.<br />

We will now prove a theorem generaliz<strong>in</strong>g Theorem 9.6. To do this,<br />

first note that Lemma 5.1 rema<strong>in</strong>s valid for relations, with the obvious


56 9. EXTENSION THEORY<br />

def<strong>in</strong>itions of Sλ and Dλ and identical proofs. We now def<strong>in</strong>e<br />

Dλ = {(u, λu) ∈ T ∗ } = {(u, λu) | u ∈ Dλ}<br />

Eλ = {(u, λu + v) ∈ T ∗ | v ∈ D λ } .<br />

As before it is clear that Eλ for non-real λ is the direct sum of Dλ and<br />

Dλ s<strong>in</strong>ce if a = i we have (u, λu+v) = a(v, λv)+(u−av, λ(u−av))<br />

2 Im λ<br />

and that this direct sum is topological (i.e., the projections from Eλ<br />

onto Dλ and Dλ are bounded).<br />

Theorem 9.16. For any non-real λ holds T ∗ = T ˙+Eλ as a topological<br />

direct sum.<br />

Corollary 9.17. If Im λ > 0 then dim Dλ = n+, dim D λ = n−.<br />

The proofs of Theorems 9.16 and 9.17 is the same as for Theorems<br />

9.6 and 9.7 respectively, and are left as exercises.


EXERCISES FOR CHAPTER 9 57<br />

Exercises for Chapter 9<br />

Exercise 9.1. Fill <strong>in</strong> all miss<strong>in</strong>g details <strong>in</strong> the proofs of Theorem<br />

9.2 and Corollaries 9.3–9.5.<br />

Exercise 9.2. Show that if n+ = n− < ∞, then if one selfadjo<strong>in</strong>t<br />

extension of a symmetric operator has compact resolvent, then every<br />

other selfadjo<strong>in</strong>t extension also has compact resolvent.<br />

H<strong>in</strong>t: The difference of the resolvents for two selfadjo<strong>in</strong>t extensions of<br />

a symmetric operator has range conta<strong>in</strong>ed <strong>in</strong> Dλ.<br />

Exercise 9.3. Suppose T is a closed and symmetric operator on<br />

H, that λ ∈ R and that Sλ is closed. Show that if λ is not an eigenvalue<br />

of T , then T ∗ is the topological direct sum of T and Eλ and that<br />

n+ = n−.<br />

You may also show that if Sλ is closed but λ is an eigen-value of T ,<br />

then one still has n+ = n−.<br />

Exercise 9.4. Suppose T is a symmetric and positive operator,<br />

i.e., 〈T u, u〉 ≥ 0 for every u ∈ D(T ). Use the previous exercise to show<br />

that T has a selfadjo<strong>in</strong>t extension (this is a theorem by von Neumann).<br />

Exercise 9.5. Suppose T is a symmetric and positive operator. By<br />

the previous exercise T has at least one selfadjo<strong>in</strong>t extension. Prove<br />

that there exists a positive selfadjo<strong>in</strong>t extension (the so called Friedrichs<br />

extension). This is a theorem by Friedrichs.<br />

H<strong>in</strong>t: First def<strong>in</strong>e [u, v] = 〈T u, v〉 + 〈u, v〉 for u, v ∈ D(T ), show that<br />

this is a scalar product, and let H1 be the completion of D(T ) <strong>in</strong> the<br />

correspond<strong>in</strong>g norm. Next show that H1 may be identified with a<br />

subset of H and that for any u ∈ H the map H1 ∋ v ↦→ 〈v, u〉 is a<br />

bounded l<strong>in</strong>ear form on H1. Conclude that 〈u, v〉 = [u, Gv] for u ∈ H1<br />

and v ∈ H, where G is an operator on H with range <strong>in</strong> H1. F<strong>in</strong>ally<br />

show that G −1 − I, where I is the identity, is a positive selfadjo<strong>in</strong>t<br />

extension of T .<br />

Exercise 9.6. Prove Theorem 9.11.<br />

Exercise 9.7. Prove Theorem 9.12.<br />

Exercise 9.8. Prove Corollaries 9.13–9.15.<br />

Exercise 9.9. Prove Theorem 9.16.<br />

Exercise 9.10. Prove Theorem 9.17.


CHAPTER 10<br />

Boundary conditions<br />

A simple example of a formally symmetric differential equation is<br />

given by the general Sturm-Liouville equation<br />

(10.1) −(pu ′ ) ′ + qu = wf.<br />

Here the coefficients p, q and w are given real-valued functions <strong>in</strong> a<br />

given <strong>in</strong>terval I. Standard existence and uniqueness theorems for the<br />

<strong>in</strong>itial value problem are valid if 1/p, q and w are all <strong>in</strong> Lloc(I). There<br />

are (at least) two Hermitian forms naturally associated with this equa-<br />

tion, namely <br />

I (pu′ v ′ + quv) and <br />

I<br />

uvw. Under appropriate positivity<br />

conditions either of these forms is a suitable choice of scalar product for<br />

a <strong>Hilbert</strong> space <strong>in</strong> which to study (10.1). The correspond<strong>in</strong>g problems<br />

are then called left def<strong>in</strong>ite and right def<strong>in</strong>ite respectively. We will not<br />

discuss left def<strong>in</strong>ite problems <strong>in</strong> these lectures.<br />

If p is not differentiable it is most convenient to <strong>in</strong>terpret (10.1) as<br />

a first order system<br />

<br />

0 1<br />

U<br />

−1 0<br />

′ <br />

q 0<br />

+<br />

0 − 1<br />

<br />

w 0<br />

U = V .<br />

p 0 0<br />

<br />

u<br />

This equation becomes equivalent to (10.1) on sett<strong>in</strong>g U =<br />

−pu ′<br />

<br />

and lett<strong>in</strong>g the first component of V be f. It is a special case of a<br />

fairly general first order system<br />

(10.2) Ju ′ + Qu = W v<br />

where J is a constant n × n matrix which is <strong>in</strong>vertible and skew-<br />

Hermitian (i.e., J ∗ = −J) and the coefficients Q and W are n × n<br />

matrix-valued functions which are locally <strong>in</strong>tegrable on I. In addition<br />

Q is assumed Hermitian and W positive semi-def<strong>in</strong>ite, and u, v are<br />

n × 1 matrix-valued functions. We shall study such systems <strong>in</strong> Chapters<br />

13–15.<br />

Here we shall just deal with the case of the simple <strong>in</strong>homogeneous<br />

scalar Sturm-Liouville equation<br />

(10.3) −u ′′ + qu = λu + f<br />

and the correspond<strong>in</strong>g homogeneous eigenvalue problem −u ′′ +qu = λu.<br />

The latter is often called the one-dimensional Schröd<strong>in</strong>ger equation. In<br />

later chapters we shall then see that with m<strong>in</strong>or additional technical<br />

complications we may deal with the first order system (10.2) <strong>in</strong> much<br />

59


60 10. BOUNDARY CONDITIONS<br />

the same way. This will of course <strong>in</strong>clude the more general Sturm-<br />

Liouville equation (10.1).<br />

We shall study (10.3) <strong>in</strong> the <strong>Hilbert</strong> space L 2 (I) where I is an<br />

<strong>in</strong>terval and the function q is real-valued and locally <strong>in</strong>tegrable <strong>in</strong> I,<br />

i.e., <strong>in</strong>tegrable on every compact sub<strong>in</strong>terval of I. In L 2 (I) the scalar<br />

product is 〈u, v〉 = <br />

I uv.<br />

Before we beg<strong>in</strong> we need to quote a few standard facts about Sturm-<br />

Liouville equations. Basic for what follows is the follow<strong>in</strong>g existence<br />

and uniqueness theorem.<br />

Theorem 10.1. Suppose q is locally <strong>in</strong>tegrable <strong>in</strong> an <strong>in</strong>terval I and<br />

that c ∈ I. Then, for any locally <strong>in</strong>tegrable function f and arbitrary<br />

complex constants A, B and λ the <strong>in</strong>itial value problem<br />

<br />

−u ′′ + qu = λu + f <strong>in</strong> I,<br />

u(c) = A, u ′ (c) = B<br />

has a unique, cont<strong>in</strong>uously differentiable solution u with locally absolutely<br />

cont<strong>in</strong>uous derivative def<strong>in</strong>ed <strong>in</strong> I. If A, B are <strong>in</strong>dependent of λ<br />

the solution u(x, λ) and its x-derivative will be entire functions of λ,<br />

locally uniformly <strong>in</strong> x.<br />

We shall use this only if f is actually locally square <strong>in</strong>tegrable. The<br />

theorem has the follow<strong>in</strong>g immediate consequence.<br />

Corollary 10.2. Let q, λ and I be as <strong>in</strong> Theorem 10.1. Then the<br />

set of solutions to −u ′′ + qu = λu <strong>in</strong> I is a 2-dimensional l<strong>in</strong>ear space.<br />

If one rewrites −u ′′ + qu = λu + f as a first order system accord<strong>in</strong>g<br />

to the prescription before (10.1), then Theorem 10.1 and Corollary 10.2<br />

become special cases of the theorems for first order systems given <strong>in</strong><br />

Appendix C.<br />

In order to get a spectral theory for (10.3) we need to def<strong>in</strong>e a m<strong>in</strong>imal<br />

operator, show that it is densely def<strong>in</strong>ed and symmetric, calculate<br />

its adjo<strong>in</strong>t and f<strong>in</strong>d the selfadjo<strong>in</strong>t restrictions of the adjo<strong>in</strong>t.<br />

We def<strong>in</strong>e Tc to be the operator u ↦→ −u ′′ +qu with doma<strong>in</strong> consist<strong>in</strong>g<br />

of those cont<strong>in</strong>uously differentiable functions u which have compact<br />

support, i.e., they are zero outside some compact sub<strong>in</strong>terval of the <strong>in</strong>terior<br />

of I, and which are such that u ′ is locally absolutely cont<strong>in</strong>uous<br />

with −u ′′ + qu ∈ L 2 (I).<br />

We will show that Tc is densely def<strong>in</strong>ed and symmetric and then<br />

calculate its adjo<strong>in</strong>t, but first need some preparation. If u, v are differentiable<br />

functions we def<strong>in</strong>e [u, v] = u(x)v ′ (x) − u ′ (x)v(x). This is<br />

called the Wronskian of u and v. It is clear that [u, v] = −[v, u], <strong>in</strong><br />

particular [u, u] = 0. The follow<strong>in</strong>g elementary fact is very important.<br />

Proposition 10.3. If u and v are l<strong>in</strong>early <strong>in</strong>dependent solutions of<br />

−v ′′ + qv = λv on I, then the Wronskian [u, v] is a non-zero constant<br />

on I.


10. BOUNDARY CONDITIONS 61<br />

Proof. Differentiat<strong>in</strong>g we obta<strong>in</strong> [u, v] ′ = uv ′′ − u ′′ v = u(q − λ)v −<br />

(q−λ)uv = 0 so that the Wronskian is constant. If the constant is zero,<br />

then given any po<strong>in</strong>t c ∈ I the vectors (u(c), u ′ (c)) and (v(c), v ′ (c)) are<br />

proportional. S<strong>in</strong>ce the <strong>in</strong>itial value problem has a unique solution this<br />

implies that u and v are proportional. <br />

Now let v1 and v2 be solutions of −v ′′ + qv = λv <strong>in</strong> I such that<br />

[v1, v2] = 1 <strong>in</strong> I. There are certa<strong>in</strong>ly such solutions. We may for<br />

example pick a po<strong>in</strong>t c ∈ I and specify <strong>in</strong>itial values v1(c) = 1, v ′ 1(c) = 0<br />

respectively v2(c) = 0, v ′ 2(c) = 1. By Proposition 10.3 the Wronskian<br />

is constant equal to its value at c, which is 1. The follow<strong>in</strong>g theorem<br />

states a version of the classical method known as variation of constants<br />

for solv<strong>in</strong>g the <strong>in</strong>homogeneous equation <strong>in</strong> terms of the solutions of the<br />

homogeneous equation.<br />

Lemma 10.4. Let v1, v2 be solutions of −v ′′ +qv = λv with [v1, v2] =<br />

1, let c ∈ I and suppose f is locally <strong>in</strong>tegrable <strong>in</strong> I. Then the solution<br />

u of −u ′′ + qu = λu + f with <strong>in</strong>itial data u(c) = u ′ (c) = 0 is given by<br />

x<br />

x<br />

(10.4) u(x) = v1(x) v2(y)f(y) dy − v2(x) v1(y)f(y) dy.<br />

c<br />

Proof. With u given by (10.4) clearly u(c) = 0. Differentiat<strong>in</strong>g<br />

we obta<strong>in</strong><br />

u ′ (x) = v ′ x<br />

1(x) v2f − v ′ x<br />

2(x) v1f,<br />

c<br />

s<strong>in</strong>ce the two other terms obta<strong>in</strong>ed cancel. Thus u ′ (c) = 0. Differentiat<strong>in</strong>g<br />

aga<strong>in</strong> we obta<strong>in</strong><br />

u ′′ (x) = v ′′<br />

x<br />

1(x) v2f−v ′′<br />

x<br />

2(x) v1f−[v1, v2]f(x) = (q(x)−λ)u(x)−f(x),<br />

c<br />

c<br />

which was to be proved. <br />

Corollary 10.5. If f ∈ L 1 (I) with compact support <strong>in</strong> I then<br />

−u ′′ + qu = f has a solution u with compact support <strong>in</strong> I if and only if<br />

<br />

I vf = 0 for all solutions v of the homogeneous equation −v′′ +qv = 0.<br />

Proof. If we choose c to the left of the support of f, then by<br />

Lemma 10.4 the function u given by (10.4) is the only solution of<br />

−u ′′ + qu = f which vanishes to the left of c. S<strong>in</strong>ce v1, v2 are l<strong>in</strong>early<br />

<strong>in</strong>dependent the equation has a solution of compact support if and only<br />

if f is orthogonal to both v1 and v2, which are a basis for the solutions<br />

of the homogeneous equation. The corollary follows. <br />

Lemma 10.6. The operator Tc is densely def<strong>in</strong>ed and symmetric.<br />

Furthermore, if u ∈ D(T ∗ c ) and f = T ∗ c u, then u is differentiable with<br />

c<br />

c


62 10. BOUNDARY CONDITIONS<br />

locally absolutely cont<strong>in</strong>uous derivative and satisfies −u ′′ + qu = f.<br />

Conversely, if u, f ∈ L 2 (I) and this equation is satisfied, then u ∈<br />

D(T ∗ c ) and T ∗ c u = f.<br />

Proof. Let u1 be a solution of −u ′′<br />

1 + qu1 = f. Assume u0 is <strong>in</strong><br />

the doma<strong>in</strong> of Tc and put f0 = Tcu0. Integrat<strong>in</strong>g by parts twice we get<br />

<br />

<br />

<br />

(10.5)<br />

I<br />

u0f =<br />

I<br />

u0(−u ′′<br />

1 + qu1) =<br />

I<br />

(−u ′′<br />

0 + qu0)u1 =<br />

I<br />

f0u1.<br />

So, if f is orthogonal to the doma<strong>in</strong> of Tc, then u1 is orthogonal to all<br />

compactly supported elements f0 ∈ L2 (I) for which there is a solution<br />

u0 of −u ′′<br />

0 + qu0 = f0 with compact support. By Corollary 10.5 it<br />

follows that u1 solves −v ′′ + qv = 0 so that f = 0. Thus Tc is densely<br />

def<strong>in</strong>ed.<br />

The calculation (10.5) also proves the converse part of the lemma.<br />

Furthermore, if u is <strong>in</strong> the doma<strong>in</strong> of T ∗ c with T ∗ c u = f we obta<strong>in</strong><br />

0 = 〈u0, f〉 − 〈f0, u〉 = 〈f0, u1 − u〉. Just as before it follows that u1 − u<br />

solves the equation −v ′′ + qv = 0. It follows that u solves the equation<br />

−u ′′ + qu = f so that Tc ⊂ T ∗ c . The proof is complete. <br />

Be<strong>in</strong>g symmetric and densely def<strong>in</strong>ed Tc is closeable, and we def<strong>in</strong>e<br />

the m<strong>in</strong>imal operator T0 as the closure of Tc and denote the doma<strong>in</strong><br />

of T0 (the m<strong>in</strong>imal doma<strong>in</strong>) by D0. Similarly, the maximal operator<br />

T1 is T1 := T ∗ c with doma<strong>in</strong> D1 ⊃ D0. Thus the maximal doma<strong>in</strong> D1<br />

consists of all differentiable functions u ∈ L 2 (I) such that u ′ is locally<br />

absolutely cont<strong>in</strong>uous function for which T1u = −u ′′ + qu ∈ L 2 (I).<br />

We can now apply the theory of Chapter 9. The deficiency <strong>in</strong>dices<br />

of T0 are accord<strong>in</strong>gly the number of solutions of −u ′′ + qu = iu and<br />

−u ′′ + qu = −iu respectively which are l<strong>in</strong>early <strong>in</strong>dependent and <strong>in</strong><br />

L 2 (I). S<strong>in</strong>ce there are only 2 l<strong>in</strong>early <strong>in</strong>dependent solutions for each of<br />

these equations the deficiency <strong>in</strong>dices can be no larger than 2. For the<br />

equation (10.3) the deficiency <strong>in</strong>dices are always equal, s<strong>in</strong>ce if u solves<br />

−u ′′ + qu = λu, then u solves the equation with λ replaced by λ, and<br />

l<strong>in</strong>ear <strong>in</strong>dependence is preserved when conjugat<strong>in</strong>g functions. Thus, for<br />

our equation there are only three possibilities: The deficiency <strong>in</strong>dices<br />

may both be 2, both may be 1, or both may be 0. We shall see later<br />

that all three cases can occur, depend<strong>in</strong>g on the choice of q and I.<br />

We will now take a closer look at how selfadjo<strong>in</strong>t realizations are<br />

determ<strong>in</strong>ed as restrictions of the maximal operator. Suppose u1 and<br />

u2 ∈ D1. Then the boundary form (cf. Chapter 9) is<br />

<br />

(10.6) 〈(u1, T1u1), U(u2, T1u2)〉 = i (u1T1u2 − T1u1u2)<br />

<br />

= i<br />

I<br />

(−u1u ′′<br />

2 + u ′′<br />

<br />

1u2) = −i<br />

I<br />

I<br />

[u1, u2] ′ <br />

<br />

= −i lim [u1, u2] ,<br />

K→I<br />

K


10. BOUNDARY CONDITIONS 63<br />

the limit be<strong>in</strong>g taken over compact sub<strong>in</strong>tervals K of I. We must<br />

restrict T1 so that this vanishes. In some sense this means that the<br />

restriction of T1 to a selfadjo<strong>in</strong>t operator T is obta<strong>in</strong>ed by boundary<br />

conditions s<strong>in</strong>ce the limit clearly only depends on the values of u1 and<br />

u2 <strong>in</strong> arbitrarily small neighborhoods of the endpo<strong>in</strong>ts of I. This is of<br />

course the motivation for the terms boundary operator and boundary<br />

form.<br />

The simplest case is when an endpo<strong>in</strong>t is an element of I. This<br />

means that the endpo<strong>in</strong>t is a f<strong>in</strong>ite number, and that q is <strong>in</strong>tegrable<br />

near the endpo<strong>in</strong>t. Such an endpo<strong>in</strong>t is called regular; otherwise the<br />

endpo<strong>in</strong>t is s<strong>in</strong>gular. If both endpo<strong>in</strong>ts are regular, we say that we are<br />

deal<strong>in</strong>g with a regular problem. We have a s<strong>in</strong>gular problem if at least<br />

one of the endpo<strong>in</strong>ts is <strong>in</strong>f<strong>in</strong>ite, or if q /∈ L 1 (I).<br />

Consider now a regular problem. It is clear that the deficiency <strong>in</strong>dices<br />

are both 2 <strong>in</strong> the regular case, s<strong>in</strong>ce all solutions of −u ′′ +qu = iu<br />

are cont<strong>in</strong>uous on the compact <strong>in</strong>terval I and thus <strong>in</strong> L 2 (I). We<br />

shall <strong>in</strong>vestigate which boundary conditions yield selfadjo<strong>in</strong>t restrictions<br />

of T1. The boundary form depends only on the boundary values<br />

(u(a), u ′ (a), u(b), u ′ (b)), and the possible boundary values constitute a<br />

l<strong>in</strong>ear subspace of C 4 . On the other hand, the boundary form is positive<br />

def<strong>in</strong>ite on Di and negative def<strong>in</strong>ite on D−i, both of which are<br />

2-dimensional spaces. The boundary values for the deficiency spaces<br />

therefore span two two-dimensional spaces which do not overlap. It follows<br />

that as u ranges through D1 the boundary values range through<br />

all of C 4 .<br />

The boundary conditions need to restrict the 4-dimensional space<br />

Di⊕D−i to the 2-dimensional space D of Theorem 9.2, so two <strong>in</strong>dependent<br />

l<strong>in</strong>ear conditions are needed. This means that there are 2 × 2 ma-<br />

trices A and B such that the boundary conditions are given by AU(a)+<br />

u<br />

BU(b) = 0, where U =<br />

−u ′<br />

<br />

. L<strong>in</strong>ear <strong>in</strong>dependence of the conditions<br />

means that the 2 × 4 matrix (A, B) must have l<strong>in</strong>early <strong>in</strong>dependent<br />

rows. Consider first the case when A is <strong>in</strong>vertible. Then the condition<br />

is of the form U(a) = SU(b), where S = −A−1 0 1<br />

B. If J = ( −1 0 ) the<br />

boundary form is −i{(U2(a)) ∗JU1(a) − (U2(b)) ∗JU1(b)}, so symmetry<br />

requires this to vanish. Insert<strong>in</strong>g U(a) = SU(b) the condition becomes<br />

(U2(b)) ∗ (S∗JS − J)U1(b) = 0 where U1(b) and U2(b) are arbitrary 2 × 1<br />

matrices. Thus it follows that the condition U(a) = SU(b) gives a<br />

selfadjo<strong>in</strong>t restriction of T1 precisely if S satisfies S∗JS = J. Such a<br />

matrix S is called symplectic.<br />

Important special cases are when S is plus or m<strong>in</strong>us the unit matrix.<br />

These cases are called periodic and antiperiodic boundary conditions<br />

respectively. Another valid choice is S = J. S<strong>in</strong>ce det J = 1 = 0 it<br />

is clear that any symplectic matrix S satisfies | det S| = 1 (see also<br />

Exercise 10.1). In particular, it is <strong>in</strong>vertible. It is clear that the <strong>in</strong>verse


64 10. BOUNDARY CONDITIONS<br />

of a symplectic matrix is also symplectic (show this!), so it follows<br />

that assum<strong>in</strong>g the matrix B to be <strong>in</strong>vertible aga<strong>in</strong> leads to boundary<br />

conditions of the form U(a) = SU(b) with a symplectic S.<br />

It rema<strong>in</strong>s to consider the case when neither A nor B is <strong>in</strong>vertible.<br />

Neither A nor B can then be zero, s<strong>in</strong>ce then the other matrix must<br />

be <strong>in</strong>vertible. Thus A and B both have l<strong>in</strong>early dependent rows, one<br />

of which has to be non-zero. We may assume the first row <strong>in</strong> A to be<br />

non-zero, and then add<strong>in</strong>g an appropriate multiple of the first row to<br />

the second <strong>in</strong> (A, B) we may assume the second row of A to be zero.<br />

The second row of B will then be non-zero s<strong>in</strong>ce the rows of (A, B)<br />

are l<strong>in</strong>early <strong>in</strong>dependent, and then add<strong>in</strong>g an appropriate multiple of<br />

the second row to the first we may cancel the first row of B. At this<br />

po<strong>in</strong>t the first row gives a condition on U(a) and the second a condition<br />

on U(b). Such boundary conditions are called separated. We end the<br />

discussion of the regular case by determ<strong>in</strong><strong>in</strong>g what separated boundary<br />

conditions give rise to selfadjo<strong>in</strong>t restrictions of T1.<br />

Separated boundary conditions require u1u ′ 2 − u ′ 1u2 to vanish <strong>in</strong><br />

each endpo<strong>in</strong>t. One possibility is of course to require u1 (and u2) to<br />

vanish there. Such a boundary condition is called a Dirichlet condition.<br />

If there is an element u1 <strong>in</strong> the doma<strong>in</strong> of the selfadjo<strong>in</strong>t realization<br />

for which u1(a) does not vanish <strong>in</strong> a we obta<strong>in</strong> for u2 = u1<br />

that 0 = u ′ 1(a)u1(a) − u1(a)u ′ 1(a) = 2i Im u ′ 1(a)u1(a) so that u ′ 1(a)u1(a)<br />

is real. Equivalently u ′ 1(a)/u1(a) is real, say = −h ∈ R. If u is<br />

any other element of the doma<strong>in</strong> the condition for symmetry becomes<br />

0 = u ′ (a)u1(a)−u(a)u ′ 1(a) = (u ′ (a)+hu(a))u1(a) so that we must have<br />

u ′ (a) + hu(a) = 0. On the other hand, impos<strong>in</strong>g this condition on all<br />

elements of the maximal doma<strong>in</strong> clearly makes the boundary form at a<br />

vanish. In particular, if h = 0 we have a Neumann boundary condition.<br />

We may of course f<strong>in</strong>d α ∈ (0, π) such that h = cot α, and multiply<strong>in</strong>g<br />

through by s<strong>in</strong> α the boundary condition becomes<br />

(10.7) u(a) cos α + u ′ (a) s<strong>in</strong> α = 0,<br />

and then α = 0 gives a Dirichlet condition. For α = π/2 we obta<strong>in</strong> a<br />

Neumann condition, and any separated, selfadjo<strong>in</strong>t boundary condition<br />

at a is given by (10.7) for some α ∈ [0, π).<br />

To summarize: Separated, symmetric boundary conditions for a<br />

Sturm-Liouville equation are of the form (10.7) at a with a similar<br />

condition at b (possibly for a different value of α, of course). Important<br />

special cases are α = 0, a Dirichlet condition, and α = π/2, a<br />

Neumann condition. Every other selfadjo<strong>in</strong>t realization is given by<br />

coupled boundary conditions U(a) = SU(b) for a symplectic matrix S.<br />

Important special cases are periodic and antiperiodic boundary conditions.


10. BOUNDARY CONDITIONS 65<br />

Let us now consider the s<strong>in</strong>gular case. We will then first consider<br />

the case when one endpo<strong>in</strong>t is regular and the other s<strong>in</strong>gular. So,<br />

assume that I = [a, b) with a regular and b possibly s<strong>in</strong>gular.<br />

Lemma 10.7. There are elements of D1 which vanish <strong>in</strong> a neighborhood<br />

of b and have arbitrarily prescribed <strong>in</strong>itial values u(a) and u ′ (a).<br />

Proof. Let c ∈ (a, b) and f ∈ L2 (a, b) vanish <strong>in</strong> (c, b). Now solve<br />

−u ′′ + qu = f with <strong>in</strong>itial data u(c) = u ′ (c) = 0 so that u vanishes <strong>in</strong><br />

(c, b). It follows that u ∈ D1, and we need to show that u(a) and u ′ (a)<br />

can be chosen arbitrarily by selection of f. Note that if −v ′′ + qv = 0<br />

<strong>in</strong>tegrat<strong>in</strong>g by parts twice shows that<br />

c<br />

〈f, v〉 = (−u ′′ + qu)v = [−u ′ v + uv ′ ] c a = u ′ (a)v(a) − u(a)v ′ (a).<br />

a<br />

If v1 and v2 are solutions of −v ′′ +qv = 0 satisfy<strong>in</strong>g v1(a) = 1, v ′ 1(a) = 0<br />

and v2(a) = 0, v ′ 2(a) = −1 respectively, we obta<strong>in</strong> u(a) = 〈f, v2〉 and<br />

u ′ (a) = 〈f, v1〉. S<strong>in</strong>ce v1, v2 are l<strong>in</strong>early <strong>in</strong>dependent we can choose<br />

f to give arbitrary values to this, for example by choos<strong>in</strong>g f as an<br />

appropriate l<strong>in</strong>ear comb<strong>in</strong>ation of v1 and v2 <strong>in</strong> [a, c]. <br />

The fact that T1 and T0 are closed means that their doma<strong>in</strong>s are<br />

<strong>Hilbert</strong> spaces with norm-square u 2 1 = u 2 +T1u 2 . We shall always<br />

view D1 and D0 as spaces <strong>in</strong> this way. We also note that if u ∈ D1, then<br />

u is cont<strong>in</strong>uously differentiable. If K is a compact <strong>in</strong>terval we def<strong>in</strong>e<br />

C 1 (K) to be the l<strong>in</strong>ear space of cont<strong>in</strong>uously differentiable functions<br />

provided with the norm uK = sup K |u|+sup K |u ′ |. Convergence for a<br />

sequence {uj} ∞ 1 <strong>in</strong> this space therefore means uniform convergence on<br />

K of uj and u ′ j as j → ∞. This space is easily seen to be complete, and<br />

thus a Banach space. As we noted above, if K is a compact sub<strong>in</strong>terval<br />

of I, then the restriction to K of any element of D1 is <strong>in</strong> C 1 (K). We<br />

will need the follow<strong>in</strong>g fact.<br />

Lemma 10.8. For every compact sub<strong>in</strong>terval K ⊂ I there exists a<br />

constant CK such that uK ≤ CKu1 for every u ∈ D1. In particular,<br />

the l<strong>in</strong>ear forms D1 ∋ u ↦→ u(x) and D1 ∋ u ↦→ u ′ (x) are locally<br />

uniformly bounded <strong>in</strong> x.<br />

Proof. The restriction map D1 ∋ u ↦→ u ∈ C 1 (K) is l<strong>in</strong>ear and<br />

we will show that this map is closed. By the closed graph theorem (see<br />

Appendix A) it then follows that this map is bounded, which is the<br />

statement of the lemma.<br />

To show that the map is closed we must show that if uj → u <strong>in</strong><br />

D1 and the restrictions to K of uj converge to ũ <strong>in</strong> C 1 (K), then the<br />

restriction to K of u equals ũ. But this is clear, s<strong>in</strong>ce if uj converges<br />

<strong>in</strong> L 2 (I) to u, then their restrictions to K converge <strong>in</strong> L 2 (K) to the<br />

restriction of u to K. At the same time the restrictions to K converge


66 10. BOUNDARY CONDITIONS<br />

uniformly to ũ, so that <br />

K |uj−u| 2 converges both to 0 and to <br />

K |ũ−u|2 .<br />

It follows that u = ũ a.e. <strong>in</strong> K. <br />

A bounded Hermitian form on a <strong>Hilbert</strong> space H is a map H × H ∋<br />

(u, v) ↦→ B(u, v) ∈ C such that |B(u, v)| ≤ Cuv for some constant<br />

C. It is clear that the boundedness of a Hermitian form is equivalent<br />

to it be<strong>in</strong>g cont<strong>in</strong>uous as a function of its arguments. The boundary<br />

form i(〈u, T1v〉 − 〈T1u, v〉) is a bounded Hermitian form on D1, i.e.,<br />

it is a Hermitian form <strong>in</strong> u, v and is bounded by u1v1, and by<br />

Lemma 10.8 the boundary form at a, i.e., u ′ (a)v(a) − u(a)v ′ (a), is also<br />

a bounded Hermitian form (bounded by 2CKu1v1 if a ∈ K). S<strong>in</strong>ce<br />

i(〈u, T1v〉 − 〈T1u, v〉) = −i lim<br />

x→b [u, v](x) + i[u, v](a)<br />

we see that i limx→b(u ′ (x)v(x) − u(x)v ′ (x)), the boundary form at b,<br />

is also a bounded Hermitian form on D1. S<strong>in</strong>ce the forms at a and b<br />

vanish if u is <strong>in</strong> the doma<strong>in</strong> of Tc, i.e., if u vanishes near a and b, it<br />

follows that they also vanish if u ∈ D0. In particular, if u ∈ D0, then<br />

u(a) = u ′ (a) = 0. Now T0 is the adjo<strong>in</strong>t of T1 so it follows that this<br />

is the only condition at a for an element of D1 to be <strong>in</strong> D0, s<strong>in</strong>ce this<br />

guarantees that the form at a vanishes. Of course, u ∈ D0 also requires<br />

that the form at b vanishes.<br />

Now let Ta be the closure of the restriction of T1 to those elements<br />

of D1 which vanish near b, and let Da be the doma<strong>in</strong> of Ta.<br />

Then Lemma 10.7 and the boundedness of the forms at a and b show<br />

that the boundary form at b vanishes on Da and that dim D1/D0 ≥<br />

dim Da/D0 ≥ 2. We obta<strong>in</strong> the follow<strong>in</strong>g theorem.<br />

Theorem 10.9. If the <strong>in</strong>terval I has one regular endpo<strong>in</strong>t a, then<br />

n+ = n− ≥ 1. If n+ = n− = 1, then the boundary form at the s<strong>in</strong>gular<br />

endpo<strong>in</strong>t vanishes on D1, and any selfadjo<strong>in</strong>t restriction of T1 is given<br />

by a boundary condition of the form (10.7) at a and no condition at all<br />

at b.<br />

Proof. If n+ = n− = 0 then T1 = T0 so that T1 is selfadjo<strong>in</strong>t,<br />

accord<strong>in</strong>g to Theorem 9.1. But then we can not have dim D1/D0 ≥ 2.<br />

If n+ = n− = 1, then 2 = dim D1/D0 ≥ dim Da/D0 ≥ 2 so that we<br />

must have D1 = Da. Thus the boundary form at the s<strong>in</strong>gular endpo<strong>in</strong>t<br />

vanishes on D1, and the boundary form at a vanishes precisely if we<br />

impose a boundary condition of the form (10.7). <br />

If n+ = n− = 2 we obta<strong>in</strong> a selfadjo<strong>in</strong>t restriction of T1 by impos<strong>in</strong>g<br />

two appropriate boundary conditions. One of them can be a condition<br />

of the form (10.7), and then a condition at the s<strong>in</strong>gular endpo<strong>in</strong>t also<br />

has to be imposed. There are also selfadjo<strong>in</strong>t restrictions obta<strong>in</strong>ed by<br />

impos<strong>in</strong>g coupled boundary conditions. See Exercise 10.3.<br />

Whether one obta<strong>in</strong>s deficiency <strong>in</strong>dices 1 or 2 when one endpo<strong>in</strong>t is<br />

regular clearly only depends on conditions near the s<strong>in</strong>gular endpo<strong>in</strong>t.


EXERCISES FOR CHAPTER 10 67<br />

It is customary to say that a s<strong>in</strong>gular endpo<strong>in</strong>t is <strong>in</strong> the limit po<strong>in</strong>t condition<br />

if the deficiency <strong>in</strong>dices are 1 and <strong>in</strong> the limit circle condition if<br />

the deficiency <strong>in</strong>dices are 2. The term<strong>in</strong>ology derives from the methods<br />

Weyl [12] used <strong>in</strong> 1910 to construct the resolvent of a Sturm-Liouville<br />

operator.<br />

If an <strong>in</strong>terval has two s<strong>in</strong>gular endpo<strong>in</strong>ts <strong>in</strong> the limit po<strong>in</strong>t condition<br />

it is clear that T1 is selfadjo<strong>in</strong>t, s<strong>in</strong>ce the boundary form vanishes on<br />

D1. No boundary conditions are therefore required <strong>in</strong> this case, and<br />

one often says that the operator Tc is essentially selfadjo<strong>in</strong>t, s<strong>in</strong>ce its<br />

closure T0 = T1 is selfadjo<strong>in</strong>t. If one or both of the endpo<strong>in</strong>ts are <strong>in</strong> the<br />

limit circle condition, we have a situation similar to when the endpo<strong>in</strong>t<br />

is regular, and need to impose boundary conditions of a similar type.<br />

Note, however, that at a limit circle endpo<strong>in</strong>t the limits of u(x) and<br />

u ′ (x) for an element u ∈ D1 do not necessarily exist. To formulate the<br />

boundary conditions <strong>in</strong> explicit terms one may <strong>in</strong>stead use the idea of<br />

Exercise 10.3.<br />

It is clearly an important problem to f<strong>in</strong>d explicit conditions on the<br />

<strong>in</strong>terval and the coefficient q which guarantee limit po<strong>in</strong>t or limit circle<br />

conditions. A large number of different criterions for this are available<br />

today. We end the chapter by prov<strong>in</strong>g a simple criterion, known already<br />

to Weyl (with a more complicated proof), for the limit po<strong>in</strong>t condition<br />

at an <strong>in</strong>f<strong>in</strong>ite <strong>in</strong>terval endpo<strong>in</strong>t.<br />

Theorem 10.10. Suppose q is bounded from below near ∞. Then<br />

(10.3) is <strong>in</strong> the limit po<strong>in</strong>t condition at ∞.<br />

Proof. Suppose q > C on [a, ∞). Let u be the solution of −u ′′ +<br />

qu = (C + i)u with <strong>in</strong>itial data u(a) = 1, u ′ (a) = 0. We have<br />

(Re(u ′ u)) ′ = Re(|u ′ | 2 + u ′′ (x)u(x))<br />

= |u ′ | 2 + Re((q − C − i)|u| 2 = |u ′ | 2 + (q − C)|u| 2 ≥ 0.<br />

Thus Re(u ′ u) is <strong>in</strong>creas<strong>in</strong>g and Re(u ′ (0)u(0)) = 0 so Re(u ′ u) ≥ 0. But<br />

(|u| 2 ) ′ = 2 Re(u ′ u) so |u| 2 is <strong>in</strong>creas<strong>in</strong>g. It follows that |u| 2 ≥ 1 so that<br />

one can not have u ∈ L 2 (0, ∞). Thus deficiency <strong>in</strong>dices are < 2. <br />

For a more general result, see Exercise 10.4.<br />

Exercises for Chapter 10<br />

Exercise 10.1. Show that a symplectic 2 × 2 matrix S is of the<br />

form e iθ P where θ ∈ R and P is a real 2 × 2 matrix with determ<strong>in</strong>ant<br />

1. Also show that the <strong>in</strong>verses and adjo<strong>in</strong>ts of symplectic matrices are<br />

symplectic.<br />

Exercise 10.2 (Hard!). Suppose that all solutions of −v ′′ +qv = λv<br />

are <strong>in</strong> L 2 (I) for some real or non-real λ. Show that this is then true<br />

for all complex λ.


68 10. BOUNDARY CONDITIONS<br />

H<strong>in</strong>t: If −u ′′ + qu = µu, write this as −u ′′ + (q − λ)u = (µ − λ)u<br />

and use the variation of constants formula, th<strong>in</strong>k<strong>in</strong>g of (µ − λ)u as an<br />

<strong>in</strong>homogeneous term, to write down an <strong>in</strong>tegral equation for u <strong>in</strong> terms<br />

of solutions of −v ′′ + qv = λv. Us<strong>in</strong>g an <strong>in</strong>itial po<strong>in</strong>t sufficiently close<br />

to an endpo<strong>in</strong>t use estimates <strong>in</strong> this <strong>in</strong>tegral equation to show that u<br />

is square <strong>in</strong>tegrable near the endpo<strong>in</strong>t.<br />

Exercise 10.3. Show that<br />

[u1, v1][u2, v2] − [u1, v2][u2, v1] = [u1, u2][v1, v2]<br />

for differentiable functions u1, u2, v1, v2. Next show that if [v1, v2] = 1,<br />

then the boundary form for u1, u2 ∈ D1 at b equals limb([u1, v1][u2, v2]−<br />

[u1, v2][u2, v1]). Furthermore, show that if −v ′′ + qv = λv and −u ′′ +<br />

qu = f then ([u, v]) ′ = (f − λu)v. F<strong>in</strong>ally show that if all solutions of<br />

−v ′′ + qv = λv are <strong>in</strong> L 2 (a, b) and if u ∈ D1 then the limit at b of [u, v]<br />

exists.<br />

Conclude that <strong>in</strong> the case n+ = n− = 2 selfadjo<strong>in</strong>t boundary conditions<br />

may be described by conditions on the values of [u, v1], [u, v2] at<br />

the endpo<strong>in</strong>ts of exactly the same form as we described them for the<br />

regular case on the values of u, u ′ .<br />

Exercise 10.4. Show that (10.3) is <strong>in</strong> the limit po<strong>in</strong>t condition at<br />

∞ if q = q0 + q1 where q0 is bounded from below and q1 ∈ L 1 (0, ∞).<br />

H<strong>in</strong>t: First show that |u(x)| 2 ≤ 2 x<br />

0 |u′ ||u| if u(0) = 0, then multiply<br />

by |q1| and <strong>in</strong>tegrate. Conclude that there is a constant A so that<br />

x<br />

0 |q1||u| 2 ≤ x<br />

0 |u′ | 2 + A x<br />

0 |u|2 for all x > 0. Now show, similar to the<br />

proof of Theorem 10.10, that |u| 2 is <strong>in</strong>creas<strong>in</strong>g if u(0) = 0, u ′ (0) = 1<br />

and u satisfies −u ′′ + qu = λu for an appropriate λ.


CHAPTER 11<br />

Sturm-Liouville equations<br />

The spectral theorem we proved <strong>in</strong> Chapter 7 is very powerful, but<br />

sometimes its abstract nature is a drawback, and one needs a more explicit<br />

expansion, analogous to Fourier series or Fourier transforms. A<br />

general theorem of this type was proved by von Neumann <strong>in</strong> 1949, but<br />

it is still of a fairly abstract nature. It can be applied to elliptic partial<br />

differential equations (G˚ard<strong>in</strong>g around 1952), but gives more satisfactory<br />

results when applied to ord<strong>in</strong>ary differential equations. How to<br />

do this was described by G˚ard<strong>in</strong>g <strong>in</strong> an appendix to John, Bers and<br />

Schechter: Partial Differential Equations, (1964). A slightly more general<br />

situation was treated <strong>in</strong> [2]. For Sturm-Liouville equations one<br />

can, however, as easily obta<strong>in</strong> an expansion theorem directly. We will<br />

do that <strong>in</strong> this chapter.<br />

As <strong>in</strong> our proof of the spectral theorem, we will deduce our results<br />

from properties of the resolvent, but now need to have a more explicit<br />

description of the resolvent operator. The first step is to prove that the<br />

resolvent is actually an <strong>in</strong>tegral operator. First note that all elements<br />

of D1 are cont<strong>in</strong>uously differentiable with locally absolutely cont<strong>in</strong>uous<br />

derivative, and accord<strong>in</strong>g to Lemma 10.8 po<strong>in</strong>t evaluations of elements<br />

of D1 (and their derivatives) are locally uniformly bounded l<strong>in</strong>ear forms<br />

on D1.<br />

If T is a selfadjo<strong>in</strong>t realization of (10.3) <strong>in</strong> L 2 (I) its resolvent Rλ is a<br />

bounded operator on L 2 (I) for every λ <strong>in</strong> the resolvent set. If E denotes<br />

the identity on L 2 (I) we have (T − λ)Rλ = E so that T Rλ = E + λRλ.<br />

Thus T Rλ ≤ 1 + |λ|Rλ. S<strong>in</strong>ce Rλu is <strong>in</strong> the doma<strong>in</strong> of T we<br />

may also view the resolvent as an operator Rλ : L 2 (I) → D1, where<br />

D1 is viewed as a <strong>Hilbert</strong> space provided with the graph norm, as<br />

on page 65. This operator is bounded s<strong>in</strong>ce Rλu 2 1 = Rλu 2 +<br />

T Rλu 2 ≤ (Rλ 2 + (|λ|Rλ + 1) 2 )u 2 . It is also clear that the<br />

analyticity of Rλ implies the analyticity of T Rλ = E + λRλ, and therefore<br />

the analyticity of Rλ : L 2 (I) → D1. We obta<strong>in</strong> the follow<strong>in</strong>g<br />

theorem.<br />

Theorem 11.1. Suppose I is an <strong>in</strong>terval, and that T is a selfadjo<strong>in</strong>t<br />

realization <strong>in</strong> L 2 (I) of the equation (10.3). Then the resolvent Rλ of T<br />

may be viewed as a bounded l<strong>in</strong>ear map from L 2 (I) to C 1 (K), for any<br />

compact sub<strong>in</strong>terval K of I, which depends analytically on λ ∈ ρ(T ),<br />

<strong>in</strong> the uniform operator topology. Furthermore, there exists Green’s<br />

69


70 11. STURM-LIOUVILLE EQUATIONS<br />

function g(x, y, λ), which is <strong>in</strong> L 2 (I) as a function of y for every x ∈ I<br />

and such that Rλu(x) = 〈u, g(x, ·, λ)〉 for any u ∈ L 2 (I). There is also<br />

a kernel g1(x, y, λ) <strong>in</strong> L 2 (I) as a function of y for every x ∈ I such<br />

that (Rλu) ′ (x) = 〈u, g1(x, ·, λ)〉 for any u ∈ L 2 (I).<br />

Proof. We already noted that ρ(T ) ∋ λ ↦→ Rλ ∈ B(L 2 (I), D1) is<br />

analytic <strong>in</strong> the uniform operator topology. Furthermore, the restriction<br />

operator IK : D1 → C 1 (K) is bounded and <strong>in</strong>dependent of λ. Hence<br />

ρ(T ) ∋ λ → IKRλ is analytic <strong>in</strong> the uniform operator topology. In<br />

particular, for fixed λ ∈ ρ(T ) and any x ∈ I, the l<strong>in</strong>ear form L 2 (I) ∋<br />

u ↦→ (IKRλu)(x) = Rλu(x) is (locally uniformly) bounded. By Riesz’<br />

representation theorem we have Rλu(x) = 〈u, g(x, ·, λ)〉, where y ↦→<br />

g(x, y, λ) is <strong>in</strong> L 2 (I). Similarly, s<strong>in</strong>ce L 2 ∋ u ↦→ (Rλu) ′ (x) is a bounded<br />

l<strong>in</strong>ear form for each x ∈ I the kernel g1 exists. <br />

Among other th<strong>in</strong>gs, Theorem 11.1 tells us that if uj → u <strong>in</strong> L 2 (I),<br />

then Rλuj → Rλu <strong>in</strong> C 1 (K), so that Rλuj and its derivative converge<br />

locally uniformly. This is actually true even if uj just converges weakly,<br />

but all we need is the follow<strong>in</strong>g weaker result.<br />

Lemma 11.2. Suppose Rλ is the resolvent of a selfadjo<strong>in</strong>t relation T<br />

as above. Then if uj ⇀ 0 weakly <strong>in</strong> L 2 (I), it follows that both Rλuj → 0<br />

and (Rλuj) ′ → 0 po<strong>in</strong>twise and locally boundedly.<br />

Proof. Rλuj(x) = 〈uj, g(x, ·, λ)〉 → 0 s<strong>in</strong>ce y ↦→ g(x, y, λ) is <strong>in</strong><br />

L 2 (I) for any x ∈ I. Now let K be a compact sub<strong>in</strong>terval of I. A<br />

weakly convergent sequence <strong>in</strong> L 2 (I) is bounded, so s<strong>in</strong>ce Rλ maps<br />

L 2 (I) boundedly <strong>in</strong>to C 1 (K), it follows that Rλuj(x) is bounded <strong>in</strong>dependently<br />

of j and x for x ∈ K. Similarly for the sequence of derivatives.<br />

<br />

Corollary 11.3. If the <strong>in</strong>terval I is compact, then any selfadjo<strong>in</strong>t<br />

restriction T of T1 has compact resolvent. Hence T has a complete<br />

orthonormal sequence of eigenfunctions <strong>in</strong> L 2 (I).<br />

Proof. Suppose uj ⇀ 0 weakly <strong>in</strong> L 2 (I). If I is compact, then<br />

Lemma 11.2 implies that Rλuj → 0 po<strong>in</strong>twise and boundedly <strong>in</strong> I,<br />

and hence by dom<strong>in</strong>ated convergence Rλuj → 0 <strong>in</strong> L 2 (I). Thus Rλ is<br />

compact. The last statement follows from Theorem 8.3.<br />

For a different proof, see Corollary 11.7. <br />

If T has compact resolvent, then the generalized Fourier series of<br />

any u ∈ L 2 (I) converges to u <strong>in</strong> L 2 (I). For functions <strong>in</strong> the doma<strong>in</strong> of<br />

T much stronger convergence is obta<strong>in</strong>ed.<br />

Corollary 11.4. Suppose T has a complete orthonormal sequence<br />

of eigenfunctions <strong>in</strong> L 2 (I). If u is <strong>in</strong> the doma<strong>in</strong> of T , then the generalized<br />

Fourier series of u, as well as the differentiated series, converges<br />

locally uniformly <strong>in</strong> I. In particular, if I is compact, the convergence<br />

is uniform <strong>in</strong> I.


11. STURM-LIOUVILLE EQUATIONS 71<br />

Proof. Suppose u is <strong>in</strong> the doma<strong>in</strong> of T , i.e., T u = v for some<br />

v ∈ L 2 (I), and let ˜v = v − iu, so that u = Ri˜v. If e is an eigenfunction<br />

of T with eigenvalue λ we have T e = λe or (T + i)e = (λ + i)e so that<br />

R−ie = e/(λ + i). It follows that 〈u, e〉 e = 〈Ri˜v, e〉 e = 〈˜v, R−ie〉 e =<br />

1<br />

λ−i 〈˜v, e〉 e = 〈˜v, e〉Rie. If sNu denotes the N:th partial sum of the<br />

Fourier series for u it follows that sNu = RisN ˜v, where sN ˜v is the N:th<br />

partial sum for ˜v. S<strong>in</strong>ce sN ˜v → ˜v <strong>in</strong> L 2 (I), it follows from Theorem 11.1<br />

and the remark after it that sNu → u <strong>in</strong> C 1 (K), for any compact<br />

sub<strong>in</strong>terval K of I. <br />

The convergence is actually even better than the corollary shows,<br />

s<strong>in</strong>ce it is absolute and uniform (see Exercise 11.2).<br />

Example 11.5. Consider the equation −u ′′ = λu, first <strong>in</strong> L2 (−π, π),<br />

with periodic boundary conditions u(−π) = u(π), u ′ (−π) = u ′ (π). The<br />

general solution is u(x) = A cos( √ λ x) + B s<strong>in</strong>( √ λ x), where A, B are<br />

constants. The boundary conditions may be viewed as l<strong>in</strong>ear equations<br />

for determ<strong>in</strong><strong>in</strong>g the constants A and B, and if there is go<strong>in</strong>g to be a<br />

non-trivial solution, the determ<strong>in</strong>ant must vanish. The determ<strong>in</strong>ant is<br />

<br />

<br />

0 2 s<strong>in</strong>(<br />

<br />

√ λ π)<br />

−2 s<strong>in</strong>( √ <br />

<br />

<br />

λ π) 0 = 4 s<strong>in</strong>2 ( √ λ π)<br />

so that λ = k 2 , where k ∈ N. For each eigenvalue k 2 > 0 we have<br />

two l<strong>in</strong>early <strong>in</strong>dependent eigenfunctions cos(kx) and s<strong>in</strong>(kx). For the<br />

eigenvalue 0 the eigenfunction is 1 √ . These functions are orthonormal<br />

2<br />

if we use the scalar product 〈u, v〉 = 1<br />

π<br />

uv (check!). We obta<strong>in</strong> the<br />

π −π<br />

classical (real) Fourier series f(x) = a0<br />

2 + ∞ k=1 (ak cos kx + bk s<strong>in</strong> kx),<br />

where a0 = 〈f, 1〉, ak = 〈f(x), cos kx〉 for k > 0, and bk = 〈f(x), s<strong>in</strong> kx〉.<br />

In this case Corollary 11.4 states that the series for u as well as that<br />

for u ′ converge uniformly if u is cont<strong>in</strong>uously differentiable with an<br />

absolutely cont<strong>in</strong>uous derivative such that u ′′ ∈ L 2 (−π, π).<br />

Now consider the same equation <strong>in</strong> L 2 (0, π), with separated boundary<br />

conditions u(0) = 0 and u(π) = 0. Apply<strong>in</strong>g this to the general<br />

solution we obta<strong>in</strong> first B = 0 and then A s<strong>in</strong> √ λ π) = 0, so<br />

a non-trivial solution exists only if λ = k 2 for a positive <strong>in</strong>teger k.<br />

Thus the eigenfunctions are s<strong>in</strong> x, s<strong>in</strong> 2x, . . . . These are orthonormal<br />

if the scalar product used is 〈u, v〉 = 2<br />

π<br />

uv. We obta<strong>in</strong> a s<strong>in</strong>e se-<br />

π 0<br />

ries f(x) = ∞ k=1 bk s<strong>in</strong>(kx), where bk = 〈f(x), s<strong>in</strong> kx〉. This is the<br />

series expansion relevant to the vibrat<strong>in</strong>g str<strong>in</strong>g problem discussed <strong>in</strong><br />

Chapter 0 (if the length of the str<strong>in</strong>g is π).<br />

F<strong>in</strong>ally, consider the same equation, still <strong>in</strong> L2 (0, π), but now with<br />

separated boundary conditions u ′ (0) = 0 and u ′ (π) = 0. Apply<strong>in</strong>g this<br />

to the general solution we obta<strong>in</strong> first A = 0 and then B s<strong>in</strong>( √ λ π) = 0,<br />

so a non-trivial solution requires λ = k2 for a non-negative <strong>in</strong>teger k.<br />

Thus the eigenfunctions are 1<br />

√ 2 , cos x, cos 2x, . . . . These are orthonormal<br />

with the same scalar product as <strong>in</strong> the previous example. We


72 11. STURM-LIOUVILLE EQUATIONS<br />

obta<strong>in</strong> a cos<strong>in</strong>e series f(x) = a0<br />

2 + ∞ and ak = 〈f(x), cos kx〉.<br />

k=1 ak cos(kx), where a0 = 〈f, 1〉<br />

We have thus retrieved some of the classical versions of Fourier<br />

series, but is clear that many other variants are obta<strong>in</strong>ed by simply<br />

vary<strong>in</strong>g the boundary conditions, and that many more examples are<br />

obta<strong>in</strong>ed by choos<strong>in</strong>g a non-zero q <strong>in</strong> (10.3).<br />

We now have a satisfactory eigenfunction expansion theory for regular<br />

boundary value problems, so we turn next to s<strong>in</strong>gular problems.<br />

We then need to take a much closer look at Green’s function. We shall<br />

here primarily look at the case of separated boundary conditions for<br />

I = [a, b) where a is a regular endpo<strong>in</strong>t and b possibly s<strong>in</strong>gular, and<br />

refer the reader to the theory of Chapter 15 for the general case. With<br />

this assumption Green’s function has a particularly simple structure.<br />

Assume that ϕ, θ are solutions of −u ′′ + qu = λu with <strong>in</strong>itial data<br />

ϕ(a, λ) = − s<strong>in</strong> α, ϕ ′ (a, λ) = cos α and θ(a, λ) = cos α, θ ′ (a, λ) = s<strong>in</strong> α.<br />

Theorem 11.6. Suppose I = [a, b) with a regular, and that T is<br />

given by the separated condition (10.7) at a, and another separated condition<br />

at b if needed, i.e., if b is regular or <strong>in</strong> the limit circle condition.<br />

If Im λ = 0, then g(x, y, λ) = ϕ(m<strong>in</strong>(x, y), λ)ψ(max(x, y), λ) where ψ is<br />

called the Weyl solution and is given by ψ(x, λ) = θ(x, λ)+m(λ)ϕ(x, λ).<br />

Here m(λ) is called the Weyl-Titchmarsh m-coefficient and is a Nevanl<strong>in</strong>na<br />

function <strong>in</strong> the sense of Chapter 6. The kernel g1 is g1(x, y, λ) =<br />

ϕ ′ (x, λ)ψ(y, λ) if x < y and g1(x, y, λ) = ϕ(y, λ)ψ ′ (x, λ) if x > y.<br />

Proof. It is easily verified that [θ, ϕ] = 1. Now ϕ satisfies the<br />

boundary condition at a and can therefore only satisfy the boundary<br />

condition at b if λ is an eigenvalue and thus real. On the other hand,<br />

there will be a solution <strong>in</strong> L2 (a, b) satisfy<strong>in</strong>g the boundary condition<br />

at b, s<strong>in</strong>ce if deficiency <strong>in</strong>dices are 1 there is no condition at b, and if<br />

deficiency <strong>in</strong>dices are 2, then the condition at b is a l<strong>in</strong>ear, homogeneous<br />

condition on a two-dimensional space, which leaves a space of dimension<br />

1. Thus we may f<strong>in</strong>d a unique m(λ) so that ψ = θ + mϕ satisfies the<br />

boundary condition at b. It follows that [ψ, ϕ] = [θ, ϕ] + m[ϕ, ϕ] = 1.<br />

Now sett<strong>in</strong>g v(x) = 〈u, g(x, ·, λ)〉 and assum<strong>in</strong>g that u ∈ L2 (a, b)<br />

has compact support we obta<strong>in</strong><br />

x<br />

b<br />

v(x) = ψ(x, λ) uϕ(·, λ) + ϕ(x, λ) uψ(·, λ),<br />

a<br />

so that v(a) = − s<strong>in</strong> α b<br />

uψ(·, λ). Differentiat<strong>in</strong>g we obta<strong>in</strong><br />

a<br />

(11.1) v ′ (x) = ψ ′ x<br />

(x, λ) uϕ(·, λ) + ϕ ′ b<br />

(x, λ) uψ(·, λ),<br />

a<br />

x<br />

x


11. STURM-LIOUVILLE EQUATIONS 73<br />

s<strong>in</strong>ce the other two terms obta<strong>in</strong>ed cancel. Thus v ′ (a) = cos α b<br />

uψ(·, λ)<br />

a<br />

so v satisfies the boundary condition at a. If x is to the right of the support<br />

of u we obta<strong>in</strong> v(x) = ψ(x, λ) b<br />

uϕ(·, λ) so that v also satisfies the<br />

a<br />

boundary condition at b, be<strong>in</strong>g a multiple of ψ near b. Differentiat<strong>in</strong>g<br />

aga<strong>in</strong> we obta<strong>in</strong><br />

−v ′′ (x) + (q(x) − λ)v(x) = [ψ, ϕ]u(x) = u(x).<br />

It follows that v = Rλu and, s<strong>in</strong>ce compactly supported functions are<br />

dense <strong>in</strong> L2 (a, b), that g(x, y, λ) is Green’s function for our operator.<br />

From (11.1) now follows that the kernel g1 is as stated.<br />

It rema<strong>in</strong>s to show that m is a Nevanl<strong>in</strong>na function. If u and v both<br />

have compact supports <strong>in</strong> I we have<br />

<br />

〈Rλu, v〉 = g(x, y, λ)u(y)v(x) dxdy,<br />

the double <strong>in</strong>tegral be<strong>in</strong>g absolutely convergent. Similarly<br />

<br />

〈u, Rλv〉 = g(y, x, λ)u(y)v(x) dxdy,<br />

and s<strong>in</strong>ce the <strong>in</strong>tegrals are equal for all u, v by Theorem 5.2.2 we obta<strong>in</strong><br />

g(x, y, λ) = g(y, x, λ) or, if x < y,<br />

ϕ(x, λ)θ(y, λ) + ϕ(x, λ)ϕ(y, λ)m(λ)<br />

= ϕ(x, λ)θ(y, λ) + ϕ(x, λ)ϕ(y, λ)m(λ),<br />

s<strong>in</strong>ce ϕ(·, λ) = ϕ(·, λ) and similarly for θ. S<strong>in</strong>ce ϕ(x, λ) = 0 for non-real<br />

λ (why?) it follows that m(λ) = m(λ). Now λ ↦→ Rλu(x) is analytic<br />

for non-real λ and for compactly supported u<br />

<br />

Rλu(x) = θ(x, λ)<br />

a<br />

x<br />

<br />

uϕ(·, λ) + ϕ(x, λ)<br />

x<br />

b<br />

uθ(·, λ)<br />

<br />

+ m(λ)ϕ(x, λ)<br />

a<br />

b<br />

uϕ(·, λ).<br />

The first two terms on the right are obviously entire functions accord<strong>in</strong>g<br />

to Theorem 10.1, as is the coefficient of m(λ), and s<strong>in</strong>ce by choice<br />

of u we may always assume that this coefficient is non-zero <strong>in</strong> a neighborhood<br />

of any given λ it follows that m(λ) is analytic for non-real<br />

λ.<br />

F<strong>in</strong>ally, <strong>in</strong>tegration by parts shows that<br />

<br />

λ<br />

a<br />

x<br />

|ψ| 2 = −ψ ′ ψ x<br />

a +<br />

x<br />

a<br />

(|ψ ′ | 2 + q|ψ| 2 ).


74 11. STURM-LIOUVILLE EQUATIONS<br />

Tak<strong>in</strong>g the imag<strong>in</strong>ary part of this and us<strong>in</strong>g the fact that ψ satisfies<br />

the boundary condition at b so that Im(ψ ′ ψ) → 0 at b we obta<strong>in</strong><br />

(11.2) 0 ≤<br />

b<br />

a<br />

|ψ(·, λ)| 2 =<br />

Im m(λ)<br />

Im λ ,<br />

s<strong>in</strong>ce a simple calculation shows that Im(ψ ′ (a, λ)ψ(a, λ)) = Im m(λ). It<br />

follows that m has all the required properties of a Nevanl<strong>in</strong>na function.<br />

<br />

Before we proceed, we note the follow<strong>in</strong>g corollary, which completes<br />

our results for the case of a discrete spectrum.<br />

Corollary 11.7. Suppose both endpo<strong>in</strong>ts of I are either regular or<br />

<strong>in</strong> the limit circle condition. Then for any selfadjo<strong>in</strong>t realization T the<br />

resolvent is compact. Thus there is a complete orthonormal sequence<br />

of eigenfunctions.<br />

Proof. By Theorem 8.4 it is enough to prove the corollary when<br />

T is given by separated boundary conditions. But as <strong>in</strong> the proof<br />

of Theorem 11.6 we can then f<strong>in</strong>d non-trivial solutions ψ−(·, λ) and<br />

ψ+(·, λ) of −v ′′ + qv = λv satisfy<strong>in</strong>g the boundary conditions to the<br />

left and right respectively. If Im λ = 0 the solutions ψ±(·, λ) can not be<br />

l<strong>in</strong>early dependent, s<strong>in</strong>ce this would give a non-real eigenvalue for T .<br />

We may therefore assume [ψ+, ψ−] = 1 by multiply<strong>in</strong>g ψ−, if necessary,<br />

by a constant. But then it is seen that ψ−(m<strong>in</strong>(x, y), λ)ψ+(max(x, y), λ)<br />

is Green’s function for T just as <strong>in</strong> the proof of Theorem 11.6.<br />

It is clear that the assumption implies that deficiency <strong>in</strong>dices equal<br />

2, so that ψ± are <strong>in</strong> L 2 (I). However, an easy calculation now shows<br />

that <br />

I×I<br />

|g(x, y, λ)| 2 dxdy ≤ 2ψ− 2 ψ+ 2 < ∞.<br />

Thus, accord<strong>in</strong>g to Theorem 8.7, the resolvent is a <strong>Hilbert</strong>-Schmidt<br />

operator, so that it is compact. <br />

If at least one of the <strong>in</strong>terval endpo<strong>in</strong>ts is s<strong>in</strong>gular and <strong>in</strong> the limit<br />

po<strong>in</strong>t condition the resolvent may not be compact (but it can be!). In<br />

this case the only boundary condition will be a separated boundary<br />

condition at the other endpo<strong>in</strong>t, unless this is also <strong>in</strong> the limit po<strong>in</strong>t<br />

condition, when no boundary conditions at all are required.<br />

We now return to the situation treated <strong>in</strong> Theorem 11.6 when I =<br />

[a, b) with a regular, and T is given by the separated condition (10.7)<br />

at a, and another separated condition at b if needed. S<strong>in</strong>ce the mcoefficient<br />

is a Nevanl<strong>in</strong>na function there is a unique <strong>in</strong>creas<strong>in</strong>g and<br />

left-cont<strong>in</strong>uous matrix-valued function ρ with ρ(0) = 0 and unique real


numbers A and B ≥ 0 such that<br />

<br />

m(λ) = A + Bλ +<br />

11. STURM-LIOUVILLE EQUATIONS 75<br />

∞<br />

−∞<br />

( 1 t<br />

−<br />

t − λ t2 ) dρ(t).<br />

+ 1<br />

The spectral measure dρ gives rise to a <strong>Hilbert</strong> space L 2 ρ, which<br />

consists of those functions û which are measurable with respect to dρ<br />

and for which û 2 ρ = ∞<br />

−∞ |û|2 is f<strong>in</strong>ite. Alternatively, we may th<strong>in</strong>k of<br />

L 2 ρ as the completion <strong>in</strong> this norm of compactly supported, cont<strong>in</strong>uous<br />

functions. These alternative def<strong>in</strong>itions give the same space, but we<br />

will not prove this here. We denote the scalar product <strong>in</strong> L 2 ρ by 〈·, ·〉ρ.<br />

The ma<strong>in</strong> result of this chapter is the follow<strong>in</strong>g.<br />

Theorem 11.8.<br />

(1) If u ∈ L 2 (a, b) the <strong>in</strong>tegral x<br />

0 uϕ(·, t) converges <strong>in</strong> L2 ρ as x → b.<br />

The limit is called the generalized Fourier transform of u and<br />

is denoted by F(u) or û. We write this as û(t) = 〈u, ϕ(·, t)〉,<br />

although the <strong>in</strong>tegral may not converge po<strong>in</strong>twise.<br />

(2) The mapp<strong>in</strong>g u ↦→ û is unitary between L 2 (a, b) and L 2 ρ so that<br />

the Parseval formula 〈u, v〉 = 〈û, ˆv〉ρ is valid if u, v ∈ L 2 (a, b).<br />

(3) The <strong>in</strong>tegral <br />

K û(t)ϕ(x, t) dρ(t) converges <strong>in</strong> L2 (a, b) as K →<br />

R through compact <strong>in</strong>tervals. If û = F(u) the limit is u, so<br />

the <strong>in</strong>tegral is the <strong>in</strong>verse of the generalized Fourier transform.<br />

Aga<strong>in</strong>, we write u(x) = 〈û, ϕ(x, ·)〉ρ for u ∈ L2 (a, b), although<br />

the <strong>in</strong>tegral may not converge po<strong>in</strong>twise.<br />

(4) Let E∆ denote the spectral projector of T for the <strong>in</strong>terval ∆.<br />

Then E∆u(x) = <br />

∆<br />

ûϕ(x, ·) dρ.<br />

(5) If u ∈ D(T ) then F(T u)(t) = tû(t). Conversely, if û and tû(t)<br />

are <strong>in</strong> L 2 ρ, then F −1 (û) ∈ D(T ).<br />

Before we prove this theorem, let us <strong>in</strong>terpret it <strong>in</strong> terms of the<br />

spectral theorem. If the <strong>in</strong>terval ∆ shr<strong>in</strong>ks to a po<strong>in</strong>t t, then E∆ tends<br />

to zero, unless t is an eigenvalue, <strong>in</strong> which case we obta<strong>in</strong> the projection<br />

on the eigenspace. By (4) this means that eigenvalues are precisely<br />

those po<strong>in</strong>ts at which the function ρ has a (jump) discont<strong>in</strong>uity; cont<strong>in</strong>uous<br />

spectrum thus corresponds to po<strong>in</strong>ts where ρ is cont<strong>in</strong>uous, but<br />

which are still po<strong>in</strong>ts of <strong>in</strong>crease for ρ, i.e., there is no neighborhood of<br />

the po<strong>in</strong>t where ρ is constant. In terms of measure theory, this means<br />

that the atomic part of the measure dρ determ<strong>in</strong>es the eigenvalues, and<br />

the diffuse part of dρ determ<strong>in</strong>es the cont<strong>in</strong>uous spectrum.<br />

We will prove Theorem 11.8 through a long (but f<strong>in</strong>ite!) sequence<br />

of lemmas. First note that for u ∈ L 2 (a, b) with compact support <strong>in</strong><br />

[a, b) the function û(λ) = 〈u, ϕ(·, λ)〉 is an entire function of λ s<strong>in</strong>ce<br />

ϕ(x, λ) is entire, locally uniformly <strong>in</strong> x, accord<strong>in</strong>g to Theorem 10.1.<br />

Lemma 11.9. The function 〈Rλu, v〉 − m(λ)û(λ)ˆv(λ) is entire for<br />

all u, v ∈ L 2 (a, b) with compact supports <strong>in</strong> [a, b).


76 11. STURM-LIOUVILLE EQUATIONS<br />

Proof. If the supports are <strong>in</strong>side [a, c], direct calculation shows<br />

that the function is<br />

c<br />

x<br />

c<br />

<br />

θ(x, λ) uϕ(·, λ) + ϕ(x, λ) uθ(·, λ) v(x) dx .<br />

a<br />

a<br />

This is obviously an entire function of λ. <br />

Lemma 11.10. Let σ be <strong>in</strong>creas<strong>in</strong>g and differentiable at 0. Then<br />

1<br />

−1 ds 1 dσ(t)<br />

√<br />

−1 t2 +s2 converges.<br />

Proof. Integrat<strong>in</strong>g by parts we have, for s = 0,<br />

1<br />

−1<br />

dσ(t)<br />

√ t 2 + s 2<br />

= σ(1) − σ(−1)<br />

√ 1 + s 2<br />

−<br />

1<br />

−1<br />

x<br />

σ(t) − σ(0)<br />

(t<br />

t<br />

d<br />

dt<br />

1<br />

√ t 2 + s 2<br />

) dt .<br />

The first factor <strong>in</strong> the last <strong>in</strong>tegral is bounded s<strong>in</strong>ce σ ′ (0) exists, and the<br />

second factor is negative s<strong>in</strong>ce (t2 + s2 1<br />

− ) 2 decreases with |t|. Furthermore,<br />

the <strong>in</strong>tegral with respect to t of the second factor is <strong>in</strong>tegrable<br />

with respect to s, by calculation (check this!). Thus the double <strong>in</strong>tegral<br />

is absolutely convergent. <br />

As usual we denote the spectral projectors belong<strong>in</strong>g to T by Et.<br />

Lemma 11.11. Let u ∈ L 2 (a, b) have compact support <strong>in</strong> [a, b) and<br />

assume c < d to be po<strong>in</strong>ts of differentiability for both 〈Etu, u〉 and ρ(t).<br />

Then<br />

(11.3) 〈Edu, u〉 − 〈Ecu, u〉 =<br />

d<br />

c<br />

|û(t)| 2 dρ(t).<br />

Proof. Let Γ be the positively oriented rectangle with corners <strong>in</strong><br />

c ± i, d ± i. Accord<strong>in</strong>g to Lemma 11.9<br />

<br />

<br />

〈Rλu, u〉 dλ = û(λ)û(λ)m(λ) dλ<br />

Γ<br />

if either of these <strong>in</strong>tegrals exist. However, by Lemma 11.9,<br />

<br />

<br />

û(λ)û(λ)m(λ) dλ =<br />

∞<br />

û(λ)û(λ) ( 1 t<br />

−<br />

t − λ t2 ) dρ(t) dλ.<br />

+ 1<br />

Γ<br />

Γ<br />

Γ<br />

The double <strong>in</strong>tegral is absolutely convergent except perhaps where t =<br />

λ. The difficulty is thus caused by<br />

1<br />

−1<br />

ds<br />

µ+1<br />

µ−1<br />

−∞<br />

û(µ + is)û(µ − is) dρ(t)<br />

t − µ − is


11. STURM-LIOUVILLE EQUATIONS 77<br />

for µ = c, d. However, Lemma 11.10 ensures the absolute convergence<br />

of these <strong>in</strong>tegrals. Chang<strong>in</strong>g the order of <strong>in</strong>tegration gives<br />

<br />

Γ<br />

û(λ)û(λ)m(λ) dλ =<br />

∞<br />

−∞<br />

<br />

Γ<br />

û(λ)û(λ)( 1 t<br />

−<br />

t − λ t2 ) dλ dρ(t)<br />

+ 1<br />

<br />

= −2πi<br />

c<br />

d<br />

|û(t)| 2 dρ(t)<br />

s<strong>in</strong>ce for c < t < d the residue of the <strong>in</strong>ner <strong>in</strong>tegral is −|û(t)| 2 dρ(t)<br />

whereas t = c, d do not carry any mass and the <strong>in</strong>ner <strong>in</strong>tegrand is<br />

regular for t < c and t > d.<br />

Similarly we have<br />

<br />

Γ<br />

〈Rλu, u〉 dλ =<br />

∞<br />

−∞<br />

<br />

d〈Etu, u〉<br />

Γ<br />

dλ<br />

t − λ<br />

<br />

= −2πi<br />

c<br />

d<br />

d〈Etu, u〉<br />

which completes the proof. <br />

Lemma 11.12. If u ∈ L2 (a, b) the generalized Fourier transform<br />

uϕ(·, t) as x → b. Furthermore,<br />

û ∈ L 2 ρ exists as the L 2 ρ-limit of x<br />

a<br />

〈Etu, v〉 =<br />

t<br />

−∞<br />

ûˆv dρ .<br />

In particular, 〈u, v〉 = 〈û, ˆv〉ρ if u and v ∈ L 2 (a, b).<br />

Proof. If u has compact support Lemma 11.11 shows that (11.3)<br />

holds for a dense set of values c, d s<strong>in</strong>ce functions of bounded variation<br />

are a.e. differentiable. S<strong>in</strong>ce both Et and ρ are left-cont<strong>in</strong>uous we<br />

obta<strong>in</strong>, by lett<strong>in</strong>g d ↑ t, c → −∞ through such values,<br />

〈Etu, v〉 =<br />

t<br />

−∞<br />

ûˆv(t) dρ(t)<br />

when u, v have compact supports; first for u = v and then <strong>in</strong> general<br />

by polarization. As t → ∞ we also obta<strong>in</strong> that 〈u, v〉 = 〈û, ˆv〉ρ when u<br />

and v have compact supports.<br />

For arbitrary u ∈ L 2 (a, b) we set, for c ∈ (a, b),<br />

uc(x) =<br />

<br />

u(x) for x < c<br />

0 otherwise<br />

and obta<strong>in</strong> a transform ûc. If also d ∈ (a, b) it follows that ûc −ûdρ =<br />

uc − ud, and s<strong>in</strong>ce uc → u <strong>in</strong> L 2 (a, b) as c → b, Cauchy’s convergence<br />

pr<strong>in</strong>ciple shows that ûc converges to an element û ∈ L 2 ρ as c → b. The<br />

lemma now follows <strong>in</strong> full generality by cont<strong>in</strong>uity.


78 11. STURM-LIOUVILLE EQUATIONS<br />

Note that we have proved that F is an isometry from L 2 (a, b) to<br />

L 2 ρ.<br />

Lemma 11.13. The <strong>in</strong>tegral <br />

K ûϕ(x, ·) dρ is <strong>in</strong> L2 (a, b) if K is a<br />

compact <strong>in</strong>terval and û ∈ L 2 ρ, and as K → R the <strong>in</strong>tegral converges<br />

<strong>in</strong> L 2 (a, b). The limit F −1 (û) is called the <strong>in</strong>verse transform of û.<br />

If u ∈ L 2 (a, b) then F −1 (F(u)) = u. F −1 (û) = 0 if and only if û is<br />

orthogonal <strong>in</strong> L 2 ρ to all generalized Fourier transforms.<br />

Proof. If û ∈ L 2 ρ has compact support, then u(x) = 〈û, ϕ(x, ·)〉ρ<br />

is cont<strong>in</strong>uous, so uc ∈ L 2 (a, b) for c ∈ (a, b), and has a transform ûc.<br />

We have<br />

uc 2 =<br />

c<br />

a<br />

∞ <br />

−∞<br />

ûϕ(x, ·) dρ u(x) dx.<br />

Considered as a double <strong>in</strong>tegral this is absolutely convergent, so chang<strong>in</strong>g<br />

the order of <strong>in</strong>tegration we obta<strong>in</strong><br />

uc 2 =<br />

∞<br />

−∞<br />

c a<br />

<br />

uϕ(·, t) û(t) dρ(t)<br />

= 〈û, ûc〉ρ ≤ ûρûcρ = ûρuc,<br />

accord<strong>in</strong>g to Lemma 11.12. Hence uc ≤ ûρ, so u ∈ L 2 (a, b), and<br />

u ≤ ûρ. If now û ∈ L 2 ρ is arbitrary, this <strong>in</strong>equality shows (like <strong>in</strong> the<br />

proof of Lemma 11.12) that <br />

K û(t)ϕ(x, t) dρ(t) converges <strong>in</strong> L2 (a, b) as<br />

K → R through compact <strong>in</strong>tervals; call the limit u1. If v ∈ L 2 (a, b),<br />

ˆv is its generalized Fourier transform, K is a compact <strong>in</strong>terval, and<br />

c ∈ (a, b), we have<br />

<br />

K<br />

c a<br />

<br />

v(x)ϕ(x, t) dx û(t) dρ(t) =<br />

c<br />

a<br />

<br />

v(x)<br />

K<br />

û(t)ϕ(x, t) dρ(t) dx<br />

by absolute convergence. Lett<strong>in</strong>g c → b and K → R we obta<strong>in</strong> 〈û, ˆv〉ρ =<br />

〈u1, v〉. If û is the transform of u, then by Lemma 11.12 u1 − u is<br />

orthogonal to L 2 (a, b), so u1 = u. Similarly, u1 = 0 precisely if û is<br />

orthogonal to all transforms. <br />

We have shown the <strong>in</strong>verse transform to be the adjo<strong>in</strong>t of the transform<br />

as an operator from L 2 (a, b) <strong>in</strong>to L 2 ρ. The basic rema<strong>in</strong><strong>in</strong>g difficulty<br />

is to prove that the transform is surjective, i.e., accord<strong>in</strong>g to<br />

Lemma 11.13, that the <strong>in</strong>verse transform is <strong>in</strong>jective. The follow<strong>in</strong>g<br />

lemma will enable us to prove this.<br />

Lemma 11.14. The transform of Rλu is û(t)/(t − λ).


11. STURM-LIOUVILLE EQUATIONS 79<br />

Proof. By Lemma 11.12, 〈Etu, v〉 = t<br />

−∞<br />

〈Rλu, v〉 =<br />

∞<br />

−∞<br />

d〈Etu, v〉<br />

t − λ =<br />

∞<br />

−∞<br />

By properties of the resolvent<br />

Rλu 2 =<br />

1<br />

2i Im λ 〈Rλu − R λ u, u〉 =<br />

û(t)ˆv(t) dρ(t)<br />

t − λ<br />

∞<br />

−∞<br />

ûˆv dρ, so that<br />

= 〈û(t)/(t − λ), ˆv(t)〉ρ.<br />

d〈Etu, u〉<br />

|t − λ| 2 = û(t)/(t − λ)2 ρ.<br />

Sett<strong>in</strong>g v = Rλu and us<strong>in</strong>g Lemma 11.12, it therefore follows that<br />

û(t)/(t−λ) 2 ρ = 〈û(t)/(t−λ), F(Rλu)〉ρ = F(Rλu) 2 ρ. It follows that<br />

we have û(t)/(t − λ) − F(Rλu)ρ = 0, which was to be proved. <br />

Lemma 11.15. The generalized Fourier transform is unitary from<br />

L 2 (a, b) to L 2 ρ and the <strong>in</strong>verse transform is the <strong>in</strong>verse of this map.<br />

Proof. Accord<strong>in</strong>g to Lemma 11.13 we need only show that if<br />

û ∈ L2 ρ has <strong>in</strong>verse transform 0, then û = 0. Now, accord<strong>in</strong>g to<br />

Lemma 11.14, F(v)(t)/(t − λ) is a transform for all v ∈ L2 (a, b) and<br />

non-real λ. Thus we have 〈û(t)/(t − λ), F(v)(t)〉ρ = 0 for all non-real λ<br />

if û is orthogonal to all transforms. But we can view this scalar product<br />

as the Stieltjes-transform of the measure t<br />

ûF(v) dρ, so apply<strong>in</strong>g the<br />

−∞<br />

<strong>in</strong>version formula Lemma 6.5 we have <br />

ûF(v) dρ = 0 for all compact<br />

K<br />

<strong>in</strong>tervals K, and all v ∈ L2 (a, b). Thus the cutoff of û, which equals<br />

û <strong>in</strong> K and 0 outside, is also orthogonal to all transforms, i.e., has<br />

<strong>in</strong>verse transform 0 accord<strong>in</strong>g to Lemma 11.13. It follows that<br />

<br />

v(x) = û(t)ϕ(x, t) dρ(t)<br />

K<br />

is the zero-element of L2 (a, b) for any compact <strong>in</strong>terval K. Differentiat<strong>in</strong>g<br />

under the <strong>in</strong>tegral sign we also see that v ′ (x) = <br />

K ûϕ′ (x, ·) dρ<br />

is the zero element of L2 (a, b). But these functions are cont<strong>in</strong>uous, so<br />

they are po<strong>in</strong>twise 0. Now 0 = v ′ (a) cos α − v(a) s<strong>in</strong> α = <br />

û dρ. Thus<br />

K<br />

û dρ is the zero measure, so that û = 0 as an element of L2 ρ. <br />

Lemma 11.16. If u ∈ D(T ), then F(T u)(t) = tû(t). Conversely, if<br />

û and tû(t) are <strong>in</strong> L 2 ρ, then F −1 (û) ∈ D(T ).<br />

Proof. We have u ∈ D(T ) if and only if u = Rλ(T u − λu), which<br />

holds if and only if û(t) = (F(T u)(t) − λû(t))/(t − λ), i.e., F(T u)(t) =<br />

tû(t), accord<strong>in</strong>g to Lemmas 11.14 and 11.15. <br />

This completes the proof of Theorem 11.8. We also have the follow<strong>in</strong>g<br />

analogue of Corollary 11.4.


80 11. STURM-LIOUVILLE EQUATIONS<br />

Theorem 11.17. Suppose u ∈ D(T ). Then the <strong>in</strong>verse transform<br />

〈û, ϕ(x, ·)〉ρ converges locally uniformly to u(x).<br />

Proof. The proof is very similar to that of Corollary 11.4. Put<br />

v = (T − i)u so that u = Riv. Let K be a compact <strong>in</strong>terval, and put<br />

uK(x) = <br />

K û(t)ϕ(x, t) dP (t) = F −1 (χû)(x), where χ is the characteristic<br />

function for K. Def<strong>in</strong>e vK similarly. Then by Lemma 11.14<br />

RivK = F −1 ( χ(t)ˆv(t)<br />

) = F<br />

t − i<br />

−1 (χû) = uK.<br />

S<strong>in</strong>ce vK → v <strong>in</strong> L2 (a, b) as K → R, it follows from Theorem 11.1<br />

that uK → u <strong>in</strong> C1 (L) as K → R, for any compact sub<strong>in</strong>terval L of<br />

[a, b). <br />

Example 11.18 (S<strong>in</strong>e and cos<strong>in</strong>e transforms). Let us <strong>in</strong>terpret Theorem<br />

11.8 for the case of the equation −u ′′ = λu on the <strong>in</strong>terval [0, ∞).<br />

We shall look at the cases when the boundary condition at 0 is either<br />

a Dirichlet condition (α = 0 <strong>in</strong> (10.7)) or a Neumann condition (α =<br />

π/2). The general solution of the equation is u(x) = Ae √ −λx +Be − √ −λx .<br />

Let the root be the pr<strong>in</strong>cipal branch, i.e., the branch where the real<br />

part is ≥ 0. Then the only solutions <strong>in</strong> L 2 (0, ∞) are, unless λ ≥ 0,<br />

the multiples of e −√ −λx = cos(i √ −λx) + i s<strong>in</strong>(i √ −λx). It follows that<br />

the equation is <strong>in</strong> the limit po<strong>in</strong>t condition at <strong>in</strong>f<strong>in</strong>ity (this is also a<br />

consequence of Theorem 10.10).<br />

With a Dirichlet condition at 0 we have θ(x, λ) = cos(i √ −λx)<br />

and ϕ(x, λ) = −i s<strong>in</strong>(i √ −λx)/ √ −λ. It follows that the m-function is<br />

mD(λ) = − √ −λ. Similarly, the m-function <strong>in</strong> the case of a Neumann<br />

condition at 0 is mN(λ) = 1/ √ −λ, us<strong>in</strong>g aga<strong>in</strong> the pr<strong>in</strong>cipal branch of<br />

the root.<br />

Us<strong>in</strong>g the Stieltjes <strong>in</strong>version formula Lemma 6.5 we √see that the<br />

t dt for t ≥<br />

correspond<strong>in</strong>g spectral measures are given by dρD(t) = 1<br />

π<br />

0, dρD = 0 <strong>in</strong> (−∞, 0), respectively dρN(t) = dt<br />

π √ t for t ≥ 0, dρN = 0<br />

<strong>in</strong> (−∞, 0). If u ∈ L2 (0, ∞) and we def<strong>in</strong>e û(t) = ∞<br />

0 u(x) s<strong>in</strong>(√tx) √ dx,<br />

t<br />

as a generalized <strong>in</strong>tegral converg<strong>in</strong>g <strong>in</strong> L2 ρD , then the <strong>in</strong>version formula<br />

reads u(x) = 1<br />

∞<br />

π 0 û(t) s<strong>in</strong>(√tx) dt.<br />

In this case one usually changes variable <strong>in</strong> the transform and<br />

def<strong>in</strong>es the s<strong>in</strong>e transform S(u)(ξ) = ∞<br />

0 u(x) s<strong>in</strong>(ξx) dx = ξû(ξ2 ).<br />

Chang<strong>in</strong>g variable to ξ = √ t <strong>in</strong> the <strong>in</strong>version formula above then shows<br />

that u(x) = 2<br />

∞<br />

S(u)(ξ) s<strong>in</strong>(ξx) dξ.<br />

π 0<br />

Similarly, if we set û(t) = ∞<br />

0 u(x) cos(√tx) dx the <strong>in</strong>version for-<br />

dt. In this case it is aga<strong>in</strong><br />

mula obta<strong>in</strong>ed is u(x) = 1<br />

π<br />

∞<br />

0 û(t) cos(√ tx)<br />

√ t<br />

common to use ξ = √ t as the transform variable, so one def<strong>in</strong>es<br />

the cos<strong>in</strong>e transform C(u)(ξ) = ∞<br />

u(x) cos(ξx) dx. Chang<strong>in</strong>g vari-<br />

0<br />

ables <strong>in</strong> the <strong>in</strong>version formula above then gives the <strong>in</strong>version formula<br />

C(u)(ξ) cos(ξx) dξ for the cos<strong>in</strong>e transform.<br />

u(x) = 2<br />

π<br />

∞<br />

0


EXERCISES FOR CHAPTER 11 81<br />

Note that there are no eigenvalues <strong>in</strong> either of these cases; the<br />

spectrum is purely cont<strong>in</strong>uous.<br />

Exercises for Chapter 11<br />

Exercise 11.1. Show that if K is a compact <strong>in</strong>terval, then C 1 (K)<br />

is a Banach space with the norm sup x∈K|u(x)| + sup x∈K|u ′ (x)|.<br />

If you know some topology, also show that if I is an arbitrary <strong>in</strong>terval,<br />

then C(I) is a Fréchet space (a l<strong>in</strong>ear Hausdorff space with<br />

the topology given by a countable family of sem<strong>in</strong>orms, which is also<br />

complete), under the topology of locally uniform convergence.<br />

Exercise 11.2. With the assumptions of Corollary 11.4 the Fourier<br />

series for u <strong>in</strong> the doma<strong>in</strong> of T actually converges absolutely and locally<br />

uniformly to u. If λ1, λ2, . . . are the eigenvalues and e1, e2, . . . the correspond<strong>in</strong>g<br />

orthonormal eigenfunctions, use Parseval’s formula to show<br />

that, po<strong>in</strong>twise <strong>in</strong> x, g(x, ·, λ)2 = | ej(x)<br />

λj−λ |2 , with natural notation.<br />

Then show that as an L2 (I)-valued function x ↦→ g(x, ·, λ) is locally<br />

bounded, i.e., x ↦→ g(x, ·, λ) is bounded on any compact sub<strong>in</strong>terval<br />

of I.<br />

If v = Rλu and ûj is the j:th Fourier coefficient of u, then ˆvj =<br />

〈Rλu, ej〉 = 〈u, Rλej〉 = ûj/(λj <br />

− λ). Show that this implies that<br />

j>n |ˆvjej(x)| tends locally uniformly to 0.


CHAPTER 12<br />

Inverse spectral theory<br />

In this chapter we cont<strong>in</strong>ue to study the simple Sturm-Liouville<br />

equation −u ′′ + qu = λu, on an <strong>in</strong>terval with at least one regular<br />

endpo<strong>in</strong>t. Our aim is to give some results on <strong>in</strong>verse spectral theory,<br />

i.e., questions related to the determ<strong>in</strong>ation of the equation, <strong>in</strong> this<br />

case the potential q from spectral data, such as eigenvalues, spectral<br />

measures or similar th<strong>in</strong>gs. Our object of study is the eigen-value<br />

problem<br />

(12.1)<br />

(12.2)<br />

− u ′′ + qu = λu on [0, b),<br />

u(0) cos α + u ′ (0) s<strong>in</strong> α = 0.<br />

Here α is an arbitrary, fixed number <strong>in</strong> [0, π), so that the boundary<br />

condition is an arbitrary separated boundary condition. We assume q ∈<br />

L1 loc [0, b), i.e., q <strong>in</strong>tegrable on any <strong>in</strong>terval [0, c] with c ∈ (0, b), so that<br />

0 is a regular endpo<strong>in</strong>t for the equation. The other endpo<strong>in</strong>t b may be<br />

<strong>in</strong>f<strong>in</strong>ite or f<strong>in</strong>ite, <strong>in</strong> the latter case s<strong>in</strong>gular or regular. If the deficiency<br />

<strong>in</strong>dices for the equation <strong>in</strong> L2 (0, b) are (1, 1) the operator correspond<strong>in</strong>g<br />

to (12.1), (12.2) is selfadjo<strong>in</strong>t; if they are (2, 2) a boundary condition<br />

at b is required to obta<strong>in</strong> a selfadjo<strong>in</strong>t operator. We assume that, if<br />

necessary, a choice of boundary condition at b is made, so that we are<br />

deal<strong>in</strong>g with a self-adjo<strong>in</strong>t operator which we will call T .<br />

If the deficiency <strong>in</strong>dices are (2, 2) we know the spectrum is discrete<br />

(Theorem 11.7), but when the deficiency <strong>in</strong>dices are (1, 1) the spectrum<br />

can be of any type. As <strong>in</strong> Chapter 11, let ϕ and θ be solutions of (12.1)<br />

satisfy<strong>in</strong>g <strong>in</strong>itial conditions<br />

<br />

ϕ(0, λ) = − s<strong>in</strong> α<br />

(12.3)<br />

ϕ ′ (0, λ) = cos α ,<br />

<br />

θ(0, λ) = cos α<br />

θ ′ (0, λ) = s<strong>in</strong> α .<br />

Then Green’s function for T is given by<br />

g(x, ·, λ) = ϕ(m<strong>in</strong>(x, y), λ)ψ(max(x, y), λ)<br />

where ψ(x, λ) = θ(x, λ) + m(λ)ϕ(x, λ) and the Titchmarsh-Weyl mfunction<br />

m(λ) is determ<strong>in</strong>ed so that ψ satisfies the boundary condition<br />

at b. In particular ψ ∈ L 2 (0, b). Let the Nevanl<strong>in</strong>na representation of<br />

m be<br />

m(λ) = A + Bλ +<br />

∞<br />

−∞<br />

1 t<br />

−<br />

t − λ t2 <br />

dρ(t),<br />

+ 1<br />

83


84 12. INVERSE SPECTRAL THEORY<br />

where A ∈ R, B ≥ 0 and ρ <strong>in</strong>creases (dρ is a positive measure) and<br />

∞<br />

−∞<br />

dρ(t)<br />

t 2 +1 < ∞. The transform space L2 ρ consists of those functions û,<br />

measurable with respect to dρ, for which û 2 ρ = ∞<br />

−∞ |û|2 dρ is f<strong>in</strong>ite.<br />

The generalized Fourier transform of u ∈ L 2 (0, b) is<br />

û(t) =<br />

b<br />

0<br />

u(x)ϕ(x, t) dx,<br />

converg<strong>in</strong>g <strong>in</strong> L 2 ρ, and with <strong>in</strong>verse given by<br />

u(x) =<br />

∞<br />

−∞<br />

û(t)ϕ(x, t) dρ(t),<br />

which converges <strong>in</strong> L 2 (0, b). Furthermore, u = ûρ (Parseval) and<br />

u ∈ D(T ) if and only if û and tû(t) ∈ L 2 (0, b), and then T u(t) = tû(t).<br />

In the case when one has a discrete spectrum, which means that the<br />

spectrum consists of isolated eigenvalues (of f<strong>in</strong>ite multiplicity), the<br />

function ρ is a step function, with a step at each eigenvalue. Suppose<br />

the eigenvalues are λ1, λ2, . . . and that the size of the step is cj =<br />

limε↓0(ρ(λj + ε) − ρ(λj − ε)). Then the <strong>in</strong>verse transform takes the<br />

form<br />

u(x) =<br />

∞<br />

û(λj)ϕ(x, λj)cj,<br />

j=1<br />

where û(λj) = 〈u, ϕ(·, λj〉. For u = ϕ(·, λj) the expansion becomes<br />

ϕ(x, λj) = ϕ(·, λj) 2 ϕ(x, λj)cj. It follows that cj = ϕ(·, λj) −2 . Note<br />

that ϕ(·, λj) is an eigenfunction associated with λj, so the jump cj of ρ<br />

at λj is the so called normalization constant for the eigenfunction. The<br />

name comes from the fact that a normalized eigenfunction is given by<br />

ej = √ cj ϕ(·, λj). We have shown the follow<strong>in</strong>g proposition.<br />

Proposition 12.1. In the case of a discrete spectrum knowledge<br />

of the spectral function ρ is equivalent to know<strong>in</strong>g the eigenvalues and<br />

the correspond<strong>in</strong>g normalization constants.<br />

1. Asymptotics of the m-function<br />

In order to discuss some results <strong>in</strong> <strong>in</strong>verse spectral theory we need<br />

a few results on the asymptotic behavior of the m-function for large λ.<br />

We denote by mα(λ) the m-function for the boundary condition (12.2)<br />

and some fixed boundary condition at b. The follow<strong>in</strong>g theorem is a<br />

simplified version of a result from [3].<br />

Theorem 12.2. We have<br />

m0(λ) = − √ −λ + o(|λ| 1/2 )


1. ASYMPTOTICS OF THE m-FUNCTION 85<br />

as λ → ∞ along any non-real ray 1 . Similarly, for 0 < α < π,<br />

mα(λ) = cot α + ( √ −λ s<strong>in</strong> 2 α) −1 + o(|λ| −1/2 )<br />

as λ → ∞ along any non-real ray.<br />

By a non-real ray we always mean a half-l<strong>in</strong>e start<strong>in</strong>g at the orig<strong>in</strong><br />

which is not part of the real l<strong>in</strong>e. Here and later the square root is<br />

always the pr<strong>in</strong>cipal branch, i.e., the branch with a positive real part<br />

Now note that, up to constant multiples, the Weyl solution ψ is<br />

determ<strong>in</strong>ed by the boundary condition at b. For α = 0 we have<br />

ψ ′ (0, λ)/ψ(0, λ) = m0(λ), so keep<strong>in</strong>g a fixed boundary condition at b<br />

we obta<strong>in</strong> m0(λ) = (s<strong>in</strong> α+mα(λ) cos α)/(cos α−mα(λ) s<strong>in</strong> α). Solv<strong>in</strong>g<br />

for mα gives<br />

mα(λ) = cos α m0(λ) − s<strong>in</strong> α<br />

s<strong>in</strong> α m0(λ) + cos α<br />

= cot α − (m0(λ) s<strong>in</strong> 2 α) −1 +<br />

cos α<br />

m0(λ) s<strong>in</strong> 2 α(m0(λ) s<strong>in</strong> α + cos α) .<br />

Thus, the formula for m0 immediately implies that for mα, 0 < α < π,<br />

so that we only have to prove the formula for m0. This will require<br />

good asymptotic estimates of the solutions ϕ and θ.<br />

Lemma 12.3. If u solves −u ′′ + qu = λu with fixed <strong>in</strong>itial data <strong>in</strong> 0<br />

one has<br />

(12.4) u(x) = u(0)(cosh(x √ −λ) + O(1)(e x<br />

0 |q|/√ |λ| − 1)e x √ −λ )<br />

uniformly <strong>in</strong> x, λ.<br />

+ u′ (0)<br />

√ −λ (s<strong>in</strong>h(x √ −λ) + O(1)(e x<br />

0 |q|/√ |λ| − 1)e x √ −λ ),<br />

Proof. Solv<strong>in</strong>g the equation u ′′ + λu = f and then replac<strong>in</strong>g f by<br />

qu gives<br />

(12.5) u(x) = cosh(kx)u(0) + s<strong>in</strong>h(kx)<br />

k<br />

+<br />

u ′ (0)<br />

x<br />

where we have written k for √ −λ. Sett<strong>in</strong>g<br />

0<br />

s<strong>in</strong>h(k(x − t))<br />

q(t)u(t) dt,<br />

k<br />

g(x) = |u(x) − cosh(kx)u(0) − s<strong>in</strong>h(kx)<br />

u<br />

k<br />

′ −x Re k<br />

(0)|e<br />

1 If g is a positive function the notation f(λ) = o(g(λ)) as λ → ∞ means<br />

f(λ)/g(λ) → 0 as λ → ∞.


86 12. INVERSE SPECTRAL THEORY<br />

easy estimates give<br />

g(x) ≤ c(λ)<br />

|k|<br />

x<br />

0<br />

|q| + 1<br />

|k|<br />

x<br />

0<br />

|q|g,<br />

where c(λ) = |u(0)| + |u ′ (0)|/|k|. Integrat<strong>in</strong>g after multiply<strong>in</strong>g by the<br />

<strong>in</strong>tegrat<strong>in</strong>g factor |q(x)| exp(− x<br />

|q|/|k|) we obta<strong>in</strong><br />

0<br />

g(x) ≤ c(λ)(e x<br />

0 |q|/√ |λ| − 1).<br />

The estimate for u follows immediately from this. <br />

Proof of Theorem 12.2. As noted, we only need to prove the<br />

theorem for α = 0, so assume this. Now let λ = rµ, where µ is <strong>in</strong><br />

some fixed, compact subset of C \ R, and r > 0 is large. We def<strong>in</strong>e<br />

ϕr(x, µ) = √ rϕ(x/ √ r, rµ) and θr(x, µ) = θ(x/ √ r, rµ). Then ϕr and<br />

θr satisfy the <strong>in</strong>itial conditions (12.3) for α = 0 and satisfy the equation<br />

−u ′′ + qru = µu on (0, b √ r), where qr(x) = q(x/ √ r)/r (check!!). From<br />

Lemma 12.3 it immediately follows that, locally uniformly <strong>in</strong> x, µ, we<br />

have ϕr(x, µ) → s<strong>in</strong>h(x√−µ) √ and θr(x, µ) → cosh(x −µ<br />

√ −µ) as r → ∞.<br />

Now let mr(µ) = m(rµ)<br />

√ r and make a change of variable x = y/ √ r <strong>in</strong><br />

(11.2). This gives b √ r<br />

|θr + mrϕr| 0<br />

2 = Im(mr(µ))/ Im µ, so if c > 0 we<br />

have<br />

(12.6)<br />

c<br />

0<br />

|θr(·, µ) + mr(µ)ϕr(·, µ)| 2 ≤<br />

Im mr(µ)<br />

Im µ<br />

as soon as b √ r ≥ c. The <strong>in</strong>equality may be rewritten as<br />

|mr(µ) − Cr| ≤ Rr,<br />

where Cr and Rr are easily expressed <strong>in</strong> terms of c<br />

0 θrϕr, c<br />

0 |θr| 2 <br />

and<br />

c<br />

0 |ϕr| 2 . The <strong>in</strong>equality therefore conf<strong>in</strong>es mr to a disk Kr(c), and is<br />

clear from Lemma 12.3 that as r → ∞ the coefficients converge, locally<br />

uniformly for µ ∈ C \ R, to those <strong>in</strong> the correspond<strong>in</strong>g disk K(c) for<br />

the case q = 0. Therefore, given any neighborhood Ω of K(c), we must<br />

have mr(µ) ∈ Ω for all sufficiently large r.<br />

This is true for any c > 0, and it is obvious from (12.6) that K(c)<br />

decreases as a function of c. We shall show presently that only the<br />

po<strong>in</strong>t − √ −µ is common to all K(c), and then it follows that mr(µ) →<br />

− √ −µ, locally uniformly for µ ∈ C \ R. But this means that m(λ) =<br />

− √ −λ(1 + o(1)) as λ → ∞ <strong>in</strong> a closed, non-real sector with vertex at<br />

the orig<strong>in</strong>, and thus proves the theorem.<br />

It rema<strong>in</strong>s to show that ∩c>0K(c) = − √ −µ. But any po<strong>in</strong>t ℓ<br />

<strong>in</strong> the <strong>in</strong>tersection corresponds to a solution u(x) = cosh(x √ −µ) +<br />

ℓ s<strong>in</strong>h(x √ −µ)/ √ −µ with u ∈ L 2 (0, ∞), s<strong>in</strong>ce we have c<br />

0 |u|2 ≤<br />

Im ℓ<br />

Im µ for<br />

all c > 0. Thus the only possible value is ℓ = − √ −µ. On the other


2. UNIQUENESS THEOREMS 87<br />

hand, the equation with q = 0 has a Weyl solution on [0, ∞), so that<br />

<strong>in</strong> fact this value of ℓ gives a po<strong>in</strong>t which is <strong>in</strong> all K(c). This may of<br />

course also be verified directly (do it!). The proof is now complete. <br />

2. Uniqueness theorems<br />

Given q, b, and the boundary conditions, one may <strong>in</strong> pr<strong>in</strong>ciple determ<strong>in</strong>e<br />

m and thus dρ. We will take as our basic <strong>in</strong>verse problem to<br />

determ<strong>in</strong>e q (and possibly b and the boundary conditions) when dρ is<br />

given. Around 1950 Gelfand and Levitan [9] gave a rather complete<br />

solution to this problem. Their solution <strong>in</strong>cludes uniqueness, i.e., a<br />

proof that different boundary value problems can not yield the same<br />

spectral measure, reconstruction, i.e., a method (an <strong>in</strong>tegral equation)<br />

whereby one, at least <strong>in</strong> pr<strong>in</strong>ciple, can determ<strong>in</strong>e q from the spectral<br />

measure, and characterization, i.e., a description of those measures<br />

that are spectral measures for some equation.<br />

To discuss the full Gelfand-Levitan theory here would take us too<br />

far afield. Instead we will conf<strong>in</strong>e ourselves to the problem of uniqueness,<br />

i.e., to show that two different operators can not have the same<br />

spectral measure. This problem was solved <strong>in</strong>dependently by Borg [8]<br />

and Marčenko [10] just before the Gelfand-Levitan theory appeared.<br />

To state the theorem we <strong>in</strong>troduce, <strong>in</strong> addition to the operator T , another<br />

similar operator ˜ T , correspond<strong>in</strong>g to a boundary condition of the<br />

form (12.2), but with an angle ˜α ∈ [0, π), an <strong>in</strong>terval [0, ˜ b), a potential<br />

˜q and, if needed, a boundary condition at ˜ b. Let the correspond<strong>in</strong>g<br />

spectral measure be d˜ρ.<br />

Theorem 12.4 (Borg-Marčenko). If dρ = d˜ρ, then ˜ T = T , i.e.,<br />

˜α = α, ˜ b = b and ˜q = q.<br />

A few years ago Barry Simon [11] proved a ‘local’ version of this<br />

uniqueness theorem. This was a product of a new strategy developed by<br />

Simon for obta<strong>in</strong><strong>in</strong>g the results of Gelfand and Levitan. I will give my<br />

own proof [6], which is quite elementary and does not use the mach<strong>in</strong>ery<br />

of Simon. We will use the same idea to prove Theorem 12.4.<br />

In order to state Simon’s theorem, one should first note that know<strong>in</strong>g<br />

m is essentially equivalent to know<strong>in</strong>g dρ, at least if the boundary<br />

condition (12.2) is known. Know<strong>in</strong>g m one can <strong>in</strong> fact f<strong>in</strong>d dρ<br />

via the Stieltjes <strong>in</strong>version formula, and know<strong>in</strong>g dρ one may calculate<br />

the <strong>in</strong>tegral <strong>in</strong> the representation of m. By Theorem 12.2 we always<br />

have B = 0, and A may be determ<strong>in</strong>ed (if α = 0) s<strong>in</strong>ce we also have<br />

m(iν) → cot α as ν → ±∞. We denote the m-functions associated<br />

with T and ˜ T by m and ˜m respectively. Then Simon’s theorem is the<br />

follow<strong>in</strong>g.<br />

Theorem 12.5 (Simon). Suppose that 0 < a ≤ m<strong>in</strong>(b, ˜ b). Then<br />

α = ˜α and q = ˜q a.e. on (0, a) if (m(λ) − ˜m(λ))e 2(a−ε) Re √ −λ → 0 for


88 12. INVERSE SPECTRAL THEORY<br />

every ε > 0 as λ → ∞ along some non-real ray. Conversely, if α = ˜α<br />

and q = ˜q on (0, a), then (m(λ) − ˜m(λ))e 2(a−ε) Re √ −λ → 0 for every<br />

ε > 0 as λ → ∞ along any non-real ray.<br />

We will prove both theorems by the same method, the crucial po<strong>in</strong>t<br />

of which is the follow<strong>in</strong>g lemma.<br />

Lemma 12.6. For any fixed x ∈ (0, b) holds ϕ(x, λ)ψ(x, λ) → 0 as<br />

λ → ∞ along a non-real ray.<br />

Note that ϕ(x, λ)ψ(x, λ) is Green’s function on the diagonal x = y.<br />

We shall postpone the proof a moment and see how the theorem follows<br />

from it. We first have a corollary.<br />

Corollary 12.7. Suppose α = ˜α = 0 or α = 0 = ˜α. Then both<br />

˜ϕ(x, λ)ψ(x, λ) and ϕ(x, λ) ˜ ψ(x, λ) tend to 0 as λ → ∞ along a non-real<br />

ray, locally uniformly <strong>in</strong> x.<br />

Proof. Clearly (12.4) implies that for fixed x and ˜α = 0 we have<br />

ϕ(x, λ)/ ˜ϕ(x, λ) → s<strong>in</strong> α/ s<strong>in</strong> ˜α as λ → ∞<br />

along a non-real ray. If α = ˜α = 0 we <strong>in</strong>stead obta<strong>in</strong> the limit 1, so the<br />

corollary follows from Lemma 12.6. <br />

We shall also need a standard theorem from complex analysis, which<br />

is a slight elaboration of the maximum pr<strong>in</strong>ciple.<br />

Theorem 12.8 (Phragmén-L<strong>in</strong>delöf). Suppose f is analytic <strong>in</strong> a<br />

closed sector bounded by two rays from the orig<strong>in</strong>, that it is bounded on<br />

the rays, and that |f(z)| ≤ AeB|z|1/2 <strong>in</strong> the sector, for some constants<br />

A and B. Then f is bounded <strong>in</strong> the sector.<br />

This is just one of the simplest versions of a general class of theorems,<br />

which are all known under the names of Phragmén and L<strong>in</strong>delöf.<br />

Proofs are given <strong>in</strong> many textbooks on complex analysis, but for the<br />

reader’s convenience we also give a proof here.<br />

Proof. We may without loss of generality assume that the rays<br />

are given by the angles ±β. Let ε > 0 and F (z) = e−εzγ f(z), where<br />

1/2 < γ < π/(2β) and the branch of z γ is chosen to be positive real for<br />

positive real z. Now, for z = re ±iβ we have |F (z)| = e −εrγ cos(βγ) |f(z)|,<br />

where cos(βγ) > 0. Let M be a bound for f on the rays. Then we<br />

have |F (z)| ≤ M on the rays.<br />

For z = Re iδ with |δ| ≤ β we have<br />

|F (z)| ≤ A exp(BR 1/2 − εR γ cos(βγ))<br />

which tends to 0 as R → ∞. Thus, on all circular sectors bounded<br />

by the rays we have |F (z)| ≤ M on the boundary if the radius R is<br />

sufficiently large. By the maximum pr<strong>in</strong>ciple this also holds <strong>in</strong> the<br />

<strong>in</strong>terior of the circular sector. S<strong>in</strong>ce R can be chosen arbitrarily large,


2. UNIQUENESS THEOREMS 89<br />

the bound is valid <strong>in</strong> the entire doma<strong>in</strong> bounded by the rays. It follows<br />

that if z is <strong>in</strong> this doma<strong>in</strong>, then |f(z)| ≤ Meε|z|γ , and lett<strong>in</strong>g ε → 0 we<br />

obta<strong>in</strong> the desired result. <br />

Proof of Theorem 12.4. Accord<strong>in</strong>g to the Nevanl<strong>in</strong>na representation<br />

formula for m and ˜m their difference is constant = C, s<strong>in</strong>ce<br />

the l<strong>in</strong>ear term Bλ is always absent by the asymptotic formulas of<br />

Theorem 12.2. In particular, s<strong>in</strong>ce Dirichlet m-functions are always<br />

unbounded near ∞ on a non-real ray and all others are bounded, we<br />

must have either α = ˜α or α = 0 = ˜α if dρ = d˜ρ. Thus, accord<strong>in</strong>g to<br />

Corollary 12.7, the difference ˜ϕ(x, λ)ψ(x, λ) − ϕ(x, λ) ˜ ψ(x, λ) tends to<br />

0 as λ → ∞ along a non-real ray. This difference is<br />

˜ϕ(x, λ)θ(x, λ) − ϕ(x, λ) ˜ θ(x, λ) + Cϕ(x, λ) ˜ϕ(x, λ),<br />

which is an entire function of λ tend<strong>in</strong>g to 0 along non-real rays, and it<br />

may be bounded by a multiple of eB|λ|1/2 for some constant B accord<strong>in</strong>g<br />

to (12.4). By Theorem 12.8 such a function is bounded <strong>in</strong> the entire<br />

plane, and therefore constant by Liouville’s theorem, hence identically<br />

0 s<strong>in</strong>ce the limit is zero along the rays. It follows that<br />

θ(x, λ)/ϕ(x, λ) = ˜ θ(x, λ)/ ˜ϕ(x, λ) + C<br />

for all x, λ. Differentiat<strong>in</strong>g with respect to x, us<strong>in</strong>g the fact that<br />

θ ′ ϕ − θϕ ′ = 1, we obta<strong>in</strong> ϕ 2 (x, λ) = ˜ϕ 2 (x, λ). Tak<strong>in</strong>g the logarith-<br />

mic derivative of this we obta<strong>in</strong> ϕ′ (x,λ)<br />

ϕ(x,λ) = ˜ϕ′ (x,λ)<br />

˜ϕ(x,λ) .<br />

For x = 0 this gives α = ˜α, and thus that m and ˜m are asymptotically<br />

the same. Thus C = 0, so that m = ˜m. Differentiat<strong>in</strong>g once more<br />

we obta<strong>in</strong> ϕ ′′ /ϕ = ˜ϕ ′′ / ˜ϕ which means that q = ˜q on m<strong>in</strong>(b, ˜ b). From<br />

this follows that ϕ = ˜ϕ and θ = ˜ θ, and thus also ψ = ˜ ψ, on m<strong>in</strong>(b, ˜ b).<br />

This implies that b = ˜ b, s<strong>in</strong>ce otherwise ψ (or ˜ ψ) would satisfy selfadjo<strong>in</strong>t<br />

boundary conditions both at b and ˜ b, so that ψ would be an<br />

eigenfunction to a non-real eigen-value for a selfadjo<strong>in</strong>t operator. S<strong>in</strong>ce<br />

ψ = ˜ ψ also the boundary conditions at b = ˜ b (if any) are the same. It<br />

follows that T = ˜ T . <br />

Proof of Theorem 12.5. Our start<strong>in</strong>g po<strong>in</strong>t is that if α = ˜α<br />

the functions ˜ϕ(x, λ)ψ(x, λ) and ϕ(x, λ) ˜ ψ(x, λ) tend to 0 as λ → ∞<br />

along a non-real ray. Their difference is<br />

(12.7) ˜ϕ(x, λ)θ(x, λ) − ϕ(x, λ) ˜ θ(x, λ) + (m(λ) − ˜m(λ))ϕ(x, λ) ˜ϕ(x, λ).<br />

Suppose first that α = ˜α and q = ˜q on (0, a). Then the first two<br />

terms cancel on (0, a), so that (m(λ) − ˜m(λ))ϕ(x, λ) ˜ϕ(x, λ) → 0 as<br />

λ → ∞ along non-real rays if x ∈ (0, a). By (12.4) this implies that<br />

(m(λ) − ˜m(λ))e 2(a−ε) Re √ −λ ) → 0 as λ → ∞ along any non-real ray.<br />

Conversely, the estimate for m − ˜m implies first that α = ˜α and<br />

then that for 0 < x < a the last term of (12.7) tends to 0 accord<strong>in</strong>g<br />

to assumption and (12.4), so that the entire function ˜ϕ(x, λ)θ(x, λ) −


90 12. INVERSE SPECTRAL THEORY<br />

ϕ(x, λ) ˜ θ(x, λ) of λ also tends to 0 along a non-real ray, and by symmetry<br />

also along its conjugate. However, as <strong>in</strong> the proof of Theorem 12.4 this<br />

entire function is bounded by eB|λ|1/2 for some constant B, so by the<br />

Phragmén-L<strong>in</strong>delöf theorem it vanishes for all x ∈ (0, a). It follows<br />

that q = ˜q <strong>in</strong> (0, a) exactly as <strong>in</strong> the proof of Theorem 12.4. <br />

It only rema<strong>in</strong>s to prove Lemma 12.6.<br />

Proof of Lemma 12.6. Note that for α = 0 (Dirichlet’s boundary<br />

condition) we have ψ(0, λ) = 1 and ψ ′ (0, λ) = m(λ). S<strong>in</strong>ce only ψ<br />

and its multiples satisfy the boundary condition at b, we have m(λ) =<br />

u ′ (0, λ)/u(0, λ) for any solution of −u ′′ +qu = λu satisfy<strong>in</strong>g the boundary<br />

condition at b. But consider now the <strong>in</strong>terval [a, b) for 0 < a < b<br />

and the correspond<strong>in</strong>g operator generated by our differential equation<br />

<strong>in</strong> L 2 (a, b) with the Dirichlet boundary condition at a and the same<br />

boundary condition as before at b. It follows that its m-function is<br />

given by ψ ′ (a, λ)/ψ(a, λ). Similarly, −ϕ ′ (a, λ)/ϕ(a, λ) is the m-function<br />

correspond<strong>in</strong>g to the <strong>in</strong>terval (0, a], consider<strong>in</strong>g a as the <strong>in</strong>itial po<strong>in</strong>t,<br />

provided with the Dirichlet boundary condition, and us<strong>in</strong>g the boundary<br />

condition (12.2) at 0. The change <strong>in</strong> sign is due to the fact that the<br />

<strong>in</strong>itial po<strong>in</strong>t of the <strong>in</strong>terval is now to the right of the other end po<strong>in</strong>t.<br />

Now, s<strong>in</strong>ce ϕψ ′ − ϕ ′ ψ ≡ 1 we have<br />

1/(ϕψ) = (ϕψ ′ − ϕ ′ ψ)/(ϕψ) = ψ ′ /ψ − ϕ ′ /ϕ,<br />

so this is a sum of two Dirichlet m-functions. Accord<strong>in</strong>g to Theorem<br />

12.2 all such m-functions are asymptotic to − √ −λ as λ → ∞ along<br />

a non-real ray, which immediately implies that ϕ(a, λ)ψ(a, λ) → 0. <br />

We make some f<strong>in</strong>al remarks. One may generalize Simon’s theorem<br />

to the more general Sturm-Liouville equation −(pu ′ ) ′ +qu = λu, where<br />

1/p and q are realvalued and locally <strong>in</strong>tegrable, provided one can show<br />

appropriate growth estimates for the solutions and that ˜ϕψ → 0 as<br />

before. I showed <strong>in</strong> [4, 5] that ˜ϕψ → 0 <strong>in</strong> the appropriate manner,<br />

provided 1/p is <strong>in</strong> Lr loc for some r > 1 and q − ˜q is <strong>in</strong> Lr′ loc , where r′ is<br />

the conjugate exponent to r. For example, if 1/p is locally bounded it<br />

is enough with local <strong>in</strong>tegrability of q and ˜q. Simon’s theorem therefore<br />

generalizes to this situation. The condition on the m-functions then<br />

has to be replaced by m(λ) − ˜m(λ) = O(exp(−2 a−ε<br />

0 Re −λ/p)).<br />

As far as the orig<strong>in</strong>al Borg-Marčenko theorem is concerned, it is<br />

now well known exactly to what extent the coefficients p, q and w <strong>in</strong><br />

the equation (10.1), as well as the <strong>in</strong>terval and boundary conditions,<br />

are determ<strong>in</strong>ed by the spectral measure, see [7].


CHAPTER 13<br />

First order systems<br />

We shall here study the spectral theory of general first order system<br />

(13.1) Ju ′ + Qu = W v<br />

where J is a constant n × n matrix which is <strong>in</strong>vertible and skew-<br />

Hermitian (i.e., J ∗ = −J) and the coefficients Q and W are n × n<br />

matrix-valued functions which are locally <strong>in</strong>tegrable on I. In addition<br />

Q is assumed Hermitian and W positive semi-def<strong>in</strong>ite. As we shall see,<br />

these properties ensure the proper symmetry of the differential expression.<br />

The functions u and v are n×1 matrix-valued on I. In the special<br />

case when n is even and J = <br />

0 I<br />

−I 0 , I be<strong>in</strong>g the unit matrix of order<br />

n/2, systems of the form (13.1) are usually called Hamiltonian systems.<br />

The follow<strong>in</strong>g existence and uniqueness theorem is fundamental<br />

Theorem 13.1. Suppose A is an n×n matrix-valued function with<br />

locally <strong>in</strong>tegrable entries <strong>in</strong> an <strong>in</strong>terval I, and that B is an n×1 matrixvalued<br />

function, also locally <strong>in</strong>tegrable <strong>in</strong> I. Assume further that c ∈ I<br />

and C is an n × 1 matrix. Then the <strong>in</strong>itial value problem<br />

<br />

u ′ + Au = B <strong>in</strong> I,<br />

u(c) = C,<br />

has a unique n × 1 matrix-valued solution u with locally absolutely cont<strong>in</strong>uous<br />

entries def<strong>in</strong>ed <strong>in</strong> I.<br />

The theorem has the follow<strong>in</strong>g immediate consequence.<br />

Corollary 13.2. The set of solutions to u ′ + Au = 0 <strong>in</strong> I is an<br />

n-dimensional l<strong>in</strong>ear space.<br />

Proofs for Theorem 13.1 and Corollary 13.2 are given <strong>in</strong> Appendix<br />

C. We will apply them for A = J −1 (Q − λW ), where λ ∈ C, and<br />

B = J −1W v.<br />

We shall study (13.1) <strong>in</strong> the <strong>Hilbert</strong> space L 2 W<br />

of equivalence classes<br />

of n × 1 matrix-valued Lebesgue measurable functions u for which<br />

u∗ W u is <strong>in</strong>tegrable over I. In this space the scalar product is 〈u, u〉 =<br />

I v∗W u. Two functions u and ũ are considered equivalent if the <strong>in</strong>tegral<br />

<br />

I (u − ũ)∗W (u − ũ) = 0. Note that this means that they can be<br />

very different po<strong>in</strong>twise. For example, <strong>in</strong> the case of the system equivalent<br />

to (10.1) the second component of an element of L2 W is completely<br />

undeterm<strong>in</strong>ed.<br />

91


92 13. FIRST ORDER SYSTEMS<br />

S<strong>in</strong>ce W is assumed locally <strong>in</strong>tegrable it is clear that constant n ×<br />

1 matrices are locally <strong>in</strong> L2 W so (each components of) W u is locally<br />

<strong>in</strong>tegrable if u ∈ L2 W . It is also clear that u and ũ are two different<br />

representatives of the same equivalence class <strong>in</strong> L2 W precisely if W u =<br />

W ũ almost everywhere (Exercise 13.1).<br />

Example 13.3. Any standard scalar differential equation may be<br />

written on the form (13.1) with a constant, skew-Hermitian J. If it<br />

is possible to do this so that Q and W are Hermitian, the differential<br />

equation is called formally symmetric. We have already seen this <strong>in</strong><br />

the case of the Sturm-Liouville equation (10.1), which will be formally<br />

symmetric if p, q and w are real-valued. The first order scalar equation<br />

iu ′ + qu = wv is already of the proper form and formally symmetric if<br />

q and w are real-valued. The fourth order equation (p2u ′′ ) ′′ − (p1u ′ ) ′ +<br />

qu = wv may be written on the form (13.1) by sett<strong>in</strong>g<br />

<br />

<br />

U =<br />

, J =<br />

and Q =<br />

u<br />

hu ′<br />

(p2u ′′ ) ′ −p1u ′<br />

−p2u ′′<br />

0 0 1 0<br />

0 0 0 1<br />

−1 0 0 0<br />

0 −1 0 0<br />

p0 0 0 0<br />

0 p1 1 0<br />

0 1 0 0<br />

0 0 0 −1/p2<br />

as is readily seen, and it will be formally symmetric if the coefficients<br />

w, p0, p1 and p2 are real-valued.<br />

In order to get a spectral theory for (13.1) it is convenient to use<br />

the theory of symmetric relations, s<strong>in</strong>ce it is sometimes not possible<br />

to f<strong>in</strong>d a densely def<strong>in</strong>ed symmetric operator realiz<strong>in</strong>g the equation.<br />

Consequently, we must beg<strong>in</strong> by def<strong>in</strong><strong>in</strong>g a m<strong>in</strong>imal relation, show that<br />

it is symmetric, calculate its adjo<strong>in</strong>t and f<strong>in</strong>d the selfadjo<strong>in</strong>t restrictions<br />

of the adjo<strong>in</strong>t. We def<strong>in</strong>e the m<strong>in</strong>imal relation T0 to be the closure <strong>in</strong><br />

L 2 W ⊕ L2 W of the set of pairs (u, v) of elements <strong>in</strong> L2 W<br />

<br />

,<br />

with compact<br />

support <strong>in</strong> the <strong>in</strong>terior of I (i.e., which are 0 outside some compact<br />

sub<strong>in</strong>terval of the <strong>in</strong>terior of I which may be different for different pairs<br />

(u, v)) and such that u is locally absolutely cont<strong>in</strong>uous and satisfies the<br />

equation Ju ′ + Qu = W v. This relation between u and v may or may<br />

not be an operator (Exercise 13.2).<br />

The next step is to calculate the adjo<strong>in</strong>t of T0. In order to do this,<br />

we shall aga<strong>in</strong> use the classical variation of constants formula, now <strong>in</strong><br />

a more general form than <strong>in</strong> Lemma 10.4. Below we always assume<br />

that c is a fixed (but arbitrary) po<strong>in</strong>t <strong>in</strong> I. Let F (x, λ) be a n × n<br />

matrix-valued solution of JF ′ + QF = λW F with F (c, λ) <strong>in</strong>vertible.<br />

This means precisely that the columns of F are a basis for the solutions<br />

of (13.1) for v = λu. Such a solution is called a fundamental matrix for<br />

this equation. We will always <strong>in</strong> addition suppose that S = F (c, λ) is<br />

<strong>in</strong>dependent of λ and symplectic, i.e., S ∗ JS = J. We may for example<br />

take S equal to the n × n unit matrix or, if J is unitary, S = J.<br />

Lemma 13.4. We have F ∗ (x, λ)JF (x, λ) = J for any complex λ<br />

and x ∈ I. The solution u of Ju ′ + Qu = λW u + W v with <strong>in</strong>itial data


u(c) = 0 is given by<br />

(13.2) u(x) = F (x, λ)J −1<br />

<br />

13. FIRST ORDER SYSTEMS 93<br />

c<br />

x<br />

F ∗ (y, λ)W (y)v(y) dy .<br />

Proof. We have (F ∗ (x, λ)JF (x, λ)) ′ = −(JF ′ (x, λ)) ∗ F (x, λ) +<br />

F ∗ (x, λ)JF ′ (x, λ) = 0 us<strong>in</strong>g the differential equation. It follows that<br />

F ∗ (x, λ)JF (x, λ) is constant. S<strong>in</strong>ce it equals J for x = c this is its<br />

value for all x ∈ I. It follows that J −1 F ∗ (x, λ) is the <strong>in</strong>verse matrix of<br />

JF (x, λ). Straightforward differentiation now shows that (13.2) solves<br />

the equation. <br />

Corollary 13.5. If v ∈ L2 W with compact support <strong>in</strong> I then (13.1)<br />

has a solution u with compact support <strong>in</strong> I if and only if <br />

I v∗W u0 = 0<br />

for all solutions u0 of the homogeneous equation (13.1) with v = 0.<br />

Proof. If we choose c to the left of the support of v, then by<br />

Lemma 13.4 the function u(x) = F (x)J −1 x<br />

c F ∗W v is the only solution<br />

of (13.1) which vanishes to the left of c. S<strong>in</strong>ce F (x)J −1 is <strong>in</strong>vertible<br />

(13.1) has a solution of compact support if and only if <br />

I F ∗W v = 0.<br />

But the columns of F are l<strong>in</strong>early <strong>in</strong>dependent so they are a basis for<br />

the solutions of the homogeneous equation. The corollary follows. <br />

Lemma 13.6. Suppose (u, v) ∈ T ∗ 0 . Then there is a representative<br />

of the equivalence class u, also denoted by u, which is absolutely cont<strong>in</strong>uous<br />

and satisfies Ju ′ + Qu = W v. Conversely, if this holds, then<br />

(u, v) ∈ T ∗ 0 .<br />

Proof. Let u1 be a solution of Ju ′ 1 + Qu1 = W v and assume<br />

(u0, v0) ∈ T0 has compact support. Integrat<strong>in</strong>g by parts we get<br />

<br />

v ∗ <br />

W u0 = (Ju ′ 1 + Qu1) ∗ <br />

u0 = u ∗ 1(Ju ′ <br />

0 + Qu0) = u ∗ 1W v0.<br />

I<br />

I<br />

This proves the converse part of the lemma. We also have 0 = 〈u0, v〉−<br />

〈v0, u〉 = 〈v0, u1 − u〉. Here v0 is an arbitrary compactly supported<br />

element of L2 W for which there exists a compactly supported element<br />

u0 ∈ L2 W satisfy<strong>in</strong>g Ju′ 0 + Qu0 = W v0. By Corollary13.5 it follows that<br />

u1 − u solves the homogeneous equation, i.e., u solves (13.1). <br />

It now follows that T0 is symmetric and that its adjo<strong>in</strong>t is given<br />

by the maximal relation T1 consist<strong>in</strong>g of all pairs (u, v) <strong>in</strong> L2 W × L2W such that u is (the equivalence class of) a locally absolutely cont<strong>in</strong>uous<br />

function for which Ju ′ + Qu = W v. We can now apply the theory of<br />

Chapter 9.2. The deficiency <strong>in</strong>dices of T0 are accord<strong>in</strong>gly the number<br />

of solutions of Ju ′ + Qu = iW u and Ju ′ + Qu = −iW u respectively<br />

which are l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> L2 W . S<strong>in</strong>ce there are altogether only<br />

I<br />

I


94 13. FIRST ORDER SYSTEMS<br />

n (po<strong>in</strong>twise) l<strong>in</strong>early <strong>in</strong>dependent solutions of these equations the deficiency<br />

<strong>in</strong>dices can be no larger than n; <strong>in</strong> particular they are both<br />

f<strong>in</strong>ite. We now make the follow<strong>in</strong>g basic assumption.<br />

Assumption 13.7. If K is a sufficiently large, compact sub<strong>in</strong>terval<br />

of I there is no non-trivial solution of Ju ′ +Qu = 0 with <br />

K u∗ W u = 0.<br />

Note that if there is a solution with u ∗ W u = 0, then W u = 0 so u<br />

actually also solves Ju ′ + Qu = λW u for any complex λ. The assumption<br />

automatically holds if (13.1) is equivalent to a Sturm-Liouville<br />

equation, or more generally an equation of the types discussed <strong>in</strong> Example<br />

13.3 and Exercise 13.3. One reason for mak<strong>in</strong>g the assumption is<br />

that it ensures that the deficiency <strong>in</strong>dices of T0 are precisely equal to the<br />

dimensions of the spaces of those solutions of Ju ′ + Qu = ±iW u which<br />

have f<strong>in</strong>ite norm, but the assumption will be even more important <strong>in</strong><br />

the next chapter.<br />

Accord<strong>in</strong>g to Corollary 9.15 there will be selfadjo<strong>in</strong>t realizations of<br />

(13.1) precisely if the deficiency <strong>in</strong>dices are equal. We will <strong>in</strong> the rest<br />

of this chapter assume that a selfadjo<strong>in</strong>t extension of T0 exists. Some<br />

simple criterions that ensure this are given <strong>in</strong> the follow<strong>in</strong>g proposition,<br />

but if these do not apply it can, <strong>in</strong> a concrete case, be very difficult to<br />

determ<strong>in</strong>e whether there are selfadjo<strong>in</strong>t realizations or not.<br />

Proposition 13.8. The m<strong>in</strong>imal relation T0 has equal deficiency<br />

<strong>in</strong>dices if either of the follow<strong>in</strong>g conditions is satisfied:<br />

(1) J, Q and W are real-valued.<br />

(2) The <strong>in</strong>terval I is compact.<br />

Proof. If u ∈ L 2 W satisfies Ju′ + Qu = λW u and the coefficients<br />

are real-valued, then conjugation shows that u is still <strong>in</strong> L 2 W and Ju′ +<br />

Qu = λW u. There is therefore a one-to-one correspondence between<br />

Dλ and D λ which obviously preserves l<strong>in</strong>ear <strong>in</strong>dependence. It follows<br />

that n+ = n−.<br />

If I is compact, then solutions of Ju ′ + Qu = λW u are absolutely<br />

cont<strong>in</strong>uous <strong>in</strong> I, and W is <strong>in</strong>tegrable <strong>in</strong> I, so that all solutions are <strong>in</strong><br />

L 2 W . Thus n+ = n− = n. <br />

Example 13.9. Note that J ∗ = −J, so J can be real-valued only if<br />

n is even (show this!). Suppose u solves the equation m<br />

k=0 (pku (k) ) (k) =<br />

iwu where the coefficients p0, . . . pm are realvalued and w > 0. Then u<br />

satisfies m<br />

k=0 (pku (k) ) (k) = −iwu. It follows that if (13.1) is equivalent<br />

to an equation of this form, then its deficiency <strong>in</strong>dices are always equal<br />

so that selfadjo<strong>in</strong>t realizations exist. This is <strong>in</strong> particular the case for<br />

the Sturm-Liouville equation (10.1).<br />

We will now take a closer look at how selfadjo<strong>in</strong>t realizations are<br />

determ<strong>in</strong>ed as restrictions of the maximal relation. Suppose (u1, v1)


13. FIRST ORDER SYSTEMS 95<br />

and (u2, v2) ∈ T1. Then the boundary form (cf. Chapter 9) is<br />

<br />

(13.3) 〈(u1, v1), U(u2, v2)〉 = i<br />

<br />

= i<br />

I<br />

I<br />

(v ∗ 2W u1 − u ∗ 2W v1)<br />

((Ju ′ 2 + Qu2) ∗ u1 − u ∗ 2(Ju ′ 1 + Qu1))<br />

<br />

= −i<br />

I<br />

(u ∗ 2Ju1) ′ = −i lim<br />

K→I [u ∗ 2Ju1]K,<br />

the limit be<strong>in</strong>g taken over compact sub<strong>in</strong>tervals K of I. We must<br />

restrict T1 so that this vanishes. Like <strong>in</strong> Chapter 10 this means that<br />

the restriction of T1 to a selfadjo<strong>in</strong>t relation T is obta<strong>in</strong>ed by boundary<br />

conditions s<strong>in</strong>ce the limit clearly only depends on the values of u1 and<br />

u2 <strong>in</strong> arbitrarily small neighborhoods of the endpo<strong>in</strong>ts of I.<br />

An endpo<strong>in</strong>t is called regular if it is a f<strong>in</strong>ite number and Q and W<br />

are <strong>in</strong>tegrable near the endpo<strong>in</strong>t. Otherwise the endpo<strong>in</strong>t is s<strong>in</strong>gular.<br />

If both endpo<strong>in</strong>ts are regular, we aga<strong>in</strong> say that we are deal<strong>in</strong>g with<br />

a regular problem. We have a s<strong>in</strong>gular problem if at least one of the<br />

endpo<strong>in</strong>ts is <strong>in</strong>f<strong>in</strong>ite, or if at least one of Q and W is not <strong>in</strong>tegrable on<br />

I.<br />

Consider now the regular case. S<strong>in</strong>ce it is clear that both deficiency<br />

<strong>in</strong>dices equal n <strong>in</strong> the regular case there are always selfadjo<strong>in</strong>t<br />

realizations. To see what they look like, let ũ be the boundary value<br />

<br />

of (u, v) ∈ T1, i.e., ũ =<br />

so that the<br />

u(a)<br />

u(b)<br />

<br />

. Also put B = iJ 0<br />

0 −iJ<br />

boundary form is ũ ∗ 2Bũ1. Now if u ∈ Di then 〈u, Uu〉 = 〈u, u〉 so that<br />

the boundary form is positive def<strong>in</strong>ite on Di. Similarly it is negative<br />

def<strong>in</strong>ite on D−i (cf., Corollary 9.17). S<strong>in</strong>ce dim Di ⊕D−i = 2n the rank<br />

of the boundary form is 2n on this space so that the boundary values<br />

of this space, and a fortiori those of T1, range through all of C 2n . S<strong>in</strong>ce<br />

〈T1, UT0〉 = 0 it follows that the boundary value of any element of T0<br />

is 0.<br />

Conversely, to guarantee that 〈T1, Uu〉 = 0 for some u ∈ T1 it is<br />

obviously enough that the boundary value of u vanishes. Hence the<br />

m<strong>in</strong>imal relation consists exactly of those elements of the maximal relation<br />

which have boundary value 0. It is now clear that any maximal<br />

symmetric restriction of T1 is obta<strong>in</strong>ed by restrict<strong>in</strong>g the boundary<br />

values to a maximal subspace of C 2n for which the boundary form vanishes,<br />

a so called maximal isotropic space for B. We know, s<strong>in</strong>ce the<br />

deficiency <strong>in</strong>dices are f<strong>in</strong>ite and equal, that all such maximal symmetric<br />

restrictions are actually selfadjo<strong>in</strong>t (Corollary 9.15). S<strong>in</strong>ce the problem<br />

of f<strong>in</strong>d<strong>in</strong>g maximal isotropic spaces of B is a purely algebraic one<br />

we consider the problem of identify<strong>in</strong>g all selfadjo<strong>in</strong>t restrictions of T1<br />

solved <strong>in</strong> the regular case. See also Exercise 13.4.


96 13. FIRST ORDER SYSTEMS<br />

Clearly all these restrictions are obta<strong>in</strong>ed by restrict<strong>in</strong>g the boundary<br />

values of elements <strong>in</strong> T1 to certa<strong>in</strong> n-dimensional subspaces of C 2n ,<br />

i.e., by impos<strong>in</strong>g n l<strong>in</strong>ear, homogeneous boundary conditions on T1.<br />

We consider a few special cases. One selfadjo<strong>in</strong>t realization is obta<strong>in</strong>ed<br />

by impos<strong>in</strong>g periodic boundary conditions u(b) = u(a) or more generally<br />

u(b) = Su(a) where S is a fixed matrix satisfy<strong>in</strong>g S ∗ JS = J. As<br />

already mentioned, such a matrix S is often called symplectic, at least<br />

, so that n is even.<br />

<strong>in</strong> the case when S is real, and J = 0 I<br />

−I 0<br />

Another possibility occurs if the <strong>in</strong>vertible Hermitian matrix iJ has<br />

an equal number of positive and negative eigen-values (this obviously<br />

requires n to be even). In that case we may impose separated boundary<br />

conditions, i.e., conditions that make both u∗ (a)Ju(a) and u∗ (b)Ju(b)<br />

vanish. Boundary conditions which are not separated are called coupled.<br />

It must be emphasized that for n > 2 there are selfadjo<strong>in</strong>t realizations<br />

which are determ<strong>in</strong>ed by some conditions imposed only on the<br />

value at one of the endpo<strong>in</strong>ts, and some conditions <strong>in</strong>volv<strong>in</strong>g the values<br />

at both endpo<strong>in</strong>ts.<br />

Let us now turn to the general, not necessarily regular case. We<br />

first need to briefly discuss Hermitian forms of f<strong>in</strong>ite rank. If B is a<br />

Hermitian form on a l<strong>in</strong>ear space L we set LB = {u ∈ L | B(u, L) = 0}<br />

which is a subspace of L. The rank of B is codim LB (= dim L/LB ).<br />

In the sequel we assume that B has f<strong>in</strong>ite rank. If M is a subspace on<br />

which the form B is non-degenerate, i.e., there is no non-zero element<br />

u ∈ M such that B(u, v) = 0 for all v ∈ M, then we must have<br />

LB ∩ M = {0} so that M has to be f<strong>in</strong>itedimensional. This means, of<br />

course, that after <strong>in</strong>troduc<strong>in</strong>g a basis <strong>in</strong> M the form B is given on M<br />

by an <strong>in</strong>vertible matrix. If B is non-degenerate on M, then for every<br />

u ∈ L there is a unique element v ∈ M (the B-projection of u on M)<br />

such that B(u − v, M) = 0 (Exercise 13.5). If B is non-degenerate on<br />

M, but not on any proper superspace of M, we say that M is maximal<br />

non-degenerate for B. Of course this means exactly that LB ∩M = {0}<br />

and dim M = rank B so that L = M ˙+L B as a direct sum.<br />

We call a subspace P of L on which B is positive def<strong>in</strong>ite a maximal<br />

positive def<strong>in</strong>ite space for B if P has no proper superspaces on<br />

which B is positive def<strong>in</strong>ite. If B is positive def<strong>in</strong>ite on P, then clearly<br />

dim P ≤ rank B. It follows that forms of f<strong>in</strong>ite rank always have maximal<br />

positive def<strong>in</strong>ite spaces. Similarly for negative def<strong>in</strong>ite spaces.<br />

Proposition 13.10 (Sylvester’s law of <strong>in</strong>ertia). Suppose B is a<br />

Hermitian form of f<strong>in</strong>ite rank on a l<strong>in</strong>ear space L. Then all maximal<br />

positive def<strong>in</strong>ite subspaces for B have the same dimension. Similarly<br />

for maximal negative def<strong>in</strong>ite subspaces.<br />

Proof. Suppose P is maximal positive def<strong>in</strong>ite for B and that ˜ P<br />

is another positive def<strong>in</strong>ite space for B. Then the B-projection on P<br />

is <strong>in</strong>jective as a l<strong>in</strong>ear map BP : ˜ P → P. For if not, there exists


13. FIRST ORDER SYSTEMS 97<br />

a non-zero u ∈ ˜ P such that B(u, P) = 0. But then B is positive<br />

def<strong>in</strong>ite on the l<strong>in</strong>ear hull of u and P, s<strong>in</strong>ce B(αu + βv, αu + βv) =<br />

|α| 2 B(u, u)+|β| 2 B(v, v) for any v ∈ P. This contradicts the maximality<br />

of P as a positive def<strong>in</strong>ite space. From the standard fact dim ˜ P =<br />

dim BP ( ˜ P) + dim{u ∈ ˜ P | BP u = 0} now follows that dim ˜ P ≤ dim P.<br />

By symmetry all maximal positive def<strong>in</strong>ite subspaces for B have the<br />

same dimension. Similarly, all maximal negative def<strong>in</strong>ite spaces for B<br />

have the same dimension. <br />

If P is any maximal positive def<strong>in</strong>ite subspace, and N any maximal<br />

negative def<strong>in</strong>ite subspace, for B, we set r+ = dim P and r− = dim N .<br />

The pair (r+, r−) is called the signature of the form B.<br />

Proposition 13.11. Suppose P and N are maximal as positive<br />

and negative def<strong>in</strong>ite subspaces for a Hermitian form B of f<strong>in</strong>ite rank.<br />

Then P ∩ N = {0}, the direct sum P ˙+N is a maximal non-degenerate<br />

space for B, and rank B = r+ + r−.<br />

Proof. Clearly B can not be both positive and negative on the<br />

same vector u, so P ∩M = {0}. B is obviously (check!) non-degenerate<br />

on P ˙+N , and if P ˙+N is not maximal there exists u /∈ P ˙+N such that<br />

B is non-degenerate on the l<strong>in</strong>ear hull M of u and P ˙+N . We may<br />

assume B(u, P ˙+N ) = 0, s<strong>in</strong>ce otherwise we can subtract from u its Bprojection<br />

on P ˙+N . We cannot have B(u, u) = 0 s<strong>in</strong>ce B would then<br />

be degenerate on M. But if B(u, u) > 0, then B would be positive<br />

def<strong>in</strong>ite on the l<strong>in</strong>ear hull of u and P, contradict<strong>in</strong>g the maximality<br />

of P. Similarly, if B(u, u) < 0 we would get a contradiction to the<br />

maximality of N . Therefore P ˙+N is maximal non-degenerate so that<br />

r+ + r− = rank B. <br />

Two Hermitian forms Ba and Bb of f<strong>in</strong>ite rank are said to be <strong>in</strong>dependent<br />

if each has a maximal non-degenerate space Ma respectively<br />

Mb such that Ba(Mb, L) = Bb(Ma, L) = 0. It is then clear that<br />

Ma ∩ Mb = {0} and that Ma ˙+Mb is maximal non-degenerate for<br />

Bb − Ba. If (r a +, r a −) and (r b +, r b −) are the signatures of Ba and Bb respectively<br />

it follows that (r b + + r a −, r b − + r a +) is the signature of Bb − Ba.<br />

Now consider (13.3) and suppose I = (a, b). If u1 = (u1, v1) and<br />

u2 = (u2, v2) ∈ T1 then −iu ∗ 2Ju1 has a limit both <strong>in</strong> a and b by (13.3).<br />

We denote these limits Ba(u1, u2) and Bb(u1, u2) respectively and call<br />

them the boundary forms at a and b respectively. Clearly Ba and Bb are<br />

Hermitian forms on T1. Be<strong>in</strong>g limits of forms of rank n they both have<br />

ranks ≤ n (Exercise 13.6). They are also <strong>in</strong>dependent. This follows<br />

from the next lemma.<br />

Lemma 13.12. Suppose (u, v) ∈ T1. Then there exists (u1, v1) <strong>in</strong><br />

T1 such that (u1, v1) = (u, v) <strong>in</strong> a right neighborhood of a and (u1, v1)<br />

vanishes <strong>in</strong> a left neighborhood of b.


98 13. FIRST ORDER SYSTEMS<br />

Proof. Let [c, d] be a compact sub<strong>in</strong>terval of I = (a, b) such that<br />

d<br />

c F ∗ (·, λ)W F (·, λ) is <strong>in</strong>vertible and put v1 = v <strong>in</strong> (a, c] and v1 ≡ 0 <strong>in</strong><br />

[d, b). Now let<br />

u1(x) = F (x, λ)(u(c) + J −1<br />

x<br />

F ∗ (y, λ)W (y)v1(y) dy) .<br />

It is clear that u1 = u <strong>in</strong> (a, c] and if we choose v1 appropriately <strong>in</strong><br />

[c, d] we can achieve that u1 ≡ 0 <strong>in</strong> [d, b). In fact, sett<strong>in</strong>g v(x) =<br />

−F (x, λ)( d<br />

c F ∗ (·, λ)W F (·, λ)) −1 Ju(c) <strong>in</strong> this <strong>in</strong>terval will do. <br />

It follows that (u − u1, v − v1) ∈ T1 is 0 near a and equals (u, v)<br />

near b. We can therefore f<strong>in</strong>d a maximal non-degenerate space for Ba<br />

consist<strong>in</strong>g of elements of T1 vanish<strong>in</strong>g near b. Similarly, a maximal nondegenerate<br />

space for Bb consist<strong>in</strong>g of elements of T1 vanish<strong>in</strong>g near a.<br />

Thus Ba and Bb are <strong>in</strong>dependent, as claimed. S<strong>in</strong>ce the signature of<br />

the complete boundary form Bb − Ba is (n+, n−) the <strong>in</strong>dependence of<br />

Ba and Bb implies that n+ = r a − + r b + and n− = r a + + r b −, us<strong>in</strong>g the<br />

notation <strong>in</strong>troduced above for the signatures of Ba and Bb. Accord<strong>in</strong>g<br />

to Corollary 9.15 T1 has selfadjo<strong>in</strong>t restrictions precisely if n+ = n−.<br />

Reason<strong>in</strong>g like <strong>in</strong> the regular case it follows that there are selfadjo<strong>in</strong>t<br />

restrictions def<strong>in</strong>ed by separated boundary conditions precisely if r a + =<br />

r a − and r b + = r b −, from which n+ = n− follows. In fact, from any two of<br />

these relations the third clearly follows.<br />

Consider f<strong>in</strong>ally the case when a is a regular endpo<strong>in</strong>t but b possibly<br />

is s<strong>in</strong>gular. In this case Ba is given by Ba(u1, u2) = iu2(a) ∗ Ju1(a), with<br />

notation as above. Clearly r a + is the number of positive eigenvalues of<br />

iJ and r a − the number of negative eigenvalues. It follows that selfadjo<strong>in</strong>t<br />

restrictions of T1 def<strong>in</strong>ed by separated boundary conditions exist if and<br />

only if the deficiency <strong>in</strong>dices are equal and iJ has an equal number<br />

of positive and negative eigenvalues; <strong>in</strong> particular n must be even. In<br />

the Sturm-Liouville case all these conditions are fulfilled, as we already<br />

know.<br />

Exercises for Chapter 13<br />

Exercise 13.1. Show that u and ũ are elements of the same equivalence<br />

class <strong>in</strong> L2 W if and only if W u = W ũ a.e.<br />

Exercise 13.2. Verify that T0 is the graph of an operator if (13.1)<br />

is equivalent to an equation of the type (10.1) (or more generally an<br />

equation of the type discussed <strong>in</strong> Exercise 13.3) and w > 0 a.e. <strong>in</strong><br />

I. Also show that <strong>in</strong> this case Assumption 13.7 holds. Try to show<br />

this assum<strong>in</strong>g only that w ≥ 0 but w > 0 on a subset of I of positive<br />

measure (this is considerably harder).<br />

c


EXERCISES FOR CHAPTER 13 99<br />

Exercise 13.3. Show that the differential equation iu ′′′ = wv (here<br />

i = √ −1) can be written on the form (13.1).<br />

Also show that the equation m<br />

k=0 (pku (k) ) (k) = wv can be written<br />

on this form if the coefficients w and p0, p1, . . . , pm satisfy appropriate<br />

conditions (state these conditions!).<br />

<br />

H<strong>in</strong>t: Put U = <strong>in</strong> the first case. In the second case, let U be<br />

u<br />

hu ′<br />

hu ′′<br />

the matrix with 2m rows u0, . . . , u2m−1 where uj = u (j) and um+j =<br />

(−1) j m k=j+1 (pku (k) ) (k−j−1) for j = 0, . . . , m − 1.<br />

Exercise 13.4. F<strong>in</strong>d all selfadjo<strong>in</strong>t realizations of a regular Sturm-<br />

Liouville equation. More generally, assume J −1 = J ∗ = −J and show<br />

that the eigen-values of B are ±1, both with multiplicity n. Then<br />

describe all maximal isotropic spaces for B.<br />

Exercise 13.5. Suppose B is a Hermitian form of f<strong>in</strong>ite rank on<br />

a <strong>Hilbert</strong> space L, and that B is non-degenerate on a subspace M.<br />

Show that for any u ∈ L there is a unique v ∈ M, the B-projection on<br />

M, such that B(u − v, M) = 0. Also show that if, and only if, M is<br />

maximal non-degenerate, then B(u − v, L) = 0.<br />

Exercise 13.6. Suppose B1, B2, . . . is a sequence of Hermitian<br />

forms on L with f<strong>in</strong>ite rank, all of signature (r+, r−), and suppose<br />

Bj(u, v) → B(u, v) as j → ∞, for any u, v ∈ L. Show that B is<br />

a Hermitian form on L of f<strong>in</strong>ite rank (s+, s−), where s+ ≤ r+ and<br />

s− ≤ r−.


CHAPTER 14<br />

Eigenfunction expansions<br />

Just as <strong>in</strong> Chapter 11, we will deduce our results for the system<br />

(13.1) from a detailed description of the resolvent. As before we will<br />

prove that the resolvent is actually an <strong>in</strong>tegral operator. To see this,<br />

first note that accord<strong>in</strong>g to Lemma 13.6 all elements of D1 are locally<br />

absolutely cont<strong>in</strong>uous, <strong>in</strong> particular they are <strong>in</strong> C(I). The set C(I) becomes<br />

a Fréchet space if provided with the topology of locally uniform<br />

convergence; with a little loss of elegance we may restrict ourselves to<br />

consider C(K) for an arbitrary compact sub<strong>in</strong>terval K ⊂ I. This is a<br />

Banach space with norm uK = supx∈K|u(x)|, |·| denot<strong>in</strong>g the norm<br />

of an n × 1 matrix (Exercise 14.1). The set T1 is a closed subspace of<br />

H ⊕ H, s<strong>in</strong>ce T1 is a closed relation. It follows from Assumption 13.7<br />

that the map T1 ∋ (u, v) ↦→ u ∈ C(I) is well def<strong>in</strong>ed, i.e., there can not<br />

be two different locally absolutely cont<strong>in</strong>uous functions u <strong>in</strong> the same<br />

L 2 W<br />

-equivalence class satisfy<strong>in</strong>g (13.1) for the same v. The restriction<br />

map IK : T1 ∋ (u, v) ↦→ u ∈ C(K) is therefore a l<strong>in</strong>ear map between<br />

Banach spaces.<br />

Proposition 14.1. For every compact sub<strong>in</strong>terval K ⊂ I there<br />

exists a constant CK such that<br />

for any (u, v) ∈ T1.<br />

uK ≤ CK(u, v)W<br />

Proof. We shall show that the restriction map IK is a closed operator<br />

if K is sufficiently large. S<strong>in</strong>ce IK is everywhere def<strong>in</strong>ed <strong>in</strong> the<br />

<strong>Hilbert</strong> space T1 it follows by the closed graph theorem (Appendix A)<br />

that IK is a bounded operator, which is the statement of the proposition.<br />

Now suppose (uj, vj) → (u, v) <strong>in</strong> T1 and uj → ũ <strong>in</strong> C(K). We<br />

must show that IK(u, v) = ũ, i.e., u = ũ po<strong>in</strong>twise <strong>in</strong> K. We have<br />

0 ≤ <br />

K (u − uj) ∗W (u − uj) ≤ u − uj2 and by Lemma 13.4<br />

uj(x) = F (x, λ)(uj(c) + J −1<br />

x<br />

F ∗ (y, λ)W (y)vj(y) dy),<br />

so lett<strong>in</strong>g j → ∞ it is clear that <br />

K (u − ũ)∗W (u − ũ) = 0 and that ũ<br />

satisfies Jũ ′ + Qũ = W v, so Assumption 13.7 shows that u − ũ = 0<br />

po<strong>in</strong>twise <strong>in</strong> K if K is sufficiently large. Hence IK is closed, and we<br />

are done. <br />

101<br />

c


102 14. EIGENFUNCTION EXPANSIONS<br />

We can now show that the resolvent is an <strong>in</strong>tegral operator. First<br />

note that if T is a selfadjo<strong>in</strong>t realization of (13.1), i.e., a selfadjo<strong>in</strong>t<br />

restriction of T1, then sett<strong>in</strong>g HT = D(T ) the resolvent Rλ of the operator<br />

part ˜ T of T is an operator on HT , def<strong>in</strong>ed for λ ∈ ρ( ˜ T ). We def<strong>in</strong>e<br />

the resolvent set ρ(T ) = ρ( ˜ T ) and extend Rλ to all of L2 W by sett<strong>in</strong>g<br />

RλH∞ = 0, and it is then clear that the resolvent has all the properties<br />

of Theorems 5.2 and 5.3; the only difference is that the resolvent<br />

is perhaps no longer <strong>in</strong>jective. Given u ∈ L 2 W<br />

we obta<strong>in</strong> the element1<br />

(Rλu, λRλu+u) ∈ T1, so we may also view the resolvent as an operator<br />

˜Rλ : L 2 W → T1. This operator is bounded s<strong>in</strong>ce (Rλu, λRλu+u)W ≤<br />

((1 + |λ|)Rλ + 1)uW . Hence ˜ Rλ ≤ (1 + |λ|)Rλ + 1, where Rλ<br />

is the norm of Rλ as an operator on HT . It is also clear that the analyticity<br />

of Rλ implies the analyticity of ˜ Rλ. We obta<strong>in</strong> the follow<strong>in</strong>g<br />

theorem.<br />

Theorem 14.2. Suppose I is an arbitrary <strong>in</strong>terval, and that T is a<br />

selfadjo<strong>in</strong>t realization <strong>in</strong> L2 W of the system (13.1), satisfy<strong>in</strong>g Assumption<br />

13.7. Then the resolvent Rλ of T may be viewed as a bounded<br />

to C(K), for any compact sub<strong>in</strong>terval K of I,<br />

l<strong>in</strong>ear map from L2 W<br />

which depends analytically on λ ∈ ρ(T ), <strong>in</strong> the uniform operator topology.<br />

Furthermore, there exists Green’s function G(x, y, λ), an n × n<br />

matrix-valued function, such that Rλu(x) = 〈u, G∗ (x, ·, λ)〉W for any<br />

u ∈ L2 W . The columns of y ↦→ G∗ (x, y, λ) are <strong>in</strong> HT = D(T ) for any<br />

x ∈ I.<br />

Proof. We already noted that ρ(T ) ∋ λ ↦→ ˜ Rλ ∈ B(L 2 W , T1) is<br />

analytic <strong>in</strong> the uniform operator topology. Furthermore, the restriction<br />

operator IK : T1 → C(K) is bounded and <strong>in</strong>dependent of λ. Hence<br />

ρ(T ) ∋ λ → IK ˜ Rλ is analytic <strong>in</strong> the uniform operator topology. In<br />

particular, for fixed λ ∈ ρ(T ) and any x ∈ I, the components of the<br />

l<strong>in</strong>ear map L 2 W ∋ u ↦→ (IK ˜ Rλu)(x) = Rλu(x) are bounded l<strong>in</strong>ear forms.<br />

By Riesz’ representation theorem we have Rλu(x) = 〈u, G ∗ (x, ·, λ)〉W ,<br />

where the columns of y ↦→ G ∗ (x, y, λ) are <strong>in</strong> L 2 W . S<strong>in</strong>ce Rλu = 0 for<br />

u ∈ H∞ it follows that the columns of G ∗ (x, ·, λ) are actually <strong>in</strong> HT<br />

for each x ∈ I. <br />

Among other th<strong>in</strong>gs, Theorem 14.2 tells us that if uj → u <strong>in</strong> L 2 W ,<br />

then Rλuj → Rλu <strong>in</strong> C(K), so that Rλuj converges locally uniformly.<br />

This is actually true even if uj just converges weakly, but we only need<br />

is the follow<strong>in</strong>g weaker result.<br />

Lemma 14.3. Suppose Rλ is the resolvent of a selfadjo<strong>in</strong>t relation<br />

T as above. Then if uj ⇀ 0 weakly <strong>in</strong> L 2 W , it follows that Rλuj → 0<br />

po<strong>in</strong>twise and locally boundedly.<br />

1 u = uT + u∞ with uT ∈ HT and (0, u∞) ∈ T and ˜ T RλuT = ( ˜ T − λ)RλuT +<br />

λRλuT = uT + λRλu. Thus (Rλu, λRλu + u) = (Rλu, λRλu + uT ) + (0, u∞) ∈ T ⊂<br />

T1.


14. EIGENFUNCTION EXPANSIONS 103<br />

Proof. Rλuj(x) = 〈uj, G∗ (x, ·, λ)〉W → 0 s<strong>in</strong>ce the columns of<br />

y ↦→ G∗ (x, y, λ) are <strong>in</strong> L2 W for any x ∈ I. Now let K be a compact<br />

sub<strong>in</strong>terval of I. A weakly convergent sequence <strong>in</strong> L2 W is bounded, so<br />

s<strong>in</strong>ce Rλ maps L2 W boundedly <strong>in</strong>to C(K), it follows that Rλuj(x) is<br />

bounded <strong>in</strong>dependently of j and x for x ∈ K. <br />

Corollary 14.4. If the <strong>in</strong>terval I is compact, then any selfadjo<strong>in</strong>t<br />

restriction T of T1 has compact resolvent. Hence T has a complete<br />

orthonormal sequence of eigenfunctions <strong>in</strong> HT .<br />

Proof. Suppose uj ⇀ 0 weakly <strong>in</strong> L2 W . If I is compact, then<br />

Lemma 14.3 implies that Rλuj → 0 po<strong>in</strong>twise and boundedly <strong>in</strong> I,<br />

and hence by dom<strong>in</strong>ated convergence Rλuj → 0 <strong>in</strong> L2 W . Thus Rλ is<br />

compact. The last statement follows from Theorem 8.3. <br />

If T has compact resolvent, then the generalized Fourier series of<br />

any u ∈ HT converges to u <strong>in</strong> L2 W ; if we just have u ∈ L2W the series<br />

converges to the projection of u onto HT . For functions <strong>in</strong> the doma<strong>in</strong><br />

of T much stronger convergence is obta<strong>in</strong>ed.<br />

Corollary 14.5. Suppose T has a complete orthonormal sequence<br />

of eigenfunctions <strong>in</strong> HT . If u ∈ D(T ), then the generalized Fourier series<br />

of u converges locally uniformly <strong>in</strong> I. In particular, if I is compact,<br />

the convergence is uniform <strong>in</strong> I.<br />

Proof. Suppose u ∈ D(T ) = D( ˜ T ), i.e., ˜ T u = v for some v ∈ HT ,<br />

and let ˜v = v − iu, so that u = Ri˜v. If e is an eigenfunction of<br />

T with eigenvalue λ we have ˜ T e = λe or ( ˜ T + i)e = (λ + i)e so<br />

that R−ie = e/(λ + i). It follows that 〈u, e〉W e = 〈Ri˜v, e〉W e =<br />

〈˜v, R−ie〉W e = 1<br />

λ−i 〈˜v, e〉W e = 〈˜v, e〉Rie. If sNu denotes the N:th partial<br />

sum of the Fourier series for u it follows that sNu = RisN ˜v, where sN ˜v<br />

is the N:th partial sum for ˜v. S<strong>in</strong>ce sN ˜v → ˜v <strong>in</strong> HT , it follows from<br />

Theorem 14.2 and the remark after it that sNu → u <strong>in</strong> C(K), for any<br />

compact sub<strong>in</strong>terval K of I. <br />

The convergence is actually even better than the corollary shows,<br />

s<strong>in</strong>ce it is absolute and uniform (see Exercise 14.2).<br />

Example 14.6. Consider the operator of Example 4.8, which is<br />

−i d<br />

dx considered <strong>in</strong> L2 (−π, π), with the boundary condition u(−π) =<br />

u(π). This is a regular, selfadjo<strong>in</strong>t realization of (13.1) for n = 1,<br />

J = −i, Q = 0 and W = 1, and it is clear that H∞ = {0}. Hence there<br />

is a complete orthonormal sequence of eigenfunctions <strong>in</strong> L 2 (−π, π).<br />

The solutions of −iu ′ = λu are the multiples of e iλx , and the bound-<br />

ary condition implies that λ is an <strong>in</strong>teger. We obta<strong>in</strong> the classical<br />

(complex) Fourier series expansion u(x) = ∞ k=−∞ ûkeikx , where ûk <br />

=<br />

1 π<br />

2π −π u(x)e−ikx dx. Accord<strong>in</strong>g to our results, the series converges <strong>in</strong><br />

L 2 (−π, π) for any u ∈ L 2 (−π, π), and uniformly if u is absolutely cont<strong>in</strong>uous<br />

with derivative <strong>in</strong> L 2 (−π, π).


104 14. EIGENFUNCTION EXPANSIONS<br />

Exercises for Chapter 14<br />

Exercise 14.1. Show that if K is a compact <strong>in</strong>terval, then C(K) is<br />

a Banach space with the norm sup x∈K|u(x)|. Also show that if I is an<br />

arbitrary <strong>in</strong>terval, then C(I) is a Fréchet space (a l<strong>in</strong>ear Hausdorff space<br />

with the topology given by a countable family of sem<strong>in</strong>orms, which is<br />

also complete), under the topology of locally uniform convergence.<br />

Exercise 14.2. With the assumptions of Corollary 14.5 the Fourier<br />

series for u ∈ D(T ) actually converges absolutely and uniformly to u.<br />

This may be proved just as for the case of a Sturm-Liouville equation,<br />

which was considered <strong>in</strong> Exercise 11.2. Do it!


CHAPTER 15<br />

S<strong>in</strong>gular problems<br />

We now have a satisfactory eigenfunction expansion theory for regular<br />

boundary value problems, so we turn next to s<strong>in</strong>gular problems.<br />

We then need to take a much closer look at Green’s function. To do<br />

this, we fix an arbitrary po<strong>in</strong>t c ∈ I; if I conta<strong>in</strong>s one of its endpo<strong>in</strong>ts,<br />

this is the preferred choice for c. Next, let F (x, λ) be a fundamental<br />

matrix for JF ′ + QF = λW F with λ-<strong>in</strong>dependent, symplectic <strong>in</strong>itial<br />

data <strong>in</strong> c. We will need the follow<strong>in</strong>g theorem.<br />

Theorem 15.1. A solution u(x, λ) of Ju ′ + Qu = λW u with <strong>in</strong>itial<br />

data <strong>in</strong>dependent of λ is an entire function of λ, locally uniformly with<br />

respect to x.<br />

This means that u(x, λ) is analytic as a function of λ <strong>in</strong> the whole<br />

complex plane, and that the difference quotients 1 (u(x, λ+h)−u(x, λ))<br />

h<br />

converge locally uniformly <strong>in</strong> x as h → 0. The proof is given <strong>in</strong> Appendix<br />

C. We can now give the follow<strong>in</strong>g detailed description of Green’s<br />

function.<br />

Theorem 15.2. Green’s function has the follow<strong>in</strong>g properties:<br />

(1) For λ ∈ ρ(T ) we have Rλu(x) = 〈u, G ∗ (x, ·, λ)〉W .<br />

(2) As functions of y the columns of G ∗ (x, y, λ) satisfy the equation<br />

Ju ′ + Qu = λW u for y = x.<br />

(3) As functions of y, the columns of G ∗ (x, y, λ) satisfy the boundary<br />

conditions that determ<strong>in</strong>e T as a restriction of T1, for any<br />

x <strong>in</strong>terior to I.<br />

(4) G ∗ (x, y, λ) = G(y, x, λ), for all x, y ∈ I and λ ∈ ρ(T ).<br />

(5) G(x, y, λ) − G(x, y, µ)) = (λ − µ)〈G ∗ (y, ·, µ), G ∗ (x, ·, λ)〉W =<br />

(λ − µ)RλG ∗ (y, ·, µ)(x), for all x, y ∈ I and λ, µ ∈ ρ(T ).<br />

Furthermore, there exists an n×n matrix-valued function M(λ), def<strong>in</strong>ed<br />

<strong>in</strong> ρ(T ) and satisfy<strong>in</strong>g M ∗ (λ) = M(λ), such that<br />

(15.1) G(x, y, λ) = F (x, λ)(M(λ) ± 1<br />

2 J −1 )F ∗ (y, λ),<br />

where the sign of 1J<br />

should be positive for x > y, negative for x < y.<br />

2<br />

Proof. We already know (1). Now let K be a compact sub<strong>in</strong>terval<br />

of I, (u, v) ∈ T0 with support <strong>in</strong> K, and suppose x /∈ K. We have u ∈<br />

D(T0) ⊂ D(T ) and (u, v) = (u, λu + (v − λu)) so that u = Rλ(v − λu).<br />

105


106 15. SINGULAR PROBLEMS<br />

We obta<strong>in</strong><br />

0 = u(x) = Rλ(v − λu)(x) = 〈v − λu, G ∗ (x, ·, λ)〉W<br />

= 〈v, G ∗ (x, ·, λ)〉W − 〈u, λG ∗ (x, ·, λ)〉W .<br />

But accord<strong>in</strong>g to Lemma 13.6 this means that each column of y ↦→<br />

G ∗ (x, y, λ) is <strong>in</strong> the doma<strong>in</strong> of the maximal relation for (13.1) on the<br />

<strong>in</strong>tervals I ∩ (−∞, x) and I ∩ (x, ∞) and satisfies the equation Ju ′ +<br />

Qu = λW u on these <strong>in</strong>tervals, so (2) follows. It also follows that we<br />

have<br />

G ∗ (x, y, λ) =<br />

<br />

F (y, λ)P ∗ +(x, λ), y < x<br />

F (y, λ)P ∗ −(x, λ), y > x,<br />

for some n × n matrix-valued functions P+ and P−.<br />

If u is compactly supported and <strong>in</strong> L2 W we have, for x outside the<br />

convex hull of the support of u,<br />

(15.2) Rλu(x) = P±(x, λ)〈u, F (·, λ)〉W .<br />

The function v = Rλu satisfies the equation Jv ′ + Qv = λW v + W u,<br />

so we may write P±(x, λ) = F (x, λ)H±(λ), and s<strong>in</strong>ce Rλu ∈ D(T )<br />

it certa<strong>in</strong>ly satisfies the boundary conditions determ<strong>in</strong><strong>in</strong>g T . If the<br />

support of u is large enough the scalar product <strong>in</strong> (15.2) can be any<br />

column vector, <strong>in</strong> view of Assumption 13.7, so for every y each column<br />

of x ↦→ G(x, y, λ) also satisfies the boundary conditions determ<strong>in</strong><strong>in</strong>g T .<br />

This proves (3). If the endpo<strong>in</strong>ts of I are a and b respectively we now<br />

have<br />

x<br />

<br />

Rλu(x) = F (x, λ)<br />

Differentiat<strong>in</strong>g this we obta<strong>in</strong><br />

JRλu ′ + (Q − λW )Rλu<br />

a<br />

H+(λ)F ∗ (·, λ)W u +<br />

b<br />

x<br />

H−(λ)F ∗ (·, λ)W u .<br />

= JF (x, λ)(H−(λ) − H+(λ))F ∗ (x, λ)W (x)u(x),<br />

so JF (x, λ)(H−(λ) − H+(λ))F ∗ (x, λ) should be1 the unit matrix. In<br />

view of the fact that JF (x, λ) is the <strong>in</strong>verse of J −1F ∗ (x, λ) this means<br />

that H−(λ) − H+(λ) = J −1 . If we def<strong>in</strong>e M(λ) = (H−(λ) + H+(λ))/2<br />

we now obta<strong>in</strong> (15.1).<br />

If now u and v both have compact supports we have<br />

<br />

〈Rλu, v〉W = v ∗ (x)W (x)G(x, y, λ)W (y)u(y) dxdy,<br />

1 Actually, one must aga<strong>in</strong> argue us<strong>in</strong>g Assumption 13.7. We leave the details<br />

to the reader.


15. SINGULAR PROBLEMS 107<br />

the double <strong>in</strong>tegral be<strong>in</strong>g absolutely convergent. Similarly<br />

<br />

〈u, Rλv〉W = v ∗ (x)W (x)G ∗ (y, x, λ)W (y)u(y) dxdy,<br />

and s<strong>in</strong>ce the <strong>in</strong>tegrals are equal by Theorem 5.2 (2) and G(x, y, λ) −<br />

G ∗ (y, x, λ) = F (x, λ)(M(λ) − M ∗ (λ))F ∗ (y, λ) we obta<strong>in</strong><br />

〈F (·, λ), v〉W (M(λ) − M ∗ (λ))〈u, F ∗ (·, λ)〉W = 0.<br />

By Assumption 13.7 this implies that M(λ) = M ∗ (λ) and thus (4).<br />

F<strong>in</strong>ally, to prove (5) we use the resolvent relation Theorem 5.2(3).<br />

For u ∈ L2 W this gives<br />

〈u, G ∗ (x, ·, λ) − G ∗ (x, ·, µ)〉W = Rλu(x) − Rµu(x)<br />

Now<br />

Thus<br />

= (λ − µ)RλRµu(x) = (λ − µ)〈Rµu, G ∗ (x, ·, λ)〉W<br />

= 〈u, (λ − µ)RµG ∗ (x, ·, λ)〉W .<br />

RµG ∗ (x, ·, λ)(y) = 〈G ∗ (x, ·, λ), G ∗ (y, ·, µ)〉W .<br />

G(x, y, λ) − G(x, y, µ) = (λ − µ)〈G ∗ (y, ·, µ), G ∗ (x, ·, λ)〉W<br />

= (λ − µ)RλG ∗ (y, ·, µ)(x),<br />

s<strong>in</strong>ce both sides are clearly <strong>in</strong> D(T ). This proves (5). <br />

Before we proceed, we note the follow<strong>in</strong>g corollary, which completes<br />

our results for the case of a discrete spectrum.<br />

Corollary 15.3. Suppose for some non-real λ that all solutions<br />

of Ju ′ + Qu = λW u and Ju ′ + Qu = λW u are <strong>in</strong> L2 W . Then for any<br />

selfadjo<strong>in</strong>t realization T the resolvent of T is compact.<br />

In other words, if the deficiency <strong>in</strong>dices are maximal, then the resolvent<br />

is compact. Actually, the assumptions are here a bit stronger<br />

than needed. In fact, it is not difficult to show (Exercise 15.1) that if<br />

all solutions are <strong>in</strong> L2 W for some λ, real or not, then the same is true<br />

for all λ.<br />

Proof. One could use a version of Theorem 8.7 valid for L2 W<br />

and show that Rλ is a <strong>Hilbert</strong>-Schmidt operator. Here is an alternative<br />

proof. Suppose uj ⇀ 0 weakly <strong>in</strong> L2 W and let I = (a, b).<br />

Then x<br />

a F ∗ (y, λ)W (y)uj(y) dy and b<br />

x F ∗ (y, λ)W (y)uj(y) dy are both<br />

bounded uniformly with respect to x by Cauchy-Schwarz and s<strong>in</strong>ce the<br />

columns of F (·, λ) are <strong>in</strong> L2 W . The latter fact also shows that the <strong>in</strong>tegrals<br />

tend po<strong>in</strong>twise to 0 as j → ∞. S<strong>in</strong>ce also the columns of F (·, λ)<br />

are <strong>in</strong> L2 W it follows that Rλuj → 0 strongly <strong>in</strong> L2 W by dom<strong>in</strong>ated<br />

convergence.


108 15. SINGULAR PROBLEMS<br />

We will give an expansion theorem generaliz<strong>in</strong>g the Fourier series<br />

expansion obta<strong>in</strong>ed for a discrete spectrum. The first step is the follow<strong>in</strong>g<br />

lemma.<br />

Lemma 15.4. Let M(λ) be as <strong>in</strong> Theorem 15.2. Then there is a<br />

unique <strong>in</strong>creas<strong>in</strong>g and left-cont<strong>in</strong>uous matrix-valued function P with<br />

P (0) = 0 and unique Hermitian matrices A and B ≥ 0 such that<br />

(15.3) M(λ) = A + Bλ +<br />

∞<br />

−∞<br />

( 1 t<br />

−<br />

t − λ t2 ) dP (t).<br />

+ 1<br />

Proof. If S = F (c, λ) Theorem 15.2.(5) gives<br />

S(M(λ) − M(µ))S ∗ = (λ − µ)RλG ∗ (c, ·, µ)(c),<br />

where the constant matrix S is <strong>in</strong>vertible. Thus M(λ) is analytic <strong>in</strong><br />

ρ(T ), s<strong>in</strong>ce the resolvent Rλ : L2 W → C(K) is. Furthermore, for µ = λ<br />

non-real we obta<strong>in</strong><br />

1<br />

2i Im λ (M(λ) − M ∗ 1<br />

(λ)) = (M(λ) − M(λ))<br />

2i Im λ<br />

= S −1 〈G ∗ (c, ·, λ), G ∗ (c, ·, λ)〉W (S −1 ) ∗ ≥ 0.<br />

Thus M is a ‘matrix-valued Nevanl<strong>in</strong>na function’. We now obta<strong>in</strong><br />

the representation (15.3) by apply<strong>in</strong>g Theorem 6.1 to the Nevanl<strong>in</strong>na<br />

function m(λ, u) = u ∗ M(λ)u where u is an n × 1-matrix. Clearly the<br />

quantities α, β and ρ <strong>in</strong> the representation (6.1) are Hermitian forms<br />

<strong>in</strong> u, so (15.3) follows. <br />

The function P is called the spectral matrix for T . We now def<strong>in</strong>e<br />

the <strong>Hilbert</strong> space L2 P <strong>in</strong> the follow<strong>in</strong>g way. We consider n × 1 matrixvalued<br />

Borel functions û, so that they are measurable with respect to<br />

all elements of dP , and for which the <strong>in</strong>tegral ∞<br />

−∞û∗ (t) dP (t) û(t) <<br />

∞. The elements of L2 P are equivalence classes of such functions, two<br />

functions u, v be<strong>in</strong>g equivalent if they are equal a.e. with respect to<br />

dP , i.e., if dP (u − v) has all elements equal to the zero measure. We<br />

denote the scalar product <strong>in</strong> this space by 〈·, ·〉P and the norm by<br />

·P . Note that one may write the scalar product <strong>in</strong> a somewhat more<br />

familiar way by us<strong>in</strong>g the Radon-Nikodym theorem to f<strong>in</strong>d a measure<br />

dµ with respect to which all the entries <strong>in</strong> dP are absolutely cont<strong>in</strong>uous;<br />

one may for example let dµ be the sum of all diagonal elements <strong>in</strong><br />

dP . One then has dP = Ω dµ, where Ω is a non-negative matrix of<br />

functions locally <strong>in</strong>tegrable with respect to dµ, and the scalar product<br />

is 〈û, ˆv〉P = ∞<br />

−∞ˆv∗ Ωû dµ. Alternatively, we def<strong>in</strong>e L2 P as the completion<br />

of compactly supported, cont<strong>in</strong>uous n×1 matrix-valued functions with<br />

respect to the norm ·P . These alternative def<strong>in</strong>itions give the same<br />

space (Exercise 15.2). The ma<strong>in</strong> result of this chapter is the follow<strong>in</strong>g.


15. SINGULAR PROBLEMS 109<br />

Theorem 15.5.<br />

(1) The <strong>in</strong>tegral <br />

K F ∗ (y, t)W (y)u(y) dy converges <strong>in</strong> L2 P for u ∈<br />

as K → I through compact sub<strong>in</strong>tervals of I. The limit<br />

L2 W<br />

is called the generalized Fourier transform of u and is<br />

denoted by F(u) or û. We write this as û(t) = 〈u, F (·, t)〉W ,<br />

although the <strong>in</strong>tegral may not converge po<strong>in</strong>twise.<br />

(2) The mapp<strong>in</strong>g u ↦→ û has kernel H∞ and is unitary between HT<br />

and L2 P so that the Parseval formula 〈u, v〉W = 〈û, ˆv〉P holds<br />

if u, v ∈ L2 W and at least one of them is <strong>in</strong> HT .<br />

(3) The <strong>in</strong>tegral <br />

K F (x, t)dP (t)û(t) converges <strong>in</strong> HT as K → R<br />

through compact <strong>in</strong>tervals. If û = F(u) the limit is PT u, where<br />

PT is the orthogonal projection onto HT . In particular the <strong>in</strong>tegral<br />

is the <strong>in</strong>verse of the generalized Fourier transform on<br />

HT . Aga<strong>in</strong>, we write u(x) = 〈û, F ∗ (x, ·)〉P for u ∈ HT , although<br />

the <strong>in</strong>tegral may not converge po<strong>in</strong>twise.<br />

(4) Let E∆ denote the spectral projector of ˜ T for the <strong>in</strong>terval ∆.<br />

Then E∆u(x) = <br />

F (x, t) dP (t) û(t).<br />

∆<br />

(5) If (u, v) ∈ T then F(v)(t) = tû(t). Conversely, if û and tû(t)<br />

are <strong>in</strong> L2 P , then F −1 (û) ∈ D(T ).<br />

We will prove Theorem 15.5 through a sequence of lemmas. First<br />

note that for u ∈ L2 W with compact support, the function û(λ) =<br />

〈u, F (·, λ)〉W is an entire, matrix-valued function of λ s<strong>in</strong>ce F (x, λ),<br />

and thus also F ∗ (x, λ), is entire, locally uniformly <strong>in</strong> x, accord<strong>in</strong>g to<br />

Theorem 15.1.<br />

Lemma 15.6. The function 〈Rλu, v〉W − ˆv ∗ (λ)M(λ)û(λ) is entire<br />

for all u, v ∈ L2 W with compact supports.<br />

Proof. If the supports are <strong>in</strong>side [a, b], direct calculation shows<br />

that the function is<br />

1<br />

2<br />

b<br />

a<br />

x a<br />

<br />

−<br />

x<br />

b<br />

<br />

v ∗ (x)W (x)F (x, λ)J −1 F ∗ (y, λ)W (y)u(y) dy dx .<br />

This is obviously an entire function of λ. <br />

As usual we denote the spectral projectors belong<strong>in</strong>g to T (i.e.,<br />

those belong<strong>in</strong>g to ˜ T ) by Et.<br />

Lemma 15.7. Let u ∈ L2 W have compact support and assume a < b<br />

to be po<strong>in</strong>ts of differentiability for both 〈Etu, u〉 and P (t). Then<br />

(15.4) 〈Ebu, u〉 − 〈Eau, u〉 =<br />

b<br />

a<br />

û ∗ (t) dP (t) û(t).


110 15. SINGULAR PROBLEMS<br />

Proof. Let Γ be the positively oriented rectangle with corners <strong>in</strong><br />

a ± i, b ± i. Accord<strong>in</strong>g to Lemma 15.6<br />

<br />

<br />

〈Rλu, u〉 dλ = û ∗ (λ)M(λ)û(λ) dλ<br />

Γ<br />

Γ<br />

if either of these <strong>in</strong>tegrals exist. However, by Lemma 15.4,<br />

<br />

û ∗ <br />

(λ)M(λ)û(λ) dλ = û ∗ ∞<br />

(λ) ( 1 t<br />

−<br />

t − λ t2 ) dP (t) û(λ) dλ .<br />

+ 1<br />

Γ<br />

Γ<br />

−∞<br />

The double <strong>in</strong>tegral is absolutely convergent except perhaps where t =<br />

λ. The difficulty is thus caused by<br />

1<br />

−1<br />

ds<br />

µ+1<br />

µ−1<br />

û ∗ (µ − is)dP (t)û(µ + is)<br />

t − µ − is<br />

for µ = a, b. However, Lemma 11.10 ensures the absolute convergence<br />

of these <strong>in</strong>tegrals. Chang<strong>in</strong>g the order of <strong>in</strong>tegration gives<br />

<br />

Γ<br />

û ∗ (λ)M(λ)û(λ) dλ =<br />

∞<br />

−∞<br />

<br />

Γ<br />

û ∗ (λ)dP (t)û(λ)( 1 t<br />

−<br />

t − λ t2 ) dλ<br />

+ 1<br />

<br />

= −2πi<br />

a<br />

b<br />

û ∗ (t)dP (t)û(t)<br />

s<strong>in</strong>ce for a < t < b the residue of the <strong>in</strong>ner <strong>in</strong>tegral is −û ∗ (t)dP (t)û(t)<br />

whereas t = a, b do not carry any mass and the <strong>in</strong>ner <strong>in</strong>tegrand is<br />

regular for t < a and t > b.<br />

Similarly we have<br />

<br />

Γ<br />

〈Rλu, u〉 dλ =<br />

∞<br />

−∞<br />

<br />

d〈Etu, u〉<br />

Γ<br />

dλ<br />

t − λ<br />

<br />

= −2πi<br />

a<br />

b<br />

d〈Etu, u〉<br />

which completes the proof. <br />

Lemma 15.8. If u ∈ L2 W the generalized Fourier transform û ∈ L2P exists as the L2 <br />

P -limit of K F ∗ (y, t)W (y)u(y) dy as K → I through<br />

compact sub<strong>in</strong>tervals of I. Furthermore,<br />

〈Etu, v〉W =<br />

t<br />

−∞<br />

ˆv ∗ (t) dP (t) û(t).<br />

In particular, 〈PT u, v〉W = 〈û, ˆv〉P if u and v ∈ L 2 W .


15. SINGULAR PROBLEMS 111<br />

Proof. If u has compact support Lemma 15.7 shows that (15.4)<br />

holds for a dense set of values a, b s<strong>in</strong>ce functions of bounded variation<br />

are a.e. differentiable. S<strong>in</strong>ce both Et and P are left-cont<strong>in</strong>uous we<br />

obta<strong>in</strong>, by lett<strong>in</strong>g b ↑ t, a → −∞ through such values,<br />

t<br />

〈Etu, v〉 = ˆv ∗ (t) dP (t) û(t)<br />

−∞<br />

when u, v have compact supports; first for u = v and then <strong>in</strong> general<br />

by polarization. If PT is the projection of L 2 W onto HT we obta<strong>in</strong> as<br />

t → ∞ also that 〈PT u, v〉W = 〈û, ˆv〉P when u and v have compact<br />

supports.<br />

For arbitrary u ∈ L 2 W<br />

we set, for a compact sub<strong>in</strong>terval K of I,<br />

<br />

uK(x) =<br />

u(x)<br />

0<br />

for x ∈ K<br />

otherwise<br />

and obta<strong>in</strong> a transform ûK. If L is another compact sub<strong>in</strong>terval of I it<br />

follows that ûK − ûLP = PT (uK − uL)W ≤ uK − uLW , and s<strong>in</strong>ce<br />

uK → u <strong>in</strong> L2 W as K → I, Cauchy’s convergence pr<strong>in</strong>ciple shows that<br />

ûK converges to an element û ∈ L2 P as K → I. The lemma now follows<br />

<strong>in</strong> full generality by cont<strong>in</strong>uity. <br />

Note that we have proved that F is an isometry on HT , and a<br />

partial isometry on L 2 W .<br />

Lemma 15.9. The <strong>in</strong>tegral <br />

compact <strong>in</strong>terval and û ∈ L 2 P<br />

K F (x, t) dP (t) û(t) is <strong>in</strong> HT if K is a<br />

, and as K → R the <strong>in</strong>tegral converges <strong>in</strong><br />

HT . The limit F −1 (û) is called the <strong>in</strong>verse transform of û. If u ∈ L2 W<br />

then F −1 (F(u)) = PT u. F −1 (û) = 0 if and only if û is orthogonal <strong>in</strong><br />

L2 P to all generalized Fourier transforms.<br />

Proof. If û ∈ L2 P has compact support, then u(x) = 〈û, F ∗ (x, ·)〉P<br />

is cont<strong>in</strong>uous, so uK ∈ L2 W for compact sub<strong>in</strong>tervals K of I, and has a<br />

transform ûK. We have<br />

uK 2 <br />

W = u ∗ ∞<br />

(x)W (x) F (x, t)dP (t)û(t) dx .<br />

K<br />

−∞<br />

Considered as a double <strong>in</strong>tegral this is absolutely convergent, so chang<strong>in</strong>g<br />

the order of <strong>in</strong>tegration we obta<strong>in</strong><br />

uK 2 W =<br />

∞<br />

−∞<br />

<br />

K<br />

F ∗ ∗ (x, t)W (x)u(x) dx dP (t) û(t)<br />

= 〈û, ûK〉P ≤ ûP ûKP ≤ ûP uKW ,<br />

accord<strong>in</strong>g to Lemma 15.8. Hence uKW ≤ ûP , so u ∈ L 2 W<br />

uW ≤ ûP . If now û ∈ L 2 P<br />

, and<br />

is arbitrary, this <strong>in</strong>equality shows (like


112 15. SINGULAR PROBLEMS<br />

<strong>in</strong> the proof of Lemma 15.8) that <br />

K F (x, t)dP (t) û(t) converges <strong>in</strong> L2W as K → R through compact <strong>in</strong>tervals; call the limit u1. If v ∈ L2 W , ˆv<br />

is its generalized Fourier transform, K is a compact <strong>in</strong>terval, and L a<br />

compact sub<strong>in</strong>terval of I, we have<br />

<br />

( F ∗ (x, t)W (x)v(x) dx) ∗ dP (t) û(t)<br />

K<br />

L<br />

<br />

=<br />

L<br />

v ∗ <br />

(x)W (x)<br />

K<br />

F (x, t)dP (t) û(t) dx<br />

by absolute convergence. Lett<strong>in</strong>g L → I and K → R we obta<strong>in</strong><br />

〈û, ˆv〉P = 〈u1, v〉W . If û is the transform of u, then by Lemma 15.8<br />

u1 − u is orthogonal to HT , so u1 = PT u. Similarly, u1 = 0 precisely if<br />

û is orthogonal to all transforms. <br />

We have shown the <strong>in</strong>verse transform to be the adjo<strong>in</strong>t of the transform<br />

as an operator from L2 W <strong>in</strong>to L2P . The basic rema<strong>in</strong><strong>in</strong>g difficulty is<br />

to prove that the transform is surjective, i.e., accord<strong>in</strong>g to Lemma 15.9,<br />

that the <strong>in</strong>verse transform is <strong>in</strong>jective. The follow<strong>in</strong>g lemma will enable<br />

us to prove this.<br />

Lemma 15.10. The transform of Rλu is û(t)/(t − λ).<br />

Proof. By Lemma 15.8, 〈Etu, v〉W = t<br />

−∞ ˆv∗ dP û, so that<br />

〈Rλu, v〉W =<br />

∞<br />

−∞<br />

d〈Etu, v〉<br />

t − λ =<br />

∞<br />

−∞<br />

By properties of the resolvent<br />

Rλu 2 =<br />

ˆv ∗ (t) dP (t) û(t)<br />

t − λ<br />

1<br />

2i Im λ 〈Rλu − R λ u, u〉W =<br />

∞<br />

−∞<br />

d〈Etu, u〉W<br />

|t − λ| 2<br />

= 〈û(t)/(t − λ), ˆv(t)〉P .<br />

= û(t)/(t − λ)2 P .<br />

Sett<strong>in</strong>g v = Rλu and us<strong>in</strong>g Lemma 15.8, it therefore follows that<br />

û(t)/(t − λ)2 P = 〈û(t)/(t − λ), F(Rλu)〉P = F(Rλu)2 P . It follows<br />

that we have û(t)/(t−λ)−F(Rλu)P = 0, which was to be proved. <br />

Lemma 15.11. The generalized Fourier transform is unitary from<br />

and the <strong>in</strong>verse transform is the <strong>in</strong>verse of this map.<br />

HT to L 2 P


15. SINGULAR PROBLEMS 113<br />

Proof. Accord<strong>in</strong>g to Lemma 15.9 we need only show that if û ∈<br />

L2 P has <strong>in</strong>verse transform 0, then û = 0. Now, accord<strong>in</strong>g to Lemma 15.10,<br />

F(v)(t)/(t − λ) is a transform for all v ∈ L2 W and non-real λ. Thus<br />

we have 〈û(t)/(t − λ), F(v)(t)〉P = 0 for all non-real λ if û is orthogonal<br />

to all transforms. But we can view this scalar product as the<br />

Stieltjes-transform of the measure t<br />

−∞ F(v)∗ dP û, so apply<strong>in</strong>g the <strong>in</strong>version<br />

formula Lemma 6.5 we have <br />

K F(v)∗dP û = 0 for all compact<br />

<strong>in</strong>tervals K, and all v ∈ L2 W . Thus the cutoff of û, which equals û <strong>in</strong><br />

K and 0 outside, is also orthogonal to all transforms, i.e., has <strong>in</strong>verse<br />

transform 0 accord<strong>in</strong>g to Lemma 15.9. It follows that<br />

<br />

F (x, t)dP (t) û(t)<br />

K<br />

is the zero-element of L2 W for any compact <strong>in</strong>terval K. Now multiply<br />

this from the left with F ∗ (x, s)W (x) and <strong>in</strong>tegrate with respect to x<br />

over a large compact sub<strong>in</strong>terval L ⊂ I. We obta<strong>in</strong><br />

<br />

B(s, t)dP (t) û(t) = 0 for every s,<br />

K<br />

where B(s, t) = <br />

L F ∗ (x, s)W (x)F (x, t) dx. Thus B(s, t)dP (t) û(t) is<br />

the zero measure for all s. By Assumption 13.7 the matrix B(s, t)<br />

is <strong>in</strong>vertible for s = t, so by cont<strong>in</strong>uity it is, given s, <strong>in</strong>vertible for t<br />

sufficiently close to s. Thus, vary<strong>in</strong>g s, it follows that dP (t) û(t) is the<br />

zero measure <strong>in</strong> a neighborhood of every po<strong>in</strong>t. But this means that<br />

û = 0 as an element of L2 P . <br />

Lemma 15.12. If (u, v) ∈ T , then ˆv(t) = tû(t). Conversely, if û<br />

and tû(t) are <strong>in</strong> L 2 P , then F −1 (û) ∈ D(T ).<br />

Proof. We have (u, v) ∈ T if and only if u = Rλ(v − λu), which<br />

holds if and only if û(t) = (ˆv(t) − λû(t))/(t − λ), i.e., ˆv(t) = tû(t),<br />

accord<strong>in</strong>g to Lemmas 15.10 and 15.11. <br />

This completes the proof of Theorem 15.5. We also have the follow<strong>in</strong>g<br />

analogue of Corollary 14.5.<br />

Theorem 15.13. Suppose u ∈ D(T ). Then the <strong>in</strong>verse transform<br />

〈û, F ∗ (x, ·)〉P converges locally uniformly to u(x).<br />

Proof. The proof is very similar to that of Corollary 14.5. Put<br />

v = ( ˜ T − i)u so that v ∈ HT and u = Riv. Let K be a compact<br />

<strong>in</strong>terval, and put uK(x) = <br />

K F (x, t) dP (t) û(t) = F −1 (χû)(x), where<br />

χ is the characteristic function for K. Def<strong>in</strong>e vK similarly. Then by<br />

Lemma 15.10<br />

RivK = F −1 ( χ(t)ˆv(t)<br />

) = F<br />

t − i<br />

−1 (χû) = uK .


114 15. SINGULAR PROBLEMS<br />

S<strong>in</strong>ce vK → v <strong>in</strong> L2 W as K → R, it follows from Theorem 14.2 that<br />

uK → u <strong>in</strong> C(L) as K → R, for any compact sub<strong>in</strong>terval L of I. <br />

Example 15.14. Let us <strong>in</strong>terpret Theorem 15.5 for the case of<br />

the operator of Example 4.6, Green’s function of which is given <strong>in</strong><br />

Example 8.8. Compar<strong>in</strong>g (8.2) with (15.1), we see that M(λ) = i/2 for<br />

λ <strong>in</strong> the upper half plane. By Lemma 6.5 the correspond<strong>in</strong>g spectral<br />

measure is P (t) = limε→0 1<br />

t<br />

t<br />

Im M(µ + iε) dµ = . This means that<br />

π 0 2π<br />

if f ∈ L2 (R), then as a, b → ∞ the <strong>in</strong>tegral b<br />

−a f(x) e−ixt dt converges<br />

<strong>in</strong> the sense of L2 (R) to a function ˆ f ∈ L2 (R). Furthermore the <strong>in</strong>tegral<br />

<br />

1 b<br />

2π −a ˆ f(t) eixt dt converges <strong>in</strong> the same sense to f as a and b → ∞.<br />

We also conclude that ∞<br />

−∞ |f|2 = 1<br />

∞<br />

2π −∞ | ˆ f| 2 . F<strong>in</strong>ally, if f is locally<br />

absolutely cont<strong>in</strong>uous and together with its derivative <strong>in</strong> L2 (R), then<br />

the transform of −if ′ is t ˆ f(t) and conversely, if ˆ f and t ˆ f(t) are both <strong>in</strong><br />

L2 (R), then the <strong>in</strong>verse transform of ˆ f is locally absolutely cont<strong>in</strong>uous,<br />

and its derivative is <strong>in</strong> L2 (R) and is the <strong>in</strong>verse transform of it ˆ f(t). We<br />

also get from Theorem 15.13 that if f has these properties, then the<br />

<strong>in</strong>verse transform of ˆ f converges absolutely and locally uniformly to f.<br />

Actually, it is here easy to see that the convergence is uniform on the<br />

whole axis, but nevertheless it is clear that we have retrieved all the<br />

basic properties of the classical Fourier transform.<br />

Exercises for Chapter 15<br />

Exercise 15.1. Use, e.g., estimates <strong>in</strong> the variation of constants<br />

formula Lemma 13.4 for v = (λ − µ)u to show that all columns of<br />

F (x, µ) are <strong>in</strong> L2 W , then so are those of F (x, λ).<br />

Exercise 15.2. Show that the two def<strong>in</strong>itions of L2 P given <strong>in</strong> the<br />

text are equivalent. What needs to be proved is that any measurable<br />

n × 1 matrix-valued function with f<strong>in</strong>ite norm can be approximated <strong>in</strong><br />

norm by a similar function which is C∞ 0 .<br />

H<strong>in</strong>t: Use a cut off and convolution with a C ∞ 0 -function of small support.<br />

Exercise 15.3. In Lemma 15.9 is claimed that for every compact<br />

<strong>in</strong>terval K the <strong>in</strong>tegral <br />

K F (x, t) dP (t) û(t) ∈ HT , but this is never<br />

proved; or is it? Clarify this po<strong>in</strong>t!<br />

Exercise 15.4. Consider, as <strong>in</strong> the beg<strong>in</strong>n<strong>in</strong>g of Chapter 10, the<br />

first order system correspond<strong>in</strong>g to a general Sturm-Liouville equation<br />

−(pu ′ ) ′ + qu = λwu on [a, b),<br />

where 1/p, q and w are <strong>in</strong>tegrable on any <strong>in</strong>terval [a, x], x ∈ (a, b).<br />

Also assume that p and q are real-valued functions and w ≥ 0 and not<br />

a.e. equal to 0. Consider a selfadjo<strong>in</strong>t realization given by separated


EXERCISES FOR CHAPTER 15 115<br />

boundary conditions (cf. Chapters 10 and 13). This will be a condition<br />

at a, and if the boundary form does not vanish at b, also a condition<br />

at b. Choose the po<strong>in</strong>t c = 0 and the fundamental matrix F such<br />

that its first column satisfies the boundary condition at a. Show that<br />

M(λ) =<br />

m(λ) 1<br />

2<br />

1<br />

2<br />

0<br />

<br />

, where the Titchmarsh-Weyl function m(λ) is a<br />

scalar-valued Nevanl<strong>in</strong>na function.<br />

ϕ θ<br />

Now write F = −pϕ ′ −pθ ′<br />

<br />

. Show that there is a scalar Green’s<br />

function for the operator given by<br />

<br />

ϕ(x, λ)ψ(y, λ), x < y,<br />

g(x, y, λ) =<br />

ψ(x, λ)ϕ(y, λ), y < x,<br />

where ψ(x, λ) = θ(x, λ) + m(λ)ϕ(x, λ), with the property that the<br />

solution of −(pu ′ ) ′ + qu = λwu + wv which is <strong>in</strong> L 2 w and satisfies the<br />

boundary conditions is given by u(x) = Rλv(x) = ∞<br />

g(x, y, λ)v(y) dy.<br />

0<br />

Show also that the spectral matrix P = <br />

ρ 0<br />

0 0 , where the spectral<br />

function ρ is the function <strong>in</strong> the representation (6.1) for the function<br />

m(λ), and that<br />

b<br />

Im m(λ) = Im λ |ψ(x, λ)| 2 dx.<br />

F<strong>in</strong>ally show that the generalized Fourier transform of ψ is always given<br />

by ˆ ψ(t, λ) = 1/(t − λ).<br />

Thus the spectral theory for the general Sturm-Liouville equation<br />

has precisely the same basic features as for the simple case treated <strong>in</strong><br />

Chapter 11.<br />

a


APPENDIX A<br />

Functional analysis<br />

In this appendix we will give the proofs of some standard theorems<br />

from functional analysis. They are all valid <strong>in</strong> more general situations<br />

than stated here. As is usual, our proofs will be based upon the follow<strong>in</strong>g<br />

important theorem. We have stated it for a Banach space, but<br />

the proof would be the same <strong>in</strong> any complete, metric space.<br />

Theorem A.1 (Baire). Suppose B is a Banach space and F1, F2, . . .<br />

a sequence of closed subsets of B. If all Fn fail to have <strong>in</strong>terior po<strong>in</strong>ts,<br />

so does ∪ ∞ n=1Fn. In particular, the union is a proper subset of B.<br />

Proof. Let B0 = {x ∈ B | x − x0 ≤ R0} be an arbitrary closed<br />

ball. We must show that it can not be conta<strong>in</strong>ed <strong>in</strong> ∪ ∞ n=1Fn. We do<br />

this by first select<strong>in</strong>g a decreas<strong>in</strong>g sequence of closed balls B0 ⊃ B1 ⊃<br />

B2 ⊃ · · · such that the radii Rn → 0 and Bn ∩ Fn = ∅ for each n. But<br />

if we already have chosen B0, . . . , Bn we can f<strong>in</strong>d a po<strong>in</strong>t xn+1 ∈ Bn (<strong>in</strong><br />

the <strong>in</strong>terior of Bn) which is not conta<strong>in</strong>ed <strong>in</strong> Fn+1, s<strong>in</strong>ce Fn+1 has no<br />

<strong>in</strong>terior po<strong>in</strong>ts. S<strong>in</strong>ce Fn+1 is closed we can choose a closed ball ⊂ Bn,<br />

centered at xn+1, and which does not <strong>in</strong>tersect Fn+1. If we also make<br />

sure that the radius Rn+1 is at most half of the radius Rn of Bn, it<br />

follows by <strong>in</strong>duction that we may f<strong>in</strong>d a sequence of balls as required.<br />

For k > n we have xk ∈ Bn so that xk − xn ≤ Rn → 0, so that<br />

x1, x2, . . . is a Cauchy sequence, and thus converges to a limit x. We<br />

have x ∈ Bn for every n s<strong>in</strong>ce xk ∈ Bn for k > n and Bn is closed.<br />

Thus x is not conta<strong>in</strong>ed <strong>in</strong> any Fn. B0 be<strong>in</strong>g arbitrary, it follows that<br />

no ball is conta<strong>in</strong>ed <strong>in</strong> ∪ ∞ n=1Fn, which therefore has no <strong>in</strong>terior po<strong>in</strong>ts,<br />

and the proof is complete. <br />

A set which is a subset of the union of countably many closed<br />

sets without <strong>in</strong>terior po<strong>in</strong>ts, is said to be of the first category. More<br />

picturesquely such a set is said to be meager. Meager subsets of R n have<br />

many properties <strong>in</strong> common with, or analogous to, sets of Lebesgue<br />

measure zero. There is no direct connection, however, s<strong>in</strong>ce a meager<br />

set may have positive measure, and a set of measure zero does not have<br />

to be meager. A set which is not meager is said to be of the second<br />

category, or to be non-meager (how about fat?). The basic properties<br />

of meager sets are the follow<strong>in</strong>g.<br />

117


118 A. FUNCTIONAL ANALYSIS<br />

Proposition A.2. A subset of a meager set is meager, a countable<br />

union of meager sets is meager, and no meager set has an <strong>in</strong>terior<br />

po<strong>in</strong>t.<br />

Proof. The first two claims are left as exercises for the reader to<br />

verify; the third claim is Baire’s theorem. <br />

The follow<strong>in</strong>g theorem is one of the cornerstones of functional analysis.<br />

Theorem A.3 (Banach). Suppose B1 and B2 are Banach spaces<br />

and T : B1 → B2 a bounded, <strong>in</strong>jective (one-to-one) l<strong>in</strong>ear map. If the<br />

range of T is not meager, <strong>in</strong> particular if it is all of B2, then T has a<br />

bounded <strong>in</strong>verse, and the range is all of B2.<br />

Proof. We denote the norm <strong>in</strong> Bj by ·j. Let<br />

An = {T x | x1 ≤ n}<br />

be the image of the closed ball with radius n, centered at 0 <strong>in</strong> B1. The<br />

balls expand to all of B1 as n → ∞, so the range of T is ∪ ∞ n=1An ⊂<br />

∪ ∞ n=1An. The range not be<strong>in</strong>g meager, at least one An must have an<br />

<strong>in</strong>terior po<strong>in</strong>t y0. Thus we can f<strong>in</strong>d r > 0 so that {y0 + y | y2 < r} ⊂<br />

An. S<strong>in</strong>ce An is symmetric with respect to the orig<strong>in</strong>, also −y0+y ∈ An<br />

if y2 < r. Furthermore, An is convex, as the closure of (the l<strong>in</strong>ear<br />

image of) a convex set. It follows that y = 1<br />

2 ((y0 +y)+(−y0 +y)) ∈ An.<br />

Thus 0 is an <strong>in</strong>terior po<strong>in</strong>t of An. S<strong>in</strong>ce all An are similar (An = nA1),<br />

0 is also an <strong>in</strong>terior po<strong>in</strong>t of A1. This means that there is a number<br />

C > 0, such that any y ∈ B2 for which y2 ≤ C is <strong>in</strong> A1. For<br />

such y we may therefore f<strong>in</strong>d x ∈ B1 with x1 ≤ 1, such that T x is<br />

arbitrarily close to y. For example, we may f<strong>in</strong>d x ∈ B1 with x1 ≤ 1<br />

such that y − T x2 ≤ 1<br />

2 C. For arbitrary non-zero y ∈ B2 we set<br />

˜y = C<br />

y2 y, and then have ˜y2 = C, so we can f<strong>in</strong>d ˜x with ˜x1 ≤ 1<br />

and ˜y − T ˜x2 ≤ 1<br />

2<br />

C. Sett<strong>in</strong>g x = y2<br />

C<br />

˜x we obta<strong>in</strong><br />

(A.1) x1 ≤ 1<br />

C y2 and y − T x2 ≤ 1<br />

2y2. Thus, to any y ∈ B2 we may f<strong>in</strong>d x ∈ B1 so that (A.1) holds (for y = 0,<br />

take x = 0).<br />

We now construct two sequences {xj} ∞ j=0 and {yj} ∞ j=0, <strong>in</strong> B1 respectively<br />

B2, by first sett<strong>in</strong>g y0 = y. If yn is already def<strong>in</strong>ed, we def<strong>in</strong>e<br />

xn and yn+1 so that xn1 ≤ 1<br />

C yn2, yn+1 = yn − T xn, and yn+12 ≤<br />

1<br />

2yn2. We obta<strong>in</strong> yn2 ≤ 2−ny2 and xn1 ≤ 1<br />

C 2−ny2 from this.<br />

Furthermore, T xn = yn+1 − yn, so add<strong>in</strong>g we obta<strong>in</strong> T ( n<br />

j=0 xj) =<br />

y − yn+1 → y as n → ∞. But the series ∞ j=0xj1 converges,<br />

s<strong>in</strong>ce it is dom<strong>in</strong>ated by 1<br />

C y2<br />

∞<br />

j=0 2−j = 2<br />

C y2. S<strong>in</strong>ce B1 is com-<br />

plete, the series ∞<br />

j=0 xj therefore converges to some x ∈ B1 satisfy<strong>in</strong>g<br />

x1 ≤ 2<br />

C y2, and s<strong>in</strong>ce T is cont<strong>in</strong>uous we also obta<strong>in</strong> T x = y. In


A. FUNCTIONAL ANALYSIS 119<br />

other words, we can solve T x = y for any y ∈ B2, so the <strong>in</strong>verse of<br />

T is def<strong>in</strong>ed everywhere, and the <strong>in</strong>verse is bounded by 2 , so it is<br />

C<br />

cont<strong>in</strong>uous. The proof is complete. <br />

In these notes we do not actually use Banach’s theorem, but the<br />

follow<strong>in</strong>g simple corollary (which is actually equivalent to Banach’s<br />

theorem). Recall that a l<strong>in</strong>ear map T : B1 → B2 is called closed if the<br />

graph {(u, T u) | u ∈ D(T )} is a closed subset of B1 ⊕B2. Equivalently,<br />

if uj → u <strong>in</strong> B1 and T uj → v <strong>in</strong> B2 implies that u ∈ D(T ) and T u = v.<br />

Corollary A.4 (Closed graph theorem). Suppose T is a closed<br />

l<strong>in</strong>ear operator T : B1 → B2, def<strong>in</strong>ed on all of B1. Then T is bounded.<br />

Proof. The graph {(u, T u) | u ∈ B1} is by assumption a Banach<br />

space with norm (u, T u) = u1 + T u2, where ·j denotes the<br />

norm of Bj. The map (u, T u) ↦→ u is l<strong>in</strong>ear, def<strong>in</strong>ed <strong>in</strong> this Banach<br />

space, with range equal to B1, and it has norm ≤ 1. It is obviously<br />

<strong>in</strong>jective, so by Banach’s theorem the <strong>in</strong>verse is bounded, i.e., there is<br />

a constant so that (u, T u) ≤ Cu1. Hence also T u2 ≤ Cu1, so<br />

that T is bounded. <br />

In Chapter 3 we used the Banach-Ste<strong>in</strong>haus theorem, Theorem 3.10.<br />

S<strong>in</strong>ce no extra effort is <strong>in</strong>volved, we prove the follow<strong>in</strong>g slightly more<br />

general theorem.<br />

Theorem A.5 (Banach-Ste<strong>in</strong>haus; uniform boundedness pr<strong>in</strong>ciple).<br />

Suppose B is a Banach space, L a normed l<strong>in</strong>ear space, and M<br />

a subset of the set L(B, L) of all bounded, l<strong>in</strong>ear maps from B <strong>in</strong>to L.<br />

Suppose M is po<strong>in</strong>twise bounded, i.e., for each x ∈ B there exists a<br />

constant Cx such that T xB ≤ Cx for every T ∈ M. Then M is uniformly<br />

bounded, i.e., there is a constant C such that T xB ≤ CxL<br />

for all x ∈ B and all T ∈ M.<br />

Proof. Put Fn = {x ∈ B | T xB ≤ n for all T ∈ M}. Then<br />

Fn is closed, as the <strong>in</strong>tersection of the closed sets which are <strong>in</strong>verse<br />

images of the closed <strong>in</strong>terval [0, n] under a cont<strong>in</strong>uous function B1 ∋<br />

x ↦→ T xL ∈ R. The assumption means that ∪ ∞ n=1Fn = B. By<br />

Baire’s theorem at least one Fn must have an <strong>in</strong>terior po<strong>in</strong>t. S<strong>in</strong>ce Fn<br />

is convex (if x, y ∈ Fn and 0 ≤ t ≤ 1, then tT x + (1 − t)T yL ≤<br />

tT xL + (1 − t)T yL ≤ n) and symmetric with respect to the orig<strong>in</strong><br />

it follows, like <strong>in</strong> the proof of Banach’s theorem, that 0 is an <strong>in</strong>terior<br />

po<strong>in</strong>t <strong>in</strong> Fn. Thus, for some r > 0 we have T xL ≤ n for all T ∈ M, if<br />

xB ≤ r. By homogeneity follows that T xL ≤ n<br />

r xB for all T ∈ M<br />

and x ∈ B.


APPENDIX B<br />

Stieltjes <strong>in</strong>tegrals<br />

The Riemann-Stieltjes <strong>in</strong>tegral is a simple generalization of the<br />

(one-dimensional) Riemann <strong>in</strong>tegral. To def<strong>in</strong>e it, let f and g be two<br />

functions def<strong>in</strong>ed on the compact <strong>in</strong>terval [a, b]. For every partition<br />

∆ = {xj} n j=0 of [a, b], i.e., a = x0 < x1 < · · · < xn = b, we let the mesh<br />

of ∆ be |∆| = max(xk − xk−1). This is the length of the longest sub<strong>in</strong>terval<br />

of [a, b] <strong>in</strong> the partition. We also choose from each sub<strong>in</strong>terval<br />

[xk−1, xk] a po<strong>in</strong>t ξk and form the sum<br />

s =<br />

n<br />

f(ξk)(g(xk) − g(xk−1)) .<br />

k=1<br />

Now suppose that s tends to a limit as |∆| → 0 <strong>in</strong>dependently of the<br />

partition ∆ and choice of the po<strong>in</strong>ts ξk. The exact mean<strong>in</strong>g of this is<br />

the follow<strong>in</strong>g: There exists a number I such that for every ε > 0 there<br />

is a δ > 0 such that |s − I| < ε as soon as |∆| < δ. In this case we say<br />

that the <strong>in</strong>tegrand f is Riemann-Stieltjes <strong>in</strong>tegrable with respect to the<br />

<strong>in</strong>tegrator g and that the correspond<strong>in</strong>g <strong>in</strong>tegral equals I. We denote<br />

this <strong>in</strong>tegral by b<br />

a f(x) dg(x) or simply b<br />

f dg. The choice g(x) = x<br />

a<br />

gives us, of course, the ord<strong>in</strong>ary Riemann <strong>in</strong>tegral.<br />

Proposition B.1. A function f is <strong>in</strong>tegrable with respect to a function<br />

g if and only if for every ε > 0 there exists a δ > 0 such that for<br />

any two partitions ∆ and ∆ ′ and the correspond<strong>in</strong>g sums s and s ′ , we<br />

have |s − s ′ | < ε as soon as |∆| and |∆ ′ | are both < δ.<br />

This is of course a version of the Cauchy convergence pr<strong>in</strong>ciple. We<br />

leave the proof as an exercise (Exercise B.1). From the def<strong>in</strong>ition the<br />

follow<strong>in</strong>g calculation rules follow immediately (Exercise B.2).<br />

(1)<br />

b<br />

a<br />

(2) C<br />

(3)<br />

b<br />

a<br />

f1 dg +<br />

b<br />

a<br />

b<br />

a<br />

f dg =<br />

<br />

f dg1 +<br />

a<br />

b<br />

a<br />

b<br />

f2 dg =<br />

Cf dg,<br />

f dg2 =<br />

b<br />

a<br />

b<br />

a<br />

(f1 + f2) dg,<br />

f d(g1 + g2),<br />

121


122 B. STIELTJES INTEGRALS<br />

(4) C<br />

(5)<br />

b<br />

a<br />

b<br />

a<br />

f dg =<br />

f dg =<br />

d<br />

a<br />

b<br />

a<br />

f d(Cg),<br />

f dg +<br />

b<br />

d<br />

f dg for a < d < b.<br />

where f, f1, f2, g, g1 and g2 are functions, C a constant and the<br />

formulas should be <strong>in</strong>terpreted to mean that if the <strong>in</strong>tegrals to the left<br />

of the equality sign exist, then so do the <strong>in</strong>tegrals to the right, and<br />

equality holds.<br />

Proposition B.2 (Change of variables). Suppose that h is cont<strong>in</strong>uous<br />

and <strong>in</strong>creas<strong>in</strong>g and f is <strong>in</strong>tegrable with respect to g over [h(a), h(b)].<br />

Then the composite function f ◦h is <strong>in</strong>tegrable with respect to g ◦h over<br />

[a, b] and<br />

<br />

h(b)<br />

h(a)<br />

f dg =<br />

b<br />

a<br />

f ◦ h d(g ◦ h).<br />

We leave the proof also of this proposition to the reader (Exercise<br />

B.3). The formula for <strong>in</strong>tegration by parts takes the follow<strong>in</strong>g<br />

nicely symmetric form <strong>in</strong> the context of the Stieltjes <strong>in</strong>tegral.<br />

Theorem B.3 (Integration by parts). If f is <strong>in</strong>tegrable with respect<br />

to g, then g is also <strong>in</strong>tegrable with respect to f and<br />

b<br />

a<br />

g df = f(b)g(b) − f(a)g(a) −<br />

b<br />

a<br />

f dg.<br />

Proof. Let a = x0 < x1 < · · · < xn = b be a partition ∆ of [a, b]<br />

and suppose xk−1 ≤ ξk ≤ xk, k = 1, . . . , n. Set ξ0 = a, ξn+1 = b. Then<br />

a = ξ0 ≤ ξ1 ≤ · · · ≤ ξn+1 = b gives a partition ∆ ′ (one discards any<br />

ξk+1 which is equal to ξk) of [a, b] for which |∆ ′ | ≤ 2|∆| (check this!).<br />

We have ξk ≤ xk ≤ ξk+1 and<br />

s =<br />

n<br />

g(ξk)(f(xk) − f(xk−1)) =<br />

k=1<br />

n<br />

n−1<br />

g(ξk)f(xk) − g(ξk+1)f(xk)<br />

k=1<br />

= f(b)g(b) − f(a)g(a) −<br />

k=0<br />

n<br />

f(xk)(g(ξk+1) − g(ξk)).<br />

If |∆| → 0 we have |∆ ′ | → 0, so the last sum converges to b<br />

f dg<br />

a<br />

(note that if ξk+1 = ξk then the correspond<strong>in</strong>g term <strong>in</strong> the sum is 0).<br />

It follows that s converges to f(b)g(b) − f(a)g(a) − b<br />

f dg and the<br />

a<br />

theorem follows. <br />

k=0


B. STIELTJES INTEGRALS 123<br />

Note that Theorem B.3 is a statement about the Riemann-Stieltjes<br />

<strong>in</strong>tegral; for more general (Lebesgue-Stieltjes) <strong>in</strong>tegrals it is not true<br />

without further assumptions about f and g. The reason is that the<br />

Riemann-Stieltjes <strong>in</strong>tegrals can not exist if f and g have discont<strong>in</strong>uities<br />

<strong>in</strong> common (Exercise B.4), whereas the Lebesgue-Stieltjes <strong>in</strong>tegrals exist<br />

as soon as f and g are, for example, both monotone. In such a case<br />

the <strong>in</strong>tegration by parts formula only holds under additional assumptions,<br />

for example if f is cont<strong>in</strong>uous to the right and g to the left <strong>in</strong> any<br />

common po<strong>in</strong>t of discont<strong>in</strong>uity, or if both f and g are normal, i.e., their<br />

values at po<strong>in</strong>ts of discont<strong>in</strong>uity are the averages of the correspond<strong>in</strong>g<br />

left and right hand limits.<br />

So far we don’t know that any function is <strong>in</strong>tegrable with respect<br />

to any other (except for g(x) = x which is the case of the Riemann<br />

<strong>in</strong>tegral).<br />

Theorem B.4. If g is non-decreas<strong>in</strong>g on [a, b], then every cont<strong>in</strong>uous<br />

function f is <strong>in</strong>tegrable with respect to g and we have<br />

b<br />

<br />

f dg ≤ max|f|(g(b)<br />

− g(a)).<br />

[a,b]<br />

a<br />

Proof. Let ∆ ′ and ∆ ′′ be partitions a = x ′ 0 < x ′ 1 < · · · < x ′ m = b<br />

and a = x ′′<br />

0 < x ′′<br />

1 < · · · < x ′′ n = b of [a, b] and consider the correspond<strong>in</strong>g<br />

Riemann-Stieltjes sums s ′ = m<br />

k=1 f(ξ′ k )(g(x′ k ) − g(x′ k−1 )) and s′′ =<br />

n<br />

k=1 f(ξ′′<br />

k )(g(x′′<br />

k )−g(x′′<br />

k−1 )). If we <strong>in</strong>troduce the partition ∆ = ∆′ ∪∆ ′′ ,<br />

suppos<strong>in</strong>g it to be a = x0 < x1 < · · · < xp = b, we can write<br />

s ′ − s ′′ p<br />

= (f(ξ ′ kj ) − f(ξ′′ qj ))(g(xj) − g(xj−1))<br />

j=1<br />

where kj = k for all j for which [xj−1, xj] ⊂ [x ′ k−1 , x′ k ] and qj = k for<br />

all j for which [xj−1, xj] ⊂ [x ′′<br />

k−1 , x′′ k ] (check this carefully!). Thus, for<br />

all j, ξ ′ kj and xj are <strong>in</strong> the same sub<strong>in</strong>terval of the partition ∆ ′ , and<br />

ξ ′′<br />

qj and xj <strong>in</strong> the same sub<strong>in</strong>terval of the partition ∆ ′′ . It follows that<br />

|ξ ′ − ξ′′ kj qj | ≤ |ξ′ kj − xj| + |ξ ′′<br />

qj − xj| ≤ |∆ ′ | + |∆ ′′ | for all j. S<strong>in</strong>ce f<br />

is uniformly cont<strong>in</strong>uous on [a, b], this means that given ε > 0, then<br />

|f(ξ ′ ) − f(ξ′′<br />

kj qj )| ≤ ε if |∆′ | and |∆ ′′ | are both small enough. It follows<br />

that |s ′ −s ′′ | ≤ ε p j=1 |g(xj)−g(xj−1| = ε(g(b)−g(a)) for small enough<br />

|∆ ′ | and |∆ ′′ |. Thus f is <strong>in</strong>tegrable with respect to g accord<strong>in</strong>g to<br />

Proposition B.1. We also have |s ′ | ≤ n k=1 |f(ξ′ k )||g(x′ k ) − g(x′ k−1 )| ≤<br />

max|f|(g(b) − g(a)) so the proof is complete. <br />

As a generalization of Theorem B.4 we may of course take g to be<br />

any function which is the difference of two non-decreas<strong>in</strong>g functions.<br />

Such a function is called a function of bounded variation. We shall<br />

briefly discuss such functions; the ma<strong>in</strong> po<strong>in</strong>t is that they are characterized<br />

by hav<strong>in</strong>g f<strong>in</strong>ite total variation.


124 B. STIELTJES INTEGRALS<br />

Def<strong>in</strong>ition B.5. Let f be a real-valued function def<strong>in</strong>ed on [a, b].<br />

Then the total variation of f over [a, b] is<br />

(B.1) V (f) = sup<br />

∆<br />

n<br />

|f(xk) − f(xk−1)|,<br />

k=1<br />

the supremum taken over all partitions ∆ = {x0, x1, . . . , xn} of [a, b].<br />

We have 0 ≤ V (f) ≤ +∞, and if V (f) is f<strong>in</strong>ite, we say that f has<br />

bounded variation on [a, b].<br />

When the <strong>in</strong>terval considered is not obvious from the context, one<br />

may write the total variation of f over [a, b] as V b<br />

a (f); another common<br />

notation is b<br />

|df|. As we mentioned above, a function of bounded<br />

a<br />

variation can also be characterized as a function which is the difference<br />

of two non-decreas<strong>in</strong>g functions.<br />

Theorem B.6.<br />

(1) The total variation V b<br />

a (f) is an <strong>in</strong>terval additive function, i.e.,<br />

if a < x < b we have V x<br />

a (f) + V b<br />

x (f) = V b<br />

a (f).<br />

(2) A function of bounded variation on an <strong>in</strong>terval [a, b] may be<br />

written as the difference of two non-decreas<strong>in</strong>g functions. Conversely,<br />

any such difference is of bounded variation.<br />

(3) If f is of bounded variation on [a, b], then there are nondecreas<strong>in</strong>g<br />

functions P and N, such that f(x) = f(a)+P (x)−<br />

N(x), called the positive and negative variation functions of f<br />

on [a, b], with the follow<strong>in</strong>g property: For any pair of nondecreas<strong>in</strong>g<br />

functions u, v for which f = u − v holds u(x) ≥<br />

u(a) + P (x) and v(x) ≥ v(a) + N(x) for a ≤ x ≤ b.<br />

Proof. It is clear that if a < x < b and ∆, ∆ ′ are partitions of<br />

[a, x] respectively [x, b], then ∆ ∪ ∆ ′ is a partition of [a, b]; the cor-<br />

respond<strong>in</strong>g sum is therefore ≤ V b<br />

a (f). Tak<strong>in</strong>g supremum over ∆ and<br />

then ∆ ′ it follows that V x<br />

a (f) + V b<br />

x (f) ≤ V b<br />

a (f). On the other hand,<br />

<strong>in</strong> calculat<strong>in</strong>g V b<br />

a (f), we may restrict ourselves to partitions ∆ con-<br />

ta<strong>in</strong><strong>in</strong>g x, s<strong>in</strong>ce add<strong>in</strong>g new po<strong>in</strong>ts can only <strong>in</strong>crease the sum (B.1). If<br />

∆ = {x0, . . . , xn} and x = xp we have p<br />

k=1 |f(xk) − f(xk−1)| ≤ V x<br />

a (f)<br />

respectively m<br />

k=p+1 |f(xk) − f(xk−1)| ≤ V b<br />

x (f). Tak<strong>in</strong>g supremum over<br />

all ∆ we obta<strong>in</strong> V b<br />

a (f) ≤ V x<br />

a (f) + V b<br />

x (f). The <strong>in</strong>terval additivity of the<br />

total variation follows.<br />

Sett<strong>in</strong>g T (x) = V x<br />

a (f) the function T is f<strong>in</strong>ite <strong>in</strong> [a, b]; it is called<br />

the total variation function of f over [a, b]. S<strong>in</strong>ce by <strong>in</strong>terval additivity<br />

T (y)−T (x) = V y<br />

x (f) ≥ |f(y)−f(x)| ≥ ±(f(y)−f(x)) if a ≤ x ≤ y ≤ b<br />

it also follows that T is non-decreas<strong>in</strong>g, as are P = 1(T<br />

+ f − f(a))<br />

2<br />

and N = 1(T<br />

− f + f(a)). But then f = (f(a) + P ) − N is a splitt<strong>in</strong>g<br />

2<br />

of f <strong>in</strong>to a difference of non-decreas<strong>in</strong>g functions. Note also that T =<br />

P + N. Conversely, if u and v are non-decreas<strong>in</strong>g functions on [a, b]


B. STIELTJES INTEGRALS 125<br />

and {x0, . . . , xn} a partition of [a, x], a < x ≤ b, then<br />

n<br />

|(u(xk) − v(xk)) − (u(xk−1) − v(xk−1))|<br />

k=1<br />

≤<br />

n<br />

|u(xk) − u(xk−1)| +<br />

k=1<br />

n<br />

|v(xk) − v(xk−1)|<br />

k=1<br />

= u(x) − u(a) + v(x) − v(a),<br />

so that V x<br />

a (u−v) ≤ u(x)+v(x)−(u(a)+v(a)). In particular, for x = b<br />

this shows that u − v is of bounded variation on [a, b]. The <strong>in</strong>equality<br />

also shows that if f = u − v, then<br />

P (x) = 1(T<br />

(x) + f(x) − f(a))<br />

2<br />

≤ 1<br />

2<br />

(u(x) − u(a) + v(x) − v(a) + f(x) − f(a)) = u(x) − u(a) .<br />

Similarly one shows that N(x) ≤ v(x) − v(a) so that the proof is<br />

complete. <br />

We remark that a complex-valued function (of a real variable) is<br />

said to be of bounded variation if its real and imag<strong>in</strong>ary parts are. If<br />

Tr and Ti are the total variation functions of the real and imag<strong>in</strong>ary<br />

parts of f, then one def<strong>in</strong>es the total variation function of f to be<br />

T = T 2 r + T 2<br />

i (sometimes the def<strong>in</strong>ition T = Tr + Ti is used). One<br />

may also use Def<strong>in</strong>ition B.5 for complex-valued functions, and then it<br />

is easily seen that T 2 r + T 2<br />

i ≤ T ≤ Tr + Ti.<br />

S<strong>in</strong>ce a monotone function can have only jump discont<strong>in</strong>uities, and<br />

at most countably many of them, also functions of bounded variation<br />

can have at most countably many discont<strong>in</strong>uities, all of them jump discont<strong>in</strong>uities.<br />

Moreover, it is easy to see that the positive and negative<br />

variation functions (and therefore the total variation function) are cont<strong>in</strong>uous<br />

wherever f is (Exercise B.7).<br />

Corollary B.7. If g is of bounded variation on [a, b], then every<br />

cont<strong>in</strong>uous function f is <strong>in</strong>tegrable with respect to g and we have<br />

b<br />

<br />

(B.2)<br />

f dg ≤ max|f|V<br />

[a,b]<br />

b<br />

a (g).<br />

a<br />

Proof. The <strong>in</strong>tegrability statement follows immediately from Theorem<br />

B.4 on writ<strong>in</strong>g g as the difference of non-decreas<strong>in</strong>g functions. To<br />

obta<strong>in</strong> the <strong>in</strong>equality, consider a Riemann-Stieltjes sum<br />

s =<br />

n<br />

f(ξk)(g(xk) − g(xk−1)).<br />

k=1


126 B. STIELTJES INTEGRALS<br />

We obta<strong>in</strong><br />

n<br />

|s| ≤ |f(ξk)||g(xk) − g(xk−1)|<br />

k=1<br />

≤ max|f|<br />

[a,b]<br />

n<br />

k=1<br />

|g(xk) − g(xk−1)| ≤ max<br />

[a,b]<br />

|f|V b<br />

a (g) .<br />

S<strong>in</strong>ce this <strong>in</strong>equality holds for all Riemann-Stieltjes sums, it also holds<br />

for their limit, which is b<br />

f dg. <br />

a<br />

In some cases a Stieltjes <strong>in</strong>tegral reduces to an ord<strong>in</strong>ary Lebesgue<br />

<strong>in</strong>tegral.<br />

Theorem B.8. Suppose f is cont<strong>in</strong>uous and g absolutely cont<strong>in</strong>uous<br />

on [a, b]. Then fg ′ ∈ L1 (a, b) and b<br />

a f dg = b<br />

a f(x)g′ (x) dx, where<br />

the second <strong>in</strong>tegral is a Lebesgue <strong>in</strong>tegral.<br />

The proof of Theorem B.8 is left as an exercise (Exercise B.8).


EXERCISES FOR APPENDIX B 127<br />

Exercises for Appendix B<br />

Exercise B.1. Prove Proposition B.1.<br />

Exercise B.2. Prove the calculation rules (1)–(5).<br />

Exercise B.3. Prove Proposition B.2.<br />

Exercise B.4. Show that if f and g has a common po<strong>in</strong>t of discont<strong>in</strong>uity<br />

<strong>in</strong> [a, b], then f is not Riemann-Stieltjes <strong>in</strong>tegrable with respect<br />

to g over [a, b].<br />

Exercise B.5. Show that if f is absolutely cont<strong>in</strong>uous on [a, b],<br />

then f is of bounded variation on [a, b], and V b<br />

a (f) = b<br />

a |f ′ |.<br />

H<strong>in</strong>t: First show V b<br />

a (f) ≥ b<br />

a |f ′ |. To show the other direction, write<br />

(B.1) on the form b<br />

a ϕf ′ for a stepfunction ϕ and use Hölder’s <strong>in</strong>equality.<br />

Exercise B.6. Show that the set of all functions of bounded variation<br />

on an <strong>in</strong>terval [a, b] is made <strong>in</strong>to a normed l<strong>in</strong>ear space by sett<strong>in</strong>g<br />

f = |f(a)| + V b<br />

a (f). Convergence <strong>in</strong> this norm is called convergence<br />

<strong>in</strong> variation. Show that convergence <strong>in</strong> variation implies uniform convergence,<br />

and that the normed space just <strong>in</strong>troduced is complete (any<br />

Cauchy sequence of functions <strong>in</strong> the space converges <strong>in</strong> variation to a<br />

function of bounded variation).<br />

Exercise B.7. Show that a monotone function can have at most<br />

countably many discont<strong>in</strong>uities, all of them jump discont<strong>in</strong>uities. Also<br />

show that if a function of bounded variation is cont<strong>in</strong>uous to the left<br />

(right) at a po<strong>in</strong>t, then so are its positive and negative variation functions,<br />

and that only if the function jumps up (down) will the positive<br />

(negative) variation function have a jump.<br />

H<strong>in</strong>t: How many jumps of size > 1/j can there be?<br />

Exercise B.8. Prove Theorem B.8. Also show that if g is absolutely<br />

cont<strong>in</strong>uous on [a, b], then any Riemann <strong>in</strong>tegrable f is <strong>in</strong>tegrable<br />

with respect to g and the same formula holds.<br />

H<strong>in</strong>t: f(ξk)(g(xk) − g(xk−1) = ϕg ′ where ϕ is a step function<br />

converg<strong>in</strong>g to f.<br />

Exercise B.9. Suppose f, g are cont<strong>in</strong>uous and ρ of bounded variation<br />

<strong>in</strong> (a, b). Put σ(t) = t<br />

f(s) dρ(s) for some c ∈ (a, b). Show that<br />

c<br />

b<br />

a<br />

g(t) dσ(t) =<br />

b<br />

a<br />

g(t)f(t) dρ(t) .<br />

H<strong>in</strong>t: Integrate both sides by parts, first replac<strong>in</strong>g (a, b) by an arbitrary<br />

compact sub<strong>in</strong>terval.


APPENDIX C<br />

L<strong>in</strong>ear first order systems<br />

In this appendix we will prove some standard results about l<strong>in</strong>ear<br />

first order systems of differential equations which are used <strong>in</strong> the text.<br />

We will prove no more than we actually need, although the theorems<br />

have easy generalizations to non-l<strong>in</strong>ear equations, more complicated parameter<br />

dependence, etc. The first result is the standard existence and<br />

uniqueness theorem, Theorem 13.1, which also implies Theorem 10.1.<br />

Theorem. Suppose A is an n × n matrix-valued function with locally<br />

<strong>in</strong>tegrable entries <strong>in</strong> an <strong>in</strong>terval I, and that B is an n × 1 matrixvalued<br />

function, locally <strong>in</strong>tegrable <strong>in</strong> I. Assume further that c ∈ I and<br />

C is an n × 1 matrix. Then the <strong>in</strong>itial value problem<br />

(C.1)<br />

<br />

u ′ = Au + B <strong>in</strong> I,<br />

u(c) = C,<br />

has a unique n × 1 matrix-valued solution u with locally absolutely cont<strong>in</strong>uous<br />

entries def<strong>in</strong>ed <strong>in</strong> I.<br />

Corollaries 13.2 and 10.2 are immediate consequences of the theorem.<br />

Corollary. Let A and I be as <strong>in</strong> the previous theorem. Then the<br />

set of solutions to u ′ = Au <strong>in</strong> I is an n-dimensional l<strong>in</strong>ear space.<br />

Proof. It is clear that any l<strong>in</strong>ear comb<strong>in</strong>ation of solutions is also<br />

a solution, so the set of solutions is a l<strong>in</strong>ear space. We must show<br />

that it has dimension n. Let uk solve the <strong>in</strong>itial value problem with<br />

uk(c) equal to the k:th column of the n × n unit matrix. If u is any<br />

solution of the equation, and the components of u(c) are x1, . . . , xn,<br />

then the function x1u1+· · ·+xnun is also a solution with the same <strong>in</strong>itial<br />

data. It therefore co<strong>in</strong>cides with u, and it is clear that no other l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of u1, . . . , un has the same <strong>in</strong>itial data as u. It follows<br />

that u1, . . . , un is a basis for the space of solutions, which therefore is<br />

n-dimensional. <br />

F<strong>in</strong>ally we shall prove Theorem 15.1.<br />

Theorem. A solution u(x, λ) of Ju ′ + Qu = λW u with <strong>in</strong>itial<br />

data <strong>in</strong>dependent of λ is an entire function of λ, locally uniformly with<br />

respect to x.<br />

129


130 C. LINEAR FIRST ORDER SYSTEMS<br />

If we <strong>in</strong>tegrate the differential equation <strong>in</strong> (C.1) from c to x, us<strong>in</strong>g<br />

the <strong>in</strong>itial data, we get the <strong>in</strong>tegral equation<br />

(C.2) u(x) = H(x) +<br />

x<br />

c<br />

Au,<br />

where H(x) = C+ x<br />

B. Conversely, if u is cont<strong>in</strong>uous and solves (C.2),<br />

c<br />

then u has <strong>in</strong>itial data H(c) = C and is locally absolutely cont<strong>in</strong>uous<br />

(be<strong>in</strong>g an <strong>in</strong>tegral function). Differentiation gives u ′ = Au + B, so that<br />

the <strong>in</strong>itial value problem is equivalent to the <strong>in</strong>tegral equation (C.2).<br />

In the case of Theorem 13.1, we put A = J −1 (Q − λW ) and B = 0<br />

to get an equation of the form (C.1). We therefore need to show the<br />

follow<strong>in</strong>g theorems.<br />

Theorem C.1. Suppose A has locally <strong>in</strong>tegrable, and H locally absolutely<br />

cont<strong>in</strong>uous, elements. Then the <strong>in</strong>tegral equation (C.2) has a<br />

unique, locally absolutely cont<strong>in</strong>uous solution.<br />

Theorem C.2. Suppose that A depends analytically on a parameter<br />

λ, <strong>in</strong> the sense that there is a matrix A ′ (x, λ) which is locally <strong>in</strong>tegrable<br />

with respect to x, and such that <br />

1 | J h (A(x, λ+h)−A(x, λ))−A′ (x, λ)| →<br />

0 as h → 0, for all compact sub<strong>in</strong>tervals J of I, and all λ <strong>in</strong> some open<br />

set Ω ⊂ C. Then the solution u(x, λ) of (C.2) is analytic for λ ∈ Ω,<br />

locally uniformly <strong>in</strong> x.<br />

Proof of Theorem C.1. We will f<strong>in</strong>d a series expansion for the<br />

solution. To do this, we set u0 = H, and if uk is already def<strong>in</strong>ed, we set<br />

uk+1(x) = x<br />

c Auk. It is then clear that uk is def<strong>in</strong>ed for k = 0, 1, . . .<br />

<strong>in</strong>ductively, and all uk are (absolutely) cont<strong>in</strong>uous. I claim that<br />

sup|uk|<br />

≤ sup|H|<br />

[c,x] [c,x]<br />

1<br />

k!<br />

x c<br />

k |A|<br />

for k = 0, 1, . . . ,<br />

for x > c, and a similar <strong>in</strong>equality with c and x <strong>in</strong>terchanged for x < c.<br />

Here |·| denotes a norm on n-vectors, and also the correspond<strong>in</strong>g subord<strong>in</strong>ate<br />

matrix-norm (so that |Au| ≤ |A||u|). Indeed, the <strong>in</strong>equality<br />

is trivial for k = 0, and suppos<strong>in</strong>g it valid for k, we obta<strong>in</strong><br />

|uk+1(x)| ≤<br />

x<br />

c<br />

|A||uk| ≤ 1<br />

k! sup<br />

<br />

|H|<br />

[c,x]<br />

c<br />

x<br />

t k |A(t)| |A| dt<br />

=<br />

1<br />

c<br />

(k + 1)! sup<br />

[c,x]<br />

x |H|<br />

c<br />

k+1 |A| ,<br />

for c < x, and a similar <strong>in</strong>equality for x < c. It follows that the series<br />

u = ∞<br />

k=0 uk is absolutely and uniformly convergent on any compact


C. LINEAR FIRST ORDER SYSTEMS 131<br />

sub<strong>in</strong>terval of I. Therefore u is cont<strong>in</strong>uous, and<br />

∞<br />

∞<br />

x<br />

u(x) = uk(x) = H(x) + Auk<br />

k=0<br />

k=0<br />

c<br />

= H(x) +<br />

x<br />

c<br />

A<br />

∞<br />

uk = H(x) +<br />

k=0<br />

x<br />

c<br />

Au .<br />

Thus (C.2) has a solution. To prove the uniqueness, we need the follow<strong>in</strong>g<br />

lemma.<br />

Lemma C.3 (Gronwall). Suppose f ∈ C(I) is real-valued, h is a<br />

non-negative constant, and g is a locally <strong>in</strong>tegrable and non-negative<br />

function. Suppose that 0 ≤ f(x) ≤ h + | x<br />

gf| for x ∈ I. Then<br />

c<br />

f(x) ≤ h exp(| x<br />

g|) for x ∈ I.<br />

c<br />

The uniqueness of the solution of (C.2) follows directly from this.<br />

For suppose v is the difference of two solutions. Then v(x) = x<br />

c Av,<br />

so sett<strong>in</strong>g f = |v| and g = |A| we obta<strong>in</strong> 0 ≤ f(x) ≤ | x<br />

gf|. Hence<br />

c<br />

f ≡ 0 by Lemma B.3, and thus v ≡ 0, so that (C.2) has at most one<br />

solution. <br />

It rema<strong>in</strong>s to prove the lemma.<br />

Proof of Lemma C.3. We will prove the lemma for c < x, leav<strong>in</strong>g<br />

the other case as an exercise for the reader. Set F (x) = h + x<br />

c gf.<br />

Then f ≤ F and F ′ = gf so that F ′ ≤ gF . Multiply<strong>in</strong>g by the <strong>in</strong>tegrat<strong>in</strong>g<br />

factor exp(− x<br />

d<br />

g) we get c dx (F (x) exp(− x<br />

g)) ≤ 0 so that<br />

c<br />

F (x) exp(− x<br />

c g) is non-<strong>in</strong>creas<strong>in</strong>g. Thus F (x) exp(− x<br />

g) ≤ F (c) = h<br />

c<br />

for x ≥ c. We obta<strong>in</strong> f(x) ≤ F (x) ≤ h exp( x<br />

g), which was to be<br />

c<br />

proved. <br />

Proof of Theorem C.2. It is clear by their def<strong>in</strong>itions that the<br />

functions uk <strong>in</strong> the proof of Theorem C.1 are analytic <strong>in</strong> Ω as functions<br />

of λ, locally uniformly <strong>in</strong> x (this is a trivial <strong>in</strong>duction). But the solution<br />

u is the locally uniform limit, <strong>in</strong> x, λ, of the partial sums j k=1 uk. S<strong>in</strong>ce<br />

uniform limits of analytic functions are analytic, we are done.


Bibliography<br />

1. Christer Bennewitz, Symmetric relations on a <strong>Hilbert</strong> space, In Conference on the<br />

<strong>Theory</strong> of Ord<strong>in</strong>ary and Partial Differential Equations (Univ. Dundee, Dundee,<br />

1972), pages 212–218. Lecture Notes <strong>in</strong> Math., Vol. 280, Berl<strong>in</strong>, 1972. Spr<strong>in</strong>ger.<br />

2. , <strong>Spectral</strong> theory for pairs of differential operators, Ark. Mat. 15(1):33–61,<br />

1977.<br />

3. , <strong>Spectral</strong> asymptotics for Sturm-Liouville equations, Proc. London Math.<br />

Soc. (3) 59 (1989), no. 2, 294–338. MR 91b:34141<br />

4. , A uniqueness theorem <strong>in</strong> <strong>in</strong>verse spectral theory, Lecture at the 1997<br />

Birman symposium <strong>in</strong> Stockholm. Unpublished, 1997.<br />

5. , Two theorems <strong>in</strong> <strong>in</strong>verse spectral theory, Prepr<strong>in</strong>ts <strong>in</strong> Mathematical<br />

Sciences 2000:15, Lund University, 2000.<br />

6. , A proof of the local Borg-Marchenko theorem, Comm. Math. Phys. 218<br />

(2001), no. 1, 131–132. MR 2001m:34035<br />

7. , A Paley-Wiener theorem with applications to <strong>in</strong>verse spectral theory,<br />

Advances <strong>in</strong> differential equations and mathematical physics (Birm<strong>in</strong>gham, AL,<br />

2002), Contemp. Math., vol. 327, Amer. Math. Soc., Providence, RI, 2003,<br />

pp. 21–31. MR 1 991 529<br />

8. G. Borg, Uniqueness theorems <strong>in</strong> the spectral theory of y ′′ +(λ−q(x))y = 0, Proc.<br />

11th Scand<strong>in</strong>avian Congress of Mathematicians (Oslo), Johan Grundt Tanums<br />

Forlag, 1952, pp. 276–287.<br />

9. I. M. Gelfand and B. M. Levitan, On the determ<strong>in</strong>ation of a differential equation<br />

from its spectral function, Izv. Akad. Nauk SSSR 15 (1951), 309–360, English<br />

transl. <strong>in</strong> Amer. Math. Soc. Transl. Ser 2,1 (1955), 253-304.<br />

10. V. A. Marčenko, Some questions <strong>in</strong> the theory of one-dimensional second-order<br />

l<strong>in</strong>ear differential operators. I, Trudy Moskov. Mat. Obˇsč. 1 (1952), 327–340,<br />

Also <strong>in</strong> Amer. Math. Soc. Transl. (2) 101, 1-104, (1973).<br />

11. B. Simon, A new approach to <strong>in</strong>verse spectral theory, I. fundamental formalism,<br />

Annals of Math. 150 (1999), 1–29.<br />

12. H. Weyl. Über gewöhnliche Differentialgleichungen mit S<strong>in</strong>gularitäten und die<br />

zugehörigen Entwicklungen willkürlicher Funktionen. Math. Ann., 68:220–269,<br />

1910.<br />

133

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!