31.10.2014 Views

Just click here. - Bad Request - University of Southern Queensland

Just click here. - Bad Request - University of Southern Queensland

Just click here. - Bad Request - University of Southern Queensland

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

MAT2101<br />

Applied Mathematics<br />

Faculty <strong>of</strong> Sciences<br />

Electronic Study Book<br />

Written by<br />

Tony Roberts, David Mander & Tim Passmore<br />

Department <strong>of</strong> Mathematics & Computing<br />

Faculty <strong>of</strong> Sciences<br />

The <strong>University</strong> <strong>of</strong> <strong>Southern</strong> <strong>Queensland</strong>


Preface<br />

This unit brings in an emphasis upon developing applications side by side<br />

with the development <strong>of</strong> mathematical concepts and techniques. Please let<br />

us know <strong>of</strong> all errors in the study book as soon as you suspect any. This<br />

feedback will improve our unit year by year.<br />

Some parts <strong>of</strong> the unit are in the mainstream, and other parts are included for<br />

a richer picture. As you read you will see that we have endeavoured to convey<br />

the importance <strong>of</strong> the various concepts and sections. For example, concepts<br />

and formulae in the “aims” and the Summaries are the most essential. In<br />

support <strong>of</strong> this, the reading you have been asked to do has been classified by<br />

requests to “study”, “read” or “peruse” in order <strong>of</strong> decreasing importance.<br />

For your convenience we have in places suggested specific problems that you<br />

should try and send in your answers to us for feedback. These problems are<br />

a minimum that you should be able to immediately do. Our feedback will<br />

help you learn the more difficult aspects <strong>of</strong> the course. Ensure you make use


Preface<br />

<strong>of</strong> us. Send in your work by post, by fax or perhaps by e-mailing scanned<br />

work.<br />

Associated with this study guide are Matlab scripts to enhance your ability<br />

to probe the problems and concepts and thus to improve learning.<br />

As part <strong>of</strong> our commitment to the highest quality <strong>of</strong> teaching we also provide<br />

this study guide in electronic format. Note several aspects about the<br />

electronic form:<br />

• the electronic form is displayed using Adobe’s acrobat reader (with<br />

bookmarks);<br />

• for electronic convenience the page size is different and so the page<br />

numbering is totally different to the printed version;<br />

• <strong>click</strong>able links allow rapid navigation around the electronic document<br />

to make it easier to connect widespread parts <strong>of</strong> the unit;<br />

• and some links to outside material have also been encoded.<br />

Information about mathematical figures in the history <strong>of</strong> the topics have been<br />

gleaned from various sources including<br />

http://www-groups.dcs.st-and.ac.uk/~history/index.html<br />

Reading 0.A Now read Chapters 1 and 2 in Kreyszig to refresh your memory<br />

<strong>of</strong> aspects <strong>of</strong> differential equations that were introduced in your<br />

previous mathematics.<br />

iii


Table <strong>of</strong> Contents<br />

Preface<br />

ii<br />

I Modelling dynamics with differential equations 1<br />

1 Systems <strong>of</strong> differential equations 4<br />

2 Scientists must write 43<br />

3 Describing the conservation <strong>of</strong> material 84<br />

4 The dynamics <strong>of</strong> momentum 113


Table <strong>of</strong> Contents<br />

II Structure, algebra and approximation <strong>of</strong> applied<br />

functions 126<br />

v<br />

5 The nature <strong>of</strong> infinite series 129<br />

6 Series solutions <strong>of</strong> differential equations give special functions<br />

185<br />

7 Linear transforms and their eigenvectors on inner product<br />

spaces 252


Part I<br />

Modelling dynamics with<br />

differential equations


Part contents<br />

1 Systems <strong>of</strong> differential equations 4<br />

1.1 Systems <strong>of</strong> linear differential equations . . . . . . . . . . . . . 7<br />

1.2 Qualitative solution <strong>of</strong> nonlinear, first-order systems <strong>of</strong> ode’s . 23<br />

1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />

2 Scientists must write 43<br />

2.1 Basics <strong>of</strong> mathematical writing . . . . . . . . . . . . . . . . . 45<br />

2.2 L A TEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

3 Describing the conservation <strong>of</strong> material 84<br />

3.1 Eulerian description <strong>of</strong> motion . . . . . . . . . . . . . . . . . . 86<br />

3.2 Conservation <strong>of</strong> mass . . . . . . . . . . . . . . . . . . . . . . . 96


PART CONTENTS 3<br />

3.3 Car traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100<br />

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112<br />

4 The dynamics <strong>of</strong> momentum 113<br />

4.1 Conservation <strong>of</strong> momentum . . . . . . . . . . . . . . . . . . . 115<br />

4.2 Dynamics <strong>of</strong> ideal gases . . . . . . . . . . . . . . . . . . . . . 119<br />

4.3 Equations <strong>of</strong> quasi-one-dimensional blood flow . . . . . . . . . 124<br />

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125


Module 1<br />

Systems <strong>of</strong> differential<br />

equations<br />

In the 17th century Isaac Newton published his famous universal laws <strong>of</strong><br />

motion which, in essence, showed how physical systems could be described<br />

by differential equations. The motions <strong>of</strong> planets, falling apples, billiard<br />

balls and flying arrows could all be described in terms <strong>of</strong> the forces acting to<br />

produce changes in motion.<br />

During the last 25 years we have seen how scientists using these same laws<br />

<strong>of</strong> motion, and computers to solve complex systems <strong>of</strong> differential equations,<br />

have been able to navigate the Voyager spacecraft, with amazing precision,


Module 1. Systems <strong>of</strong> differential equations 5<br />

to rendezvous in space with Jupiter, Saturn and the outer planets <strong>of</strong> our<br />

solar system. Given the governing differential equations and a set <strong>of</strong> initial<br />

conditions, the future motion can be predicted.<br />

In this module we use differential equations to model physical systems and<br />

describe and predict their behaviour under a variety <strong>of</strong> conditions.<br />

Module contents<br />

1.1 Systems <strong>of</strong> linear differential equations . . . . . . . . 7<br />

1.1.1 Case study: the motion <strong>of</strong> a mass on a spring . . . . . . 7<br />

1.1.2 Conversion <strong>of</strong> the order <strong>of</strong> differential equations . . . . . 8<br />

1.1.3 The phase plane and phase portrait <strong>of</strong> the mass-spring<br />

system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

1.1.4 Trajectories in the phase plane <strong>of</strong> a linear system . . . . 14<br />

1.1.5 Classification and stability <strong>of</strong> fixed points . . . . . . . . 17<br />

1.2 Qualitative solution <strong>of</strong> nonlinear, first-order systems<br />

<strong>of</strong> ode’s . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

1.2.1 Linearisation using the Jacobian . . . . . . . . . . . . . 35<br />

1.2.2 Answers to selected Exercises . . . . . . . . . . . . . . . 40<br />

1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />

The text for this module is Chapter 3 in Kreyszig Advanced Engineering<br />

Mathematics, 8th ed, Wiley. References to the text use the format [K,reference].


Module 1. Systems <strong>of</strong> differential equations 6<br />

Main aims:<br />

• to write differential equations as a system <strong>of</strong> first-order differential<br />

equations;<br />

• classify general solutions near any fixed point or equilibrium;<br />

• predict the qualitative nature <strong>of</strong> solutions near fixed points or equilibria;<br />

• introduce the technique <strong>of</strong> linearisation;<br />

• to patch together the pictures near each fixed point to obtain a global<br />

understanding <strong>of</strong> the solutions.


Module 1. Systems <strong>of</strong> differential equations 7<br />

1.1 Systems <strong>of</strong> linear differential equations<br />

You have solved some ordinary differential equations (ode’s) in first year<br />

mathematics; these differential equations and their solutions are <strong>of</strong>ten used<br />

to describe the motion <strong>of</strong> some mechanical or otherwise evolving system.<br />

For example, the motion <strong>of</strong> a mass on a spring is discussed briefly next in<br />

§1.1.1. However, for many purposes it is much better to recast a differential<br />

equation as a system <strong>of</strong> first-order differential equations. For example, this<br />

is necessary to analyse “chaos.” 1 In this section we lay the foundations for<br />

the analysis <strong>of</strong> system <strong>of</strong> differential equations.<br />

1.1.1 Case study: the motion <strong>of</strong> a mass on a spring<br />

Kreyszig shows [K,pp158–9] that the motion <strong>of</strong> a mass attached to a spring, if<br />

t<strong>here</strong> are no friction or damping forces, is governed by the single second-order<br />

ode<br />

my ′′ = −ky (1.1)<br />

w<strong>here</strong> y = y(t) is the displacement at time t <strong>of</strong> the mass from its rest position<br />

w<strong>here</strong> the spring is unstretched, y ′ = dy/dt and k and m are constants,<br />

m being mass and k describing the ‘stiffness’ <strong>of</strong> the spring.<br />

1 The topic <strong>of</strong> chaos is explored in the fourth year course mat4102.<br />

This equation<br />

comes directly from<br />

Newton’s Second<br />

Law that<br />

applied force =<br />

mass × acceleration.<br />

The minus sign says<br />

that the force <strong>of</strong> the<br />

spring opposes the<br />

motion <strong>of</strong> the mass<br />

and acceleration is


Module 1. Systems <strong>of</strong> differential equations 8<br />

From first-year mathematics we know that this ode may be re-written as:<br />

y ′′ + k m y = 0 ,<br />

and its general solution is<br />

√ √<br />

k k<br />

y(t) = A 1 cos<br />

m t + A 2 sin<br />

m t , (1.2)<br />

for constants A 1 and A 2 . This solution describes √ an unending oscillation in<br />

time with constant angular frequency ω = k/m.<br />

1.1.2 Conversion <strong>of</strong> the order <strong>of</strong> differential equations<br />

To illustrate the approach we take in general, the second-order differential<br />

equation (1.1) describing a spring is <strong>here</strong> re-written as a system <strong>of</strong> two firstorder<br />

equations.<br />

Introduce two new variables y 1 and y 2 and put<br />

y 1 = y , and y 2 = y ′ .<br />

Then using (1.1) we also describe the motion <strong>of</strong> the spring by the first-order<br />

system<br />

y ′ 1 = y 2 ,<br />

y ′ 2 = − k m y 1 .


Module 1. Systems <strong>of</strong> differential equations 9<br />

√<br />

In matrix form, with ω = k/m,<br />

w<strong>here</strong><br />

[<br />

y<br />

′<br />

1<br />

y ′ 2<br />

]<br />

y =<br />

=<br />

[<br />

[<br />

y1<br />

0 1<br />

−ω 2 0<br />

y 2<br />

]<br />

] [<br />

y1<br />

y 2<br />

]<br />

and A =<br />

[<br />

or y ′ = Ay . (1.3)<br />

0 1<br />

−ω 2 0<br />

Similarly, many higher order differential equations are reduced to first-order<br />

systems.<br />

]<br />

.<br />

It is convenient to<br />

use the angular<br />

frequency ω instead<br />

<strong>of</strong> k/m in the<br />

matrix formulation.<br />

Reading 1.A Kreyszig Chapter 3: read §3.0, then study §3.1 [K,p152–8] on<br />

modelling with systems <strong>of</strong> differential equations and their solutions.<br />

Example 1.1: rewrite the following ode as a first-order system:<br />

y ′′′ + 7y ′′ − 4y ′ + 8y = 0 .<br />

Solution:<br />

Define new variables:<br />

y 1 = y , y 2 = y ′ and y 3 = y ′′ ,


Module 1. Systems <strong>of</strong> differential equations 10<br />

then<br />

y ′ 1 = y 2 ,<br />

y ′ 2 = y 3 ,<br />

y ′ 3 = −8y 1 + 4y 2 − 7y 3 ,<br />

which is written in matrix form as<br />

⎡<br />

y ′ ⎤ ⎡<br />

⎤ ⎡ ⎤<br />

1 0 1 0 y 1<br />

⎢<br />

⎣ y 2<br />

′ ⎥ ⎢<br />

⎥ ⎢ ⎥<br />

⎦ = ⎣ 0 0 1 ⎦ ⎣ y 2 ⎦ .<br />

y 3<br />

′ −8 4 −7 y 3<br />

Exercise 1.2:<br />

Convert the following to first-order systems:<br />

(a) y ′′′ + 12y ′′ − 5y ′ + 11y = 0;<br />

(b) y ′′ + αy ′ + cy = 0, with α and c constants;<br />

(c) y ′′′′ + 7y ′′ = 9y.<br />

Activity 1.B Do problems from §3.1 [K,p158] and Exercise 1.2 above. Send<br />

in to the examiner for feedback at least Q9.<br />

Reading 1.C Read §3.2 [K,p159–161] for some background theory.


Module 1. Systems <strong>of</strong> differential equations 11<br />

1.1.3 The phase plane and phase portrait <strong>of</strong> the massspring<br />

system<br />

The point <strong>of</strong> writing the spring equation (1.1) as the two dimensional matrix<br />

system (1.3) is that we now have a 2-D description <strong>of</strong> the motion <strong>of</strong> the mass<br />

in terms <strong>of</strong> its position, y 1 = y, and its velocity, y 2 = y ′ . By plotting y 2<br />

against y 1 a graph, known as a phase portrait, <strong>of</strong> the motion <strong>of</strong> the mass on<br />

the spring is made. At each point in the phase plane, illustrated by the little<br />

pictures in Figure 1.1, the mass-spring system has a specific combination <strong>of</strong><br />

extension and velocity.<br />

At each time, t, a single point on the phase plane is plotted corresponding<br />

to the position and velocity <strong>of</strong> the mass. Over time the system traverses a<br />

path in the phase plane, known as a trajectory or an orbit. For a given set<br />

<strong>of</strong> initial conditions a trajectory for the mass might appear as in Figure 1.1.<br />

T<strong>here</strong> are two points, corresponding to the left/right extremes <strong>of</strong> the ellipse,<br />

w<strong>here</strong> the velocity y 2 = 0 and the displacement y 1 is extreme, meaning that<br />

the mass is instantaneously at rest and the spring has reached maximum<br />

compression/extension. A moment later the mass has changed direction and<br />

is picking up speed; at the top/bottom extremes <strong>of</strong> the ellipse the mass is<br />

moving through y 1 = 0 w<strong>here</strong> the spring is unstretched and the speed is maximal.<br />

At other times the velocity and displacement have values intermediate<br />

between these extremes.<br />

Since t<strong>here</strong> is no friction (an ideal case) the motion just keeps repeating itself<br />

indefinitely.


Module 1. Systems <strong>of</strong> differential equations 12<br />

y 2<br />

=y'<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

-0.2<br />

-0.4<br />

-0.6<br />

-0.8<br />

-1<br />

-1 -0.5 0 0.5 1 1.5<br />

y =y 1<br />

Figure 1.1: mass spring phase plane showing that at each point in the phase<br />

plane a little picture displays the unique state <strong>of</strong> the system quantified by<br />

its position y = y 1 and velocity y ′ = y 2 . The green ellipse shows a possible<br />

orbit or trajectory <strong>of</strong> the mass-spring system, the path through the states,<br />

over time.


Module 1. Systems <strong>of</strong> differential equations 13<br />

Example 1.3: We show all this by solving (1.3) which is a homogeneous linear<br />

system with constant coefficient matrix, A. Kreyszig shows [K,p163,<br />

Theorem 1] that the general solution will be <strong>of</strong> the form<br />

y = c 1 x (1) e λ 1t + c 2 x (2) e λ 2t<br />

(1.4)<br />

w<strong>here</strong> c j are arbitrary complex constants, λ j are the eigenvalues <strong>of</strong> A<br />

and x (j) the corresponding eigenvectors. The characteristic equation<br />

is:<br />

λ −1<br />

det(λI − A) =<br />

∣ ω 2<br />

λ ∣ = λ2 + ω 2 = 0<br />

yielding λ 1 = iω and λ 2 = −iω. For the eigenvectors solve<br />

[ ] [ ] [ ]<br />

λ −1 v1 0<br />

ω 2<br />

=<br />

λ v 2 0<br />

to get<br />

and<br />

x (1) =<br />

x (2) =<br />

[<br />

[<br />

1<br />

iω<br />

1<br />

−iω<br />

]<br />

]<br />

for λ = λ 1 = iω ,<br />

So the general solution to the system is:<br />

[ ] [ ] [<br />

y1 1<br />

= c<br />

y 1 e iωt + c<br />

2 iω<br />

2<br />

for λ = λ 2 = −iω .<br />

1<br />

−iω<br />

]<br />

e −iωt . (1.5)


Module 1. Systems <strong>of</strong> differential equations 14<br />

Exercise 1.4: Show by choosing<br />

c 1 = 1 2 (A 1 − iA 2 ) and c 2 = 1 2 (A 1 + iA 2 )<br />

in the general solution above, that you recover the solution (1.2).<br />

Normally, the constants c 1 and c 2 will be chosen so that the solution (1.5) is<br />

real, in which case the plot <strong>of</strong> y 1 versus y 2 or <strong>of</strong> (1.2) and its derivative will<br />

generate an ellipse.<br />

1.1.4 Trajectories in the phase plane <strong>of</strong> a linear system<br />

The big advantage <strong>of</strong> the phase plane is that we qualitatively see how the<br />

dynamics <strong>of</strong> a system will evolve. For example, in the mass-spring system<br />

we know<br />

y ′ 1 = y 2 and y ′ 2 = −ω2 y 1 .<br />

That is the rate <strong>of</strong> change <strong>of</strong> the position vector y is (y 2 , −ω 2 y 1 ). 2 Thus<br />

at each point in the phase plane we can tell the direction that the system<br />

evolves by drawing an arrow as in the following plot.<br />

2 We use the row-vector in parentheses, such as (y 1 , y 2 , . . . , y n ) to denote the corresponding<br />

column vector.


Module 1. Systems <strong>of</strong> differential equations 15<br />

1<br />

[y1,y2]=meshgrid(linspace(-1,1,7));<br />

u=y2; v=-0.88*y1;<br />

quiver(y1,y2,u,v)<br />

y 2<br />

=y'<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-1.5 -1 -0.5 0 0.5 1 1.5<br />

y 1<br />

=y<br />

% then try the following code to<br />

% simulate evolution <strong>of</strong> this DE<br />

hold on<br />

y1=0.05;y2=0.6;dt=0.01;<br />

pt = plot(y1,y2,’r*’,’erase’,’xor’);<br />

drawnow<br />

for t=dt:dt:20<br />

dy1=y2; dy2=-0.88*y1;<br />

y1=y1+dt*dy1; y2=y2+dt*dy2;<br />

set(pt,’xdata’,y1,’ydata’,y2)<br />

end<br />

The green curve shows the trajectory taken by the system: the set <strong>of</strong> states<br />

it goes through as time evolves. See how the evolution arrows are tangent<br />

to the trajectory as they must point along the direction <strong>of</strong> evolution. In this<br />

subsection we look at the few different sorts <strong>of</strong> pictures generically seen in<br />

two-dimensions.<br />

Reading 1.D Study Kreyszig §3.3 [K,pp162–9]: take note <strong>of</strong> the phase<br />

plane pictures in Fig. 78–82 [K,pp165–6], and ignore the irrelevant distinction<br />

between “improper node” and a “proper node.”.


Module 1. Systems <strong>of</strong> differential equations 16<br />

Exercise 1.5: Some systems <strong>of</strong> differential equations evolve according to<br />

the vectors plotted in 2-D below. For each system, visualise the trajectories<br />

<strong>of</strong> the system and classify the origin at the centre point as either<br />

a node, saddle, centre or spiral point.<br />

(a) (b) (c)<br />

(d) (e) (f)


Module 1. Systems <strong>of</strong> differential equations 17<br />

Activity 1.E Do exercises in Problem Set 3.3 [K,pp169–170] and Exercise<br />

1.5 above. Send in to the examiner for feedback at least Q1 &<br />

10.<br />

1.1.5 Classification and stability <strong>of</strong> fixed points<br />

These pictures <strong>of</strong> the dynamics near the origin allow us to answer very important<br />

qualitative questions about the solutions <strong>of</strong> differential equations.<br />

In application, we principally concern ourselves with things that can be observed.<br />

Thus we need to predict what may be observed and what cannot<br />

be observed. This is expressed via the notion <strong>of</strong> stability. Loosely, a fixed<br />

point (or critical point), the origin for linear systems, is stable if all nearby<br />

solutions stay nearby for all time and thus could be observed, a pendulum<br />

hanging downwards for example; w<strong>here</strong>as a critical point is unstable if at least<br />

one nearby solution escapes from the neighbourhood <strong>of</strong> the critical point and<br />

thus cannot be expected to be observed because we expect the escape to be<br />

found, a pencil is impossible to balance on its sharp tip for example.<br />

Reading 1.F Study §3.4 [K,pp170–5], especially the definition <strong>of</strong> stability<br />

and its consequences.<br />

Kreyszig, as do many texts, writes the conditions for stability and classifications<br />

in terms <strong>of</strong> the coefficients <strong>of</strong> the characteristic polynomial. While


Module 1. Systems <strong>of</strong> differential equations 18<br />

this may be slightly more convenient in 2-D, it is usually easier to remember<br />

the conditions directly in terms <strong>of</strong> the eigenvalues. This is for two reasons:<br />

the classification then proceeds systematically to higher dimensions; and it<br />

is easy to remember the details because the dynamics are simply those <strong>of</strong><br />

exp(λ j t).<br />

Thus we urge you to classify the fixed points <strong>of</strong> two-dimensional linear systems<br />

according to the eigenvalues <strong>of</strong> their coefficient matrix, A. The results<br />

are summarised in this table.<br />

Eigenvalues λ j Condition Fixed point (0, 0)<br />

R(λ j ) = 0 for j = 1, 2 stable centre<br />

Complex R(λ j ) > 0 for j = 1, 2 unstable spiral<br />

(R denotes real part) R(λ j ) < 0 for j = 1, 2 stable spiral<br />

λ 1 , λ 2 > 0 unstable node<br />

Real λ 1 , λ 2 < 0 stable node<br />

λ 1 < 0 < λ 2 unstable saddle<br />

However, cases not covered by the above table, the so-called degenerate cases,<br />

have to be considered on their own merits.<br />

Activity 1.G Do Problem Set 3.4 [K,p174–5]. Send in to the examiner for<br />

feedback at least Q2, 4 & 14.


Module 1. Systems <strong>of</strong> differential equations 19<br />

In higher dimensions the stability <strong>of</strong> a fixed point is most easily expressed in<br />

terms <strong>of</strong> the eigenvalues <strong>of</strong> the corresponding coefficient matrix. Based upon<br />

the generic solution [K,p163]<br />

y = c 1 x (1) e λ 1t + · · · + c n x (n) e λnt ,<br />

and the behaviour <strong>of</strong> exp(λ j t) we deduce:<br />

• the fixed point y = 0 is unstable if R(λ j ) > 0 for at least one j as<br />

then at least that component exp(λ j t) in the solution will grow and<br />

lead solutions away from the fixed point;<br />

• the fixed point y = 0 is stable if R(λ j ) ≤ 0 for all j as then all the<br />

components exp(λ j t) in the solution will decay or just oscillate;<br />

• unless the exceptional degenerate case occurs w<strong>here</strong> R(λ j ) ≤ 0 for all Note: the case<br />

j but two or more pairs <strong>of</strong> eigenvalues with R(λ j ) = 0 also have equal<br />

imaginary part, say ω, when the general solution will have the growing<br />

component cte iωt from the degeneracy [K,p167] to cause the fixed point<br />

to be unstable.<br />

Eigenvalues in two or three dimensional problems may be calculated by hand.<br />

In higher dimensions we typically resort to computer numerics.<br />

Example 1.6: Determine the stability <strong>of</strong> the origin in the three-dimensional<br />

systems <strong>of</strong> Problems 7–9 in Problem Set 3.3 [K,p169].<br />

R(λ j ) = 0 is very<br />

delicate as it is<br />

exactly on the<br />

dividing line<br />

between stability<br />

and instability, and<br />

hence any small<br />

effect will tip the<br />

dynamics from one<br />

to the other.


Module 1. Systems <strong>of</strong> differential equations 20<br />

Solution:<br />

use Matlab to compute the eigenvalues as follows:<br />

>>a7=[10 -10 -4;-10 1 -14;-4 -14 -2];<br />

>>eig(a7)<br />

ans =<br />

18.0000<br />

9.0000<br />

-18.0000<br />

>>a8=[-3 -1 2; 0 -4 2; 0 1 -5];<br />

>>eig(a8)<br />

ans =<br />

-3<br />

-3<br />

-6<br />

>>a9=[-1 -4 2;2 5 -1;2 2 2];<br />

>>eig(a9)<br />

ans =<br />

-0.0000<br />

3.0000<br />

3.0000<br />

Thus (<strong>here</strong> all the eigenvalues are real) the origin is:<br />

• unstable in Problem 7 as at least one eigenvalue (<strong>here</strong> two) is<br />

positive;


Module 1. Systems <strong>of</strong> differential equations 21<br />

• stable in Problem 8 as all eigenvalues are negative (the multiple<br />

eigenvalue −3 introduces the component te −3t but this still decays);<br />

• unstable in Problem 9 as at least one eigenvalue is positive.<br />

Exercise 1.7: A function y(t) is governed by the third-order equation:<br />

y ′′′ + 5y ′′ − 2y ′ − 6y = 0 .<br />

(a) By introducing appropriate variables show how this can be expressed<br />

as a linear system <strong>of</strong> first-order ode’s.<br />

(b) Write down the general solution <strong>of</strong> the system.<br />

(c) Describe the nature <strong>of</strong> the fixed point at (0, 0, 0).<br />

Exercise 1.8: Prepare a phase plane diagram for the following system:<br />

dx<br />

dt<br />

dy<br />

dt<br />

= −x − 3y ,<br />

= 2x − 3y .


Module 1. Systems <strong>of</strong> differential equations 22<br />

(a) Find the real solution to this system when x(0) = 4 and y(0) = 1.<br />

(b) Sketch on your phase plane the solution curve above.


Module 1. Systems <strong>of</strong> differential equations 23<br />

1.2 Qualitative solution <strong>of</strong> nonlinear, first-order<br />

systems <strong>of</strong> ode’s<br />

Systems which have complex or physically interesting behaviour are governed<br />

by nonlinear differential equations. It is usually impossible to solve such<br />

equations algebraically, but phase portraits can give a rough overview <strong>of</strong><br />

what solutions look like. Near each fixed point <strong>of</strong> the system, the solution<br />

is dominated by the linear terms in the differential equations and so for<br />

each fixed point one <strong>of</strong> the pictures examined in the previous section applies.<br />

After considering each fixed point, all the little pictures are reasonably joined<br />

together to give a global overview <strong>of</strong> the solutions.<br />

The techniques you study <strong>here</strong> will be developed further in later modules.<br />

Reading 1.H Study §3.5 [K,pp180–6] and all its examples as the understanding<br />

<strong>of</strong> this section is the main purpose <strong>of</strong> this module.<br />

Note the main steps used in the analysis <strong>of</strong> nonlinear systems:<br />

• set up a mathematical model, converting to a system <strong>of</strong> first-order<br />

differential equations if necessary;<br />

• determine all the fixed points (critical points) <strong>of</strong> the system;


Module 1. Systems <strong>of</strong> differential equations 24<br />

• use linearisation to examine the dynamics near each fixed point;<br />

• fill-in the trajectories in phase space in a sensible way.<br />

This last step is <strong>of</strong>ten aided by determining isoclines: the curves in the phase<br />

plane w<strong>here</strong> trajectories have constant slope.<br />

Example 1.9: Prepare a phase plane diagram <strong>of</strong> the following system: locate<br />

its fixed points and determine the nature <strong>of</strong> each fixed point by<br />

linearisation. Find the approximate linear solution near each <strong>of</strong> the<br />

fixed points. Find some isoclines. Sketch in trajectories.<br />

dx<br />

dt<br />

dy<br />

dt<br />

= 3x − xy<br />

= y − x 2 y .<br />

Aside: the whole phase plot is easy enough to do with Matlab: below<br />

is the sort <strong>of</strong> picture we will work towards. However, we use mathematical<br />

analysis.


Module 1. Systems <strong>of</strong> differential equations 25<br />

5<br />

4<br />

3<br />

y<br />

2<br />

1<br />

0<br />

[x y]=meshgrid(-2:.4:2,-.8:.4:5);<br />

dx=3*x-x.*y;<br />

dy=y-x.*x.*y;<br />

-1<br />

-3 -2 -1 0 1 2 3 quiver(x,y,dx,dy);<br />

x<br />

Solution:<br />

• It is useful to know w<strong>here</strong> the fixed points are before we do the<br />

phase plot. So set the right-hand sides to zero.<br />

3x − xy = x(3 − y) = 0 ⇒ x = 0 or y = 3<br />

y − x 2 y = y(1 − x 2 ) = 0 ⇒ y = 0 or x = ±1<br />

Putting x = 0 from the first equation forces y = 0 from the<br />

second, w<strong>here</strong>as y = 3 from the first equation forces x = ±1<br />

from the second. So t<strong>here</strong> are three fixed points: (0, 0), (1, 3) and


Module 1. Systems <strong>of</strong> differential equations 26<br />

(−1, 3). We should make sure the phase plot includes these points<br />

as shown below.<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

y<br />

1.5<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-3 -2 -1 0 1 2 3<br />

x<br />

• Consider each <strong>of</strong> the fixed points in turn.<br />

– To linearise near the fixed point (1, 3), make the change <strong>of</strong><br />

variable:<br />

x = 1 + X(t) , and y = 3 + Y (t)


Module 1. Systems <strong>of</strong> differential equations 27<br />

w<strong>here</strong> X(t) and Y (t) are small. Then x ′ = X ′ , y ′ = Y ′ and<br />

X ′ = 3(1 + X) − (1 + X)(3 + Y )<br />

= −Y − XY ,<br />

Y ′ = (3 + Y ) − (1 + X) 2 (3 + Y )<br />

= −6X − 3X 2 − 2XY − X 2 Y .<br />

Since X and Y are small, all the nonlinear quadratic and cubic<br />

terms in X and Y are negligible compared to the linear terms<br />

and we approximate as the linear system<br />

[ ] [<br />

X<br />

′<br />

Y ′ ≈<br />

0 −1<br />

−6 0<br />

] [<br />

X<br />

Y<br />

]<br />

. (1.6)<br />

The coefficient matrix has eigenvalues λ 1<br />

− √ 6. Thus (1, 3) is a saddle point.<br />

= √ 6 and λ 2 =<br />

– Similarly near (−1, 3),<br />

X ′ = 3(−1 + X) − (−1 + X)(3 + Y )<br />

= Y − XY ,<br />

Y ′ = (3 + Y ) − (−1 + X) 2 (3 + Y )<br />

= 6X − 3X 2 + 2XY − X 2 Y ,<br />

linearises to [ ] [<br />

X<br />

′ 0 1<br />

Y ′ ≈<br />

6 0<br />

] [<br />

X<br />

Y<br />

]<br />

.


Module 1. Systems <strong>of</strong> differential equations 28<br />

The eigenvalues are again ± √ 6, so (−1, 3) is also a saddle<br />

point.<br />

– Lastly, near (0, 0), x and y are small, so ignoring nonlinear<br />

terms in these original variables we have:<br />

[ ] [<br />

x<br />

′ 3 0<br />

y ′ ≈<br />

0 1<br />

] [<br />

x<br />

y<br />

which has eigenvalues 1 and 3, and hence (0, 0) is an unstable<br />

node.<br />

• To predict trajectories it is most important to explore the neighbourhood<br />

<strong>of</strong> each fixed point.<br />

– Here the simplest fixed point is the origin (0, 0) as its linearisation<br />

(see above) is simply<br />

]<br />

x ′ = 3x and y ′ = y .<br />

Immediately write down the general solution <strong>of</strong> each <strong>of</strong> these<br />

basic differential equations separately:<br />

x = c 1 e 3t and y = c 2 e t .<br />

,<br />

Writing this in vector notation,<br />

[ ] [ ] [<br />

x 1 0<br />

= c<br />

y 1 e 3t + c<br />

0<br />

2<br />

1<br />

]<br />

e t ,


Module 1. Systems <strong>of</strong> differential equations 29<br />

see the eigenvectors are (1, 0) and (0, 1) corresponding to the<br />

eigenvalues 3 and 1 respectively. Thus in the direction (1, 0)<br />

solutions grow three times faster than in the (0, 1) direction.<br />

Hence we draw the little local picture below.<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

y<br />

1.5<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-3 -2 -1 0 1 2 3<br />

x<br />

– To find approximate solutions near (±1, 3) simultaneously we<br />

need to find the eigenvectors <strong>of</strong> the coefficient matrix (being


Module 1. Systems <strong>of</strong> differential equations 30<br />

careful with the sign <strong>of</strong> x):<br />

A =<br />

[<br />

0 ∓1<br />

∓6 0<br />

]<br />

.<br />

So solve:<br />

(λI − A)v =<br />

[<br />

λ ±1<br />

±6 λ<br />

] [ ] [<br />

v1 0<br />

=<br />

v 2 0<br />

]<br />

,<br />

which yields eigenvectors<br />

[ ]<br />

∓1<br />

v (1) = √ for λ 6 1 = √ 6 ,<br />

and<br />

v (2) =<br />

[ ]<br />

√ ±1<br />

6<br />

for λ 2 = − √ 6 .<br />

The linear solution <strong>of</strong> (1.6) near (±1, 3) will be [K,p163, Theorem<br />

1]:<br />

[<br />

X<br />

Y<br />

]<br />

[ ] [ ]<br />

∓1<br />

= c 1<br />

√ e √ ±1<br />

6t + c 6 2<br />

√ e −√6t , 6<br />

or, in terms <strong>of</strong> the original variables,<br />

[ ] [ ] [ ]<br />

x ±1 ∓1<br />

= + c<br />

y 3 1<br />

√ e √ [ ]<br />

±1<br />

6t + c 6 2<br />

√ e −√6t , 6


Module 1. Systems <strong>of</strong> differential equations 31<br />

w<strong>here</strong> c 1 and c 2 are arbitrary constants. Thus for these two<br />

saddle points t<strong>here</strong> will be exponential growth in the direction<br />

(∓1, √ 6), towards the top-left and bottom-right for the fixed<br />

point (1, 3), and decay in the direction (±1, √ 6), the top-right<br />

and bottom-left directions for (1, 3). Thus we sketch in the<br />

local pictures shown below for the two fixed points (±1, 3).<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

y<br />

1.5<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-3 -2 -1 0 1 2 3<br />

x<br />

• The isoclines help fill in the picture. The two easiest isoclines are


Module 1. Systems <strong>of</strong> differential equations 32<br />

w<strong>here</strong> the trajectories are horizontal, slope zero, obtained by finding<br />

w<strong>here</strong> y ′ = 0, and w<strong>here</strong> the trajectories are vertical, infinite<br />

slope, obtained by finding w<strong>here</strong> x ′ = 0.<br />

– y ′ = y − x 2 y = 0 whenever y = 0 or x = ±1. Thus all<br />

trajectories are vertical when they cross the three red dotdashed<br />

lines shown below.<br />

– x ′ = 3x − xy = 0 whenever x = 0 or y = 3. Thus all trajectories<br />

are horizontal when they cross the two magenta dashed<br />

lines plotted below.


Module 1. Systems <strong>of</strong> differential equations 33<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

y<br />

1.5<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-3 -2 -1 0 1 2 3<br />

x<br />

• Lastly, use all this information to qualitatively sketch in trajectories<br />

such as the solid green lines shown next.


Module 1. Systems <strong>of</strong> differential equations 34<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

y<br />

1.5<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-3 -2 -1 0 1 2 3<br />

x<br />

Activity 1.I Do exercises from Problem Set 3.5 [K,p183]. Send in to the<br />

examiner for feedback at least Q5 & 8.


Module 1. Systems <strong>of</strong> differential equations 35<br />

The qualitative methods developed <strong>here</strong> generalise to higher dimensional<br />

systems, but it is much harder to “fill in” the phase space because <strong>of</strong> the<br />

intricate contortions permitted for trajectories in dimensions higher than<br />

two. That is why chaos (in its mathematical sense) only occurs in differential<br />

systems with three or more components.<br />

1.2.1 Linearisation using the Jacobian<br />

When exploring the dynamics <strong>of</strong> systems we analyse the linear dynamics<br />

in the neighbourhood <strong>of</strong> each fixed point. Previously we found the linear<br />

dynamics by a change <strong>of</strong> variable and subsequent neglect <strong>of</strong> nonlinear terms.<br />

This is a useful technique but a little laborious in straightforward situations.<br />

The alternative we explore <strong>here</strong> is to obtain the matrix <strong>of</strong> coefficients simply<br />

by evaluating the “derivative” <strong>of</strong> the differential system (the Jacobian).<br />

Consider a nonlinear, first-order system <strong>of</strong> ode’s <strong>of</strong> the form<br />

x ′ = f(x, y)<br />

y ′ = g(x, y)<br />

w<strong>here</strong> we have taken a 2-D example, but the extension to an n-dimensional<br />

system is straightforward. Also assume that the system is autonomous, i.e.<br />

t<strong>here</strong> is no explicit t-dependence on the right-hand sides. Suppose that<br />

(x 0 , y 0 ) is a fixed point for the system, i.e. f(x 0 , y 0 ) = g(x 0 , y 0 ) = 0, and


Module 1. Systems <strong>of</strong> differential equations 36<br />

consider a point (x, y) nearby. We now appeal to Taylor’s theorem in twodimensions<br />

that we will explore in detail in a later module, §5.4. Taylor’s<br />

theorem allows us to approximate f and g in the neighbourhood <strong>of</strong> the fixed<br />

point as<br />

f(x, y) ≈ f(x 0 , y 0 ) + (x − x 0 ) ∂f<br />

+ (y − y<br />

∂x ∣ 0 ) ∂f<br />

,<br />

(x0 ,y 0<br />

∂y ∣<br />

)<br />

(x0 ,y 0 )<br />

g(x, y) ≈ g(x 0 , y 0 ) + (x − x 0 ) ∂g<br />

∣ + (y − y 0 ) ∂g<br />

∣ .<br />

∂x<br />

∂y<br />

∣ (x0 ,y 0 )<br />

∣ (x0 ,y 0 )<br />

Since (x 0 , y 0 ) is a fixed point f(x 0 , y 0 ) = g(x 0 , y 0 ) = 0; so in the neighbourhood<br />

<strong>of</strong> the fixed point the evolution is governed by the linear system<br />

x ′ ≈ ∂f<br />

(x − x<br />

∂x ∣ 0 ) + ∂f<br />

(y − y<br />

(x0 ,y 0<br />

∂y ∣ 0 ) ,<br />

)<br />

(x0 ,y 0 )<br />

y ′ ≈ ∂g<br />

∣ (x − x 0 ) + ∂g<br />

∣ (y − y 0 ) .<br />

∂x<br />

∂y<br />

∣ (x0 ,y 0 )<br />

Thus, making the change <strong>of</strong> variable:<br />

the linearised system is then<br />

∣ (x0 ,y 0 )<br />

x = x 0 + X(t) ⇒ x ′ = X ′ ,<br />

y = y 0 + Y (t) ⇒ y ′ = Y ′ ,<br />

[<br />

X<br />

′<br />

Y ′ ]<br />

≈<br />

⎡<br />

∂f<br />

⎣ ∂x<br />

∂g<br />

∂x<br />

⎤<br />

∂f [ ]<br />

∂y ⎦ X<br />

∂g Y<br />

∂y


Module 1. Systems <strong>of</strong> differential equations 37<br />

w<strong>here</strong> all the derivatives in the matrix are evaluated at the fixed point (x 0 , y 0 ).<br />

A common notation for the matrix appearing <strong>here</strong> is<br />

⎡ ⎤<br />

∂f ∂f<br />

∂(f, g)<br />

J(x, y) =<br />

∂(x, y) = ⎣ ∂x ∂y ⎦ ,<br />

which is called the Jacobian 3 and must be evaluated at the fixed point in<br />

question. This Jacobian matrix evaluated at a fixed point is the matrix <strong>of</strong><br />

coefficients <strong>of</strong> the linearised dynamics and hence its eigenvalues and eigenvectors<br />

determine the stability and classification <strong>of</strong> the fixed point.<br />

Example 1.10: Use the Jacobian in Example 1.9.<br />

worked example we had<br />

so the Jacobian is<br />

∂g<br />

∂x<br />

∂g<br />

∂y<br />

ẋ = f(x, y) = 3x − xy<br />

ẏ = g(x, y) = y − x 2 y<br />

[ ]<br />

∂(f, g) 3 − y −x<br />

J =<br />

∂(x, y) = −2xy 1 − x 2<br />

In our previous<br />

which when evaluated at the fixed points (1, 3), (−1, 3) and (0, 0) gives<br />

the respective coefficient matrices<br />

[ ] [ ] [ ]<br />

0 −1 0 1<br />

3 0<br />

,<br />

and<br />

−6 0 6 0<br />

0 1<br />

3 After Karl Jacobi (1804–51) a German mathematician and pr<strong>of</strong>essor at Königsberg,<br />

noted for work in elliptic functions, number theory and differential determinants.


Module 1. Systems <strong>of</strong> differential equations 38<br />

as seen previously. The eigenvalues <strong>of</strong> these matrices determined the<br />

stability <strong>of</strong> the corresponding fixed point as seen earlier.<br />

In n-dimensions the Jacobian matrix is<br />

⎡<br />

J = ∂(f 1, f 2 , . . . , f n )<br />

∂(x 1 , x 2 , . . . , x n ) = ⎢<br />

⎣<br />

∂f 1 ∂f 1<br />

∂x 1<br />

∂f 2 ∂f 2<br />

∂x 1<br />

.<br />

∂f n<br />

∂x 1<br />

∂x 2<br />

· · ·<br />

· · ·<br />

∂x 2<br />

.<br />

. .. .<br />

∂f n<br />

· · ·<br />

∂x 2<br />

∂f 1<br />

⎤<br />

∂x n<br />

∂f 2<br />

∂x n<br />

,<br />

⎥<br />

⎦<br />

and similarly determines the behaviour <strong>of</strong> the linearised dynamics near any<br />

fixed point.<br />

Activity 1.J Do problems 5–11 from Problem Set 3.5 [K,p183] using the<br />

Jacobian to determine the coefficient matrices <strong>of</strong> the linear dynamics<br />

about each <strong>of</strong> the fixed points. Send in to the examiner for feedback<br />

at least Q7 & 10.<br />

∂f n<br />

∂x n<br />

Exercise 1.11: Consider the following system <strong>of</strong> equations:<br />

dx<br />

dt<br />

dy<br />

dt<br />

= −4y + y 3 ,<br />

= 5x − y − xy 2 .


Module 1. Systems <strong>of</strong> differential equations 39<br />

(a) Find the fixed points <strong>of</strong> this system.<br />

(b) Compute the Jacobian and evaluate it at each fixed point. From<br />

your results classify each <strong>of</strong> the fixed points.<br />

(c) Find a general linearised solution near (0, 0).<br />

Exercise 1.12: A predator-prey population model (not the Lotka-Volterra<br />

model) is governed by the equations<br />

y ′ 1 = y 1 (1 − y 1 ) − 3 2 y 2<br />

y ′ 2 = 1 2 y 1 − y 2<br />

(a) Deduce that the only critical points <strong>of</strong> this system <strong>of</strong> equations<br />

are (0, 0) and (1/4, 1/8).<br />

(b) By linearising the system in the neighbourhood <strong>of</strong> each critical<br />

point, determine the nature <strong>of</strong> these points are a saddle and stable<br />

spiral respectively.<br />

(c) Based upon the eigenvectors at (0, 0) and the nature <strong>of</strong> the fixed<br />

points, sketch some representative trajectories for the system.<br />

(d) Write down the general solution to the linearised system in the<br />

neighbourhood <strong>of</strong> the origin (0, 0).


Module 1. Systems <strong>of</strong> differential equations 40<br />

1.2.2 Answers to selected Exercises<br />

1.11 (a) (0, 0) and ±(2, 2) (b) saddle as λ = 4 and −5, and saddle as<br />

λ = 0.8191 and −9.8191 .<br />

1.12 (b) eigenvalues are λ = ±1/2 and λ = (−1±i √ 3)/4 respectively. (c)<br />

v = (3, 1) and v = (1, 1) correponding to eigenvalues λ = 1/2 and −1/2<br />

respectively. (d) y = c 1 (3, 1)e t/2 + c 2 (1, 1)e −t/2 .


Module 1. Systems <strong>of</strong> differential equations 41<br />

1.3 Summary<br />

• The behaviour <strong>of</strong> physical systems, such as mechanical and electrical<br />

systems, are described using differential equations (§1.1.1 for example).<br />

• Higher order differential equations can be reduced to systems <strong>of</strong> firstorder<br />

differential equations by introducing more dependent variables<br />

(§1.1.2).<br />

• Solutions to 2-D systems <strong>of</strong> ode’s are graphically represented on the<br />

phase plane (§1.1.3). Higher dimensional systems may be also represented<br />

using a phase space, but imagination is required above 3-D. For<br />

a given set <strong>of</strong> initial conditions, the system evolves along a trajectory<br />

or orbit in the phase space, which describes how the system evolves in<br />

time (§1.1.4).<br />

• At a critical point or fixed point, the system undergoes no further evolution<br />

in time (§1.1.4).<br />

For a first-order system <strong>of</strong> ode’s in the form y ′ = f(y), a fixed point<br />

occurs w<strong>here</strong>ver f(y) = 0 (§1.2).<br />

• Linear first-order systems with constant coefficients are written y ′ =<br />

Ay for some matrix A. Their general solution (equation (1.4)) is written<br />

down in terms <strong>of</strong> the eigenvalues and eigenvectors <strong>of</strong> A, provided the<br />

eigenvectors form a basis for the phase space (§1.1.4).<br />

Such linear systems have only the origin as a fixed point.


Module 1. Systems <strong>of</strong> differential equations 42<br />

• In particular, in 2-D, the solution to a linear first-order system with<br />

constant coefficients must 4 be one <strong>of</strong> six basic types: a centre, stable<br />

or unstable node, stable or unstable spiral or a saddle point (§1.1.5).<br />

• Nonlinear systems are capable <strong>of</strong> more complex behaviour and usually<br />

have more than one fixed point. Nevertheless the behaviour <strong>of</strong> the<br />

system near a fixed point is dominated by linear terms in the ode and<br />

approximate local solutions are found via linearisation. It is usually<br />

not be possible to write down a general solution for a nonlinear system,<br />

so approximations and phase plane methods are useful to build up a<br />

working picture <strong>of</strong> how these systems behave (§1.2).<br />

Activity 1.K Do Chapter 3 Review Problems 1–23, 28–30 and 35–38 [K,pp190–<br />

2].<br />

4 That is, unless no basis <strong>of</strong> eigenvectors exists, see [K,p168].


Module 2<br />

Scientists must write<br />

Module contents<br />

2.1 Basics <strong>of</strong> mathematical writing . . . . . . . . . . . . . 45<br />

2.2 L A TEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

2.2.1 Why L A TEX? . . . . . . . . . . . . . . . . . . . . . . . . 50<br />

2.2.2 Start to use L A TEX . . . . . . . . . . . . . . . . . . . . . 51<br />

2.2.3 Simple mathematics goes in-line . . . . . . . . . . . . . 59<br />

2.2.4 List environments usefully organise . . . . . . . . . . . . 61<br />

2.2.5 Complex mathematics is displayed . . . . . . . . . . . . 63<br />

2.2.6 Figures float . . . . . . . . . . . . . . . . . . . . . . . . 75


Module 2. Scientists must write 44<br />

2.2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 78<br />

2.2.8 Many mathematical symbols in L A TEX . . . . . . . . . . 79<br />

Developing technical communication is essential as preparation for the workplace<br />

and advanced study. In this module we help you to structure, prepare<br />

and deliver small documents <strong>of</strong> technical material. This module is to be studied<br />

in parallel with the first module in preparation for your first assignment.<br />

In your assignments you will be required to use your skills in technical writing<br />

for certain exercises. These exercises will not only be graded on mathematical<br />

content, but also on the style and manner <strong>of</strong> the technical and English<br />

expression.<br />

The first section (§2.1) discusses the composition <strong>of</strong> mathematical writing.<br />

Although mathematical writing has much in common with non-technical<br />

writing, t<strong>here</strong> are many distinctions and extensions. Some <strong>of</strong> the common<br />

problem areas are identified and discussed. These and basic aspects <strong>of</strong> writing<br />

will be assessed in the specified exercises in the assignments.<br />

The second section (§2.2) introduces you to L A TEX, the open standard for<br />

high quality typesetting <strong>of</strong> scientific and general documents (t<strong>here</strong> are two<br />

alternative pronunciations <strong>of</strong> L A TEX: either “lay-teck” or “lah-teck”). As<br />

well as typesetting documents, L A TEX provides a convenient standard for the<br />

communication <strong>of</strong> mathematics in plain text such as e-mails—your e-mail<br />

enquiries to us should be phrased using the syntax and grammar <strong>of</strong> L A TEX.<br />

It is compulsory for you to use L A TEX for the specified exercises.


Module 2. Scientists must write 45<br />

2.1 Basics <strong>of</strong> mathematical writing<br />

In this first introduction to the writing <strong>of</strong> technical documents involving<br />

mathematics we focus on incorporating mathematical equations, symbols<br />

and structures into a short expository document. This mathematical detail<br />

is based upon basic communications concepts that we first summarise and<br />

which I expect to be familiar to you.<br />

Basic written communication: Successful writing in any discipline is based<br />

on certain elements and these are summarised below. It could be useful<br />

to read Chapters 4 and 6 from Communication: A Foundation Course, by<br />

S. Tyler, C. Kossen and C. Ryan, rev. edn.<br />

• Analysing the task (what you are being asked to write about).<br />

• Analysing the audience (to whom are you writing).<br />

• Developing a thesis statement (what you intend to prove).<br />

• Deciding on your main points (how you intend to prove or support your<br />

thesis statement).<br />

• Logical sequence <strong>of</strong> points (developing a co<strong>here</strong>nt argument).<br />

I recommend international students also read Chapter 5, “When English is a<br />

foreign language”, from Handbook <strong>of</strong> writing for the mathematical sciences,<br />

by N. J. Higham, 2nd edition.


Module 2. Scientists must write 46<br />

Mathematical writing has special features<br />

Reading 2.A Study, noting the comments below, Chapter 3 “Mathematical<br />

writing” from Handbook <strong>of</strong> writing for the mathematical sciences, by<br />

N. J. Higham, 2nd edition.<br />

§3.1 What is a theorem? This is <strong>of</strong> interest, but we will not worry about<br />

whether you call results theorems, propositions, or lemmas. However,<br />

we will look for a well structured argument in your writing—you will<br />

need to state clearly what are your main results.<br />

§3.2 Pro<strong>of</strong>s It is essential to help readers and to show you appreciate the<br />

role <strong>of</strong> the various parts <strong>of</strong> an argument by annotating the argument<br />

accordingly.<br />

§3.3 The role <strong>of</strong> examples Although I expect this to be irrelevant to your<br />

assignments, if necessary, introduce a generality by a preliminary specific<br />

example.<br />

§3.4 Definitions Only define terms if they are new and are needed in several<br />

places.<br />

§3.5 Notation Endeavour to choose a notation that is consistent and not<br />

confusing.


Module 2. Scientists must write 47<br />

§3.6 Words versus symbols Readers typically have difficulty remembering<br />

the meaning <strong>of</strong> symbols that you have introduced. Even though<br />

you know what they mean, use words w<strong>here</strong>ver reasonable w<strong>here</strong>ver a<br />

symbol appears.<br />

§3.7 Displaying equations The crucial point in this section is the first<br />

sentence: “An equation is displayed when it needs to be numbered,<br />

when it would be hard to read if placed in-line, or when it merits<br />

special attention,. . . ” otherwise typeset equations in-line.<br />

§3.8 Parallelism A subtle aspect—we will not assess this.<br />

§3.9 Dos and Don’ts <strong>of</strong> mathematical writing has lots <strong>of</strong> good tips.<br />

Punctuating expressions It is important to remember that mathematical<br />

content, whether expressions, equations or derivations,<br />

must form an integral part <strong>of</strong> a sentence. Write and punctuate<br />

accordingly.<br />

Otiose symbols Avoid gratuitous symbols.<br />

Placement <strong>of</strong> symbols Endeavour to be clear.<br />

“The” or “A” addresses a small but common error.<br />

Notational synonyms Strive to find, among all readable and clear<br />

possibilities, an aesthetically pleasing version <strong>of</strong> your mathematical<br />

expressions.


Module 2. Scientists must write 48<br />

Referencing equations Analogous to §3.6, a descriptive word or two<br />

helps remind readers <strong>of</strong> the subject <strong>of</strong> an equation that you reference.<br />

Miscellaneous In my opinion the most important <strong>of</strong> these are:<br />

• standard mathematical functions are set in roman, not italic;<br />

• avoid stacked fractions in superscripts and subscripts;<br />

• avoid tall in-line expressions;<br />

• choose the correct ellipsis;<br />

• avoid ambiguity in slashed fractions.<br />

If you see our study guides failing any <strong>of</strong> the above, then please inform us.


Module 2. Scientists must write 49<br />

2.2 L A TEX<br />

Good writing deserves the best reproduction. Here we introduce you to<br />

L A TEX, the world’s best package for typesetting technical and other documents.<br />

You will use L A TEX for at least the specified exercises.<br />

From: pete@nospam <br />

t<strong>here</strong> is simply nothing better.<br />

i started learning latex on my own few years ago, and at work i<br />

write all my reports using it. every one tells me how good the<br />

reports look and they wonder how i do it, since everyone else uses<br />

word and their report do not half look as good.<br />

i try to keep it a secret, but sometimes i am forced to tell, but<br />

most will not even try it because they think it is hard to do this<br />

way, but they do not know that it is actually easier. in latex i<br />

concentrate on the logic and content <strong>of</strong> my report and let latex<br />

worry about how to format it and typeset it. so with latex i am<br />

much faster than with those gui word processors.<br />

tex and latex makes writing more fun.


Module 2. Scientists must write 50<br />

2.2.1 Why L A TEX?<br />

• L A TEX is arguably the premier typesetting package in the world. Donald<br />

Knuth and Leslie Lamport have distilled for us the wisdom, accumulated<br />

over hundreds <strong>of</strong> years, <strong>of</strong> many generations <strong>of</strong> printers.<br />

• The L A TEX system typesets documents with line and page breaks to<br />

maximise readability and appeal by avoiding as far as possible poor<br />

breaks and hyphenation.<br />

• It is simply the best package for documents containing mathematics.<br />

“TEX can print virtually any mathematical thought that comes into<br />

your head, and print it beautifully.” (Herbert S. Wilf, 1986)<br />

• It is free on virtually every computer in the world.<br />

• It is portable—stick to the standard commands and everyone can read<br />

and exchange documents.<br />

• The source file uses standard keyboard characters so it can be read<br />

by eye or posted by e-mail with no problems associated with different<br />

versions or binary files.<br />

• L A TEX has the reputation <strong>of</strong> being hard, but as a mark-up language it<br />

is effectively the same as html!<br />

• Weakness: it is not usually wysiwyg.


Module 2. Scientists must write 51<br />

L A TEX is a very powerful typesetting system. Here we only introduce you to<br />

some basics <strong>of</strong> L A TEX. The idea is to provide you with enough to typeset the<br />

specified assignment questions. T<strong>here</strong> is much more that you may learn to<br />

extend your use <strong>of</strong> L A TEX.<br />

As well as the guide written <strong>here</strong> t<strong>here</strong> is a wealth <strong>of</strong> support information for<br />

L A TEX on the internet. More L A TEX information and many links to further<br />

sources are to be found at http://www.sci.usq.edu.au/staff/robertsa/LaTeX/latexintro.<br />

For further reading I suggest:<br />

• Chapter 13 “TEX and L A TEX” <strong>of</strong> Handbook <strong>of</strong> writing for the mathematical<br />

sciences, by N. J. Higham, 2nd edition; and<br />

• Learning L A TEX by D. F. Griffiths and N. J. Higham.<br />

2.2.2 Start to use L A TEX<br />

Install L A TEX on your computer: if you run Windows, the Department cdrom<br />

set has a copy <strong>of</strong> L A TEX (called MiKTeX, essential) and the shareware<br />

editor WinEdt (helpful, but not essential) for you to install; if you use Linux,<br />

L A TEX is included as an optional part <strong>of</strong> the Linux release; if you use a Macintosh,<br />

obtain OzTeX and I recommend the editor Alpha. Install whichever<br />

is appropriate for you. For Windows follow the instructions on the Maths &<br />

Computing cd-rom. If the installation fails, still make progress by following<br />

the instructions in a subsequent paragraph.


Module 2. Scientists must write 52<br />

Execute L A TEX on a windows computer:<br />

1. Prepare a plain text file in any simple editing application such as<br />

notepad, or preferably in WinEdt. Your L A TEX source forms the text<br />

and name the file with the .tex extension, for example, first.tex .<br />

2. Execute the L A TEX application giving as input your source file, for example,<br />

first.tex .<br />

3. If t<strong>here</strong> are errors, correct your source and redo the previous step.<br />

4. View the beautifully typeset “dvi” file generated by L A TEX, for example<br />

first.dvi , using the application yap .<br />

If your execution fails, still make progress by following the instructions in the<br />

following paragraph.<br />

If you do not have access to a computer with L A TEX: we have provided<br />

a web interface to L A TEX. The following are its instructions.<br />

1. Prepare a plain text file in any simple editing application, such as<br />

notepad, with your L A TEX source and preferably name it with the .tex<br />

extension, for example, first.tex (although notepad likes to insist on<br />

a .txt extension which is also acceptable).


Module 2. Scientists must write 53<br />

2. Point your internet browser to the web address http://www.sci.usq.edu.au/latex,<br />

enter your usqconnect username and password when requested.<br />

3. Click the Browse... button and navigate around the file system on<br />

your computer to your L A TEX source file, for example, first.tex .<br />

4. Click Submit Document button.<br />

5. Wait for hopefully no more than a few seconds for a new web page to<br />

appear saying “The PDF file is available for download” in which case<br />

<strong>click</strong> on the link PDF file and view your beautiful document in Adobe<br />

acrobat reader.<br />

6. If a serious error has occurred, download the log file and use the error<br />

messages to guide fixing your document, then return to Step 2. It is a<br />

good idea to download the log file and check for non-fatal errors in any<br />

case.<br />

Your first document: you need to prepare a text file <strong>of</strong> the content <strong>of</strong> your<br />

composition interspersed with L A TEX commands. First tell L A TEX the sort<br />

<strong>of</strong> document you will be typesetting. For our straightforward needs in this<br />

course you will use the article style typeset in a 12pt size font. Around<br />

the document text that you wish to typeset, you need<br />

\documentclass[12pt,a4paper]{article}<br />

\begin{document}


Module 2. Scientists must write 54<br />

...<br />

\end{document}<br />

The three dots above denote the place w<strong>here</strong> the content text is to be placed.<br />

Second, specify the title and author <strong>of</strong> the document. These are to be specified<br />

in the following manner using the \title{...}, \author{...} and<br />

\maketitle commands.<br />

\documentclass[12pt,a4paper]{article}<br />

\begin{document}<br />

\title{Assignment 1, Question 3: The<br />

importance <strong>of</strong> beings fractal}<br />

\author{Ben Hall, Q99123456}<br />

\maketitle<br />

...<br />

\end{document}<br />

<strong>Just</strong> those seven lines form a complete, though pointless, document. Try it.<br />

Type the above into a file (perhaps named first.tex) then run it through<br />

L A TEX and view the result. These seven lines form the skeleton <strong>of</strong> all our<br />

L A TEX documents. Contact us if t<strong>here</strong> is any problem.<br />

Now put in some information. Simply type the text <strong>of</strong> your document in<br />

place <strong>of</strong> the three dots in the above skeleton. For example:


Module 2. Scientists must write 55<br />

\documentclass[12pt,a4paper]{article}<br />

\begin{document}<br />

\title{Assignment 1, Question 3: The<br />

importance <strong>of</strong> beings fractal}<br />

\author{Ben Hall, Q99123456}<br />

\maketitle<br />

Fractal geometry, largely inspired by Benoit Mandelbrot<br />

during the sixties and seventies, is one <strong>of</strong> the great<br />

advances in mathematics for two thousand years. Given<br />

the rich and diverse power <strong>of</strong> developments in<br />

mathematics and its applications, this is a remarkable<br />

claim.<br />

Often presented as being just a part <strong>of</strong> modern chaos<br />

theory, fractals are momentous in their own right.<br />

Euclid’s geometry describes the world around us in<br />

terms <strong>of</strong> points, lines and planes---for two thousand<br />

years these have formed the somewhat limited repertoire<br />

<strong>of</strong> basic geometric objects with which to describe the<br />

universe. Fractals immeasurably enhance this<br />

world-view by providing a description <strong>of</strong> much around us<br />

that is rough and fragmented---<strong>of</strong> objects that have<br />

structure on many sizes. Examples include: coastlines,<br />

rivers, plant distributions, architecture, wind gusts,


Module 2. Scientists must write 56<br />

music, and the cardiovascular system.<br />

\end{document}<br />

See that paragraphs are indicated by introducing a blank line to indicate to<br />

L A TEX w<strong>here</strong> one paragraph ends and another begins. All other line breaks in<br />

the source are treated as simply blank characters. Line breaks in your source<br />

do not correspond to line breaks in the typeset document. 1 Type a few<br />

paragraphs, each <strong>of</strong> a couple <strong>of</strong> sentences, and typeset your own document.<br />

Longer documents need sections: although you may not need sectioning<br />

to answer simple questions, most documents do. In L A TEX automatically<br />

numbered sections and their titles are specified by the \section{...} command.<br />

W<strong>here</strong>ver you want to start a new section, just put this command<br />

with the title <strong>of</strong> the section within the braces. See the two sections in the<br />

following example<br />

\documentclass[12pt,a4paper]{article}<br />

\begin{document}<br />

\title{Assignment 1, Question 3: The<br />

importance <strong>of</strong> beings fractal}<br />

1 In fact, this is one brilliant aspect <strong>of</strong> TEX: Knuth programmed a sophisticated optimisation<br />

scheme to determine the very best line breaks to be made in a paragraph. He<br />

incorporated the knowledge <strong>of</strong> the best printers accumulated over hundreds <strong>of</strong> years.


Module 2. Scientists must write 57<br />

\author{Ben Hall, Q99123456}<br />

\maketitle<br />

Fractal geometry, largely inspired by Benoit Mandelbrot<br />

during the sixties and seventies, is one <strong>of</strong> the great<br />

advances in mathematics for two thousand years. Given<br />

the rich and diverse power <strong>of</strong> developments in<br />

mathematics and its applications, this is a remarkable<br />

claim.<br />

\section{Some fractal models}<br />

Before discussing in detail the common feature <strong>of</strong> the<br />

previously mentioned examples, I present a few<br />

examples <strong>of</strong> fractals and the type <strong>of</strong> physical<br />

applications that they have.<br />

\section{Scaling and dimensionality}<br />

The common theme in these examples is not just that<br />

they have detail on many lengths, but also that the<br />

structure at any scale is much the same at any other<br />

scale---the coastline around a continent looks just<br />

like any small part <strong>of</strong> the coastline.


Module 2. Scientists must write 58<br />

\end{document}<br />

However, for many purposes you will want to emphasise the main point <strong>of</strong><br />

a paragraph by using the \paragraph{...} command to introduce a simple<br />

statement at the start. For example,<br />

\paragraph{Construct the Cantor set:} start with a<br />

bar <strong>of</strong> some length; then remove its middle third to<br />

leave two separate thirds; then remove the middle<br />

thirds <strong>of</strong> these to leave four separate ninths; then<br />

remove the middle thirds <strong>of</strong> these to obtain eight<br />

separate twenty-sevenths; and so on. Eventually we<br />

just obtain a scattered dust <strong>of</strong> points.<br />

produces the following<br />

Construct the Cantor set: start with a bar <strong>of</strong> some length;<br />

then remove its middle third to leave two separate thirds; then<br />

remove the middle thirds <strong>of</strong> these to leave four separate ninths;<br />

then remove the middle thirds <strong>of</strong> these to obtain eight separate<br />

twenty-sevenths; and so on. Eventually we just obtain a scattered<br />

dust <strong>of</strong> points.


Module 2. Scientists must write 59<br />

I like using \paragraph commands. For example, at the start <strong>of</strong> this subsection<br />

they were used to highlight the actions to do if you could install L A TEX<br />

on you computer, or not, as the case may be.<br />

Note that the characters slosh, \, and braces, { and }, are special characters<br />

to L A TEX as they are used to invoke the typesetting mark-up commands.<br />

T<strong>here</strong> are other special characters in L A TEX <strong>of</strong> which to be wary:<br />

• the percent sign, %, causes LaTeX to ignore the rest <strong>of</strong> the line, so you<br />

may comment the document if needed;<br />

• the dollar, $, used to delineate in-line mathematics;<br />

• the ampersand, &, for tabbing;<br />

• the underscore and caret, _ and ^, for subscripts and superscripts;<br />

• the hash, #, and the tilde, ~ .<br />

To actually get any <strong>of</strong> these last nine characters (except the slosh \) to appear<br />

in the final typeset document, just precede them by a slosh (a backslash).<br />

2.2.3 Simple mathematics goes in-line<br />

Mathematics to be typeset in-line with the text must be contained within<br />

matching dollar signs $...$ . For example, Newton’s law F = ma is typeset


Module 2. Scientists must write 60<br />

by $F=ma$ . Note the different font used for mathematical letters (called math<br />

italic): it is imperative that all mathematics be typeset in a mathematics<br />

environment (even if the mathematics is just a single letter symbol), and not<br />

in the roman font that is the default for text; equally it is imperative that<br />

non-mathematical text is not placed within a mathematics environment. For<br />

example,<br />

Newton’s law is $F= m a$<br />

for force $ F $, mass $m$<br />

and acceleration $a$.<br />

Newton’s law is F = ma for<br />

force F , mass m and acceleration<br />

a.<br />

See that in any mathematics environment, blank characters are totally ignored.<br />

Subscripts and superscripts are typeset in a mathematics environment using<br />

the underscore and the caret character respectively. For example, $d^{-1}$<br />

and $d^2$ will typeset d −1 and d 2 . The theorem <strong>of</strong> Pythagoras is a 2 + b 2 =<br />

c 2 obtained by $a^2+b^2=c^2$ . Similarly, subscripts are generated by the<br />

underscore character _ : for example, the Fibonacci numbers are obtained by<br />

the recursion F n+2 = F n+1 + F n typeset by $F_{n+2}=F_{n+1}+F_n$ . Single<br />

character scripts need no enclosing braces.<br />

L A TEX has an enormously wide variety <strong>of</strong> symbols to help typeset mathematics.<br />

For example, \times to get a times sign, ×; \propto to get a<br />

proportional to symbol, ∝; \pi to get the Greek letter π, and similarly for


Module 2. Scientists must write 61<br />

the whole Greek alphabet. The names <strong>of</strong> these symbols have to be followed<br />

by a non-alphabetic character, <strong>of</strong>ten a blank. See §2.2.8 for tables <strong>of</strong> just<br />

some <strong>of</strong> the vast number <strong>of</strong> symbols available in L A TEX.<br />

Fractions are typeset using \frac{}{} (with two arguments in braces): 1 − n<br />

1 = 3 only if n = 1 is typeset using $\frac{1}{n}-1=3$ and $n=\frac{1}{2\times 2}$ .<br />

2×2<br />

For in-line mathematics you must only use simple expressions in fractions as<br />

otherwise it becomes to hard to read.<br />

2.2.4 List environments usefully organise<br />

Extremely useful are the list environments <strong>of</strong> which I describe two. Use<br />

them w<strong>here</strong>ver you have a sequence <strong>of</strong> steps (perhaps in a mathematical<br />

argument) or a list <strong>of</strong> things (perhaps describing an algorithm). The format<br />

for a bulleted list is<br />

\begin{itemize}<br />

\item ...<br />

\item ...<br />

...<br />

\end{itemize}<br />

You might use such a list to clearly structure an argument such as:


Module 2. Scientists must write 62<br />

\begin{itemize}<br />

\item To solve the differential equation $y’’-y’-2y=0$,<br />

\item<br />

substitute the exponential $y=\exp(\lambda t)$,<br />

\item and deduce that $\lambda^2-\lambda-2=0$.<br />

\end{itemize}<br />

(the blank lines are optional) is typeset as<br />

• To solve the differential equation y ′′ − y ′ − 2y = 0,<br />

• substitute the exponential y = exp(λx),<br />

• and deduce that λ 2 − λ − 2 = 0.<br />

For an example <strong>of</strong> a numbered list, see that I used a numbered list at the<br />

beginning <strong>of</strong> §2.2.2 to advise you <strong>of</strong> the steps to follow to start using L A TEX.<br />

The general format for a numbered list is<br />

\begin{enumerate}<br />

\item ...<br />

\item ...<br />

...<br />

\end{enumerate}<br />

Lists may be nested within lists, to a maximum depth <strong>of</strong> four.


Module 2. Scientists must write 63<br />

2.2.5 Complex mathematics is displayed<br />

Recall, to include mathematics in-line with the text, use $...$ . However,<br />

<strong>of</strong>ten mathematics is sufficiently complicated that it displayed, centred, on<br />

a line by itself. For this purpose use the displaymath or equation environments.<br />

The difference between the two is that the equation environment<br />

automatically typesets a useful labelling number alongside the mathematics,<br />

w<strong>here</strong>as the displaymath environment does not. See the following examples:<br />

\begin{displaymath}<br />

a^2+b^2=c^2<br />

\end{displaymath}<br />

\begin{equation}<br />

\log x=\int_1^x<br />

\frac{1}{t}dt<br />

\end{equation}<br />

log x =<br />

a 2 + b 2 = c 2<br />

∫ x<br />

1<br />

1<br />

dt (2.1)<br />

t<br />

Relations<br />

L A TEX knows to typeset extra space around relations such as =, \approx (≈)<br />

including inequalities such as , \leq (≤) and \geq (≥), and also set<br />

relations such as \in (∈) or \subset (⊂).


Module 2. Scientists must write 64<br />

Delimiters<br />

Delimiters, such as parentheses, brackets, and braces, come in various sizes<br />

to cope with different sub-expressions that they surround. The easiest way to<br />

get the size <strong>of</strong> a delimiter nearly correct is to use the modifying commands as<br />

in \left(...\right) . Note that \left and \right must be used in pairs<br />

so that L A TEX can determine the size <strong>of</strong> the intervening mathematics.<br />

See how the delimiters are <strong>of</strong> reasonable size in these two examples:<br />

\begin{displaymath}<br />

\left(a+b\right)\left[1<br />

-\frac{b}{a+b}\right]=a\,;<br />

\end{displaymath}<br />

\begin{displaymath}<br />

\sqrt{|xy|}\leq\left|\frac<br />

{x+y}{2}\right|\,.<br />

\end{displaymath}<br />

(a + b)<br />

[<br />

1 − b ]<br />

a + b<br />

√<br />

|xy| ≤<br />

∣<br />

x + y<br />

2<br />

= a ;<br />

∣ .<br />

Spacing<br />

In the previous two examples I used \,, and \,. to punctuate at the end<br />

<strong>of</strong> the equations. Both in and out <strong>of</strong> mathematics, LaTeX provides the<br />

commands:


Module 2. Scientists must write 65<br />

• \, to typeset a thin space;<br />

• \␣ to typeset a normal space;<br />

• \quad to typeset a quad space;<br />

• and \! to typeset some negative space!<br />

Use these to space the mathematics w<strong>here</strong> needed. Integrals <strong>of</strong>ten need a bit<br />

<strong>of</strong> help with their spacing as in<br />

\begin{displaymath}<br />

\int\!\!\!\int xy^2\,dx\,dy<br />

=\frac{1}{6}x^2y^3\,,<br />

\end{displaymath}<br />

∫∫<br />

xy 2 dxdy = 1 6 x2 y 3 ,<br />

w<strong>here</strong>as vector problems <strong>of</strong>ten lead to statements such as<br />

\begin{displaymath}<br />

u=\frac{-y}{x^2+y^2}<br />

\quad\mbox{and}\quad<br />

v=\frac{x}{x^2+y^2}\,.<br />

\end{displaymath}<br />

u =<br />

−y<br />

x 2 + y 2 and v =<br />

x<br />

x 2 + y 2 .


Module 2. Scientists must write 66<br />

Use:<br />

• thin spaces, \,, to separate the infinitesimals from each other and from<br />

the integrand in integrals, and to separate punctuation from an equation<br />

or expression;<br />

• some negative space, \!\!\!, in multi-dimensional integrals to bring<br />

the integral signs closer together;<br />

• \quad to separate two or more equations or text on the one line.<br />

Observe the use <strong>of</strong> the \mbox{...} command to include a few words <strong>of</strong><br />

ordinary (roman) text and its spacing within a mathematics environment.<br />

Arrays<br />

Frequently we need to set mathematics in a tabular format. For example,<br />

arrays are typeset within a mathematics environment by the array environment.<br />

The structure is<br />

\begin{array}{argument}<br />

... & ... & ... & ... \\<br />

... & ... & ... & ... \\<br />

... & ... & ... & ...<br />

\end{array}


Module 2. Scientists must write 67<br />

for an array <strong>of</strong> three rows and four columns. The argument consists <strong>of</strong> the<br />

letters r, c or l to indicate that the corresponding columns are to be typeset<br />

right, centred or left justified. An array example is<br />

\begin{displaymath}<br />

\left[\begin{array}{ccc}<br />

1 & x & 0 \\<br />

0 & 1 & -1<br />

\end{array}\right]<br />

\left[\begin{array}{c}<br />

1 \\ y \\ 1<br />

\end{array}\right]<br />

=\left[ \begin{array}{c}<br />

1+xy \\<br />

y-1<br />

\end{array} \right]\,,<br />

\end{displaymath}<br />

[<br />

1 x 0<br />

0 1 −1<br />

] ⎡ ⎢ ⎣<br />

1<br />

y<br />

1<br />

⎤<br />

⎥<br />

⎦ =<br />

[<br />

1 + xy<br />

y − 1<br />

]<br />

,<br />

or in a case statement such as


Module 2. Scientists must write 68<br />

\begin{displaymath}<br />

|x|=\left\{\begin{array}{rl}<br />

x\,, & \mbox{if }x\geq 0\,, \\<br />

-x\,, & \mbox{if }x< 0\,.<br />

\end{array}\right.<br />

\end{displaymath}<br />

|x| =<br />

{<br />

x , if x ≥ 0 ,<br />

−x , if x < 0 .<br />

Many arrays have lots <strong>of</strong> dots all over the place as in<br />

\begin{displaymath}<br />

\left[\begin{array}{ccccc}<br />

-2&1&0&\cdots&0 \\<br />

1&-2&1&\cdots&0 \\<br />

0&1&-2&\ddots&\vdots \\<br />

\vdots&\vdots&\ddots&\ddots&1\\<br />

0&0&\cdots&1&-2<br />

\end{array}\right]<br />

\end{displaymath}<br />

⎡<br />

⎢<br />

⎣<br />

−2 1 0 · · · 0<br />

1 −2 1 · · · 0<br />

0 1 −2 . . . .<br />

.<br />

. . .. . .. 1<br />

0 0 · · · 1 −2<br />

⎤<br />

⎥<br />

⎦<br />

Equation arrays<br />

Often we want to align related equations together, or to align each line <strong>of</strong> a<br />

multi-line derivation. The eqnarray mathematics environment does this.


Module 2. Scientists must write 69<br />

The format is the same as an array environment, except that the eqnarray<br />

environment automatically assumes three columns: the left column right<br />

justified; the centre, centred; and the right column left justified:<br />

\begin{eqnarray}<br />

... & ... & ... \\<br />

... & ... & ... \\<br />

... & ... & ...<br />

\end{eqnarray}<br />

Each line will be numbered by L A TEX, unless you specify \nonumber in a line,<br />

or unless you use the * form <strong>of</strong> eqnarray.<br />

For examples, in the flow <strong>of</strong> a fluid film we may report<br />

\begin{eqnarray}<br />

u&=&\epsilon^2 k_{xxx}\sin y\,,\\<br />

v&=&\epsilon^3 k_{xxx} y\,, \\<br />

p&=&\epsilon k_{xx}\,.<br />

\end{eqnarray}<br />

u = ɛ 2 k xxx sin y , (2.2)<br />

v = ɛ 3 k xxx y , (2.3)<br />

p = ɛk xx . (2.4)<br />

Alternatively, the curl <strong>of</strong> a vector field (u, v, w) may be written with only<br />

one equation number:


Module 2. Scientists must write 70<br />

\begin{eqnarray}<br />

\omega_1 & = &<br />

\frac{\partial w}{\partial y}<br />

-\frac{\partial v}{\partial z}<br />

\,, \nonumber \\<br />

\omega_2 & = &<br />

\frac{\partial u}{\partial z}<br />

-\frac{\partial w}{\partial x}<br />

\,, \\<br />

\omega_3 & = &<br />

\frac{\partial v}{\partial x}<br />

-\frac{\partial u}{\partial y}<br />

\,. \nonumber<br />

\end{eqnarray}<br />

ω 1 = ∂w<br />

∂y − ∂v<br />

∂z ,<br />

ω 2 = ∂u<br />

∂z − ∂w<br />

∂x , (2.5)<br />

ω 3 = ∂v<br />

∂x − ∂u<br />

∂y .<br />

W<strong>here</strong>as a derivation may look like


Module 2. Scientists must write 71<br />

\begin{eqnarray*}<br />

&&<br />

(p ∧ q) ∨ (p ∧ ¬q)<br />

(p\wedge q)\vee(p\wedge\neg q)\\<br />

= p ∧ (q ∨ ¬q) distributive law<br />

& = & p\wedge(q\vee\neg q)<br />

\quad\mbox{distributive law}\\ = p ∧ T excluded middle<br />

& = & p\wedge T<br />

\quad\mbox{excluded middle} \\<br />

& = & p<br />

\quad\mbox{by identity}<br />

\end{eqnarray*}<br />

= p by identity<br />

Functions<br />

L A TEX knows how to typeset a lot <strong>of</strong> mathematical functions.<br />

• Trigonometric and other elementary functions are defined by the obvious<br />

corresponding command name. Two examples are \sin x or<br />

\exp(i\theta) . Observe that trigonometric and other elementary<br />

functions are typeset properly, in roman, even to the extent <strong>of</strong> automatically<br />

providing a thin space if followed by a single symbol argument:


Module 2. Scientists must write 72<br />

\begin{displaymath}<br />

\exp(i\theta)=\cos\theta<br />

+i\sin\theta\,,<br />

exp(iθ) = cos θ + i sin θ ,<br />

\end{displaymath}<br />

\begin{displaymath}<br />

\sinh(\log x)=\frac{1}{2}<br />

\left(x-\frac{1}{x}\right)\,.<br />

\end{displaymath}<br />

sinh(log x) = 1 2<br />

(<br />

x − 1 )<br />

.<br />

x<br />

• Subscriptss on more complicated functions, such as \lim_{...} and<br />

\max_{...} are appropriately placed under the function name.<br />

\begin{displaymath}<br />

\lim_{q\to\infty}\|f(x)\|_q<br />

=\max_{x}|f(x)|\,.<br />

\end{displaymath}<br />

lim ‖f(x)‖ q = max |f(x)| .<br />

q→∞ x<br />

• And the same goes for both sub- and superscripts on large operators<br />

such as \sum, \prod, etc.


Module 2. Scientists must write 73<br />

\begin{displaymath}<br />

e^x=\sum_{n=0}^\infty<br />

\frac{x^n}{n!}\,,\quad<br />

n!=\prod_{i=1}^n i\,.<br />

\end{displaymath}<br />

e x =<br />

∞∑<br />

n=0<br />

x n<br />

n! , n! = n ∏<br />

i=1<br />

i .<br />

Although in in-line mathematics the scripts are automatically placed<br />

to the side in order to conserve vertical space and to strive for uniform<br />

vertical spacing, as in 1/(1 − x) = ∑ ∞<br />

n=0 x n obtained from<br />

$1/(1-x)=\sum_{n=0}^\infty x^n$<br />

Accents<br />

Common mathematical accents over a single character, say a, are: \bar a<br />

for ā; \tilde a for ã; \hat a for â; \dot a for ȧ; \ddot a for ä; and \vec a<br />

for ⃗a . Two examples:


Module 2. Scientists must write 74<br />

\begin{displaymath}<br />

\bar f=\frac{1}{L}<br />

\int_0^L f(x)\,dx\,,<br />

¯f = 1 ∫ L<br />

f(x) dx ,<br />

L 0<br />

\end{displaymath}<br />

\begin{displaymath}<br />

\dot{\vec \omega}=<br />

\vec r\times\vec I\,.<br />

\end{displaymath}<br />

˙⃗ω = ⃗r × ⃗ I .<br />

Command definition<br />

L A TEX provides a facility for you to define your very own commands. Most<br />

useful commands involve arguments; I give three <strong>of</strong> my favourites. The first<br />

two, with two arguments, define partial derivative commands<br />

\newcommand{\D}[2]{\frac{\partial #2}{\partial #1}}<br />

\newcommand{\DD}[2]{\frac{\partial^2 #2}{\partial #1^2}}<br />

\renewcommand{\vec}[1]{{\bf #1}}<br />

and the last, with one argument, redefines the \vec command to denote<br />

vectors by boldface characters (rather than have an arrow accent). Note that<br />

within a definition, #n denotes a placeholder for the nth supplied argument.<br />

This vector identity will serve nicely to illustrate two <strong>of</strong> the new commands:


Module 2. Scientists must write 75<br />

\begin{displaymath}<br />

\nabla\times\vec q<br />

=\vec i\left(\D yw-\D zv\right)<br />

+\vec j\left(\D zu-\D xw\right)<br />

+\vec k\left(\D xv-\D yu\right)<br />

\end{displaymath}<br />

∇ × q = i<br />

( ∂w<br />

∂y − ∂v ) ( ∂u<br />

+ j<br />

∂z ∂z − ∂w ) ( ∂v<br />

+ k<br />

∂x ∂x − ∂u )<br />

∂y<br />

You will have noticed that L A TEX is very verbose. Many people define their<br />

own abbreviations for the common commands so that they are quicker to<br />

type. My advice is do not do this; it makes your L A TEX much less portable<br />

and harder to read. Instead, setup your editor to cater for the verbosity;<br />

use command definitions only to give you new logical facilities, such as the<br />

partial differentiation.<br />

2.2.6 Figures float<br />

Often we illustrate or support discussion with a figure. Usually figures are<br />

big. Thus they make a mess <strong>of</strong> the pagination <strong>of</strong> a document. The solution<br />

adopted by pr<strong>of</strong>essional printers and L A TEX is to generally place figures at


Module 2. Scientists must write 76<br />

the top or bottom <strong>of</strong> a page, or on a page by itself, near w<strong>here</strong> the author<br />

specifies. That is, the location <strong>of</strong> a figures “floats”. The usual way to include<br />

a figure in L A TEX is as follows.<br />

1. Create a postscript file <strong>of</strong> the drawing from whatever application is<br />

being used to generate the figure. For example, in Matlab draw the<br />

figure then execute the command<br />

print -depsc2 filename.eps<br />

to store in the file filename.eps the encapsulated postscript to draw<br />

the figure. Users <strong>of</strong> Windows may have trouble generating postscript<br />

from other applications as Micros<strong>of</strong>t generally do not distribute the<br />

postscript printer driver—if needed, get it.<br />

2. Then place in the preamble (that part <strong>of</strong> the document between the<br />

\documentclass command and the \begin{document} environment)<br />

the command \usepackage{graphicx} . This tells L A TEX to load information<br />

about how to include graphics.<br />

3. Somew<strong>here</strong> near w<strong>here</strong> you want the figure, include the figure environment<br />

\begin{figure}<br />

\centerline{\includegraphics{...}}<br />

\caption{...}<br />

\end{figure}


Module 2. Scientists must write 77<br />

w<strong>here</strong> the argument <strong>of</strong> the \includegraphics command is the full filename,<br />

and the argument to the \caption command is text describing<br />

the figure.<br />

4. Or use this version to scale the picture up/down to the width <strong>of</strong> the<br />

page<br />

\begin{figure}<br />

\includegraphics[width=0.9\textwidth]{...}<br />

\caption{...}<br />

\end{figure}<br />

The optional [width=0.9\textwidth] scales the figure to 90% <strong>of</strong> the<br />

width <strong>of</strong> the typeset text: change it if desired; leave it out in order to<br />

reproduce the figures unscaled.<br />

For example, the following commands draw and place Figure 2.1 somew<strong>here</strong><br />

near <strong>here</strong> (on page 78), but not precisely <strong>here</strong> as L A TEX has chosen a better<br />

place for it.<br />

\begin{figure}<br />

\centerline{\includegraphics{cantor.eps}}<br />

\caption{steps in the construction <strong>of</strong> a Cantor set.}<br />

\end{figure}


Module 2. Scientists must write 78<br />

Figure 2.1: steps in the construction <strong>of</strong> a Cantor set.<br />

Note: I strongly recommend that you generate any graphic at about the same<br />

size as it is to appear. This is so that the title, label and legend information<br />

is actually readable and the line thicknesses are creditable. Astonishingly,<br />

some people do shrink a figure by a factor <strong>of</strong> about three and expect the<br />

captions and labelling to be readable!<br />

2.2.7 Summary<br />

• L A TEX is the best.<br />

• In §2.2.2 you saw how to prepare and process simple documents in<br />

L A TEX complete with titles and sectioning. then simple mathematics<br />

could go in-line as seen in §2.2.3.<br />

• But note that lists, §2.2.4, provide a useful structure for mathematical<br />

derivations as well as lists <strong>of</strong> points.


Module 2. Scientists must write 79<br />

• Complicated mathematics is displayed. As discussed in §2.2.5 t<strong>here</strong> is<br />

a multitude <strong>of</strong> L A TEX structures and commands to help you do this.<br />

Make your displayed mathematics a work <strong>of</strong> art, but keep it as simple<br />

as possible so L A TEX works for you.<br />

• Lastly, we <strong>of</strong>ten need to typeset a figure in a mathematical document<br />

as described in §2.2.6. L A TEX floats these to a reasonable position.<br />

L A TEX can do so much more for you too: automatic cross-referencing, tables,<br />

table <strong>of</strong> contents, footnotes, marginal notes, hypertext links, bibliographies,<br />

indexing, two-sided printing, two-column printing, colour, different fonts, a<br />

vast number <strong>of</strong> mathematical symbols, music, etc. Here we have presented<br />

the basics.<br />

2.2.8 Many mathematical symbols in L A TEX


Module 2. Scientists must write 80<br />

α \alpha β \beta γ \gamma δ \delta<br />

ɛ \epsilon ε \varepsilon ζ \zeta η \eta<br />

θ \theta ϑ \vartheta ι \iota κ \kappa<br />

λ \lambda µ \mu ν \nu ξ \xi<br />

π \pi ϖ \varpi ρ \rho ϱ \varrho<br />

σ \sigma ς \varsigma τ \tau υ \upsilon<br />

φ \phi ϕ \varphi χ \chi ψ \psi<br />

ω \omega<br />

Table 2.1: Lowercase Greek letters<br />

Γ \Gamma ∆ \Delta Θ \Theta Λ \Lambda<br />

Ξ \Xi Π \Pi Σ \Sigma Υ \Upsilon<br />

Φ \Phi Ψ \Psi Ω \Omega<br />

Table 2.2: Uppercase Greek letters<br />

± \pm ∩ \cap ⋄ \diamond ⊕ \oplus<br />

∓ \mp ∪ \cup △ \bigtriangleup ⊖ \ominus<br />

× \times ⊎ \uplus ▽ \bigtriangledown ⊗ \otimes<br />

÷ \div ⊓ \sqcap ⊳ \triangleleft ⊘ \oslash<br />

∗ \ast ⊔ \sqcup ⊲ \triangleright ⊙ \odot<br />

⋆ \star ∨ \vee ∧ \wedge ○ \bigcirc<br />

† \dagger \ \setminus ∐ \amalg ◦ \circ<br />

‡ \ddagger · \cdot ≀ \wr • \bullet<br />

Table 2.3: Binary Operation Symbols


Module 2. Scientists must write 81<br />

≠<br />

≤ \leq ≥ \geq ≡ \equiv |= \models<br />

≺ \prec ≻ \succ ∼ \sim ⊥ \perp<br />

≼ \preceq ≽ \succeq ≃ \simeq | \mid<br />

≪ \ll ≫ \gg ≍ \asymp ‖ \parallel<br />

⊂ \subset ⊃ \supset ≈ \approx ⊲⊳ \bowtie<br />

⊆ \subseteq ⊇ \supseteq ∼ = \cong ⊲⊳ \Join<br />

\neq ⌣ \smile<br />

⊑ \sqsubseteq ⊒ \sqsupseteq<br />

∈ \in ∋ \ni ∝ \propto<br />

⊢ \vdash ⊣ \dashv<br />

Table 2.4: Binary relations<br />

.<br />

= \doteq ⌢ \frown<br />

̸<br />

ℵ \aleph ′ \prime ∀ \forall ∞ \infty<br />

¯h \hbar ∅ \emptyset ∃ \exists<br />

ı \imath ∇ \nabla ¬ \neg △ \triangle<br />

j \jmath √ \surd ♭ \flat △ \triangle<br />

l \ell ⊤ \top ♮ \natural ♣ \clubsuit<br />

℘ \wp ⊥ \bot ♯ \sharp ♦ \diamondsuit<br />

R \Re ‖ \| \ \backslash ♥ \heartsuit<br />

I \Im \angle ∂ \partial ♠ \spadesuit<br />

Table 2.5: Miscellaneous symbols


Module 2. Scientists must write 82<br />

← \leftarrow ←− \longleftarrow ↑ \uparrow<br />

⇐ \Leftarrow ⇐= \Longleftarrow ⇑ \Uparrow<br />

→ \rightarrow −→ \longrightarrow ↓ \downarrow<br />

⇒ \Rightarrow =⇒ \Longrightarrow ⇓ \Downarrow<br />

↔ \leftrightarrow ←→ \longleftrightarrow ↕ \updownarrow<br />

⇔ \Leftrightarrow ⇐⇒ \Longleftrightarrow ⇕ \Updownarrow<br />

↦→ \mapsto ↦−→ \longmapsto ↗ \nearrow<br />

←↪ \hookleftarrow ↩→ \hookrightarrow ↘ \searrow<br />

↼ \leftharpoonup ⇀ \rightharpoonup ↙ \swarrow<br />

↽ \leftharpoondown ⇁ \rightharpoondown ↖ \nwarrow<br />

⇀↽ \rightleftharpoons<br />

Table 2.6: Arrow symbols<br />

( ( ) ) ↑ \uparrow<br />

[ ] ] ] ↓ \downarrow<br />

{ \{ } \} ↕ \updownarrow<br />

⌊ \lfloor ⌋ \rfloor ⇑ \Uparrow<br />

⌈ \lceil ⌉ \rceil ⇓ \Downarrow<br />

〈 \langle 〉 \rangle ⇕ \Updownarrow<br />

/ / \ \backslash<br />

| — ‖ \|<br />

Table 2.7: Delimiters


Module 2. Scientists must write 83<br />

∑ ∑<br />

\sum<br />

⋂ ⋂<br />

\bigcap<br />

⊙ ⊙<br />

\bigodot<br />

∏ ∏<br />

\prod<br />

⋃ ⋃<br />

\bigcup<br />

⊗ ⊗<br />

\bigotimes<br />

∐ ∐ ⊔ ⊔ ⊕ ⊕<br />

\coprod \bigsqcup \bigoplus<br />

∫ ∫<br />

∨ ∨ ⊎ ⊎<br />

\int<br />

\bigvee \biguplus<br />

∮ ∮<br />

∧ ∧<br />

\oint<br />

\bigwedge<br />

Table 2.8: Variable-sized symbols<br />

û \hat{u} ú \acute{u} ū \bar{u} ˙u \dot{u}<br />

ǔ \check{u} ù \grave{u} u \vec{u} ü \ddot{u}<br />

ŭ \breve{u} ũ \tilde{u}<br />

Table 2.9: Math accents


Module 3<br />

Describing the conservation <strong>of</strong><br />

material<br />

In this module we begin to learn how to mathematically model the flow <strong>of</strong><br />

material such as water and air. On our human scale, such material appears<br />

smooth and continuous, albeit made <strong>of</strong> uncounted billions <strong>of</strong> tiny molecules.<br />

We are lead to treat it as smooth in a mathematical description. The first<br />

task is to discover how to describe the movement <strong>of</strong> material. Only then do<br />

we move on to encode physical principles in mathematical terms that tell us<br />

how the various properties <strong>of</strong> the material interact and evolve. Solutions <strong>of</strong><br />

the resulting mathematical models exhibit the rich variety <strong>of</strong> behaviour we<br />

see and use in our everyday life.


Module 3. Describing the conservation <strong>of</strong> material 85<br />

Module contents<br />

3.1 Eulerian description <strong>of</strong> motion . . . . . . . . . . . . . 86<br />

3.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 89<br />

3.2 Conservation <strong>of</strong> mass . . . . . . . . . . . . . . . . . . . 96<br />

3.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />

3.3 Car traffic . . . . . . . . . . . . . . . . . . . . . . . . . . 100<br />

3.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 105<br />

3.3.2 Age structured populations is another application . . . 108<br />

3.3.3 Answers to selected Exercises . . . . . . . . . . . . . . . 110<br />

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 112<br />

The text for this module is by AJ Roberts, A one-dimensional introduction<br />

to continuum mechanics, World Scientific. References to the text use the<br />

format [R,reference].


Module 3. Describing the conservation <strong>of</strong> material 86<br />

3.1 Eulerian description <strong>of</strong> motion<br />

What does it mean to say: “the tide moves water in an estuary with a velocity<br />

5 cos(x/100−t/3) km/h”? W<strong>here</strong> will the fallout from the Chernobyl nuclear<br />

reactor accident be carried by the wind? Answers to such questions require<br />

an understanding <strong>of</strong> how the bulk movement <strong>of</strong> a material may be described<br />

by mathematics. In this section we begin to do this.<br />

Main aims:<br />

the most important aims <strong>of</strong> the section are to<br />

• understand the Eulerian description 1 <strong>of</strong> movement [R,§1.3], and<br />

• to introduce, understand and use the material derivative [R,§1.4]<br />

w<strong>here</strong> v denotes the velocity <strong>of</strong> the material.<br />

Df<br />

Dt = ∂f<br />

∂t + v ∂f<br />

∂x , (3.1)<br />

Reading 3.A Read all <strong>of</strong> Chapter 1 [R,pp1–10]. Especially study §1.3–4<br />

and work through Examples 1.2–4.<br />

1 Leonhard Euler (1707–83), born in Switzerland, developed the application <strong>of</strong> differential<br />

equations to the world around us. He worked prolifically in hydraulics, ballistics,<br />

geometry, optics, magnetism and electricity. He also introduced much modern notation<br />

such as i = √ −1.


Module 3. Describing the conservation <strong>of</strong> material 87<br />

The discussion in §1.2 [R,p6–7] and the Example 1.2–3 shown in Figure 1.4<br />

is readily realised. Get a rubber band and initially hold lightly tensioned<br />

between your two hands. Then move your hands apart. This is roughly the<br />

deformation discussed in Example 1.2–3. Use a bit <strong>of</strong> “liquid paper” or a<br />

texta to put some dots on the rubber band. Watch how the dots move as<br />

you stretch the rubber band. These could be the Lagrangian particle paths<br />

discussed in Example 1.2.<br />

Oceanographers <strong>of</strong>ten drop “floats” to drift with the ocean currents. These<br />

floaters are Lagrangian because they are carried with the moving water.<br />

They are used to help determine ocean circulation which, for example, in<br />

turn helps us model the ocean-atmosp<strong>here</strong> system to predict the nature <strong>of</strong><br />

global warming.<br />

The nature <strong>of</strong> the material derivative is also illustrated in car traffic. An<br />

observer sitting on the side <strong>of</strong> the road is an Eulerian observer <strong>of</strong> the traffic.<br />

He/she would see, for example, a tight bunch <strong>of</strong> cars quickly passing by and<br />

so the observed change in time <strong>of</strong> the density, the time derivative, would be<br />

quite high. However, a driver in one <strong>of</strong> the cars in the bunch is a Lagrangian<br />

observer, the driver would be stuck in the bunch for a long time and so the<br />

moving driver observes rates <strong>of</strong> change in time which are much lower. Using<br />

the material derivative, a stationary observer is able to determine how the<br />

moving driver will see the surrounding traffic, and vice-versa.<br />

Example 3.1: worked Problem 1.4 Here I outline the steps in an answer<br />

to Problem 1.4 [R,p10]. Work through the details.


Module 3. Describing the conservation <strong>of</strong> material 88<br />

(a) dx L /dt = v L is the velocity <strong>of</strong> the particle at x L at time t which<br />

= v E (x L , t) by its definition.<br />

(b) From part (a)<br />

• dxL = v E (x L , t) = 2t x L +1+t 2 which is a linear, first-order,<br />

dt 1+t 2<br />

ordinary differential equation for x L .<br />

• Recall [K,§1.7] that we multiply by an integrating factor (What<br />

is it?) to solve analytically this class <strong>of</strong> differential equations,<br />

to find<br />

• x L = (1+t 2 )(t+C) is the general solution for some integration<br />

constant C.<br />

• But you know that at time t = 0 particles have their initial<br />

position, namely x L (ξ, 0) = ξ, which determines the integration<br />

constant to be just C = ξ.<br />

• Thus x L = (1 + t 2 )(t + ξ).<br />

• Then use this analytic solution to find that the endpoints <strong>of</strong><br />

[0, 1] get carried to the endpoints <strong>of</strong> [10, 15].<br />

(c) Straightforwardly check your answers are:<br />

• ξ E =<br />

x − t by rearranging x L ;<br />

1+t 2<br />

• v L = 1 + 3t 2 + 2tξ by differentiating x L ;<br />

• a L = 6t + 2ξ by differentiating v L ;<br />

• and then confirm that a L (ξ E , t) = DvE . Dt<br />

“ξ” is the Greek<br />

letter “xi” and<br />

corresponds to the<br />

English “x”.


Module 3. Describing the conservation <strong>of</strong> material 89<br />

3.1.1 Exercises<br />

Activity 3.B Do Problems 1.1 [R,p5], 1.3, 1.5 [R,pp10–1], and the Exercises<br />

3.2–3.6. Send in to the examiner for feedback at least Problem 1.3<br />

[R,p10] and Exercise 3.2.<br />

Ex. 3.2: Sketch the velocity field, v(x), corresponding to the particle paths<br />

shown in the following picture. Note that v = dx = 1/(dt/dx) =<br />

dt<br />

1/(slope) and so v(x) at any x is inversely proportional to the slope at<br />

that x.<br />

5<br />

4<br />

3<br />

t<br />

2<br />

1<br />

0<br />

0 2 4 6 8 10<br />

x


Module 3. Describing the conservation <strong>of</strong> material 90<br />

Ex. 3.3: This time the velocity field, v(x, t), depends upon time. For particle<br />

paths shown below, sketch the velocity field at times t = 2 and<br />

t = 4.<br />

5<br />

4<br />

3<br />

t<br />

2<br />

1<br />

0<br />

0 2 4 6 8 10<br />

x<br />

Ex. 3.4: Consider the movement <strong>of</strong> some material in a one-dimensional continuum.<br />

Sketch the velocity field, v(x, t), at times t = 1.5 and t = 3.5<br />

corresponding to the particle paths shown below.


Module 3. Describing the conservation <strong>of</strong> material 91<br />

4<br />

particle paths<br />

3.5<br />

3<br />

2.5<br />

t<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

0 2 4 6 8 10<br />

x<br />

Ex. 3.5: For the steady (time independent) velocity field shown below, sketch<br />

the acceleration field obtained from the material derivative.


Module 3. Describing the conservation <strong>of</strong> material 92<br />

v<br />

1<br />

0.5<br />

0<br />

-0.5<br />

0 1 2 3 4 5 6 7 8<br />

x<br />

Ex. 3.6: Suppose particles <strong>of</strong> a continuum accelerate at a = sin 2x, use the<br />

material derivative to determine the corresponding steady velocity field<br />

v(x) given that v = 0 at x = 0.<br />

Ex. 3.7: Some particle paths are shown in the following picture:


Module 3. Describing the conservation <strong>of</strong> material 93<br />

t<br />

5<br />

4.5<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

particle paths<br />

0<br />

0 2 4 6 8 10<br />

x<br />

Given that the material was <strong>of</strong> uniform density at time t = 0, say<br />

ρ(x, 0) = 1, sketch the density <strong>of</strong> the material at time t = 3. Also<br />

sketch a graph <strong>of</strong> the particles’ velocity versus x at time t = 4.<br />

Example 3.8: worked Problem 1.2 Problem 1.2 [R,p5] leads into Section<br />

3.3 w<strong>here</strong> we model the flow <strong>of</strong> car traffic as a continuum. This


Module 3. Describing the conservation <strong>of</strong> material 94<br />

problem is a little difficult but shows how some algebra leads us to deduce<br />

that we may treat car traffic on a road as a material continuum!<br />

(a) The probability <strong>of</strong> having n cars in a stretch <strong>of</strong> road <strong>of</strong> length<br />

x + δx, P n (x + δx), equals the probability <strong>of</strong> n cars in length x<br />

and none in length δx, together with the probability <strong>of</strong> n − 1 cars<br />

in length x and one car in length δx.<br />

• Hence P n (x + δx) = P n (x)(1 − λδx) + P n−1 (x)λδx.<br />

• Rearrange to Pn(x+δx)−Pn(x) + λP<br />

δx n = λP n−1 .<br />

• Thus as δx → 0, dPn + λP dx n = λP n−1 is a differential equation<br />

for P n .<br />

• Substitute the expression P n = (λx) n e −λx /n! to show that it<br />

satisfies the differential equation.<br />

• One should also show that ∫ ∞<br />

0 P n (x) dx = 1 in order for P n to<br />

be a proper probability distribution. Use induction on n and<br />

integration by parts to do so. (Should any other property be<br />

checked?)<br />

(b) Deduce:<br />

(i) P n (0) = 0 is the probability <strong>of</strong> n cars fitting on a stretch <strong>of</strong><br />

road <strong>of</strong> length 0;<br />

(ii) P 0 (L) = e −λL is the exponentially decaying probability <strong>of</strong> no<br />

cars on a length L;<br />

(iii) P 1 (L) = λLe −λL is the probability <strong>of</strong> finding just one car<br />

in a length L, it reasonably rises from zero with L but over


Module 3. Describing the conservation <strong>of</strong> material 95<br />

lengths bigger than 1/λ it decays to zero as more and more<br />

cars are likely on long lengths <strong>of</strong> road.<br />

(c) The expected number <strong>of</strong> cars on a length x <strong>of</strong> road is<br />

¯n(x) =<br />

∞∑<br />

nP n (x) by definition <strong>of</strong> expectation<br />

n=0<br />

=<br />

∞∑<br />

n (λx)n e −λx<br />

n=1<br />

n!<br />

by value <strong>of</strong> P n<br />

∞<br />

= λxe −λx ∑ (λx) n−1<br />

n=1<br />

(n − 1)!<br />

rearranging factors<br />

= λx by Taylor series for e λx<br />

By similar machinations deduce the variance σ 2 (x) = (n − ¯n) 2 =<br />

n 2 − ¯n 2 is also just λx.<br />

Thus using an averaging length L, the average density <strong>of</strong> cars is ρ =<br />

¯n/L = λ. Since ¯n typically has random fluctuations <strong>of</strong> size σ = √ √<br />

λL,<br />

this estimate <strong>of</strong> the density has fluctuations <strong>of</strong> a size σ/L = λ/L → 0<br />

for large L. Thus averaging works and so cars on a road may be viewed<br />

as a continuum!


Module 3. Describing the conservation <strong>of</strong> material 96<br />

3.2 Conservation <strong>of</strong> mass<br />

A biologist, physicist and mathematician were in a bar. They<br />

watch two people enter a house across the street. A little later,<br />

they see three people leave the house. The biologist says, “They<br />

must have reproduced.” The physicist says, “We must have misinterpreted<br />

the initial input.” The mathematician says, “If one<br />

more person enters the house, t<strong>here</strong> will be no one inside.”<br />

Once we know how to describe motion and properties, we then have to deduce<br />

how these relate to each other. Principles based upon conservation enable<br />

us to do this. Based upon the conservation <strong>of</strong> mass we deduce a differential<br />

equation <strong>of</strong> wide spread importance.<br />

Main aims:<br />

the most important aims <strong>of</strong> this section are:<br />

• to understand how identifying physical processes in and on a slice <strong>of</strong><br />

the continuum leads to a partial differential equation to solve;<br />

• the derivation <strong>of</strong> the continuity equation<br />

Reading 3.C Study all <strong>of</strong> Section 2.1 [R,pp13–16].<br />

∂ρ<br />

∂t + ∂ (ρv) = 0 . (3.2)<br />

∂x


Module 3. Describing the conservation <strong>of</strong> material 97<br />

3.2.1 Exercises<br />

Activity 3.D Do Problem 2.1 [R,p15] and Exercises 3.9–3.12. Send in to<br />

the examiner for feedback at least Problems 3.10–3.12.<br />

Ex. 3.9: At some time the density <strong>of</strong> a material just happens to be constant<br />

in x and the velocity field is as drawn below<br />

v<br />

1<br />

0.5<br />

0<br />

-0.5<br />

0 1 2 3 4 5 6 7 8<br />

x<br />

Use the continuity equation (3.2) to identify w<strong>here</strong> the density is increasing<br />

in time and w<strong>here</strong> the density is decreasing.<br />

Ex. 3.10: Suppose the density at some fixed station x evolves in time according<br />

to the following picture


Module 3. Describing the conservation <strong>of</strong> material 98<br />

2.5<br />

2<br />

ρ<br />

1.5<br />

1<br />

0.5<br />

0 1 2 3 4 5 6<br />

t<br />

Can you justifiably deduce anything from the continuity equation about<br />

the velocity field v in the neighbourhood <strong>of</strong> x? If so, what?<br />

Ex. 3.11: Suppose the density and velocity at some time t are such that the<br />

product ρv is as shown in the following picture<br />

1.5<br />

ρ v<br />

1<br />

0.5<br />

0 1 2 3 4 5 6<br />

x


Module 3. Describing the conservation <strong>of</strong> material 99<br />

Can you justifiably deduce anything about the evolution <strong>of</strong> the density<br />

field ρ? If so, what?<br />

Ex. 3.12: Consider an interval [a, b] <strong>of</strong> a continuum and investigate the rate<br />

at which material is carried into and out <strong>of</strong> the interval. Suppose the<br />

velocity and density at the left-hand side is v(a) = 1 + t 2 and ρ(a) = 2,<br />

while that at the right-hand side is v(b) = (1+t) 2 and ρ(b) = 1/(1+t 2 ).<br />

At what net rate is matter being carried into the interval? How much<br />

is carried in between times t = 0 and t = 1?<br />

Ex. 3.13: Suppose the density field <strong>of</strong> a one dimensional continuum is ρ =<br />

exp[sin(t − x)] and the velocity field is v = cos(t − x). What is the<br />

flux <strong>of</strong> material past x = 0 as a function <strong>of</strong> time? how much material<br />

passes x = 0 in the time interval [0, π/2] ?


Module 3. Describing the conservation <strong>of</strong> material 100<br />

3.3 Car traffic<br />

As far as the laws <strong>of</strong> mathematics refer to reality, they are not<br />

certain, and as far as they are certain, they do not refer to reality.<br />

Albert Einstein<br />

One application <strong>of</strong> continuum modelling is to car traffic. We explore the<br />

modelling <strong>here</strong>, and from the mathematical model deduce phenomena that<br />

are seen on the roads.<br />

Main aims:<br />

the most important aims <strong>of</strong> this section are:<br />

• to appreciate the continuum modelling <strong>of</strong> car traffic;<br />

• the use <strong>of</strong> experimental results to formulate a complete problem;<br />

• the use <strong>of</strong> the classic approach <strong>of</strong> seeking equilibria and then linearisation<br />

to gain a preliminary understanding <strong>of</strong> the dynamics (as in Module<br />

1 but in vastly higher dimension).<br />

• and to introduce the basic features <strong>of</strong> the method <strong>of</strong> characteristics for<br />

solving nonlinear partial differential equations.<br />

Reading 3.E Study all <strong>of</strong> Section 2.2 [R,§2.2,pp16–37].


Module 3. Describing the conservation <strong>of</strong> material 101<br />

Note that the method <strong>of</strong> characteristics is not just an algebraic technique,<br />

that geometry and graphical drawing plays a crucial role. This is a feature<br />

that many people find difficult as they predominantly see mathematics purely<br />

as algebraic manipulation. But for the method <strong>of</strong> characteristics the graphical<br />

element is essential.<br />

Example 3.14: Given the car flux density relation Q(ρ) = ρ(1−ρ/150) cars<br />

per minute w<strong>here</strong> ρ is measured in cars per km, 0 ≤ ρ ≤ 150.<br />

(a) Draw the graph <strong>of</strong> characteristics for the evolution <strong>of</strong> a denser<br />

patch <strong>of</strong> cars for which the initial density is ρ 0 (x) = 25 + 50e −x2 /2<br />

cars per km.<br />

(b) Hence graph the predicted solution ρ(x, t) at times t = 0, 1, 2 and<br />

3 minutes.<br />

Solution: First, deduce the wave speed c(ρ) = Q ′ (ρ) = 1 − ρ/75 km<br />

per minute. Then tabulate the initial density field, the wave speed (the<br />

“slope” <strong>of</strong> the characteristics passing through all the initial points) and<br />

thus also the equation <strong>of</strong> the characteristic x = s + c 0 (s) t:


Module 3. Describing the conservation <strong>of</strong> material 102<br />

s ρ 0 c 0 = c[ρ 0 ] characteristic t = 1 t = 2 t = 3<br />

-4 25.02 0.6664 x = −4 + 0.67 t -3.33 -2.67 -2.00<br />

-3 25.56 0.6593 x = −3 + 0.66 t -2.34 -1.68 -1.02<br />

-2 31.77 0.5764 x = −2 + 0.58 t -1.42 -0.85 -0.27<br />

-1 55.33 0.2623 x = −1 + 0.26 t -0.74 -0.47 -0.21<br />

0 75 0 x = 0 + 0 t 0 0 0<br />

1 55.33 0.2623 x = 1 + 0.26 t 1.26 1.52 1.79<br />

2 31.77 0.5764 x = 2 + 0.58 t 2.58 3.15 3.73<br />

3 25.56 0.6593 x = 3 + 0.66 t 3.66 4.32 4.98<br />

4 25.02 0.6664 x = 4 + 0.67 t 4.67 5.33 6.00<br />

Also tabulated is the position, x value, <strong>of</strong> each characteristic at three<br />

later times. At these positions the density is the value <strong>of</strong> ρ 0 in the same<br />

row. On each <strong>of</strong> the characteristics, plotted below in a characteristic<br />

diagram, the density is constant, namely the value it had initially, as<br />

tabulated in the legend.


Module 3. Describing the conservation <strong>of</strong> material 103<br />

3<br />

2.5<br />

t<br />

2<br />

1.5<br />

1<br />

0.5<br />

25.02<br />

25.56<br />

31.77<br />

55.33<br />

75<br />

55.33<br />

31.77<br />

25.56<br />

25.02<br />

0<br />

-4 -2 0 2 4 6<br />

x<br />

s=(-4:4)’;<br />

rho0=25+50*exp(-s.^2/2);<br />

c0=1-rho0/75;<br />

t=0:3;<br />

x=s*ones(size(t))+c0*t;<br />

plot(x,t)<br />

legend(num2str(rho0))<br />

Then at any time t compute from the equation for each characteristic<br />

given above, the value <strong>of</strong> x w<strong>here</strong> you will find the density ρ 0 (s). Plotting<br />

these points gives the following curves for the density ρ(x, t) at<br />

each time t.


Module 3. Describing the conservation <strong>of</strong> material 104<br />

70<br />

60<br />

50<br />

t=0<br />

t=1<br />

t=2<br />

t=3<br />

ρ<br />

40<br />

30<br />

20<br />

10<br />

0<br />

-4 -2 0 2 4<br />

x<br />

s=linspace(-4,4)’;<br />

rho0=25+50*exp(-s.^2/2);<br />

c0=1-rho0/75;<br />

t=0:3;<br />

x=s*ones(size(t))+c0*t;<br />

plot(x,rho0)<br />

Observe how, over time, the density steepens at the back <strong>of</strong> a bunch <strong>of</strong><br />

cars, and lessens at the front.<br />

In the traffic light example in the textbook you have to imagine that all values<br />

<strong>of</strong> density occur at the mathematical point x = 0. Physically the initial<br />

density will smoothly rise from 0 in a distance in front <strong>of</strong> the traffic lights<br />

to the jamming density a distance behind the traffic lights: the reason is<br />

that density is only defined by choosing some averaging length, when this<br />

averaging length contains part queue and part empty road in front <strong>of</strong> the


Module 3. Describing the conservation <strong>of</strong> material 105<br />

lights then you get an intermediate value <strong>of</strong> the density. However, such a<br />

physically smooth transition occurs in the mathematical model at the mathematical<br />

point x = 0. Thus draw characteristics corresponding to every value<br />

<strong>of</strong> density emanating from x = 0.<br />

3.3.1 Exercises<br />

Activity 3.F Do Problems 2.2–6 [R,pp34–6] and Exercises 3.15–3.16. Send<br />

in to the examiner for feedback at least Problems 2.2 & 2.3(a) [R,pp34–<br />

5], and Ex. 3.16(a) below.<br />

In the last line <strong>of</strong><br />

Prob. 2.6(d),<br />

“dr/dt” should be<br />

“dρ/dt.”<br />

Ex. 3.15: Assume the flux Q(ρ) = ρ(1 − ρ/150)(1 − ρ/300) cars per minute<br />

w<strong>here</strong> the density ρ is in cars per km.<br />

(a) A uniform stream <strong>of</strong> cars is travelling at 50 km/hr. Approximately<br />

what density is the car traffic? At location x = 0, say, a group<br />

<strong>of</strong> cars leave the road to view the scenery, fill up with petrol, etc.<br />

Because they leave, the car traffic density is decreased locally: at<br />

what speed does the patch <strong>of</strong> low density travel in the car traffic?<br />

(b) Repeat the above for the case when the cars travel at 20 km/hr<br />

(because they are at a higher density).


Module 3. Describing the conservation <strong>of</strong> material 106<br />

Ex. 3.16: (a) Show that a constant ρ(x, t) = ρ ∗ is an equilibrium solution<br />

(a fixed point) <strong>of</strong><br />

∂ρ<br />

∂t = c(ρ) ∂2 ρ<br />

∂x . 2<br />

Argue that small fluctuations to ρ about ρ ∗ , say ˆρ(x, t), then obey<br />

the differential equation ∂ ˆρ = c ∂t ∗ ∂2 ˆρ<br />

(approximately).<br />

∂x 2<br />

(b) Repeat the above for the differential equation<br />

∂ρ<br />

∂t = ∂ [<br />

c(ρ) ∂ρ ]<br />

.<br />

∂x ∂x<br />

Ex. 3.17: The initial value problem ∂ρ + ρ ∂ρ = 0 such that ρ(x, 0) = ρ ∂t ∂x 0(x)<br />

has solution ρ = ρ 0 (s) on characteristics x = s+ρ 0 (s)t. Regard x = s+<br />

ρ 0 (s)t as an implicit equation for the function s(x, t), then differentiate<br />

it to find implicit formula for ∂s ∂s<br />

and . Hence show that ρ = ρ ∂t ∂x 0[s(x, t)]<br />

does indeed satisfy the governing differential equation ∂ρ + ρ ∂ρ = 0 .<br />

∂t ∂x<br />

Ex. 3.18: In Figure 3.1 is drawn the wave speed c(ρ) as a function <strong>of</strong> density<br />

ρ for car traffic along some road. Sketch the corresponding car<br />

flux-density relation Q(ρ). Estimate the value <strong>of</strong> the density corresponding<br />

to the maximum flux <strong>of</strong> cars.<br />

Also shown in Figure 3.1 is a plot <strong>of</strong> some initial car density field<br />

ρ 0 (x). Draw, with a little care, in the tx-plane characteristic curves<br />

for the subsequent evolution <strong>of</strong> car traffic; draw enough so that you<br />

then can draw a predicted density field ρ(x, t) at time t = 1 minute.


Module 3. Describing the conservation <strong>of</strong> material 107<br />

1<br />

c(ρ) (km/min)<br />

0.5<br />

0<br />

-0.5<br />

0 50 100 150<br />

ρ (cars/km)<br />

150<br />

ρ 0<br />

(cars/km)<br />

100<br />

50<br />

0<br />

-3 -2 -1 0 1 2 3<br />

x (km)<br />

Figure 3.1:


Module 3. Describing the conservation <strong>of</strong> material 108<br />

Approximately w<strong>here</strong> and when do you estimate a “traffic shock” will<br />

form? Give reasons.<br />

3.3.2 Age structured populations is another application<br />

This example <strong>of</strong> age structured populations is introduced to show a slightly<br />

different use <strong>of</strong> the continuity equation. The same important concepts are<br />

used in a different application.<br />

Consider a population <strong>of</strong> individuals <strong>of</strong> some species, either plant or animal.<br />

We study the structure <strong>of</strong> ages <strong>of</strong> the individuals in the population (not their<br />

spatial structure as in car traffic and other applications explored later).<br />

• Let x denote the age <strong>of</strong> individuals, in years say, and use fractions <strong>of</strong><br />

years by letting the age x be a real number (not just integers). Let<br />

t denote time as usual, in years also. Then the density ρ(x, t) is the<br />

average number <strong>of</strong> individuals in the population who at time t have age<br />

approximately x.<br />

• Now, the number <strong>of</strong> individuals who cross an age x are precisely those<br />

that are at age x. Hence the flux is q = ρ. Equivalently, imagine that<br />

individuals are aging at a rate v = 1 year per year (obviously!), and so<br />

again the flux q = ρv = ρ.


Module 3. Describing the conservation <strong>of</strong> material 109<br />

• Individual plants or animals will die due to accidents, disease, old age,<br />

etc. For simplicity, <strong>here</strong> just assume death only by accident with a constant<br />

probability; ignore old age and other mortal enemies. Then this is<br />

an example <strong>of</strong> the process introduced in Problem 2.1 w<strong>here</strong> individuals<br />

are removed at some rate. The expected number <strong>of</strong> individuals to die<br />

at age x, and hence removed from the population, is then proportional<br />

to the number at that age, namely ρ. Thus include a “source” term<br />

r = −λρ to the right-hand side <strong>of</strong> the continuity equation (negative<br />

because deaths remove individuals).<br />

• A continuity equation for the age structure ( ∂ρ + ∂q<br />

∂t ∂t<br />

∂ρ<br />

∂t + ∂ρ<br />

∂x = −λρ .<br />

= r) is thus<br />

• For example, a steady age population is found by assuming ∂ρ/∂t = 0<br />

and solving this equation. Then ρ = Ce −λx which shows the exponentially<br />

decreasing numbers <strong>of</strong> individuals at age x as fatal accidents<br />

almost inevitably happen to an individual sooner or later.<br />

• But what is the integration constant C? A partial differential equation<br />

generally needs boundary conditions and so far I have not supplied it<br />

with any. Here we need to specify some birth-rate. A constant rate <strong>of</strong><br />

births could reflect a scientist continually preparing new young cultures<br />

to place in the population: specified by say ρ(0, t) = C.<br />

A more sophisticated model says that the number <strong>of</strong> births depends<br />

upon the number and age <strong>of</strong> the parent population. One simple example


Module 3. Describing the conservation <strong>of</strong> material 110<br />

arises by assuming that each individual constantly and independent <strong>of</strong><br />

age gives rise to new individuals: then ρ(0, t) = µ ∫ ∞<br />

0 ρ dx for some<br />

birth-rate µ. Determine the integration constant C as a function <strong>of</strong> the<br />

birth-rate µ.<br />

3.3.3 Answers to selected Exercises<br />

3.6 v = √ 1 − cos 2x<br />

3.10 No.<br />

3.11 Yes. The rate <strong>of</strong> change in density with time has opposite sign to the<br />

slope <strong>of</strong> ρv.<br />

3.12 Rate <strong>of</strong> matter increase is 1 + 2t 2 − 2t/(1 + t 2 ). Total is 1 + 2 − log 2.<br />

3<br />

3.15 (a) ρ = 17.33 cars per km and c = 40.40 km per hour (b) ρ = 81.39<br />

cars per km and c = −11.17 cars per km.<br />

Prob. 1.1 (a) ρ = 1 (2n + 1) w<strong>here</strong> n = [ ]<br />

L<br />

L 2 ; (b) p =<br />

2n+1<br />

{2 + L 10−6 n(n +<br />

1)/3}<br />

Prob. 1.3 (a) Plot x = 1/(1 + t); (b) v L = −ξ 2 /(1 + ξt) 2 and a L =<br />

2ξ 3 /(1 + ξt) 3 ; (c) v E = −x 2 (d) determine Dv E /Dt = 2x 3 .<br />

Prob. 1.5 The label ξ is constant for each particle.


Module 3. Describing the conservation <strong>of</strong> material 111<br />

Prob. 2.5 (a) ∂s<br />

∂x<br />

shock!<br />

= 1/[1 + c′ ∂s<br />

0 (s)t] and = −c<br />

∂t 0 (s)/[1 + c ′ 0 (s)t]] (b) A<br />

Prob. 2.6 (a) Because the radioactive material is conserved. (b) characteristics<br />

are x = (s + t)(1 + t 2 ) (c) follows from s = x/(1 + t 2 ) − t<br />

(d) characteristics stay the same, but t<strong>here</strong> is decay along each characteristic.


Module 3. Describing the conservation <strong>of</strong> material 112<br />

3.4 Summary<br />

• The continuum approximation leads to describing density, velocity and<br />

stress/pressure fields, for example, as functions <strong>of</strong> position x and time t<br />

(§3.1).<br />

• Conservation <strong>of</strong> material leads to the continuity equation (§3.2)<br />

∂ρ<br />

∂t + ∂<br />

∂x (ρv) = 0 .<br />

• Typically, experimental observations are needed to complete the set <strong>of</strong><br />

continuum equations. For example, v = V (ρ) for cars (§3.3).<br />

• Linearisation <strong>of</strong> dynamical equations about convenient equilibria leads<br />

to approximate solutions which allow us to make useful predictions<br />

about the dynamics that occur in the applications.<br />

• The method <strong>of</strong> characteristics, based upon the chain rule and a graphical<br />

approach, leads to exact solutions <strong>of</strong> the partial differential equations<br />

describing nonlinear waves and shocks.


Module 4<br />

The dynamics <strong>of</strong> momentum<br />

Mass conservation is just one powerful principle in modelling the dynamics<br />

<strong>of</strong> material. Another fundamental principle for mechanical systems is the<br />

conservation <strong>of</strong> momentum. This is explored in this module and applied to<br />

the dynamics <strong>of</strong> gases and blood.<br />

Module contents<br />

4.1 Conservation <strong>of</strong> momentum . . . . . . . . . . . . . . . 115<br />

4.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 116<br />

4.2 Dynamics <strong>of</strong> ideal gases . . . . . . . . . . . . . . . . . . 119


Module 4. The dynamics <strong>of</strong> momentum 114<br />

4.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 120<br />

4.3 Equations <strong>of</strong> quasi-one-dimensional blood flow . . . . 124<br />

4.3.1 Answers to selected Exercises . . . . . . . . . . . . . . . 124<br />

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 125


Module 4. The dynamics <strong>of</strong> momentum 115<br />

4.1 Conservation <strong>of</strong> momentum<br />

Balls in flight and other bodies obey Newton’s laws <strong>of</strong> motion, in particular<br />

F = ma, or in words, the net applied force F causes a body with mass m<br />

to move with acceleration a. These rules apply when the body is rigid. But<br />

many bodies are not. Many materials flex and compress or expand. What<br />

rules apply then? We find out in this section that Newton’s laws still apply<br />

in a novel manner.<br />

Main aims: This section largely repeats the arguments for conservation<br />

<strong>of</strong> mass (§3.2) although applied in a more sophisticated fashion. The most<br />

important aims are:<br />

• to understand how identifying the physical processes in and on a slice<br />

<strong>of</strong> continuum leads to a partial differential equation;<br />

• to derive the momentum equation<br />

ρ Dv ( ∂v<br />

Dt = ρ ∂t + v ∂v )<br />

= F + ∂σ<br />

∂x ∂x . (4.1)<br />

Reading 4.A Study Section 3.1 [R,pp47–51].


Module 4. The dynamics <strong>of</strong> momentum 116<br />

4.1.1 Exercises<br />

Activity 4.B Do Problems 3.1–2 [R,pp51–2] and the problems below. Send<br />

in to the examiner for feedback at least Exercises 4.1 and 4.3 below.<br />

Ex. 4.1: Consider a body extending over some range <strong>of</strong> x which is initially<br />

at rest and then is accelerated into motion by the gravitational body<br />

force F = −gρ. Show that v = −gt (independent <strong>of</strong> x) satisfies the<br />

momentum equation (4.1) and explain why this describes a body falling<br />

freely.<br />

Ex. 4.2: A material moves according to the velocity field v = 2tx + 1 + t 2<br />

1+t 2<br />

and has a constant density field ρ. How much momentum is in the<br />

interval [0, 1] <strong>of</strong> the material? As a function <strong>of</strong> time t, what is the rate<br />

at which momentum enters the interval [0, 1] through being carried<br />

across the ends x = 0 and x = 1?<br />

Ex. 4.3: A material body has no applied body force, F = 0, but has an internal<br />

pattern <strong>of</strong> stress σ(x) shown below. Sketch the resultant material<br />

acceleration. For simplicity assume the body has constant density in x<br />

at this particular time.


Module 4. The dynamics <strong>of</strong> momentum 117<br />

1<br />

stress σ<br />

0<br />

-1<br />

0 1 2 3 4 5 6 7 8<br />

x<br />

Example 4.4: worked Problem 3.3 In outline [R,p52].<br />

(a) The continuity equation ∂ρ + ∂ (ρv) = 0 when ρ is constant reduces<br />

∂t ∂x<br />

to just ∂v = 0, which implies v may not depend upon x and hence<br />

∂x<br />

only depends upon t.<br />

(b) With stress σ = −p, F = −Cv and v independent <strong>of</strong> x, the momentum<br />

equation (3.2) reduces to<br />

• ∂p = − ( )<br />

Cv + ρ ∂v<br />

∂x ∂t which, since the right-hand side is independent<br />

<strong>of</strong> x, is integrated to<br />

• p = − ( )<br />

Cv + ρ ∂v<br />

∂t x + D for some integration constant D<br />

independent <strong>of</strong> x,<br />

• and thus observe the pressure is linear in x.<br />

(c) Substituting x = 0 determines D = p 0 (t). Substituting x = L then<br />

determines the given differential equation for v(t).


Module 4. The dynamics <strong>of</strong> momentum 118<br />

(d) The differential equation is a linear, first order differential equation<br />

and so may be solved by multiplying by the integrating factor<br />

e Ct/ρ . Treating p L−p 0<br />

as a constant, obtain the solution<br />

L<br />

v = p L − p 0<br />

LC<br />

(<br />

1 − e<br />

−Ct/ρ ) .<br />

Interpret the solution to see that the flow exponentially quickly<br />

approaches the steady, long-term flow-rate <strong>of</strong> (p L − p 0 )/(LC) representing<br />

a balance between driving pressure drop, p L − p 0 , and<br />

total viscous drag, LC.


Module 4. The dynamics <strong>of</strong> momentum 119<br />

4.2 Dynamics <strong>of</strong> ideal gases<br />

Perhaps the next simplest mechanical continuum is that formed by ideal<br />

gases. For example, air is an ideal gas to a very good approximation. Here<br />

we show how to use the two conservation equations, one for material and one<br />

for momentum, to deduce the nature and propagation <strong>of</strong> sound. We extend<br />

the analysis to a description <strong>of</strong> a sonic boom such as that generated by a<br />

supersonic plane.<br />

Main aims:<br />

• to understand the need to supplement the partial differential equations<br />

by an equation <strong>of</strong> state;<br />

• to see the wave equation arise in the linearised dynamics <strong>of</strong> the mathematical<br />

model<br />

∂ 2 u<br />

∂t 2<br />

= c2 ∂2 u<br />

∂x 2 . (4.2)<br />

• to show further use <strong>of</strong> the basics <strong>of</strong> the method <strong>of</strong> characteristics.<br />

Reading 4.C Study Section 3.2 [R,pp52–8].<br />

Note: in [R,p55],<br />

twice g(x − c ∗ t)<br />

should read<br />

g(x + c ∗ t).


Module 4. The dynamics <strong>of</strong> momentum 120<br />

4.2.1 Exercises<br />

Activity 4.D Do Problem 3.4 [R,p58]<br />

Example 4.5: worked Problem 3.5 In outline [R,p58], fill in the details.<br />

(a) Reproduce the argument on [R,pp24–5] for velocity v instead <strong>of</strong> ρ,<br />

and for k + v instead <strong>of</strong> c(ρ).<br />

(b) Draw a characteristic diagram for 0 ≤ x ≤ 4.5 and 0 ≤ t ≤ 4 as in<br />

Figure 4.1.<br />

• Characteristics emanating from the x-axis (x > 0) have slope<br />

k(= 1) as the velocity v = 0 on them all from the initial state.<br />

For example, the characteristic emanating from (x, t) = (1, 0)<br />

is the line x = 1 + t, and on this line we know that v = 0.<br />

• Characteristics emanating from the t-axis (t > 0) have differing<br />

slopes <strong>of</strong> 1 + v for the prescribed v. For example,<br />

the characteristic emanating from (x, t) = (0, 1/2) is the line<br />

x = ( )<br />

1 + 1<br />

2π (t − 1/2), and on this line v = 1/2π.<br />

By looking at the value <strong>of</strong> v on each characteristic at each <strong>of</strong><br />

the two times t = 2 and t = 4, draw the solution curves for the<br />

velocity v as seen in Figure 4.2. Evidently from the intersection<br />

<strong>of</strong> the characteristics the first shock needs to form at some time<br />

between roughly t = 2 and t = 2.5.<br />

A Matlab<br />

program to animate<br />

the characteristic<br />

solution is in<br />

fanreal.m, try it.


t<br />

Module 4. The dynamics <strong>of</strong> momentum 121<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5<br />

x<br />

Figure 4.1: characteristic diagram for a fan blowing air into a long pipe.


Module 4. The dynamics <strong>of</strong> momentum 122<br />

0.2<br />

t=2<br />

0.1<br />

v<br />

0<br />

-0.1<br />

-0.2<br />

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5<br />

x<br />

0.2<br />

t=4<br />

0.1<br />

v<br />

0<br />

-0.1<br />

-0.2<br />

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5<br />

x<br />

Figure 4.2: velocity field predicted at two times by the characteristic solution.<br />

Note the multi-valued solution indicating the need for “shocks”.


Module 4. The dynamics <strong>of</strong> momentum 123<br />

Exercise 4.6: For an ideal gas with γ = 1 the continuity and momentum<br />

equations are<br />

∂ρ<br />

∂t + ∂(ρv)<br />

[ ∂v<br />

∂x = 0 and ρ ∂t + v ∂v ]<br />

= −k 2 ∂ρ<br />

∂x ∂x .<br />

Linearise about the fixed point v = 0 and ρ = ρ ∗ and then combine the<br />

linearised equations to deduce that sound, density-velocity fluctuations,<br />

obey the wave equation<br />

∂ 2ˆv<br />

∂t 2<br />

= k2<br />

∂2ˆv<br />

∂x 2 .


Module 4. The dynamics <strong>of</strong> momentum 124<br />

4.3 Equations <strong>of</strong> quasi-one-dimensional blood flow<br />

Main aims:<br />

• to generalise the continuity and momentum equations to situations<br />

w<strong>here</strong> the cross-sectional area <strong>of</strong> a continuum varies in space and time;<br />

• to see how to model the dynamics <strong>of</strong> blood flowing through an elastic<br />

artery by the forced wave equation.<br />

Reading 4.E Study Section 5.1–2, [R,pp111–123].<br />

Activity 4.F Do Problems 5.1–2, [R,pp123–4]. Send in to the examiner for<br />

feedback at least Prob. 5.1.<br />

4.3.1 Answers to selected Exercises<br />

4.2 ρ ( t<br />

+ 1 + t 2) , −ρ [ ]<br />

2t + 4t2<br />

1+t 2 (1+t 2 ) 2<br />

Prob. 3.1 (a) F i+1 = −(4 − i)mg<br />

(b) σ = −(L − x)ρg<br />

Prob. 3.2 ∂(ρv) + ∂(ρv2 )<br />

∂t ∂x<br />

= F + ∂σ<br />

∂x + ru<br />

Prob. 3.4 (a) ρ = [ ρ 2/5<br />

0 − 2 5 gx/k2] 5/2<br />

(b) ∂v + ( ρ 1/5<br />

∂t ∗ k + 6v/5 ) ∂v<br />

= 0 ∂x


Module 4. The dynamics <strong>of</strong> momentum 125<br />

4.4 Summary<br />

• The principle <strong>of</strong> conservation <strong>of</strong> momentum leads to the momentum<br />

equation (§4.1) ( ∂v<br />

ρ<br />

∂t + v ∂v )<br />

= F + ∂σ<br />

∂x ∂x .<br />

• Typically, experimental observations are needed to complete the set <strong>of</strong><br />

continuum equations. For example, σ = −p ∝ −ρ γ−1 for gasses (§4.2).<br />

• Linearisation <strong>of</strong> dynamical equations about convenient equilibria leads<br />

to approximate solutions which allow us to make useful predictions<br />

about the dynamics that occur in the applications.<br />

• In situations w<strong>here</strong> the continuum varies in cross-sectional area, say<br />

A(x, t), the continuity equation becomes (§4.3)<br />

∂<br />

∂t (Aρ) + ∂<br />

∂x (Aρv) = 0 ,<br />

and the momentum equation is<br />

[ ∂v<br />

Aρ<br />

∂t + v ∂v ]<br />

= F 1 + ∂<br />

∂x ∂x (Aσ) .<br />

• The material and muscles <strong>of</strong> an artery suggest a linear model, Hooke’s<br />

law for arteries, p = p ∗ +α(R−R ∗ )+P (x, t), relating the pressure (−σ)<br />

to the varying radius <strong>of</strong> the artery and including the muscle applied<br />

pressure P (§4.3).


Part II<br />

Structure, algebra and<br />

approximation <strong>of</strong> applied functions


Part contents<br />

5 The nature <strong>of</strong> infinite series 129<br />

5.1 Introduction to summing an infinite series . . . . . . . . . . . 132<br />

5.2 Establishing when a series converges . . . . . . . . . . . . . . 142<br />

5.3 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147<br />

5.4 Taylor’s theorem in n-dimensions . . . . . . . . . . . . . . . . 166<br />

5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />

6 Series solutions <strong>of</strong> differential equations give special functions<br />

185<br />

6.1 Power series method leads to Legendre polynomials . . . . . . 189<br />

6.2 Frobenius method is needed to describe Bessel functions . . . 195<br />

6.3 Computer algebra for repetitive tasks . . . . . . . . . . . . . . 206


PART CONTENTS 128<br />

6.4 The orthogonal solutions to second order differential equations 245<br />

6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249<br />

7 Linear transforms and their eigenvectors on inner product<br />

spaces 252<br />

7.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . 255<br />

7.2 The nature <strong>of</strong> linear transformations . . . . . . . . . . . . . . 269<br />

7.3 Revision <strong>of</strong> eigenvalues and eigenvectors . . . . . . . . . . . . 284<br />

7.4 Diagonalisation transformation . . . . . . . . . . . . . . . . . 289<br />

7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312


Module 5<br />

The nature <strong>of</strong> infinite series<br />

Quite <strong>of</strong>ten we use power series to approximately solve differential equations<br />

(see the next module). For example, an exact solution to the differential<br />

equation y ′′ = 6y 2 is y = 1/(1+x) 2 . But suppose, using techniques developed<br />

in the next module, we knew only the power series approximate solution<br />

y = 1 − 2x + 3x 2 − 4x 3 + 5x 4 − · · · : how can we sensibly ascribe a value to<br />

such an infinite sum?<br />

This module will focus on the following question:<br />

How is it possible, quite generally, to add up infinitely many numbers<br />

and still obtain a sum which is finite and sensible?


Module 5. The nature <strong>of</strong> infinite series 130<br />

A set <strong>of</strong> infinitely many numbers will be called a ‘sequence’ and when the<br />

numbers in a sequence are added together they are said to form an ‘infinite<br />

series’. If an infinite series has a finite sum it is said to ‘converge’ and if not,<br />

to ‘diverge’.<br />

Though not couched in these terms, our question has a long history in mathematics,<br />

beginning with the work <strong>of</strong> the Greek philosopher Zeno <strong>of</strong> Elea<br />

in the 5th century B.C. Zeno is noted for having posed four ‘paradoxes’<br />

which showed that in order to understand fundamental concepts like motion,<br />

change, continuity and infinity, one must resolve questions like the one we<br />

have before us. In turn, it was essential that these concepts be placed on<br />

a firm mathematical foundation to allow the complete development <strong>of</strong> differential<br />

and integral calculus, begun by Newton and Leibnitz in the 17th<br />

century.<br />

Module contents<br />

5.1 Introduction to summing an infinite series . . . . . . 132<br />

5.1.1 Zeno’s Second Paradox: Achilles and the Tortoise . . . . 133<br />

5.1.2 Case studies: using partial sums . . . . . . . . . . . . . 134<br />

5.1.3 Case study: the harmonic series diverges . . . . . . . . . 139<br />

5.2 Establishing when a series converges . . . . . . . . . . 142<br />

5.2.1 Absolute and conditional convergence . . . . . . . . . . 143<br />

5.2.2 Tests for the convergence <strong>of</strong> series . . . . . . . . . . . . 145


Module 5. The nature <strong>of</strong> infinite series 131<br />

5.3 Power series . . . . . . . . . . . . . . . . . . . . . . . . . 147<br />

5.3.1 Functions from power series . . . . . . . . . . . . . . . . 151<br />

5.3.2 Taylor and Maclaurin Series . . . . . . . . . . . . . . . . 152<br />

5.3.3 Truncation error for Taylor series . . . . . . . . . . . . . 155<br />

5.3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 164<br />

5.4 Taylor’s theorem in n-dimensions . . . . . . . . . . . . 166<br />

5.4.1 Identify local maxima and minima . . . . . . . . . . . . 170<br />

5.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 180<br />

5.4.3 Answers to selected Exercises . . . . . . . . . . . . . . . 182<br />

5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 183


Module 5. The nature <strong>of</strong> infinite series 132<br />

5.1 Introduction to summing an infinite series<br />

Suppose then that we have an infinite sequence <strong>of</strong> real numbers and, for<br />

simplicity, that all the numbers are positive. Intuitively, it is clear that if<br />

all the numbers remain about the same size, or if they progressively increase<br />

in size, such as 1, 2, 3, 4,. . . , then when the numbers are added together<br />

their sum will grow without limit. On the other hand, if the numbers grow<br />

progressively smaller in size, so that when the numbers are added together<br />

each successive number contributes less and less to the overall sum, such as<br />

1, 1/2, 1/4, 1/8, . . . , then it might be possible for the sum to remain finite.<br />

Now suppose that we allow the sequence to contain both positive and negative<br />

numbers. The negative numbers will tend to cancel out the contributions<br />

which the positive numbers make to the sum. If the negative numbers are<br />

randomly interspersed among the positive numbers, then the effect that they<br />

might have on the sum is difficult to assess. However, in many practical<br />

situations, the negative terms alternate with the positive ones and this is<br />

easier to handle. These intuitive ideas will be developed more fully below.<br />

Main aims:<br />

• introduce some examples <strong>of</strong> summing an infinite series;<br />

• show examples <strong>of</strong> when a sum cannot be found.


Module 5. The nature <strong>of</strong> infinite series 133<br />

5.1.1 Zeno’s Second Paradox: Achilles and the Tortoise<br />

Consider just one <strong>of</strong> Zeno’s paradoxes which, in modern units, could be<br />

expressed as follows.<br />

Achilles, who runs 10 times faster than a tortoise, set <strong>of</strong>f to chase one<br />

100 metres away. At the same time the tortoise began to crawl away from<br />

him. By the time Achilles reached the point w<strong>here</strong> the tortoise started, the<br />

tortoise was 10 m away. Achilles continued the chase but, upon reaching the<br />

tortoises previous position, the tortoise had moved and was now 1 m away.<br />

Achilles continued for another metre, but yet again the tortoise had moved<br />

further. This apparently continues forever: the tortoise has always moved<br />

by the time Achilles had reached w<strong>here</strong> it last was. Evidently, Achilles was<br />

never able to catch the tortoise.<br />

This conclusion is clearly absurd. We know from experience that tortoises<br />

are relatively easy to catch. Zeno was concerned with finding the fault in<br />

his logic. Had he the use <strong>of</strong> a modern number system, much <strong>of</strong> his problem<br />

would have disappeared: the total distance run by Achilles in chasing the<br />

tortoise is<br />

100 + 10 + 1 + 0.1 + 0.01 + 0.001 + · · · = 111.11111 . . . metres.<br />

To suggest that Achilles could not cover this distance is to say that he would<br />

never be able to run 112m, which he certainly could. The problem is with the<br />

word never. The ancient Greeks apparently thought that it was impossible


Module 5. The nature <strong>of</strong> infinite series 134<br />

to sum an infinite set <strong>of</strong> numbers and arrive at a finite sum. This is refuted<br />

in our number system by the commonplace notion <strong>of</strong> a recurring decimal, for<br />

example<br />

1<br />

3 = 0.333333 . . .<br />

= 3 10 + 3<br />

10 2 + 3<br />

10 3 + 3<br />

10 4 + 3<br />

10 5 + · · ·<br />

w<strong>here</strong> the right-hand side is the sum <strong>of</strong> an infinite series and the left-hand<br />

side is its clearly finite sum. Thus this infinite series converges to the value<br />

1/3.<br />

This is an example<br />

<strong>of</strong> a convergent<br />

geometric series.<br />

Definition 5.1 Given an infinite sequence <strong>of</strong> numbers that we wish to sum,<br />

say z 1 , z 2 , z 3 , . . . , we define the partial sums S n = ∑ n<br />

k=1 z k and say that the<br />

infinite series, the infinite sum, ∑ ∞<br />

k=1 z k converges to the value lim n→∞ S n if<br />

this limit exists.<br />

5.1.2 Case studies: using partial sums<br />

Example 5.1: establishing convergence from partial sums.<br />

show that:<br />

∞∑ 1<br />

k(k + 1) = 1 .<br />

k=1<br />

Here I


Module 5. The nature <strong>of</strong> infinite series 135<br />

• Begin by considering the sequence {S n } <strong>of</strong> partial sums:<br />

S 1 =<br />

S 2 =<br />

S 3 =<br />

1<br />

(the first term)<br />

1 × 2<br />

1<br />

1 × 2 + 1 (the sum <strong>of</strong> the first 2 terms)<br />

2 × 3<br />

1<br />

1 × 2 + 1<br />

2 × 3 + 1 (the sum <strong>of</strong> the first 3 terms)<br />

3 × 4<br />

.<br />

S n =<br />

1<br />

1 × 2 + 1<br />

2 × 3 + 1<br />

3 × 4 + · · · + 1<br />

n(n + 1) .<br />

• Finding a sum for this series requires that we find a limit for the<br />

sequence {S n }. To proceed, note that<br />

then<br />

S n =<br />

1<br />

n(n + 1) = 1 n − 1<br />

n + 1<br />

( 1<br />

1 − 1 ( 1<br />

+<br />

2)<br />

2 − 1 ( 1<br />

+ · · · +<br />

3)<br />

n − 1 − 1 ) ( 1<br />

+<br />

n n − 1 )<br />

.<br />

n + 1<br />

• Clearly, all terms cancel except the first and last, a process known<br />

as a telescopic sum and this leaves:<br />

S n = 1 − 1<br />

n + 1 .


Module 5. The nature <strong>of</strong> infinite series 136<br />

• It follows that:<br />

(<br />

lim S n = lim 1 − 1 )<br />

= 1 .<br />

n→∞ n→∞ n + 1<br />

Thus ∑ ∞ 1<br />

k=1<br />

below.<br />

k(k+1)<br />

converges and its sum is 1, as is seen in the table<br />

n nth term z n partial sum S n<br />

1 0.5 0.5<br />

10 0.0090909090909 0.909090909090<br />

50 0.0003921568627 0.980392156862<br />

100 0.0000990099009 0.990099009900<br />

200 0.0000248756218 0.995024875621<br />

500 0.0000039920159 0.998003992015<br />

1000 0.0000009990009 0.999000999000<br />

10 000 0.0000000099990 0.999900009999<br />

This result is also displayed graphically using Matlab


Module 5. The nature <strong>of</strong> infinite series 137<br />

1<br />

0.9<br />

S n<br />

0.8<br />

0.7<br />

0.6<br />

S n<br />

0.5<br />

0 5 10 15 20 25 30 35 40 45 50<br />

n<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />

1/n<br />

n=50;<br />

k=1:n;<br />

s=cumsum(1./(k.*(k+1)));<br />

subplot(2,1,1)<br />

plot(k,s,’+’,k,1+zeros(size(k)),’--’)<br />

subplot(2,1,2)<br />

plot(1./k,s,’.’,0,1,’o’)<br />

The top plot shows the partial sums converging to the limit as n → ∞<br />

and the bottom plot shows the same limit, plotted as a circle in the<br />

top left corner, but perhaps more convincingly as 1/n → 0 (equivalent<br />

to n → ∞) by plotting S n against 1/n.<br />

Example 5.2: establishing divergence from partial sums. The series<br />

∞∑<br />

(−1) k+1 = 1 − 1 + 1 − 1 + · · ·<br />

k=1


Module 5. The nature <strong>of</strong> infinite series 138<br />

is divergent, for the partial sum<br />

n∑<br />

S n = (−1) k+1 = 1 − 1 + 1 − 1 + · · · + (−1) n+1 =<br />

k=1<br />

{<br />

0 , if n is even,<br />

1 , if n is odd.<br />

Thus the sequence <strong>of</strong> partial sums is {1, 0, 1, 0, . . .} which has no limit.<br />

This example provides a good illustration <strong>of</strong> the absurdities which can<br />

arise from supposing that a limit exists when, in fact, it does not.<br />

• Suppose that the above series has a limit, say S, then<br />

S = 1 − 1 + 1 − 1 + 1 − 1 + · · ·<br />

= 1 − (1 − 1 + 1 − 1 + 1 − · · ·)<br />

= 1 − S<br />

⇒ 2S = 1<br />

⇒ S = 1 2 .<br />

• However, it is equally valid (actually equally invalid) to argue<br />

S = 1 − 1 + 1 − 1 + 1 − 1 + · · ·<br />

= (1 − 1) + (1 − 1) + (1 − 1) + · · ·<br />

= 0 ,<br />

• or again,<br />

S = 1 − (1 − 1) − (1 − 1) − (1 − 1) − · · ·<br />

= 1 .<br />

In this context<br />

divergence has<br />

nothing to do with a<br />

differential operator!<br />

It means that an


Module 5. The nature <strong>of</strong> infinite series 139<br />

• We cannot sensibly ascribe any particular value to the sum and<br />

hence we say that the series is divergent.<br />

5.1.3 Case study: the harmonic series diverges<br />

The infinite series<br />

∞∑<br />

k=1<br />

1<br />

k = 1 + 1 2 + 1 3 + 1 4 + · · ·<br />

is called the harmonic series. We will show in a moment that the harmonic<br />

series diverges, which is important in connection with Kreyszig’s caution<br />

[K,p735] that the terms in a series getting inexorably smaller, z k → 0, is not<br />

a sufficient condition for the series to converge.<br />

Example 5.3: The harmonic series is divergent The pro<strong>of</strong> is by contradiction,<br />

i.e. assume that the series converges to some value H, and<br />

show that this leads to a contradiction.<br />

Let H = 1 + 1 2 + 1 3 + 1 4 + 1 5 + 1 6 + 1 7 + · · · ,<br />

E = 1 2 + 1 4 + 1 6 + · · · ,<br />

O = 1 + 1 3 + 1 5 + 1 7 + · · · .<br />

E represents the<br />

sum <strong>of</strong> the even<br />

terms and O the<br />

sum <strong>of</strong> the odd<br />

terms. Since they<br />

are both sub-sets <strong>of</strong><br />

H, they must<br />

converge if H does.


Module 5. The nature <strong>of</strong> infinite series 140<br />

Now observe three facts which form the contradiction.<br />

• Since the harmonic series has simply been partitioned into a series<br />

<strong>of</strong> its even terms and a series <strong>of</strong> its odd terms, we must have<br />

H = E + O .<br />

• Since for all n, the nth term <strong>of</strong> O is larger than the nth term <strong>of</strong><br />

E, it follows that<br />

O > E<br />

which means that O contributes more than half <strong>of</strong> the total <strong>of</strong> H,<br />

so that E must contribute less than half <strong>of</strong> the total.<br />

• Taking a common factor <strong>of</strong> 1/2 out <strong>of</strong> each term <strong>of</strong> E allows us to<br />

rewrite E as<br />

E = 1 (1 + 1 2 2 + 1 3 + 1 4 + 1 5 + 1 ·)<br />

6 + · · ,<br />

or E = 1 2 H ,<br />

which contradicts the previous observation that E must be less<br />

than half H.<br />

In spite <strong>of</strong> the fact that 1/k → 0 as k → ∞, the harmonic series is<br />

divergent, a famous example originally discovered by Nicole d’Oresme<br />

in the 14th century. It should be noted however, that the harmonic<br />

series diverges very slowly: after fifteen thousand terms the sum has


Module 5. The nature <strong>of</strong> infinite series 141<br />

grown to 10.1931 and after one million terms, to only 14.3927, yet it<br />

does diverge!<br />

5<br />

4<br />

S n<br />

3<br />

2<br />

S n<br />

1<br />

0 5 10 15 20 25 30 35 40 45 50<br />

n<br />

4.5<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />

1/n<br />

n=50;<br />

k=1:n;<br />

s=cumsum(1./k);<br />

subplot(2,1,1)<br />

plot(k,s,’+’,k,log(2*k+1),’--’)<br />

subplot(2,1,2)<br />

plot(1./k,s,’+’)


Module 5. The nature <strong>of</strong> infinite series 142<br />

5.2 Establishing when a series converges<br />

Main aims:<br />

• introduce the two types <strong>of</strong> convergence when summing an infinite series:<br />

absolute convergence is robust, and conditional convergence which is,<br />

in a sense, marginal.<br />

• develop and use three tests for convergence <strong>of</strong> the sum <strong>of</strong> a series.<br />

Reading 5.A Study Section 14.1 in Kreyszig [K,pp732–40].<br />

Note:<br />

• Chapter 14 deals with sequences and series <strong>of</strong> complex numbers, but<br />

the same theory applies if the numbers are real.<br />

• Remember the distinction between a sequence and a series: an infinite<br />

series is summed to give a sequence <strong>of</strong> partial sums.<br />

• Cauchy’s convergence principle for series also applies to a sequence in<br />

the form, paraphrasing that on [K,p735], that


Module 5. The nature <strong>of</strong> infinite series 143<br />

Theorem 5.2 A sequence S n converges if and only if for every ɛ > 0<br />

(no matter how small), we can find an N (depending upon ɛ in general)<br />

such that |S n − S m | < ɛ for all n, m > N.<br />

Cauchy’s principle is extremely useful, especially in more difficult problems,<br />

because we can test rigorously for convergence without actually<br />

knowing the value <strong>of</strong> the limit to which the sequence or series converges!<br />

But we will see little <strong>of</strong> this aspect in this unit.<br />

5.2.1 Absolute and conditional convergence<br />

The notions <strong>of</strong> absolute convergence and conditional convergence are well<br />

illustrated by contrasting the harmonic series, which diverges, with the alternating<br />

harmonic series,<br />

∞∑<br />

(−1) k+1 1<br />

k=1<br />

k = 1 − 1 2 + 1 3 − 1 4 + · · · ,<br />

which converges [K,p736, Example 3], but only just.


Module 5. The nature <strong>of</strong> infinite series 144<br />

1<br />

0.9<br />

S n<br />

0.8<br />

0.7<br />

0.6<br />

S n<br />

0.5<br />

0 5 10 15 20 25 30 35 40 45 50<br />

n<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />

1/n<br />

n=50;<br />

k=1:n;<br />

s=cumsum((-1).^(k-1)./k);<br />

subplot(2,1,1)<br />

plot(k,s,’+’,k,log(2)+zeros(1,n),’--’)<br />

subplot(2,1,2)<br />

plot(1./k,s,’+’,0,log(2),’o’)<br />

Essentially, the alternation <strong>of</strong> sign produces some degree <strong>of</strong> cancellation in<br />

successive terms which is sufficient to allow the series to converge w<strong>here</strong>as<br />

the harmonic series, which has terms <strong>of</strong> the same size but all positive, fails<br />

to converge. In this situation the convergence is conditional.<br />

On the other hand, an absolutely convergent series such as<br />

∞∑<br />

(−1) k+1 1 k 2<br />

k=1<br />

converges absolutely because the sum <strong>of</strong> the absolute values <strong>of</strong> terms<br />

∞∑<br />

1 ∣ ∣∣∣ ∞∑ 1<br />

∣ (−1)k+1 =<br />

k 2 k 2<br />

k=1<br />

k=1


Module 5. The nature <strong>of</strong> infinite series 145<br />

converges, though obviously to a different sum (see Using the Comparison<br />

test in §§5.2.2). Here its terms z k → 0 fast enough to ensure convergence<br />

even though all terms are positive.<br />

5.2.2 Tests for the convergence <strong>of</strong> series<br />

The comparison test, ratio test and root test which Kreyszig establishes in<br />

Theorems 5–10 <strong>of</strong> §14.1 are very useful tools in determining whether a given<br />

series converges. Notice that they do not tell you what the sum <strong>of</strong> the series<br />

may be, other methods are needed for that. The ratio test is the most<br />

important <strong>of</strong> these.<br />

Often geometric series are useful in applications <strong>of</strong> the comparison test since<br />

their convergence is easily established [K,Theorem 9, p739].<br />

Example 5.4: using the Comparison test. The series ∑ ∞<br />

k=1 1/k 2 is convergent,<br />

for<br />

• a sneaky way to write this is<br />

∞∑<br />

k=1<br />

• Now observe that<br />

1<br />

k 2 = 1 + ∞ ∑<br />

k=2<br />

1<br />

k 2 = 1 + ∞ ∑<br />

k=1<br />

1<br />

(k + 1) 2 < 1<br />

k(k + 1) ,<br />

1<br />

(k + 1) 2 .


Module 5. The nature <strong>of</strong> infinite series 146<br />

• then ∑ ∞<br />

k=1 1/(k + 1) 2 converges by comparison with ∑ ∞<br />

k=1 1/[k(k + 1)]<br />

which was shown to converge in the worked example in §§5.1.2.<br />

• Hence the series ∑ ∞<br />

k=1 1/k 2 converges as displayed below.<br />

1.8<br />

1.6<br />

S n<br />

1.4<br />

1.2<br />

S n<br />

1<br />

0 5 10 15 20 25 30 35 40 45 50<br />

n<br />

1.8<br />

1.6<br />

1.4<br />

1.2<br />

1<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />

1/n<br />

n=50;<br />

k=1:n;<br />

s=cumsum(1./k.^2);<br />

subplot(2,1,1)<br />

plot(k,s,’+’,k,pi^2/6+zeros(1,n),’--’)<br />

subplot(2,1,2)<br />

plot(1./k,s,’+’,0,pi^2/6,’o’)<br />

Activity 5.B Do examples from Problem Set 14.1 [K,p730]. Send in to the<br />

examiner for feedback at least Q3, 7, 12 & 13.


Module 5. The nature <strong>of</strong> infinite series 147<br />

5.3 Power series<br />

We are interested in power series such as the “solution” y = 1 − 2x + 3x 2 −<br />

4x 3 + 5x 4 − · · · <strong>of</strong> the differential equation (1 + x) 2 y ′′ = 6y. This power<br />

series and its properties will depend upon x, for example: at x = 0 it is<br />

y = 1 − 0 + 0 − 0 + 0 − · · · which trivially converges to y = 1; at x = 1 it<br />

clearly diverges as the terms in the series 1 − 2 + 3 − 4 + 5 − · · · increase in<br />

magnitude; at x = 1/2 it converges and so we might say y(1/2) ≈ 1 − 1 +<br />

3/4 − 1/2 + 5/16 = 9/16, but what then is the error? how good may we<br />

expect the linear approximation, y = 1 − 2x? This section addresses these<br />

questions:<br />

• how does convergence depend upon x in such a power series?<br />

• what sort <strong>of</strong> error may we expect in any finite truncation <strong>of</strong> the infinite<br />

series?<br />

Main aims:<br />

• to show that within their domain <strong>of</strong> convergence, power series define<br />

well-behaved functions <strong>of</strong> x (or z);<br />

• conversely, the Taylor or Maclaurin series <strong>of</strong> a function generally converges<br />

to the function in some domain;


Module 5. The nature <strong>of</strong> infinite series 148<br />

• to deduce an expression that usefully estimates the error in using a<br />

Taylor series approximation.<br />

A power series is an infinite series with terms that involve a variable; Kreyszig<br />

uses a complex variable z, but the theory applies equally to real power series,<br />

w<strong>here</strong> we might use x to represent a real variable. Thus a power series like<br />

∞∑<br />

a n z n = a 0 + a 1 z + a 2 z 2 + · · ·<br />

n=0<br />

involves both constant coefficients, a 0 , a 1 , a 2 ,. . . , and increasing powers <strong>of</strong><br />

a complex or real variable z, roughly like an “infinite polynomial”. Notice<br />

that we start the summation at n = 0 to allow for a constant term a 0 , not<br />

depending on z, but the convergence or divergence <strong>of</strong> the resulting series is<br />

determined by the value <strong>of</strong> z, as well as by the coefficients.<br />

Reading 5.C Study [K,pp741–5, §14.2], particularly Radius <strong>of</strong> convergence.<br />

Example 5.5: Write down the centre and determine the radius <strong>of</strong> convergence<br />

<strong>of</strong> the power series 1 − 2x + 3x 2 − 4x 3 + 5x 4 − · · ·.


Module 5. The nature <strong>of</strong> infinite series 149<br />

Solution: Clearly this has centre <strong>of</strong> expansion x = 0 as it is written<br />

in powers <strong>of</strong> x = (x − 0). To determine its radius <strong>of</strong> convergence note<br />

that the power series is ∑ ∞<br />

n=0 (n + 1)(−1) n x n ; that is, its nth coefficient<br />

is a n = (−1) n (n + 1). Use the ratio test<br />

a n+1 x n+1 ∣ ∣∣∣∣ =<br />

∣ a n x n<br />

∣<br />

(−1) n+1 (n + 2)x n+1<br />

(−1) n (n + 1)x n ∣ ∣∣∣∣<br />

=<br />

∣ n + 2 ∣∣∣ ∣<br />

n + 1 x → |x| as n → ∞ ,<br />

which is less than 1 if and only if |x| < 1. Thus the radius <strong>of</strong> convergence<br />

is R = 1 and we expect the power series to usefully converge for −1 <<br />

x < 1.<br />

This analysis holds<br />

if x is either real or<br />

complex.<br />

Sometimes a power series only involves only even or odd powers <strong>of</strong> x−c (or x)<br />

in which case the radius <strong>of</strong> convergence is best determined from that in terms<br />

<strong>of</strong> (x − c) 2 (or x 2 ). The following example shows the sort <strong>of</strong> considerations<br />

that could be applied.<br />

Example 5.6: convergence in x 2 Consider the power series for<br />

sin x = x − 1 6 x3 + 1<br />

120 x5 − · · · = ∑<br />

n odd<br />

(−1) (n−1)/2<br />

and show it converges for all x. A direct application <strong>of</strong> the ratio test<br />

fails because the ratio <strong>of</strong> consecutive terms is either 0 or ∞ as all the<br />

n!<br />

x n ,


Module 5. The nature <strong>of</strong> infinite series 150<br />

terms in even powers are zero! However, recast the series as<br />

sin x =<br />

= x<br />

∞∑ (−1) n<br />

(2n + 1)! x2n+1<br />

n=0<br />

∞∑<br />

n=0<br />

1<br />

(2n + 1)! zn<br />

upon letting z = −x 2 and extracting the common factor <strong>of</strong> x from the<br />

series. Then it is straightforward to show that the series ∑ ∞ 1<br />

n=0 (2n+1)! zn<br />

converges for all z from the ratio test:<br />

a n+1 z n+1 ∣ ∣∣∣∣ =<br />

∣ a n z n<br />

∣<br />

z n+1 /(2n + 3)!<br />

z n /(2n + 1)!<br />

∣ ∣∣∣∣<br />

∣ = z<br />

(2n + 3)(2n + 2) ∣ → 0 as n → ∞ ,<br />

for all z. Since it converges for all z = −x 2 , the original series must<br />

correspondingly converge for all x.<br />

Other substitutions may be used to analyse the convergence <strong>of</strong> power series<br />

with other patterns <strong>of</strong> zero terms.<br />

Activity 5.D Do problems in Problem Set 14.2 [K,p745]. Send in to the<br />

examiner for feedback at least Q2 & 4.


Module 5. The nature <strong>of</strong> infinite series 151<br />

5.3.1 Functions from power series<br />

The key point <strong>of</strong> this subsection is that at every point z for which such a<br />

power series converges, we can use its sum to define the value <strong>of</strong> a function<br />

f(z).<br />

∞∑<br />

f(z) = a n z n = a 0 + a 1 z + a 2 z 2 + · · · (5.1)<br />

n=0<br />

Kreyszig shows that such functions f(z), called analytic functions, have nice<br />

properties: they are continuous, differentiable and integrable at every point<br />

inside their radius <strong>of</strong> convergence. Also their derivatives and integrals are<br />

found exactly as you would hope to, by differentiating or integrating the<br />

power series term-by-term.<br />

Reading 5.E Study all <strong>of</strong> §14.3 [K,pp746–8] except for the subsection Power<br />

series represent analytic functions which you need only read.<br />

Exercise 5.7: Suppose function f(x) defined by a power series in (x − c)<br />

with some nonzero radius <strong>of</strong> convergence R:<br />

f(x) =<br />

∞∑<br />

a k (x − c) k<br />

k=0<br />

= a 0 + a 1 (x − c) + · · · + a n (x − c) n + · · ·<br />

∀x such that |x − c| < R .<br />

Recall that ∀ is<br />

short for “for all.”


Module 5. The nature <strong>of</strong> infinite series 152<br />

By differentiating f repeatedly with respect to x and evaluating each<br />

derivative at x = c, show that<br />

f (n) (c)<br />

n!<br />

= a n for n = 0, 1, 2, . . . .<br />

Given that we finally have established convergence <strong>of</strong> an infinite sum and that<br />

we can differentiate a power series this exercise can now be done. It most<br />

importantly establishes that the power series representation <strong>of</strong> any function<br />

f(x) about x = c is unique and is its Taylor series.<br />

Activity 5.F Do the above exercise and problems in Problem Set 14.3<br />

[K,pp750–1]. Send in to the examiner for feedback at least Q3 & 4.<br />

5.3.2 Taylor and Maclaurin Series<br />

Two English mathematicians, Brook Taylor (1685–1731) and Colin Maclaurin<br />

(1698–1746) pioneered this work for real power series. Taylor presented<br />

his results for power series in (x − c) while Maclaurin’s name is associated<br />

with power series in x.


Module 5. The nature <strong>of</strong> infinite series 153<br />

If a function f can be represented by a power series in (x − c), with radius<br />

<strong>of</strong> convergence R, then<br />

f(x) = f(c) + f ′ (c)(x − c) + f ′′ (c)<br />

2!<br />

(x − c) 2 + · · · + f (n) (c)<br />

(x − c) n + · · ·<br />

n!<br />

for all x such that |x − c| < R. This series representation is called the Taylor<br />

series in (x − c) <strong>of</strong> the function f.<br />

You have shown in Exercise 5.7 that if f has a power series representation<br />

then it must be the Taylor series, i.e. t<strong>here</strong> is only one power series in (x − c)<br />

to correspond to a given function f.<br />

When c = 0, the Taylor series gives a power series in x called the Maclaurin<br />

series. The Maclaurin series representation <strong>of</strong> function f is:<br />

f(x) = f(0) + f ′ (0)x + f ′′ (0)<br />

2!<br />

x 2 + · · · + f (n) (0)<br />

x n + · · ·<br />

n!<br />

for all x such that |x| < R. Note that in the Maclaurin series, all the<br />

derivatives <strong>of</strong> f are evaluated at 0, and the interval <strong>of</strong> convergence has its<br />

centre at 0.<br />

The Taylor series in (x−c) <strong>of</strong> a function f is usually referred to as the ‘Taylor<br />

series expansion <strong>of</strong> f about c’, while the Maclaurin series <strong>of</strong> f is the ‘Taylor<br />

series expansion <strong>of</strong> f about 0’.


Module 5. The nature <strong>of</strong> infinite series 154<br />

Example 5.8: finding a Maclaurin series. Assuming that f(x) = e x can<br />

be represented by a power series in x, we find its Maclaurin series as<br />

follows. Firstly, find f and its derivatives at x = 0:<br />

Hence the Maclaurin series is:<br />

f(x) = e x ⇒ f(0) = 1<br />

f ′ (x) = e x ⇒ f ′ (0) = 1<br />

f ′′ (x) = e x ⇒ f ′′ (0) = 1<br />

.<br />

f(x) = e x = 1 + x + x2<br />

2! + x3<br />

3! + x4<br />

4! + · · · = ∞ ∑<br />

.<br />

k=0<br />

x k<br />

k! .<br />

Reading 5.G Study the part <strong>of</strong> §14.4 [K,pp754–7] from Power series as Taylor<br />

series to the end <strong>of</strong> the section inclusive.<br />

Note the pivotal role <strong>of</strong> their power series properties <strong>of</strong> uniqueness, differentiability<br />

and integrability.<br />

Exercise 5.9: Find the radius and interval <strong>of</strong> convergence for the power<br />

series<br />

∞∑ (x − 2) n+1<br />

f(x) =<br />

(n + 1)3 . n+1<br />

n=0


Module 5. The nature <strong>of</strong> infinite series 155<br />

Find the sum <strong>of</strong> the series for f(x), thus writing an expression for f(x)<br />

not involving an infinite series. Hint: consider f ′ (x).<br />

Activity 5.H Do problems from Problem Set 14.4 [K,pp757–9]. Send in to<br />

the examiner for feedback at least Q2, 10 & 19.<br />

5.3.3 Truncation error for Taylor series<br />

Some <strong>of</strong> the earliest work on power series was done by the Scots mathematician<br />

James Gregory (1638–1675). He developed a power series method for<br />

interpolating table values for functions. The idea <strong>of</strong> using power series to<br />

estimate function values remained a prime motivation for later workers like<br />

Taylor. For example, putting x = 1 in the Maclaurin series for e x we obtain:<br />

e = exp(1) = 1 + 1 + 1 2! + 1 3! + 1 4! + · · ·<br />

∞∑ 1<br />

=<br />

k=0<br />

k!<br />

≈ 2.718281828459045235360287 . . . .<br />

Now e is a transcendental number, i.e. it is not the root <strong>of</strong> any algebraic<br />

equation and its value is an infinite, non-recurring decimal. In fact the only


Module 5. The nature <strong>of</strong> infinite series 156<br />

way <strong>of</strong> representing the number e exactly is as the sum <strong>of</strong> an infinite series.<br />

To estimate its value though, we have to take a partial sum <strong>of</strong> the series<br />

and in doing so we make a truncation error. With computers, it is now<br />

possible to compute e to hundreds, thousands, or even millions <strong>of</strong> decimal<br />

places. This is far greater accuracy than was ever dreamed <strong>of</strong> by Gregory,<br />

but every expansion involves an error and we should know something about<br />

these errors.<br />

Consider the Taylor series <strong>of</strong> a function f about c:<br />

f(x) = f(c) + f ′ (c)(x − c) + f ′′ (c)<br />

(x − c) 2 + · · · + f (n) (c)<br />

(x − c) n + · · ·<br />

2!<br />

n!<br />

∀x such that |x − c| < R . (5.2)<br />

Truncate the series after terms up to order n to form an nth degree polynomial<br />

approximation to f(x):<br />

P n (x) = f(c) + f ′ (c)(x − c) + f ′′ (c)<br />

2!<br />

(x − c) 2 + · · · + f (n) (c)<br />

(x − c) n ,<br />

n!<br />

w<strong>here</strong> P n (x) is called the Taylor polynomial <strong>of</strong> degree n for f at c.<br />

truncation error made in such an approximation is:<br />

The<br />

Taylor polynomials<br />

are like the partial<br />

sums <strong>of</strong> a series.<br />

R n (x) = f(x) − P n (x) ,<br />

w<strong>here</strong> R n (x) is called the remainder term for an nth order approximation.


Module 5. The nature <strong>of</strong> infinite series 157<br />

Example 5.10: Taylor polynomials approximate the function Consider<br />

the power series 1 − 2x + 3x 2 − 4x 3 + 5x 4 − · · · discussed earlier<br />

which we claimed is the power series for y = 1/(1 + x) 2 . The first few<br />

Taylor polynomials are:<br />

P 0 (x) = 1 ,<br />

P 1 (x) = 1 − 2x ,<br />

P 2 (x) = 1 − 2x + 3x 2 ,<br />

P 3 (x) = 1 − 2x + 3x 2 − 4x 3 .<br />

These are plotted below with 1/(1 + x) 2 plotted dashed:<br />

f(x) and approximations<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

P 0<br />

(x)<br />

P 1<br />

(x)<br />

P 2<br />

(x)<br />

P 3<br />

(x)<br />

0<br />

-0.5 0 0.5<br />

x<br />

x=linspace(-0.5,0.5);<br />

p=[ones(size(x))<br />

-2*x<br />

3*x.^2<br />

-4*x.^3];<br />

p=cumsum(p);<br />

plot(x’,p’,x,1./(1+x).^2,’--’)


Module 5. The nature <strong>of</strong> infinite series 158<br />

Observe that all Taylor polynomials are accurate sufficiently close to<br />

the centre <strong>of</strong> expansion x = 0. The error, or remainder, away from<br />

x = 0 is given by the distance from a curve to the exact dashed line<br />

and is different for each polynomial.<br />

Example 5.11: A 1st order Taylor polynomial for f(x) = e x about<br />

x = 1.<br />

f(x) = f(c) + f ′ (c)(x − c) + R 1 (x)<br />

Here f(c) = f ′ (c) = e 1 = e, so<br />

e x = e + e(x − 1) + R 1 (x)<br />

= e.x + R 1 (x)<br />

The following theorem shows one way to estimate the remainder, R n (x).<br />

Theorem 5.3 (Lagrange’s remainder) let f be a function which has n+1<br />

derivatives that are continuous on some interval I containing c. Then, for<br />

every x ∈ I, t<strong>here</strong> exists a number, u, between x and c, such that:<br />

f(x) = f(c) + f ′ (c)(x − c) + · · · + f (n) (c)<br />

(x − c) n + R n (x)<br />

n!<br />

= P n (x) + R n (x)


Module 5. The nature <strong>of</strong> infinite series 159<br />

w<strong>here</strong> Lagrange’s remainder is<br />

R n (x) = f (n+1) (u)<br />

(n + 1)! (x − c)n+1 . (5.3)<br />

Example 5.12: Lagrange’s remainder Examine the simple example <strong>of</strong><br />

the cubic f(x) = 1 + x + x 3 . It has a Taylor’s series about x = 0<br />

which is just itself (this is why this example is simple). The linear<br />

Taylor polynomial approximation to f(x) is simply P 1 (x) = 1 + x.<br />

By inspection we know that its error is the remainder R 1 (x) = x 3 .<br />

However, in complicated cases we will not know this and we have to<br />

see what the theorem can tell us. Here it tells us that t<strong>here</strong> exists a u,<br />

0 ≤ u ≤ x, such that<br />

R 1 (x) = f ′′ (u)<br />

x 2 = 6u 2 2 x2 = 3ux 2 .<br />

Here, because we already know R 1 (x) = x 3 we identify the correct<br />

u = x/3 which is indeed between 0 and x. In general we will not know<br />

R 1 (x) exactly but because 0 ≤ u ≤ x we will be able to say that the<br />

remainder, the error, R 1 (x) ≤ 3x 3 as 3ux 2 ≤ 3x 3 for 0 ≤ u ≤ x. Thus<br />

we can <strong>of</strong>ten place a bound on the error in a Taylor polynomial.<br />

Pro<strong>of</strong>:


Module 5. The nature <strong>of</strong> infinite series 160<br />

• since x is a fixed point in I with x ≠ c, let g be a function <strong>of</strong> t, defined<br />

as follows:<br />

g(t) = f(x) − f(t) − f ′ (t)(x − t) − f ′′ (t)<br />

(x − t) 2 −<br />

2!<br />

· · · − f (n) (t)<br />

(x − t) n (x − t)n+1<br />

− R n (x)<br />

n!<br />

(x − c) . n+1<br />

The reason for defining g in this way is that differentiating with respect<br />

to t has a telescoping effect. For example:<br />

d<br />

dt [−f(t) − f ′ (t)(x − t)] = −f ′ (t) + f ′ (t) − f ′′ (t)(x − t)<br />

= −f ′′ (t)(x − t).<br />

• The net result is that g ′ (t) simplifies to:<br />

g ′ (t) = − f (n+1) (t)<br />

n!<br />

(x − t) n (x − t)n<br />

+ (n + 1)R n (x)<br />

(x − c) n+1<br />

for all t between x and c. Also note that, for fixed x,<br />

and<br />

g(c) = f(x) − P n (x) − R n (x) = 0 ,<br />

g(x) = f(x) − f(x) − 0 − · · · − 0 = 0 .<br />

Thus we have g(c) = g(x) = 0 and g is differentiable between x and c.<br />

Moreover, g is continuous throughout I, since f and its derivatives are<br />

continuous. This includes c, x and all points in between.


Module 5. The nature <strong>of</strong> infinite series 161<br />

♠<br />

• T<strong>here</strong>fore, g satisfies the conditions for Rolle’s theorem 1 , and it follows<br />

that t<strong>here</strong> is a number u between x and c for which g ′ (u) = 0. Now<br />

substituting t = u in g ′ (t) gives:<br />

g ′ (u) = − f (n+1) (u)<br />

n!<br />

⇒ R n (x) = f (n+1) (u)<br />

(n + 1)! (x − c)n+1 .<br />

(x − u) n (x − u)n<br />

+ (n + 1)R n (x)<br />

(x − c) = 0 n+1<br />

Note that when applying this result, we do not expect to be able to find the<br />

exact value <strong>of</strong> u. If we could do that, then making an approximation to f<br />

would not have been necessary. Rather, we try to find bounds for f (n+1) (u)<br />

from which we can estimate how large the remainder R n (x) might become,<br />

as in the worked example below.<br />

Lastly, suppose we approximate a function f by some Taylor polynomial, so<br />

that:<br />

f(x) = P n (x) + R n (x) ,<br />

1 Those unfamiliar with Rolle’s theorem should consult either <strong>of</strong> the following:<br />

– Mizrahi & Sullivan: Calculus & Analytic Geometry (3rd Edition); Wadsworth<br />

(1990)–Chapter 11.<br />

– Larson, Hostetler & Edwards: Calculus (5th Edition); Heath (1994)–Chapter 8.


Module 5. The nature <strong>of</strong> infinite series 162<br />

or equivalently,<br />

P n (x) = f(x) − R n (x) .<br />

Taking limits as n → ∞, the left-hand side will give the whole Taylor series<br />

for f, and on the right, f(x) does not depend on n. Thus a necessary and<br />

sufficient condition for the Taylor series to converge to f is that:<br />

lim R f (n+1) (u)<br />

n(x) = lim<br />

n→∞ n→∞ (n + 1)! (x − c)n+1 = 0 .<br />

Example 5.13: determining the accuracy <strong>of</strong> an approximation. Use a Taylor<br />

polynomial <strong>of</strong> degree 5 for sin x about x = 0 to estimate sin(0.1) and<br />

bound the accuracy <strong>of</strong> the approximation using Lagrange’s remainder.<br />

• start by calculating derivatives:<br />

f(x) = sin x ⇒ f(0) = 0<br />

f ′ (x) = cos x ⇒ f ′ (0) = 1<br />

f ′′ (x) = − sin x ⇒ f ′′ (0) = 0<br />

f ′′′ (x) = − cos x ⇒ f ′′′ (0) = −1<br />

f (4) (x) = sin x ⇒ f (4) (0) = 0<br />

f (5) (x) = cos x ⇒ f (5) (0) = 1<br />

f (6) (x) = − sin x .<br />

• Now<br />

sin x ≈ P 5 (x) = x − x3<br />

3! + x5<br />

5!


Module 5. The nature <strong>of</strong> infinite series 163<br />

and<br />

R 5 (x) = f (6) (u)<br />

x 6<br />

6!<br />

= − sin u x6<br />

6!<br />

• Using the above to approximate sin(0.1):<br />

for some number u with 0 ≤ u ≤ 0.1 .<br />

sin(0.1) ≈ P 5 (0.1) = 0.1 − (0.1)3 + (0.1)5<br />

3! 5!<br />

= 0.1 − 0.000166666 − 0.000000083<br />

and the remainder is given by<br />

= 0.099833416<br />

R 5 (0.1) = − sin u (0.1)6<br />

6!<br />

• Since the sine function is increasing on the interval [0, 0.1] we must<br />

have 0 ≤ sin u < 1 so<br />

−0.000000001 ≈ − (0.1)6<br />

6!<br />

and we conclude that<br />

or<br />

< R 5 (0.1) = − sin u (0.1)6<br />

6!<br />

0.099833416 − 0.000000001 ≤ sin(0.1) ≤ 0.099833416<br />

0.099833415 ≤ sin(0.1) ≤ 0.099833416<br />

.<br />

≤ 0


Module 5. The nature <strong>of</strong> infinite series 164<br />

Activity 5.I Do problems 5.14–5.16 from Exercises 5.3.4.<br />

5.3.4 Exercises<br />

Ex. 5.14: Bound the error on the Taylor polynomial P 2 (x) (about x = 0) as<br />

an approximation to 1/(1 + x) 2 over the interval − 1 2 < x < 1 2 . What<br />

would the bound be if we were only interested in 0 ≤ x < 1 2 ?<br />

Ex. 5.15: Use a Taylor polynomial <strong>of</strong> degree 2 about x = 0 for e x to estimate<br />

e 0.1 and bound the accuracy <strong>of</strong> the approximation using Lagrange’s<br />

remainder.<br />

Ex. 5.16: Use a Taylor polynomial <strong>of</strong> degree 3 to estimate f(x) = e 2x at<br />

x = 0.1, and use Lagrange’s remainder theorem to determine an error<br />

bound for your estimate.<br />

Ex. 5.17: Use a Taylor polynomial <strong>of</strong> degree 4 about x = 0 for log(1 + x) to<br />

estimate log(1.2) and bound the accuracy <strong>of</strong> the approximation using<br />

Lagrange’s remainder (note: log denotes the natural logarithm).


Module 5. The nature <strong>of</strong> infinite series 165<br />

Ex. 5.18: Find the Maclaurin series for the function f(x) = arctan x and<br />

determine its radius <strong>of</strong> convergence. Hint: the Maclaurin series for<br />

1/(1 + x 2 ) = 1 − x 2 + x 4 − x 6 + · · · .<br />

Ex. 5.19: Consider the function defined by the infinite series<br />

g(x) =<br />

∞∑<br />

n=1<br />

[<br />

1<br />

(−1) n + 1 ]<br />

(x + 1) n .<br />

n2 n 2 n<br />

Find the region in which this series converges.


Module 5. The nature <strong>of</strong> infinite series 166<br />

5.4 Taylor’s theorem in n-dimensions<br />

It is useful to generalise Taylor’s result to functions <strong>of</strong> several variables. An<br />

outline <strong>of</strong> the three variable case is presented below, from which generalisation<br />

to other cases is straightforward.<br />

Main aims:<br />

• generalise Taylor series to many independent variables;<br />

• use this generalisation to find and characterise maxima and minima <strong>of</strong><br />

functions <strong>of</strong> many variables.<br />

Given a function f(x, y, z) we seek an expansion for f(x + h, y + p, z + q)<br />

at some ‘nearby’ point, w<strong>here</strong> the expansion is written in terms <strong>of</strong> f and its<br />

derivatives and powers <strong>of</strong> h, p and q.<br />

Exercise 5.20: By setting x−c = h in equation (5.2) show that the Taylor<br />

series <strong>of</strong> a real function, f(x), centred at x can be written<br />

f(x + h) = f(x) + hf ′ (x) + h2<br />

2! f ′′ (c) + · · · + hn<br />

n! f (n) (x) + · · ·<br />

assuming, <strong>of</strong> course, that |h| < R, the radius <strong>of</strong> convergence <strong>of</strong> the<br />

power series at x.


Module 5. The nature <strong>of</strong> infinite series 167<br />

The implication <strong>of</strong> this expansion is that the value <strong>of</strong> an analytic function<br />

at points x + h ‘nearby’ to x is entirely determined by the values<br />

<strong>of</strong> f and its derivatives at the point x, and the separation, h. This is<br />

useful, particularly if the radius <strong>of</strong> convergence about x is not small.<br />

Outline <strong>of</strong> a Taylor’s series for a function <strong>of</strong> three variables: begin by<br />

using the single variable Taylor expansion derived in the Exercise 5.20.<br />

• First, vary x only holding y and z constant, then<br />

f(x + h, y + p, z + q) = f + h ∂f<br />

∂x + h2 ∂ 2 f<br />

2! ∂x + h3 ∂ 3 f<br />

2 3! ∂x + · · · 3<br />

w<strong>here</strong> all derivatives and f are evaluated at (x, y + p, z + q).<br />

Since only one<br />

variable changes, all<br />

derivatives are<br />

partial derivatives.<br />

• Now hold x and z constant in this series and do the expansion for y +p,<br />

f and its derivatives are now evaluated at (x, y, z + q).<br />

• Now hold x and y constant and do the expansion for z + q. Collect together<br />

all terms with the same total order <strong>of</strong> differentiation and obtain<br />

the following result.


Module 5. The nature <strong>of</strong> infinite series 168<br />

f(x + h, y + p, z + q) =<br />

(<br />

f + h ∂f<br />

∂x + p∂f ∂y + q ∂f )<br />

∂z<br />

+ 1 (<br />

)<br />

h 2 ∂2 f<br />

2! ∂x + ∂2 f<br />

2 p2<br />

∂y + ∂2 f<br />

2 q2<br />

∂z + 2hp ∂2 f<br />

2 ∂x∂y + 2hq ∂2 f<br />

∂x∂z + 2pq ∂2 f<br />

∂y∂z<br />

+ 1 (<br />

h 3 ∂3 f<br />

3! ∂x + 2 similar terms + 3 3h2 p ∂3 f<br />

+ 5 similar terms<br />

∂x 2 ∂y<br />

∂ 3 )<br />

f<br />

+ 6hpq<br />

∂x∂y∂z<br />

+ · · ·<br />

w<strong>here</strong> f and all its derivatives are evaluated at (x, y, z). This is expressed<br />

more compactly in terms <strong>of</strong> the displacement vector H = hi + pj + qk as:<br />

f(x + h, y + p, z + q) = f + (H · ∇)f + 1 2! (H · ∇)2 f + 1 3! (H · ∇)3 f<br />

+ · · · + 1 n! (H · ∇)n f + · · ·<br />

Recall from first<br />

year mathematics<br />

that the gradient <strong>of</strong><br />

f is ∇f =<br />

i ∂f<br />

∂x + j ∂f<br />

∂y + k ∂f<br />

∂z .<br />

w<strong>here</strong><br />

H · ∇ ≡<br />

(<br />

h ∂<br />

∂x + p ∂ ∂y + q ∂ )<br />

∂z<br />

and (H · ∇) n f means do the operation H · ∇ to f, then to the result, then<br />

to the result <strong>of</strong> that etc., until the operation has been done n times.<br />

Our work on extrema requires only the terms up to second order.


Module 5. The nature <strong>of</strong> infinite series 169<br />

Example 5.21: Find up to the second-order terms <strong>of</strong> the multi-variable<br />

Taylor series <strong>of</strong> f(x, y) = cos x e 2y about (x, y) = (0, 0).<br />

Solution: “Up to the second-order terms” includes (H · ∇) 2 f but<br />

excludes all third derivative terms. Now, using subscripts to denote<br />

partial differentiation:<br />

• f(0, 0) = 1;<br />

• f x = − sin x e 2y so f x (0, 0) = 0;<br />

• f y = 2 cos x e 2y so f y (0, 0) = 2;<br />

• f xx = − cos x e 2y so f xx (0, 0) = −1;<br />

• f xy = −2 sin x e 2y so f xy (0, 0) = 0;<br />

• f yy = 4 cos x e 2y so f yy (0, 0) = 4.<br />

Hence the second-order truncation <strong>of</strong> the Taylor series is<br />

f(h, p) ≈ f + (hf x + pf y ) + 1 2<br />

= 1 + 2p − 1 2 h2 + 2p 2 .<br />

(<br />

h 2 f xx + 2hpf xy + p 2 f yy<br />

)<br />

Note: as f(x, y) is the product <strong>of</strong> a function <strong>of</strong> x and a function <strong>of</strong> y,<br />

namely cos x and e 2y , this answer is quite sensibly the product <strong>of</strong> the<br />

two single variable, second-order Taylor polynomials, namely 1 − x 2 /2<br />

and 1 + 2y + 2y 2 .


Module 5. The nature <strong>of</strong> infinite series 170<br />

Activity 5.J Do Problem 5.23 in Exercises 5.4.2 [p180].<br />

examiner for feedback at least part (b).<br />

Send in to the<br />

5.4.1 Identify local maxima and minima<br />

The 3D-surface plotted in the following graph contains several peaks and<br />

a trough. The highest peak is a global maximum the trough is a global<br />

minimum and the two smaller peaks are called local maxima. Collectively,<br />

such points are known as extrema. A local maximum is higher than all points<br />

nearby, but a global maximum is the highest <strong>of</strong> all points on the surface.<br />

Minima are defined analogously.


Module 5. The nature <strong>of</strong> infinite series 171<br />

18<br />

16<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

0<br />

5<br />

10<br />

15<br />

20<br />

25<br />

30 0<br />

5<br />

10<br />

15<br />

20<br />

25<br />

30<br />

surfc(peaks(40)+8)<br />

The location and study <strong>of</strong> extrema is frequently important, for example,<br />

suppose the height z <strong>of</strong> the surface above the xy-plane represents the temperature<br />

<strong>of</strong> a chemical reaction as quantities x and y <strong>of</strong> two reactants are<br />

added, it may be essential to know how high or low the temperature can go<br />

in order to properly contain the reaction.<br />

Mathematically, a 3-D surface is represented explicitly as z = f(x, y), or<br />

implicitly by F (x, y, z) = C for some constant C. In first-year mathematics<br />

courses we saw that local extrema occur at stationary points w<strong>here</strong><br />

∂f<br />

∂x = ∂f<br />

∂y = 0 ,


Module 5. The nature <strong>of</strong> infinite series 172<br />

so that all directional derivatives <strong>of</strong> f vanish at a stationary point, or equivalently<br />

the tangent plane to the surface is horizontal, which means that the<br />

normal to the surface must be in the same direction as the z-axis: that is,<br />

parallel ∇F ‖ k. T<strong>here</strong> are stationary points, called saddle points which<br />

satisfy these conditions but are nether minima nor maxima. In the following<br />

figure the origin (0, 0, 0) is a saddle point. In the plane x = 0, moving<br />

along the dashed line, the origin appears to be a local maximum, but in the<br />

plane y = 0, along the solid line, a local minimum. The behaviour <strong>of</strong> nearby<br />

points depends on the direction in which (0, 0, 0) is approached which defines<br />

a saddle point. It is neither a local minimum nor local maximum.<br />

100<br />

50<br />

0<br />

z<br />

-50<br />

-100<br />

-150<br />

5<br />

0<br />

y<br />

-5<br />

-5<br />

x<br />

0<br />

x=linspace(-5,5), y=x;<br />

[X Y]=meshgrid(x,y);<br />

Z=2*X.^2-5*Y.^2;<br />

surfl(X,Y,Z)<br />

5<br />

Activity 5.K Do Problem 5.24 in the Exercises 5.4.2.


Module 5. The nature <strong>of</strong> infinite series 173<br />

Algebraically, extrema are characterised using Taylor’s formula in n-dimensions.<br />

For example in 2-D, suppose (a, b) is a local extremum <strong>of</strong> f(x, y), then compare<br />

the value <strong>of</strong> f(a, b) with nearby points f(a + h, b + p), w<strong>here</strong> h, p are<br />

small:<br />

• if all nearby values <strong>of</strong> f are greater than f(a, b) then (a, b) is a local<br />

minimum;<br />

• if all nearby values <strong>of</strong> f are less than f(a, b) then (a, b) is a local maximum;<br />

• otherwise (a, b) is a saddle point.<br />

Taylor’s theorem gives<br />

f(a + h, b + p) = f(a, b) + hf x (a, b) + pf y (a, b)<br />

+ 1 (<br />

h 2 f xx (a, b) + p 2 f yy (a, b) + 2hpf xy (a, b) )<br />

2!<br />

+ higher order terms.<br />

Subscripts x and y<br />

to a function f are<br />

used to denote<br />

partial derivatives<br />

with respect to the<br />

subscript variable.<br />

Now f x (a, b) = f y (a, b) = 0, since (a, b) is an extremum and terms which<br />

are cubic and higher order in (h, p) are negligible compared to the quadratic<br />

term, so<br />

f(a + h, b + p) − f(a, b) ≈ 1 Q(h, p) (5.4)<br />

2


Module 5. The nature <strong>of</strong> infinite series 174<br />

w<strong>here</strong> the quadratic terms<br />

Q = f xx h 2 + 2f xy hp + f yy p 2 = [ h p ] [ ] [<br />

f xx f xy h<br />

f yx f yy p<br />

]<br />

= h T Hh , (5.5)<br />

w<strong>here</strong> all the second-order derivatives are evaluated at (a, b), and w<strong>here</strong> the<br />

vector h = (h, p).<br />

Definition 5.4 In (5.5) Q(h) has been written as the quadratic form Q =<br />

h T Hh:<br />

• h T Hh is called the Hessian 2 <strong>of</strong> f at the point (a, b);<br />

• the symmetric matrix H <strong>of</strong> second derivatives is called the Hessian<br />

matrix;<br />

• such a quadratic form, Q, is said to be positive definite if Q(h) > 0 for<br />

all h ≠ 0;<br />

• and is said to be negative definite if Q(h) < 0 for all h ≠ 0.<br />

From (5.4):<br />

• if Q is positive definite then f(a + h, b + p) − f(a, b) > 0 (at least near<br />

enough to (a, b)) and so (a, b) is a local minimum;<br />

2 Ludwig Otto Hesse introduced these in 1884.


Module 5. The nature <strong>of</strong> infinite series 175<br />

• if Q is negative definite then (a, b) is a local maximum;<br />

• otherwise, (a, b) could be a saddle point, but it could also mean that we<br />

need information from the “higher order terms” neglected in forming<br />

the approximation (5.4).<br />

Observe that the Hessian matrix, in n-D<br />

H =<br />

[ ∂ 2 f<br />

∂x i ∂x j<br />

]<br />

=<br />

⎡<br />

⎢<br />

⎣<br />

∂ 2 f<br />

∂x 2 1<br />

∂ 2 f<br />

∂x 2 ∂x 1<br />

.<br />

∂ 2 f<br />

∂x n∂x 1<br />

∂ 2 f<br />

∂x 1 ∂x 2<br />

· · ·<br />

∂ 2 f<br />

∂x 2 2<br />

∂ 2 f<br />

∂x n∂x 2<br />

· · ·<br />

∂ 2 f<br />

∂x 1 ∂x n<br />

∂ 2 f<br />

∂x 2 ∂x n<br />

· · ·<br />

. .. .<br />

∂ 2 f<br />

∂x 2 n<br />

⎤<br />

⎥<br />

⎦<br />

(evaluated at a stationary point) is symmetric and so has real eigenvalues<br />

and orthogonal eigenvectors. Recall from first-year mathematics that we can<br />

thus diagonalise H = P DP T , w<strong>here</strong> the columns <strong>of</strong> P are the normalised<br />

eigenvectors <strong>of</strong> H and w<strong>here</strong> the matrix D is diagonal with the eigenvalues<br />

<strong>of</strong> H along its diagonal. Make a change <strong>of</strong> variable so that the axes <strong>of</strong> the See Kreyszig §7.5<br />

r = (r, s) coordinate system are aligned along the principle directions <strong>of</strong> the<br />

quadratic Q. An example is seen in the graph below w<strong>here</strong> the r and s axes<br />

are chosen to fit to the nature <strong>of</strong> the quadratic (whose contours are shown)<br />

with Hessian matrix<br />

]<br />

H =<br />

[<br />

−8 4<br />

4 −4<br />

.<br />

[K,p392–8] for<br />

another summary <strong>of</strong><br />

diagonalisation.


Module 5. The nature <strong>of</strong> infinite series 176<br />

1<br />

s<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

p<br />

0<br />

-0.2<br />

-0.4<br />

-0.6<br />

r<br />

-0.8<br />

-1<br />

-1 -0.5 0 0.5 1<br />

h<br />

The appropriate change <strong>of</strong> variable is<br />

r = P T h , equivalently h = P r ,


Module 5. The nature <strong>of</strong> infinite series 177<br />

so that in the new coordinate system the quadratic simplifies to give<br />

Q = h T Hh = r T P T HP r = r T Dr .<br />

But D is diagonal with diagonal entries the eigenvalues <strong>of</strong> H: namely D =<br />

diag(λ 1 , . . . , λ n ) in n-dimensions. Thus in the r coordinate system the quadratic<br />

is<br />

Q = λ 1 r 2 1 + · · · + λ n r 2 n . (5.6)<br />

From this we readily deduce the shape <strong>of</strong> the quadratic and hence the nature<br />

<strong>of</strong> the stationary point:<br />

• if all eigenvalues <strong>of</strong> H are positive then Q is positive definite, as all<br />

terms in (5.6) are positive, and the stationary point is a local minimum;<br />

• if all eigenvalues are negative then Q is negative definite, as all terms<br />

in (5.6) are negative, and the stationary point is a local maximum;<br />

• if some eigenvalues are positive and some are negative then the stationary<br />

point is a saddle point as we can increase the value <strong>of</strong> Q by<br />

moving in some directions and decrease the value <strong>of</strong> Q by moving in<br />

other directions;<br />

• lastly, if the eigenvalues are all positive or all negative except some that<br />

are precisely zero, then the neglected higher order terms in f need to<br />

be taken into account.


Module 5. The nature <strong>of</strong> infinite series 178<br />

Example 5.22: analyse the behaviour <strong>of</strong> z = f(x, y) = x 3 + 4xy − 2y 2 + 8<br />

at its stationary points.<br />

Before beginning the analysis Matlab draws the following surface z =<br />

f(x, y):<br />

15<br />

10<br />

z<br />

5<br />

0<br />

2<br />

1<br />

0<br />

1<br />

2<br />

-1<br />

0<br />

y<br />

-2<br />

-3<br />

-3<br />

-2<br />

-1<br />

x


Module 5. The nature <strong>of</strong> infinite series 179<br />

Solution:<br />

First find the stationary points:<br />

∂f<br />

∂x = 3x2 + 4y<br />

∂f<br />

∂y<br />

= 4x − 4y<br />

setting both <strong>of</strong> these equal to 0 gives x = y and 3x 2 +4x = x(3x+4) = 0.<br />

So the stationary points are (0, 0) and (− 4, − 4 ). Now find the second<br />

3 3<br />

order derivatives:<br />

Thus<br />

∂ 2 f<br />

∂x 2 = 6x ,<br />

∂ 2 f<br />

∂y 2 = −4 ,<br />

∂ 2 f<br />

∂x∂y = 4 .<br />

(0, 0) the Hessian matrix is<br />

H =<br />

[<br />

0 4<br />

4 −4<br />

]<br />

and hence the characteristic polynomial is<br />

|λI − H| = λ 2 + 4λ − 16 .<br />

This is an upwards parabola which is −16 at λ = 0 and hence<br />

t<strong>here</strong> must be one 0 for negative λ and one for positive λ. Hence<br />

the two eigenvalues have opposite sign and so (0, 0) is a saddle<br />

point.


Module 5. The nature <strong>of</strong> infinite series 180<br />

(−4/3, −4/3) the Hessian matrix is<br />

H =<br />

[<br />

−8 4<br />

4 −4<br />

and hence the characteristic polynomial is<br />

|λI − H| = λ 2 + 12λ + 16 .<br />

This is an upwards parabola which is +16 at λ = 0 and hence<br />

both 0’s have to occur for same signed λ. Since the slope <strong>of</strong> the<br />

parabola is positive when λ = 0, namely 12, then both 0’s occur<br />

for negative λ. Hence both (all) eigenvalues are negative and so<br />

(−4/3, −4/3) is a local maximum.<br />

]<br />

Activity 5.L Do problems 5.24–5.25 from Exercises 5.4.2. Send in to the<br />

examiner for feedback at least Ex. 5.25(a) and (d).<br />

5.4.2 Exercises<br />

Ex. 5.23: Find up to the second-order terms <strong>of</strong> the multi-variable Taylor<br />

series <strong>of</strong> the following functions about the specified points:


Module 5. The nature <strong>of</strong> infinite series 181<br />

(a) f(x, y) = cos x e 2y about (π/2, 0);<br />

(b) f(x, y) = (x + y)/(1 + y) about (0, 0);<br />

(c) f(x, y, z) = e x√ 1 + y 2 + z 2 about (0, 2, 2).<br />

Ex. 5.24: If z = f(x, y) = 2x 2 − 5y 2 show that f x (0, 0) = f y (0, 0) = 0 and<br />

hence that (0, 0) is a stationary point.<br />

Re-write the equation for the surface in the form F (x, y, z) = C, for<br />

some constant C, and show that ∇F ‖ k at (0, 0, 0), proving again it<br />

is a stationary point.<br />

Ex. 5.25: Find the stationary points <strong>of</strong> the given functions and then determine<br />

whether they are local maxima, local minima, or saddle points.<br />

(a) f(x, y) = x 2 − y 2 + xy<br />

(b) f(x, y) = x 2 + y 2 − xy<br />

(c) f(x, y) = x 2 − 3xy + 5x − 2y + 6y 2 + 8<br />

(d) f(x, y) = log(x 2 + y 2 + 1).<br />

(e) f(x, y) = x 5 y + xy 5 + xy .<br />

Ex. 5.26: Find the three stationary points <strong>of</strong> f(x, y) = x 2 +y 2 +2 cos(x+y)<br />

and classify the stationary point at the origin.<br />

Ex. 5.27: Analyse the behaviour <strong>of</strong> f(x, y) = x 3 + 6xy + 3y 2 + 5 at its<br />

stationary points.


Module 5. The nature <strong>of</strong> infinite series 182<br />

5.4.3 Answers to selected Exercises<br />

5.18 x − 1 3 x3 + 1 5 x5 − · · · with radius <strong>of</strong> convergence 1.<br />

5.19 −3 < x < 1<br />

5.23 (a) f(x, y) ≈ −x − 2xy<br />

(b) x + y − xy − y 2<br />

(c) 3 + 3x + 2 3 y + 2 3 z + 3 2 x2 + 5<br />

54 y2 + 5 54 z2 + 2 3 xy + 2 3 xz − 4 27 yz<br />

5.24 For example F = z − 2x 2 + 5y 2 = 0 whence ∇F = −4xi + 10yj + k = k<br />

at x = y = z = 0.<br />

5.25 (a) (0, 0), a saddle point<br />

(b) (0, 0), a local minimum<br />

(c) (− 18, − 11 ), a local minimum<br />

5 15<br />

(d) (0, 0), a local minimum<br />

(e) (0, 0), a saddle point.


Module 5. The nature <strong>of</strong> infinite series 183<br />

5.5 Summary<br />

• Tests like the Comparison test, Ratio test and Root test (§§5.2.2) are<br />

useful in determining whether a given infinite series converges or diverges<br />

but they do not establish what its sum may be. For that it may<br />

be necessary to use direct arguments based on partial sums (§§5.1.2),<br />

or to resort to direct numerical evaluation.<br />

• Infinite series that converge absolutely are “robust”, w<strong>here</strong>as series<br />

that converge conditionally rely on delicate cancellation <strong>of</strong> terms in the<br />

series (§§5.2.1).<br />

• Power series are complex/real infinite series with terms involving increasing<br />

powers <strong>of</strong> a complex/real variable z/x, around a given centre,<br />

c (§5.3). Generally, they converge (absolutely) within a disc <strong>of</strong> the<br />

complex plane, or an interval <strong>of</strong> the real line, centred at c. The radius<br />

<strong>of</strong> this disc is called the radius <strong>of</strong> convergence, but convergence is not<br />

guaranteed (and conditional at best) on the edge <strong>of</strong> the disc.<br />

• Power series are used to define functions which are continuous, differentiable<br />

and integrable within their radii <strong>of</strong> convergence (5.3.1). Conversely,<br />

a given analytic function, f(z) can be represented by a power<br />

series expansion about some centre, c, and this expansion is unique,<br />

being the Taylor series <strong>of</strong> f about c, or when c = 0, the Maclaurin<br />

series (§§5.3.2).


Module 5. The nature <strong>of</strong> infinite series 184<br />

• Truncated Taylor series, or Taylor polynomials are used to compute<br />

approximate values for functions (§§5.3.3). The accuracy <strong>of</strong> these approximations<br />

may be estimated with Lagrange’s remainder (5.3).<br />

• Taylor series are generalised to functions <strong>of</strong> more than one variable<br />

(§5.4). This is used, for example, in describing the nature <strong>of</strong> the stationary<br />

points <strong>of</strong> functions <strong>of</strong> several variables, w<strong>here</strong> the first-order<br />

derivatives vanish (§§5.4.1). Such points will be local minima, local<br />

maxima or saddle points depending upon the eigenvalues <strong>of</strong> the Hessian<br />

matrix <strong>of</strong> second-order derivatives.<br />

Activity 5.M Do representatives <strong>of</strong> Problems 1–5 and 16–35 from the Chapter<br />

14 Review [K,pp767–8].


Module 6<br />

Series solutions <strong>of</strong> differential<br />

equations give special functions<br />

“Although this may seem a paradox, all exact science is dominated<br />

by the idea <strong>of</strong> approximation”<br />

Bertrand<br />

Russell<br />

We have seen how linear ordinary differential equations (ode’s) are solved if<br />

they have constant coefficients (Module 1). Higher-order ode’s are first represented<br />

as linear systems <strong>of</strong> first-order ode’s and then the general solution<br />

will be typically <strong>of</strong> the form (1.4). The power series solution is the standard<br />

method for solving linear ordinary differential equations with variable


Module 6. Series solutions <strong>of</strong> differential equations give special functions 186<br />

coefficients. It gives solutions in the form <strong>of</strong> power series, hence the name.<br />

Power series are also the paramount method for solving otherwise intractable<br />

nonlinear differential equations.<br />

How do variable coefficients arise in differential equations? Perhaps it is best<br />

to first explain how constant coefficients arise. Constant coefficients arise<br />

because one part <strong>of</strong> space looks very much like another; thus the mathematical<br />

expression <strong>of</strong> the processes at each point in space is the same and hence<br />

the differential equation modelling the processes is everyw<strong>here</strong> the same. We<br />

saw this in earlier modules on continuum mechanics. Conversely, differential<br />

equations with variable coefficients arise when different points in space have<br />

different properties. I give two examples:<br />

• look at the waves near a beach. They curve in towards the beach,<br />

steepen and break. Let x measure distance from the beach and h(x)<br />

denote the depth <strong>of</strong> the water (small near the shore and larger further<br />

away), then the height, y(x), <strong>of</strong> the waves satisfies a differential equation<br />

<strong>of</strong> the form h(x)y ′′ +h ′ (x)y ′ = · · · with coefficients depending upon<br />

the local water depth;<br />

• in finance, the Black-Scholes equation is used to estimate the current<br />

value <strong>of</strong> future transactions (see the course on Advanced Mathematics).<br />

Letting s denote the price <strong>of</strong> a stock, then the value v(s) satisfies a<br />

differential equation <strong>of</strong> the form rsv ′ + 1 2 β2 s 2 v ′′ = · · · w<strong>here</strong> r is the bank<br />

interest rate and β measures how volatile is the stock. The variable<br />

coefficients, rs and 1 2 β2 s 2 , arise because returns are relative to the<br />

investment.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 187<br />

These are two examples <strong>of</strong> w<strong>here</strong> variable coefficient differential equations<br />

arise. This module supplies tools for the analytic solution <strong>of</strong> such variable<br />

coefficient differential equations.<br />

In this module we develop not only the general principles and methods, but<br />

also apply them to differential equations that commonly arise in physical<br />

problems. In practise all we do is to simply try a power series solution and<br />

see what solutions we obtain, §6.1. This works except when the coefficient<br />

<strong>of</strong> the highest derivative is zero; in this case we are more inventive, §6.2.<br />

The solutions <strong>of</strong> these important differential equations have special properties<br />

that make them widely useful, though perhaps not quite so useful as<br />

trigonometric and exponential functions. Called Legendre polynomials and<br />

Bessel functions, these are examples <strong>of</strong> a wide class <strong>of</strong> special functions. Inspired<br />

by these examples we then develop Sturm-Liouville theory, in §6.4, to<br />

tell us useful and general properties about the solutions <strong>of</strong> a wide class <strong>of</strong><br />

differential equations.<br />

We also introduce a little computer algebra, §6.3, to help with the repetitive<br />

analysis <strong>of</strong> this module and to attack nonlinear ode’s.<br />

Module contents<br />

6.1 Power series method leads to Legendre polynomials 189<br />

6.1.1 Introduction to the power series method . . . . . . . . . 190<br />

6.1.2 Legendre’s equation and Legendre polynomials . . . . . 192


Module 6. Series solutions <strong>of</strong> differential equations give special functions 188<br />

6.2 Frobenius method is needed to describe Bessel functions<br />

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195<br />

6.2.1 Frobenius extends the method . . . . . . . . . . . . . . 196<br />

6.2.2 Bessel functions are used in circular geometries . . . . . 201<br />

6.3 Computer algebra for repetitive tasks . . . . . . . . . 206<br />

6.3.1 Introducing reduce . . . . . . . . . . . . . . . . . . . . 208<br />

6.3.2 Introduction to the iterative method . . . . . . . . . . . 211<br />

6.3.3 Iteration is very flexible . . . . . . . . . . . . . . . . . . 222<br />

6.3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 239<br />

6.3.5 Summary <strong>of</strong> some reduce commands . . . . . . . . . . 242<br />

6.4 The orthogonal solutions to second order differential<br />

equations . . . . . . . . . . . . . . . . . . . . . . . . . . 245<br />

6.4.1 Answers to selected Exercises . . . . . . . . . . . . . . . 247<br />

6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 249


Module 6. Series solutions <strong>of</strong> differential equations give special functions 189<br />

6.1 Power series method leads to Legendre polynomials<br />

In this first section we introduce the fundamental ideas <strong>of</strong> the power series<br />

method. These ideas are applied to standard differential equations that we<br />

could readily solve other ways. Do not be misled, this is only so that we can<br />

compare the results to the known solutions. The power series method is very<br />

powerful and is applied to even immensely difficult mathematical problems.<br />

Main aims:<br />

• use the uniqueness <strong>of</strong> power series representations to derive power series<br />

solutions <strong>of</strong> differential equations;<br />

• see how the method leads to linearly independent power series solutions;<br />

• find the polynomial solutions <strong>of</strong> Legendre’s equation as an example <strong>of</strong><br />

the method.<br />

Note: in this module we will generally seek a solution y as a function <strong>of</strong> the<br />

independent variable x.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 190<br />

6.1.1 Introduction to the power series method<br />

To find power series solutions to differential equations we simply substitute<br />

a power series and see the logical consequences. In particular, see how neatly<br />

we get two linearly independent solutions <strong>of</strong> a second order ode.<br />

Reading 6.A Study Kreyszig §4.1 [K,pp194–8] and note especially how the<br />

examples work.<br />

Activity 6.B Do problems from Problem Set 4.1 [K,p198]—find the general<br />

solutions in terms <strong>of</strong> arbitrary “integration” constants. Verify for a few<br />

<strong>of</strong> these that the power series method yields the Taylor series expansion<br />

<strong>of</strong> the general analytic solution obtained by well known methods. Send<br />

in to the examiner for feedback at least Q1 & 7.<br />

Most <strong>of</strong> the theoretical basis for using power series to represent functions was<br />

developed in Module 5.<br />

Reading 6.C Read §4.2 [K,pp198–204], but make sure you review the sections<br />

on Shifting summation indices [K,pp202–3] and Existence <strong>of</strong> power<br />

series solutions [K,pp203–4].<br />

Four important points are the following.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 191<br />

• By the uniqueness <strong>of</strong> power series coefficients, the zero function must<br />

have zero coefficients. Thus when we compute the left-hand side <strong>of</strong> a<br />

differential equation as a power series and the right-hand side is zero,<br />

then the coefficients <strong>of</strong> each power on the left-hand side has to be zero.<br />

This determines the equations for the power series coefficients.<br />

• Being able to shift summation indices is an important skill to learn in<br />

order to quickly develop power series solutions.<br />

• Power series solutions to linear ordinary differential equations exist<br />

and converge for some non-zero radius provided that the coefficient<br />

functions <strong>of</strong> the differential equation are well-behaved: namely they<br />

can all be expanded in convergent Taylor series and the coefficient <strong>of</strong><br />

the highest derivative in the ode does not vanish at the expansion<br />

point.<br />

• Well behaved functions are called analytic.<br />

Example 6.1: shifting summation indices Perhaps the easiest way to<br />

learn how to shift summation indices is to: write out the first few terms<br />

in the sum; then rewrite as a new sum in the desired form. Usually<br />

the aim is to make the exponent <strong>of</strong> x the variable <strong>of</strong> summation. For<br />

example, consider the second derivative<br />

y ′′ =<br />

∞∑<br />

m(m − 1)a m x m−2<br />

m=0


Module 6. Series solutions <strong>of</strong> differential equations give special functions 192<br />

writing out the first 7 terms<br />

= 0 + 0 + 2 · 1 · a 2 + 3 · 2 · a 3 x + 4 · 3 · a 4 x 2<br />

=<br />

+ 5 · 4 · a 5 x 3 + 6 · 5 · a 6 x 4 + · · ·<br />

rewriting in terms <strong>of</strong> the exponent <strong>of</strong> x<br />

∞∑<br />

(m + 2)(m + 1)a m+2 x m .<br />

m=0<br />

We may use the same summation variable m, or something different<br />

if we wish, because m is a parameter to the sum: it has no meaning<br />

outside <strong>of</strong> the sum in which it is used, and thus is allowed to mean<br />

different things in different sums.<br />

Activity 6.D Do problems from Problem Set 4.2 [K,pp204–5]. Send in to<br />

the examiner for feedback at least Q5, 15 & 23.<br />

6.1.2 Legendre’s equation and Legendre polynomials<br />

In many applications <strong>of</strong> mathematics we <strong>of</strong>ten have a need to solve problems<br />

in a spherical domain or on the surface <strong>of</strong> a sp<strong>here</strong>. This might be because we<br />

study the internal dynamics <strong>of</strong> a star, the weather in the global atmosp<strong>here</strong>,


Module 6. Series solutions <strong>of</strong> differential equations give special functions 193<br />

the dynamics <strong>of</strong> a ball, or the deformation <strong>of</strong> a near spherical drop <strong>of</strong> water. The classic<br />

In all these cases the differential equations describing the material take a<br />

similar form because <strong>of</strong> the spherical symmetry. This form leads to Legendre’s<br />

equation,<br />

(1 − x 2 )y ′′ − 2xy ′ + n(n + 1)y = 0 , (6.1)<br />

whose solutions we now explore using the techniques <strong>of</strong> power series.<br />

When solving problems on a sp<strong>here</strong> such as the earth: x = sin(latitude)<br />

so that x = ±1 corresponds to the North and South poles and x = 0 the<br />

equator. Consequently, in applications we require that the solutions are well<br />

behaved (analytic) at x = ±1 . See that this is an essential ingredient in the<br />

analysis.<br />

We will concentrate on the solutions <strong>of</strong> Legendre’s equation for integer n.<br />

For example:<br />

n = 1 y = P 1 (x) = x satisfies (1 − x 2 )y ′′ − 2xy ′ + 2y = 0 ;<br />

n = 2 y = P 2 (x) = 1 2 (3x2 − 1) satisfies (1 − x 2 )y ′′ − 2xy ′ + 6y = 0 .<br />

But what is the other independent solution for each case? and what about<br />

other values <strong>of</strong> n?<br />

differential<br />

equations in<br />

spherical geometry<br />

will be derived and<br />

discussed in the<br />

course on Vector<br />

calculus and partila<br />

differential<br />

equations.<br />

Reading 6.E Study Kreyszig §4.3 [K,p205–8]. Note that Legendre polynomials<br />

arise as solutions when the parameter n to Legendre’s equation<br />

is integral, n ≥ 0.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 194<br />

Legendre polynomials and associated Legendre functions are readily computed<br />

with Matlab. See below for code to plot Legendre polynomials.<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

P 1<br />

(x)<br />

P 2<br />

(x)<br />

P 3<br />

(x)<br />

P 4<br />

(x)<br />

0<br />

-0.2<br />

x=linspace(-1,1);<br />

-0.4<br />

pp=[];<br />

for n=1:4<br />

-0.6<br />

p=legendre(n,x);<br />

-0.8<br />

pp(n,:)=p(1,:);<br />

end<br />

-1<br />

-1 -0.5 0 0.5 1 plot(x,pp)<br />

x<br />

Activity 6.F Do problems 1–9, 11 & 12 from Problem Set 4.3 [K,pp209–10].<br />

Send in to the examiner for feedback at least Q1, 4 & 8.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 195<br />

6.2 Frobenius method is needed to describe Bessel<br />

functions<br />

We are <strong>of</strong>ten interested in mathematically formulating and solving problems<br />

in a circular geometry, for example: the vibrations <strong>of</strong> a drum; the development<br />

<strong>of</strong> blood flow along nearly circular arteries and veins; and propagation <strong>of</strong><br />

light down an optical fibre. In these circumstances we use polar coordinates<br />

(r, θ) to describe the cross-sectional structures in these circular domains.<br />

Then the unknown fields, say u, are expressed as u = f(r) cos nθ w<strong>here</strong> integer<br />

n parametrises the structure around the circular domain, whence we are Such solutions <strong>of</strong><br />

lead to solve ode’s for f(r) <strong>of</strong> the form<br />

f ′′ + 1 r f ′ − n2<br />

r 2 f = 0 .<br />

Not only does such an equation have variable coefficients, it also has badly<br />

behaved coefficients as r → 0, the very centre <strong>of</strong> the domain! In this section<br />

we extend the power series method to cope with these interesting sorts <strong>of</strong><br />

problems.<br />

partial differential<br />

equations are<br />

developed in the<br />

course mat2102.<br />

Main aims:<br />

• generalise the power series method to cope with singular differential<br />

equations via the indicial equation;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 196<br />

• see how the different cases that can arise lead to the different Bessel<br />

function solutions <strong>of</strong> Bessel’s equation.<br />

6.2.1 Frobenius extends the method<br />

The key to analysing such more general problems, called the Frobenius method,<br />

is to seek a power series in a slightly more general form. For a problem expressed<br />

as an ode for y(x) all we need do is to introduce a prefactor to the<br />

power series <strong>of</strong> x r w<strong>here</strong> r is some real or complex number to be determined<br />

as needed. 1 That is, we seek solutions in the form<br />

y(x) = x r<br />

∞ ∑<br />

m=0<br />

a m x m = x r (a 0 + a 1 x + a 2 x 2 + a 3 x 3 + · · ·) . (6.2)<br />

Example 6.2: Find the first few terms in a generalised power series solution<br />

to the ode 4x 2 y ′′ + x 2 y ′ + y = 0 expanded about the centre x = 0.<br />

Solution:<br />

Substitute the more general power series form<br />

y(x) = a 0 x r + a 1 x r+1 + a 2 x r+2 + · · · ,<br />

1 In more tricky problems still we may resort to not only having a prefactor <strong>of</strong> x r , but<br />

also expanding in non-integral powers <strong>of</strong> x. Trying y(x) = ∑ ∞<br />

m=0 a mx r+qm for some real<br />

or complex r and q is very powerful. But we will not explore this.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 197<br />

noting that its derivatives are<br />

y ′ = ra 0 x r−1 + (r + 1)a 1 x r + (r + 2)a 2 x r+1 + · · ·<br />

y ′′ = r(r − 1)a 0 x r−2 + (r + 1)ra 1 x r−1 + (r + 2)(r + 1)a 2 x r + · · · ,<br />

then the ode becomes<br />

4r(r − 1)a 0 x r + 4(r + 1)ra 1 x r+1 + 4(r + 2)(r + 1)a 2 x r+2 + · · ·<br />

+ ra 0 x r+1 + (r + 1)a 1 x r+2 + · · ·<br />

+a 0 x r + a 1 x r+1 + a 2 x r+2 + · · · = 0 .<br />

As before, the fundamental principle is that the complicated generalised<br />

power series on the left-hand side can only be equal to the zero on<br />

the right-hand side if all the coefficients <strong>of</strong> each power <strong>of</strong> x vanish.<br />

Grouping all terms in x r , x r+1 and x r+2 we must have:<br />

[4r(r − 1) + 1] a 0 = 0 ,<br />

[4(r + 1)r + 1] a 1 + ra 0 = 0 ,<br />

and [4(r + 2)(r + 1) + 1] a 2 + (r + 1)a 1 = 0 .<br />

• Now, without loss <strong>of</strong> generality we may assume that a 0 ≠ 0. 2 Thus<br />

we arrive at the indicial equation for r, that 4r(r − 1) + 1 = 0.<br />

This is simply a quadratic for r which factors to (2r − 1) 2 = 0,<br />

thus r = 1/2 and the prefactor to the power series must be simply<br />

√ x. a0 is not constrained (other than being non-zero).<br />

2 If a 0 = 0 then we are effectively seeking a power series <strong>of</strong> the form y = x r+1 (a 1 +<br />

a 2 x + · · ·) which is not any different in principle.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 198<br />

• The second equation above, from coefficients <strong>of</strong> x r+1 , says that<br />

a 1 = −ra 0 /[4(r + 1)r + 1]. But we know r = 1/2 and hence this<br />

determines a 1 = −a 0 /8.<br />

• Similarly, the third equation above, from coefficients <strong>of</strong> x r+2 , says<br />

that a 2 = −(r + 1)a 1 /[4(r + 2)(r + 1) + 1]. Hence a 2 = −3a 1 /32 =<br />

+3a 0 /256.<br />

Thus a power series solution to the ode is<br />

(<br />

√<br />

y 1 (x) = a 0 x 1 − 1 8 x + 3 )<br />

256 x2 + · · · ,<br />

w<strong>here</strong> a 0 is an arbitrary constant.<br />

This example leads to two questions: when does the Frobenius method work?<br />

and what happened to the second (linearly independent) solution that must<br />

exist for this second order ode?<br />

Reading 6.G Study Kreyszig §4.4 [K,pp211–6].<br />

Example 6.3: Find the first few orders in the expansion <strong>of</strong> a second linearly<br />

independent solution <strong>of</strong> the ode in Example 6.2.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 199<br />

Solution: See that Example 6.2 is an example <strong>of</strong> Case 2 when the<br />

indicial equation has a double root. Hence expect a second linearly<br />

independent solution to be<br />

y 2 (x) = y 1 (x) log x + √ x(b 1 x + b 2 x 2 + · · ·) .<br />

Note the omission <strong>of</strong> b 0 in this expansion in order to avoid introducing<br />

an arbitrary multiple <strong>of</strong> y 1 —we could leave b 0 in, but we would pointlessly<br />

reproduce some <strong>of</strong> the earlier analysis. Differentiating y 2 leads<br />

to<br />

y ′ 2 = y ′ 1 log x + y 1x −1 + 3 2 b 1x 1/2 + 5 2 b 2x 3/2 + · · · ,<br />

y ′′<br />

2 = y ′′<br />

1 log x + 2y′ 1 x−1 − y 1 x −2 + 3 4 b 1x −1/2 + 15 4 b 2x 1/2 + · · · .<br />

Substitute these into the differential equation:<br />

4x 2 y 1 ′′ 1 − 4y 1 +3b 1 x 3/2 +15b 2 x 5/2 + · · ·<br />

+x 2 y 1 ′ log x + xy 1<br />

+y 1 log x +b 1 x 3/2 + 3 b 2 1x 5/2 + · · ·<br />

+b 2 x 5/2 + · · · = 0 .<br />

• The three terms involving log x immediately cancel because y 1 (x)<br />

satisfies the ode.<br />

• Also 8xy ′ 1 − 4y 1 + xy 1 (upon setting a 0 = 1 in y 1 for simplicity)<br />

becomes just x 5/2 /16 + · · · —the x 1/2 term disappears because the<br />

indicial equation has a double root, and the x 3/2 term disappears<br />

by chance.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 200<br />

• Thus grouping all terms in x 3/2 and setting its coefficient to zero<br />

leads to 4b 1 = 0, that is b 1 = 0.<br />

• Grouping all terms in x 5/2 and setting its coefficient to zero leads<br />

to 1/16 + 3 2 b 1 + 16b 2 = 0. Hence b 2 = −1/256.<br />

A second linearly independent solution is thus<br />

y 2 = y 1 (x) log x + √ (<br />

x − 1 )<br />

256 x2 + · · ·<br />

.<br />

Note:<br />

• A regular point <strong>of</strong> a linear ode is any point w<strong>here</strong> all the coefficient<br />

functions are analytic, namely they all have Taylor series expansions<br />

that have a non-zero radius <strong>of</strong> convergence, and the coefficient <strong>of</strong> the<br />

highest derivative is non-zero.<br />

If a point is not regular, then it is called a singular point. Singular<br />

points for an ode <strong>of</strong>ten arise because <strong>of</strong> a degeneracy <strong>of</strong> the coordinate<br />

system and have nothing to do with the subject <strong>of</strong> the application <strong>of</strong><br />

the mathematics. For example, in polar coordinates the point r = 0<br />

is degenerate because all angles θ meet at t<strong>here</strong>, but the centre <strong>of</strong> a<br />

circular domain is usually completely undistinguished, just an ordinary<br />

point <strong>of</strong> the domain, in the application.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 201<br />

• Convergent Taylor series centred about a regular point can always be<br />

found for the general solution <strong>of</strong> an ode. At singular points, a more<br />

general power series expansion may be needed.<br />

• The Frobenius method straightforwardly applies to higher order differential<br />

equations as well.<br />

Activity 6.H Do problems from Problem Set 4.4 [K,pp216–7]. Send in to<br />

the examiner for feedback at least Q4 & 7.<br />

6.2.2 Bessel functions are used in circular geometries<br />

Bessel’s equation,<br />

x 2 y ′′ + xy ′ + (x 2 − ν 2 )y = 0 , (6.3)<br />

arises in circular or cylindrical geometries (w<strong>here</strong> the variable x would represent<br />

the radial distance). For example, y(x) could represent the deflection,<br />

as a function <strong>of</strong> radius, <strong>of</strong> the membrane <strong>of</strong> a circular drum; or y(x) could<br />

represent the cross-pipe structure in the blood flow along a near circular<br />

artery. Indeed the differential equation mentioned in Example 6.2 is a variant<br />

<strong>of</strong> Bessel’s differential equation. We now solve this sort <strong>of</strong> equation using<br />

Frobenius’ method. The solutions for integer ν that we find, Bessel functions<br />

<strong>of</strong> the first kind, are plotted below.<br />

The letter “ν” is<br />

the Greek letter<br />

“nu” corresponding<br />

to the English “n”.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 202<br />

1<br />

0.5<br />

J 0<br />

(x)<br />

J 1<br />

(x)<br />

J 2<br />

(x)<br />

J 3<br />

(x)<br />

J 4<br />

(x)<br />

0<br />

x=linspace(0,10);<br />

j=besselj((0:4)’,x);<br />

-0.5<br />

0 2 4 6 8 10 plot(x,j)<br />

x<br />

Reading 6.I Study Kreyszig §4.5 [K,pp218–225].<br />

Positive order Bessel’s functions are relevant. In applications the Bessel<br />

functions <strong>of</strong> order ν ≥ 0 are the ones <strong>of</strong> interest. Observe that as x → 0, the<br />

Bessel functions J ν (x) ∼ a 0 x ν which tends to zero if the order ν is positive,<br />

but goes to infinity is the order ν is negative. In most applications the variable<br />

x is the radius r. Thus x → 0 corresponds to approaching the centre <strong>of</strong> the<br />

domain. The general solution to Bessel’s equation is y = c 1 J ν (x) + c 2 J −ν (x)<br />

but in applications we usually cannot tolerate solutions going to infinity and


Module 6. Series solutions <strong>of</strong> differential equations give special functions 203<br />

so the arbitrary constant c 2 = 0 in order to eliminate the bad behaviour <strong>of</strong><br />

Bessel functions <strong>of</strong> negative order. This just leaves the physically interesting<br />

solution to be y = c 1 J ν (x) for ν ≥ 0.<br />

Variable transforms are useful. Now that are investigating ode’s with<br />

variable coefficients we find a much richer range <strong>of</strong> possible ode’s. Some<br />

<strong>of</strong> these may be transformed into a well studied ode such as Bessel’s or<br />

Legendre’s equations. For example, if we can deduce that the solutions to a<br />

strange ode are J ν (x 2 ) or P n ( √ x) then we immediately know lots about the<br />

solutions. Thus one useful technique that <strong>of</strong> transforming an ode from one<br />

form into another, hopefully well known.<br />

Example 6.4: transform an ode to Bessel’s equation Consider, as an<br />

example, Problem 3 in Problem Set 4.5 <strong>of</strong> Kreyszig, p226. The task is<br />

to transform the ode in y(x) into Bessel’s ode for y(z) w<strong>here</strong> z = x 2 .<br />

Then we would be able to say that the solution is known to be y ∝<br />

J ν (z) for some ν, and hence know the solution to the original ode is<br />

y ∝ J ν (x 2 ) .<br />

The challenge is to transform the x-derivatives in the original ode,<br />

x 2 y ′′ + xy ′ + (4x 4 − 1 )y = 0, into derivatives with respect to z. We do<br />

4<br />

this using the chain rule. Among many equally valid routes, see the<br />

logic in the following for both the first and second derivatives.<br />

dy<br />

dx = dy<br />

dz × dz<br />

dx<br />

by the chain rule


Module 6. Series solutions <strong>of</strong> differential equations give special functions 204<br />

= dy<br />

dz 2x as z = x2<br />

= 2 √ z dy as x = √ z .<br />

Then<br />

d2 y<br />

dx<br />

dz<br />

= d ( ) dy<br />

dx dx<br />

= d (<br />

2 √ z dy )<br />

by above expression for dy/dx<br />

dx dz<br />

= d (<br />

2 √ z dy )<br />

× dz by chain rule<br />

dz dz dx<br />

= 2 √ z d (<br />

2 √ z dy )<br />

as z = x 2<br />

dz dz<br />

= 2 dy<br />

dz + 4z d2 y<br />

by derivative <strong>of</strong> a product .<br />

dz 2<br />

Then substitute these derivatives into the original ode to deduce the<br />

equivalent ode<br />

(<br />

x 2 2 dy ) (<br />

dz + 4z d2 y<br />

+ x 2 √ z dy ) (<br />

+ 4x 4 − 1 )<br />

y = 0 ;<br />

dz 2 dz<br />

4<br />

that is, using x = √ z,<br />

upon dividing by 4<br />

4z 2 d2 y dy<br />

+ 4z<br />

dz2 (4z<br />

dz + 2 − 1 4<br />

z 2 d2 y<br />

dz + z dy (z 2 dz + 2 − 1 16<br />

)<br />

y = 0 ;<br />

)<br />

y = 0 .<br />

This is Bessel’s ode for y(z) with parameter ν = 1/4. Thus its solutions


Module 6. Series solutions <strong>of</strong> differential equations give special functions 205<br />

are, for example, y ∝ J 1/4 (z) = J 1/4 (x 2 ) .<br />

Activity 6.J Do problems from Problem Set 4.5 [K,226–7]. Send in to the<br />

examiner for feedback at least Q4 & 11.<br />

Exercise 6.5: Consider the differential equation 2y ′′ −4xy ′ +(4x 2 −6)y = 0 .<br />

(a) Briefly explain why you would expect it to have power series solutions<br />

in the form <strong>of</strong> the Maclaurin series y = ∑ ∞<br />

n=0 a n x n .<br />

(b) Hence construct the first few terms in a power series, with errors<br />

O (x 4 ), <strong>of</strong> the solution with y(0) = 1 and y ′ (0) = 0 to the<br />

differential equation.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 206<br />

6.3 Computer algebra for repetitive tasks<br />

The whole <strong>of</strong> the developments and operations <strong>of</strong> analysis are<br />

now capable <strong>of</strong> being executed by machinery. . . . As soon as<br />

an Analytical Engine exists, it will necessarily guide the future<br />

course <strong>of</strong> science. Charles Babbage in Passages from the Life<br />

<strong>of</strong> a Philosopher (London 1864)<br />

“On two occasions I have been asked [by members <strong>of</strong> Parliament!],<br />

‘Pray, Mr. Babbage, if you put into the machine wrong figures,<br />

will the right answers come out?’<br />

I am not able rightly to apprehend the kind <strong>of</strong> confusion <strong>of</strong> ideas<br />

that could provoke such a question.” Charles Babbage<br />

S<strong>of</strong>tware packages to do computer algebra do much incredibly sophisticated<br />

analysis. However, mostly we want computers to do the tedious repetitive<br />

tasks—those that it is worth investing our time making sure the computer<br />

is doing what we want. Developing power series solutions <strong>of</strong> differential<br />

equations is an ideal application.<br />

Main aims:<br />

• see how computer algebra can be usefully employed to do tedious tasks;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 207<br />

• use simple iteration to develop power series solutions <strong>of</strong> linear and<br />

nonlinear differential equations;<br />

• make iteration more flexible by basing it upon the residual <strong>of</strong> the governing<br />

equations.<br />

We will use the free demonstration copies <strong>of</strong> reduce 3 available from:<br />

windows PC: ftp://ftp.maths.bath.ac.uk/pub/algebra, download in<br />

binary demored.exe and demored.img;<br />

linux: ftp://ftp.zib.de/pub/reduce/demo/linux;<br />

Macintosh: ftp://ftp.maths.bath.ac.uk/pub/algebra, download and unpack<br />

demored.hqx.<br />

Check that you can start and run reduce, it should open up a window<br />

saying something like<br />

REDUCE 3.6, patched to 30 Aug 98...<br />

1:<br />

3 T<strong>here</strong> is always a limitation in using free, demonstration copies. Here the main restriction<br />

on the demonstration version is that “garbage collection” is disabled in reduce.<br />

What that means in practise is that only small to medium amounts <strong>of</strong> computer algebra<br />

can be done before having to restart reduce. It is probably best to solve one problem at<br />

a time, restarting reduce in between each problem.<br />

We generally use<br />

such a coloured,<br />

teletype font for<br />

computer<br />

instructions and<br />

dialogue.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 208<br />

The “1:” is a prompt for a command: type quit; followed by the return or<br />

enter key for reduce to finish. If this works, you can run reduce.<br />

• If you cannot get reduce to execute on your computer system, contact<br />

us for help. However, in the meantime you may start your work by<br />

using a telnet application to connect over the internet to the computer<br />

marlene.zib.de, 4 login as reducet and with an empty password. A<br />

reduce session will start for you; it is a little slow but at least you<br />

can make progress with your work.<br />

• A summary <strong>of</strong> the reduce commands that we will use are given in<br />

§§6.3.5.<br />

• A simple introduction to reduce is given in the following Section 6.3.1.<br />

• http://www.zib.de/Symbolik/reduce/Overview/Overview.html is an<br />

on-line overview to the capabilities <strong>of</strong> reduce.<br />

• http://www.uni-koeln.de/cgi-bin/redref/redr_dir.html gives extensive<br />

online help to the commands and syntax for reduce.<br />

6.3.1 Introducing reduce<br />

• Start reduce in Unix by typing reduce in a command window. To<br />

exit from reduce type the command quit; followed by the enter<br />

4 Courtesy <strong>of</strong> Konrad-Zuse-Zentrum für Informationstechnik, Berlin


Module 6. Series solutions <strong>of</strong> differential equations give special functions 209<br />

key .<br />

• Note: all reduce statements must be terminated with a semi-colon.<br />

Do not forget. They are subsequently executed by typing a enter key.<br />

• reduce uses exact arithmetic by default: for example to find 100! in<br />

full gory detail type factorial(100);enter (I will not mention the<br />

enter key again unless necessary).<br />

• Identifiers, usually we use single letters, denote either variables or expressions:<br />

in f:=2*x^2+3*x-5; the identifier x is a variable w<strong>here</strong>as f,<br />

after the assignment with :=, contains the above expression; similary<br />

after g:=x^2-x-6; then g contains an algebraic expression.<br />

• Expressions may be<br />

added with f+g;<br />

subtracted with f-g;<br />

multiplied with f*g;<br />

divided with f/g;<br />

exponentiated with f^3;, etc.<br />

• Straightforward equations may be solved (by default equal to zero):<br />

solve(x^2-x-6,x); or through using an expression previously found<br />

such as solve(f,x); .


Module 6. Series solutions <strong>of</strong> differential equations give special functions 210<br />

Systems <strong>of</strong> equations may be solved by giving a list (enclosed in braces)<br />

<strong>of</strong> equations and a list <strong>of</strong> variables to be determined. For example,<br />

solve({x-y=2,a*x+y=0},{x,y}); returns the solution parametrised<br />

by a.<br />

• Basic calculus is a snap:<br />

differentiation uses the function df as in df(f,x); to find the first<br />

derivative; or df(g,x,x); for the second; or df(sin(x*y),x,y);<br />

for a mixed derivative.<br />

The product rule for differentiation is verified for the above two<br />

functions by df(f*g,x)-df(f,x)*g-f*df(g,x); reducing to zero.<br />

integration is similar, int(f,x); giving the integral <strong>of</strong> the polynomial<br />

in f, without an integration constant, but perhaps more impressive<br />

is the almost instant integration <strong>of</strong> int(x^5*cos(x^2),x); .<br />

Note that repeated integration must be done by repeated invocations<br />

<strong>of</strong> int, not by further arguments as for df . Instead, for<br />

example, int(f,x,0,2); will give you the definite integral from 0<br />

to 2.<br />

• One can substitute an expression for a variable in another expression.<br />

For example the composition f(g(x)) is computed by sub(x=g,f);<br />

• reduce allows you to use many lines for the one command: a command<br />

is not terminated until the semi-colon is typed. reduce alerts you to<br />

the fact that you are still entering the one command by displaying the


Module 6. Series solutions <strong>of</strong> differential equations give special functions 211<br />

prompt again. Thus if you forget the semi-colon, just type a semi-colon<br />

at the new prompt and then the enter key to execute what you had<br />

typed on the previous lines.<br />

• If reduce displays an error message along the lines <strong>of</strong> Declare xxx operator ?<br />

then you have probably mistyped something and the best answer is to<br />

type N then enter.<br />

6.3.2 Introduction to the iterative method<br />

Computers are extremely good are repeating the same thing many times over.<br />

We use this aspect to find power series solutions <strong>of</strong> some simple differential<br />

equations, and then some “horrible” nonlinear differential equations. The<br />

ideas are developed by example.<br />

Example 6.6: The solution to y ′′ + y = 0, y(0) = 1 and y ′ (0) = 0 is<br />

y = cos x. Find the Maclaurin series solution by iteration first by hand<br />

and secondly using computer algebra.<br />

Solution: Rearrange this ode to y ′′ = −y and then formally integrate<br />

twice to y = − ∫∫ y dx dx. These integrals on the right-hand side are indefinite<br />

integrals so implicit constants <strong>of</strong> integration, say a+bx, should<br />

appear on the right-hand side. But we know that the cosine solution


Module 6. Series solutions <strong>of</strong> differential equations give special functions 212<br />

to y ′′ + y = 0 has y(0) = 1 and y ′ (0) = 0 so surely we should set a = 1<br />

and b = 0 to account for these initial conditions. Thus<br />

∫∫<br />

y = 1 − y dx dx (6.4)<br />

w<strong>here</strong> <strong>here</strong> the integrals are implicitly the definite integral from 0 to<br />

x. This rearrangement incorporates the information <strong>of</strong> the ode and its<br />

initial conditions.<br />

In this form we readily find its power series solution by iteration: given<br />

an approximation y n (x) we find a new approximation by evaluating<br />

∫∫<br />

y n+1 = 1 − y n dx dx .<br />

First try by hand starting from y 0 = 1:<br />

• y 1 = 1 − ∫∫ 1 dx dx = 1 − 1 2 x2 ;<br />

• y 2 = 1 − ∫∫ 1 − 1 2 x2 dx dx = 1 − 1 2 x2 + 1<br />

24 x4 .<br />

See these are the first few terms in the Maclaurin series for cos x. Now<br />

try using reduce to do the algebra:<br />

• first type the three commands<br />

on div;, <strong>of</strong>f allfac; and on revpri;<br />

(do not forget the semi-colon to logically terminate each command<br />

and the return or enter key to get reduce to actually execute the<br />

line you have typed)—these commands tell reduce to format its<br />

output in a nice way for power series;<br />

Interestingly, this is<br />

Picard iteration that<br />

is also used to prove<br />

existence <strong>of</strong><br />

solutions to ode’s.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 213<br />

• second set a variable to the first approximation by typing y:=1;<br />

which assigns the value one to the variable y;<br />

• type y:=1-int(int(y,x),x); to assign the first approximation,<br />

y 1 = 1 − x 2 /2, to the variable y—int(y,x) computes an integral<br />

with respect to x <strong>of</strong> whatever is in y, fortunately for us, for<br />

polynomial y it computes the integral which is zero at x = 0;<br />

• type y:=1-int(int(y,x),x); again to compute y 2 , etc;<br />

• iterative loops are standard in computer languages and computer<br />

algebra is no exception so type<br />

for n:=3:8 do y:=1-int(int(y,x),x);<br />

to compute further iterations. But nothing was printed so finally<br />

type y; to see the resulting power series for cos x.<br />

The entire dialogue should look like this:<br />

1: on div;<br />

2: <strong>of</strong>f allfac;<br />

3: on revpri;<br />

4: y:=1;<br />

y := 1<br />

5: y:=1-int(int(y,x),x);


Module 6. Series solutions <strong>of</strong> differential equations give special functions 214<br />

1 2<br />

y := 1 - ---*x<br />

2<br />

6: y:=1-int(int(y,x),x);<br />

1 2 1 4<br />

y := 1 - ---*x + ----*x<br />

2 24<br />

8: for n:=3:8 do y:=1-int(int(y,x),x);<br />

9: y;<br />

1 2 1 4 1 6 1 8 1 10<br />

1 - ---*x + ----*x - -----*x + -------*x - ---------*x<br />

2 24 720 40320 3628800<br />

1 12 1 14 1 16<br />

+ -----------*x - -------------*x + ----------------*x<br />

479001600 87178291200 20922789888000<br />

Example 6.7: Find the general Maclaurin series solution to y ′′ +y = 0 using<br />

computer algebra (reduce).


Module 6. Series solutions <strong>of</strong> differential equations give special functions 215<br />

Solution: In the previous example we built in the specific initial conditions<br />

appropriate to y = cos x, namely y(0) = 1 and y ′ (0) = 0. If we<br />

make the integration constants arbitrary, by iterating<br />

∫∫<br />

y = a + bx − y dx dx ,<br />

then we recover the general solution parametrised by a and b w<strong>here</strong><br />

y(0) = a and y ′ (0) = b. Let’s do it. Start reduce:<br />

• type factor a,b; to get reduce to group all terms in a and all<br />

terms in b;<br />

• set the initial value to something simple satisfying the initial conditions<br />

y:=a+b*x;<br />

• iterate for n:=1:4 do write y:=a+b*x-int(int(y,x),x); using<br />

the write command to print each iterate.<br />

The dialogue is:<br />

4: factor a,b;<br />

5: y:=a+b*x;<br />

y := b*x + a<br />

6: for n:=1:4 do write y:=a+b*x-int(int(y,x),x);<br />

1 3 1 2<br />

Always remember to<br />

start reduce with<br />

on div;<br />

<strong>of</strong>f allfac;<br />

on revpri;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 216<br />

y := b*(x - ---*x ) + a*(1 - ---*x )<br />

6 2<br />

1 3 1 5 1 2 1 4<br />

y := b*(x - ---*x + -----*x ) + a*(1 - ---*x + ----*x )<br />

6 120 2 24<br />

1 3 1 5 1 7<br />

y := b*(x - ---*x + -----*x - ------*x )<br />

6 120 5040<br />

1 2 1 4 1 6<br />

+ a*(1 - ---*x + ----*x - -----*x )<br />

2 24 720<br />

1 3 1 5 1 7 1 9<br />

y := b*(x - ---*x + -----*x - ------*x + --------*x )<br />

6 120 5040 362880<br />

1 2 1 4 1 6 1 8<br />

+ a*(1 - ---*x + ----*x - -----*x + -------*x )<br />

2 24 720 40320<br />

See how easily this generates the Maclaurin series for y = a cos x +<br />

b sin x.<br />

Now let’s try something rather hard—in fact almost impossible to quantitatively<br />

solve except via power series methods. We now use precisely the same


Module 6. Series solutions <strong>of</strong> differential equations give special functions 217<br />

iteration to solve a nonlinear ode!<br />

Example 6.8: Find the Maclaurin series solution to the nonlinear ode<br />

y ′′ = 6y 2 , y(0) = 1 and y ′ (0) = −2.<br />

Before solving this as a power series (by my design its exact solution<br />

just happens to be y = 1/(1 + x) 2 ), investigate it qualitatively using<br />

techniques developed in earlier by considering it as a system <strong>of</strong> firstorder<br />

differential equations. Introduce z(x) = y ′ then the equivalent<br />

system is<br />

y ′ = z , z ′ = 6y 2 .<br />

Hence the evolution in the phase plane is dictated by the arrows shown<br />

below with the particular trajectory starting from the initial condition<br />

(1, −2) shown in green:


Module 6. Series solutions <strong>of</strong> differential equations give special functions 218<br />

z=y'<br />

0<br />

-0.2<br />

-0.4<br />

-0.6<br />

-0.8<br />

-1<br />

-1.2<br />

-1.4<br />

-1.6<br />

-1.8<br />

-2<br />

-0.5 0 0.5 1<br />

y<br />

[y,z]=meshgrid(-.5:.1:1,-2:.1:0);<br />

quiver(y,z,z,6*y.^2)<br />

hold on<br />

x=linspace(0,7);<br />

y=1./(1+x).^2;<br />

z=-2./(1+x).^3;<br />

plot(y,z,’g’)<br />

hold <strong>of</strong>f<br />

Solution: Now we find its power series solution! As before, recast the<br />

ode in the following form that also incorporates the initial conditions


Module 6. Series solutions <strong>of</strong> differential equations give special functions 219<br />

by formally integrating twice the ode:<br />

∫∫<br />

y = 1 − 2x + 6 y 2 dx dx , (6.5)<br />

w<strong>here</strong> again the repeated x integral is assumed done so that each integral<br />

is zero at x = 0. Then iterate, starting from y 0 = 1 − 2x say:<br />

∫∫<br />

y 1 = 1 − 2x + 6 1 − 4x + 4x 2 dx dx<br />

= 1 − 2x + 3x 2 − 4x 3 + 2x 4 ;<br />

y 2 =<br />

∫∫ (1<br />

1 − 2x + 6 − 2x + 3x 2 − 4x 3 + 2x 4) 2<br />

dx dx<br />

∫∫<br />

= 1 − 2x + 6 1 − 4x + 10x 2 − 20x 3 + 29x 4 − 32x 5<br />

+28x 6 − 16x 7 + 4x 8 dx dx<br />

= 1 − 2x + 3x 2 − 4x 3 + 5x 4 − 6x 5<br />

+ 29<br />

5 x6 − 32<br />

7 x7 + 3x 8 − 4 3 x9 + 4<br />

15 x10 .<br />

This is quickly becoming horrible. But that is just why computers<br />

are made. Before rushing in to use reduce, observe that <strong>here</strong> the<br />

quadratic nonlinearity y 2 is going to generate very high powers <strong>of</strong> x,<br />

most <strong>of</strong> which we do not want. For example, in y 2 above the terms up<br />

to x 5 are correct, but all the higher powers are as yet wrong. 5 Another<br />

5 The quadratic nonlinearity y 2 rapidly generates high powers <strong>of</strong> x in the expressions.<br />

However, the iteration plods along only getting one or two orders <strong>of</strong> x more accurate each<br />

iteration.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 220<br />

iteration would generate a 22nd order polynomial for y 3 <strong>of</strong> which only<br />

the first 8 coefficients are correct, the rest are rubbish. In reduce we<br />

discard such high order terms in a power series by using, for example,<br />

the command let x^10=>0; which tells reduce to discard, set to zero,<br />

or otherwise ignore, all terms with a power <strong>of</strong> x <strong>of</strong> ten or more. This<br />

is just what we want. Thus <strong>here</strong> the dialogue would be:<br />

5: let x^10=>0;<br />

6: y:=1-2*x;<br />

y := 1 - 2*x<br />

7: for n:=1:5 do write y:=1-2*x+6*int(int(y^2,x),x);<br />

2 3 4<br />

y := 1 - 2*x + 3*x - 4*x + 2*x<br />

2 3 4 5 29 6 32 7 8<br />

y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + ----*x - ----*x + 3*x<br />

5 7<br />

4 9<br />

- ---*x<br />

3<br />

2 3 4 5 6 7 306 8<br />

y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + 7*x - 8*x + -----*x<br />

35


Module 6. Series solutions <strong>of</strong> differential equations give special functions 221<br />

316 9<br />

- -----*x<br />

35<br />

2 3 4 5 6 7 8 9<br />

y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + 7*x - 8*x + 9*x - 10*x<br />

2 3 4 5 6 7 8 9<br />

y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + 7*x - 8*x + 9*x - 10*x<br />

See how the iteration settles on the correct power series but with all<br />

terms with powers <strong>of</strong> ten or higher have been neglected. Check this satisfies<br />

the ode by computing the residual df(y,x,x)-6*y^2; (df(y,x)<br />

computes the derivative <strong>of</strong> y with respect to x and df(y,x,x) computes<br />

the second derivative); the result is zero except for two terms in x 8 and<br />

x 9 which would cancel with the second derivative <strong>of</strong> the absent tenth<br />

and eleventh order terms. We thus triumphantly write the solution <strong>of</strong><br />

this nonlinear ode as<br />

y = 1 − 2x + 3x 2 − 4x 3 + 5x 4 − 6x 5 + 7x 6 − 8x 7 + 9x 8 − 10x 9 + O(x 10 ) ,<br />

w<strong>here</strong> O(x 10 ) (read “order <strong>of</strong> x 10 ”) tells us that the error in the power<br />

series, the neglected terms, are x 10 or higher powers.<br />

In the above three examples we have developed the Taylor series about x = 0,<br />

the Maclaurin series. To find Taylor series about any point x = c it is simply


Module 6. Series solutions <strong>of</strong> differential equations give special functions 222<br />

a matter <strong>of</strong> changing the independent variable to, for example, t = x − c<br />

and then finding the Maclaurin series in t. We will continue to find only<br />

Maclaurin series because that is all we need to also find other power series<br />

solutions.<br />

Activity 6.K Do Problems 6.13–6.17 in the Exercises 6.3.4, p239. Send in<br />

to the examiner for feedback at least Ex. 6.13 & 6.14.<br />

6.3.3 Iteration is very flexible<br />

So far we have simply rearranged an ode in order to derive an iteration that<br />

will generate the desired power series solution. 6 In this subsection we discuss<br />

why this strategy works at all, and what extension we need in order to solve<br />

a very wide range <strong>of</strong> differential equations.<br />

The iteration works because integration is basically a smoothing operation.<br />

This smoothing tends to reduce errors in a power series. For example, suppose<br />

an error was O(x 3 ), so that it is roughly about 10 −3 when x = 0.1 say:<br />

6 What we have done is rather remarkable. In the course on Numerical Computing you<br />

will learn about fixed point iteration as a method <strong>of</strong> solving linear and nonlinear equations.<br />

In fact we have done precisely fixed point iteration <strong>here</strong>. The remarkable difference is<br />

that in Numerical Computing you will simply find the one number that satisfies a given<br />

equation; <strong>here</strong> you have found the function, via its power series, that satisfies the given<br />

differential equation—a much more difficult task. Nonetheless the strategy <strong>of</strong> appropriately<br />

rearranging the equation and iterating works.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 223<br />

then integrating it twice will lead to an error O(x 5 ) in the integral which<br />

is much smaller in magnitude, roughly 10 −5 when x = 0.1. Conversely differentiation<br />

magnifies errors: two derivatives <strong>of</strong> an error O(x 3 ) becomes an<br />

error O(x) which, at roughly 10 −1 when x = 0.1, is much larger. To make<br />

errors smaller, equivalently to push them to higher powers in x, we generally<br />

need to integrate. Thus an integral reformulation <strong>of</strong> an ode is the basis for<br />

a successful iterative solution.<br />

The other question is: how do we know how many iterations should be<br />

performed? The answer <strong>here</strong> is simple: keep iterating until t<strong>here</strong> is no more<br />

change to the solution. One consequence <strong>of</strong> the answer though is that we<br />

have to keep track <strong>of</strong> the change in the approximations. A good way to find<br />

the change in an approximation is to solve for it explicitly. But first we have<br />

to find an equation for the small change in the approximate solution at each<br />

iteration. This leads us to a powerful iterative framework, based upon the<br />

residual <strong>of</strong> the ode, which we develop and explore by example.<br />

Example 6.9: Legendre functions. Use iteration to find the general<br />

Maclaurin series solutions to Legendre’s equation (6.1), written <strong>here</strong><br />

as<br />

(1 − x 2 )y ′′ − 2xy ′ + ky = 0 for k = n(n + 1) ,<br />

to an error O(x 10 ) for initial conditions y(0) = 1 and y ′ (0) = 0.<br />

Solution:<br />

Immediately an initial approximation is<br />

y 0 = 1 ,


Module 6. Series solutions <strong>of</strong> differential equations give special functions 224<br />

as this satisfies the initial conditions. The iterative challenge is: given<br />

a known approximation y n , find an improved solution<br />

ŷ is read as “y-hat”.<br />

y n+1 (x) = y n (x) + ŷ n (x) ,<br />

w<strong>here</strong> ŷ n is the as yet unknown change in the approximation that we<br />

have to find. Now substitute this form for y n+1 into the ode and<br />

rearrange to put all the known terms on the right-hand side and all the<br />

unknown on the left:<br />

−(1 − x 2 )ŷ ′′<br />

n + 2xŷ′ n − kŷ n = (1 − x 2 )y ′′<br />

n − 2xy′ n + ky n .<br />

This looks like a differential equation for the as yet unknown change<br />

ŷ n forced by the known right-hand side, the residual <strong>of</strong> Legendre’s<br />

equation evaluated at the current approximation, R n = (1 − x 2 )y n ′′ −<br />

2xy n ′ + ky n. For example, the first residual from y 0 = 1 is R 0 = k. But<br />

this ode for the change is far too complicated—indeed if we could solve<br />

it exactly then the problem would be over immediately. Instead we seek<br />

a simplification to make the ode for ŷ n tractable while still useful. The<br />

general principles <strong>of</strong> the simplification are that in any terms involving<br />

ŷ n :<br />

• near the point <strong>of</strong> expansion x = 0, x is much smaller than 1 and x 2<br />

is even smaller still, thus we neglect higher powers <strong>of</strong> x relative to<br />

lower powers—so in this example we replace the (1 − x 2 ) factor<br />

by 1 because the x 2 is negligible in comparison to 1 for the small x<br />

near the point <strong>of</strong> expansion;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 225<br />

• also, though be careful, because differentiation increases errors as<br />

differentiation by x corresponds roughly to lowering the power<br />

<strong>of</strong> x by 1 (equivalently it roughly corresponds to dividing by x)<br />

we neglect low order derivatives <strong>of</strong> ŷ n (provided they are not also<br />

divided by x)—so in this example xŷ n ′ is roughly <strong>of</strong> the same “size”<br />

as ŷ n because the derivative makes it larger but the multiplication<br />

by x cancels this effect, but both <strong>of</strong> these terms are smaller than<br />

ŷ n ′′ which is roughly 1/x2 times larger.<br />

After this simplification, the ode for the change then reduces to<br />

−ŷ ′′<br />

n = R n(x) = (1 − x 2 )y ′′<br />

n − 2xy′ n + ky n .<br />

In the first iteration, as R 0 = k, ŷ 0 ′′ = −k which upon integrating twice<br />

leads to the requisite change being ŷ 0 = −kx 2 /2.<br />

But what about the constants <strong>of</strong> integration? In this approach the initial<br />

approximation satisfies the initial conditions y(0) = 1 and y ′ (0) =<br />

0. We ensure these are satisfied by all approximations by ensuring all<br />

the changes ŷ n satisfy the corresponding homogeneous initial conditions<br />

ŷ n (0) = ŷ ′ n (0) = 0. Thus, for example, the change ŷ 0 above is indeed<br />

correct. Hence the next approximation is y 1 = 1 − kx 2 /2.<br />

We could continue doing this by hand, but the plan is to use computer<br />

algebra to do the tediously repetitious iteration.<br />

• The initial approximation is set simply by y:=1;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 226<br />

• We wish to discard any powers generated <strong>of</strong> O(x 10 ) so include the<br />

declaration let x^10=>0;<br />

• To iterate until the change is negligible use the repeat loop, namely<br />

repeat ... until r=0; w<strong>here</strong> we will use r to store the residual<br />

and the change.<br />

• The repeat-until construct in reduce, unlike many other computing<br />

languages, expects only a single statement between the repeat<br />

and the until—we bracket the multiple statements needed inside<br />

with a begin ... end<br />

• Inside the loop:<br />

– compute residual, r:=(1-x^2)*df(y,x,x)-2*x*df(y,x)+k*y;<br />

– compute change, r:=-int(int(r,x),x);<br />

– update the approximation, write y:=y+r;<br />

The reduce dialogue might be:<br />

4: y:=1;<br />

y := 1<br />

5: let x^10=>0;<br />

6: repeat begin<br />

6: r:=(1-x^2)*df(y,x,x)-2*x*df(y,x)+k*y;<br />

6: r:=-int(int(r,x),x);<br />

6: write y:=y+r;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 227<br />

6: end until r=0;<br />

1 2<br />

y := 1 - ---*k*x<br />

2<br />

1 2 1 4 1 2 4<br />

y := 1 - ---*k*x - ---*k*x + ----*k *x<br />

2 4 24<br />

1 2 1 4 1 6 1 2 4 13 2 6<br />

y := 1 - ---*k*x - ---*k*x - ---*k*x + ----*k *x + -----*k *x<br />

2 4 6 24 360<br />

1 3 6<br />

- -----*k *x<br />

720<br />

1 2 1 4 1 6 1 8 1 2 4<br />

y := 1 - ---*k*x - ---*k*x - ---*k*x - ---*k*x + ----*k *x<br />

2 4 6 8 24<br />

13 2 6 101 2 8 1 3 6 17 3 8<br />

+ -----*k *x + ------*k *x - -----*k *x - -------*k *x<br />

360 3360 720 10080<br />

1 4 8<br />

+ -------*k *x<br />

40320


Module 6. Series solutions <strong>of</strong> differential equations give special functions 228<br />

1 2 1 4 1 6 1 8 1 2 4<br />

y := 1 - ---*k*x - ---*k*x - ---*k*x - ---*k*x + ----*k *x<br />

2 4 6 8 24<br />

13 2 6 101 2 8 1 3 6 17 3 8<br />

+ -----*k *x + ------*k *x - -----*k *x - -------*k *x<br />

360 3360 720 10080<br />

1 4 8<br />

+ -------*k *x<br />

40320<br />

It is painful having to retype the entire loop anytime one typing mistake<br />

is made. Instead prepare a file, called say leg.red, containing the<br />

reduce commands (including an extra end; at the end):<br />

on div; <strong>of</strong>f allfac; on revpri;<br />

factor x;<br />

y:=1;<br />

let x^10=>0;<br />

repeat begin<br />

r:=(1-x^2)*df(y,x,x)-2*x*df(y,x)+k*y;<br />

r:=-int(int(r,x),x);<br />

write y:=y+r;<br />

end until r=0;<br />

end;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 229<br />

then start reduce and get all these commands executed by typing<br />

in "leg.red"; The output gives the desired Maclaurin series to be<br />

y = 1 − k ( 1<br />

2 x2 +<br />

24 k2 − 1 ) ( 1<br />

4 k x 4 −<br />

720 k3 − 13<br />

360 k2 + 1 )<br />

6 k x 6<br />

( 1<br />

+<br />

40320 k4 − 17<br />

10080 k3 + 101<br />

3360 k2 − 1 )<br />

8 k x 8 + O(x 10 ) .<br />

Example 6.10: Find the Maclaurin series solution to errors O(x 10 ) to the<br />

nonlinear ode y ′′ +(1+x)y ′ −6y 2 = 0 such that y(0) = 1 and y ′ (0) = −1.<br />

Solution: Again immediately write down an initial approximation<br />

consistent with the initial conditions: namely y 0 = 1 − x. Then, given<br />

a known approximation, say y n (x), seek an improved approximation<br />

y n+1 (x) = y n (x) + ŷ n (x) w<strong>here</strong> ŷ n (x) is the as yet unknown change.<br />

Substitute into the differential equation and rearrange to deduce the<br />

following ode for the change:<br />

−ŷ ′′<br />

n − (1 + x)ŷ′ n + 6ŷ2 n + 12y nŷ n = R n = y ′′<br />

n + (1 + x)y′ n − 6y2 n ,<br />

w<strong>here</strong> <strong>here</strong>, as always, R n (x) is the known residual evaluated for the<br />

current approximation. Now simplify the left-hand side:


Module 6. Series solutions <strong>of</strong> differential equations give special functions 230<br />

• since x is “small” (in the power series expansion) 1 + x ≈ 1 and<br />

similarly y n ≈ 1 from the initial condition y(0) = 1 so the lefthand<br />

side first simplifies to<br />

−ŷ ′′<br />

n − ŷ′ n + 6ŷ2 n + 12ŷ n ;<br />

• but also the change ŷ n must be small (as each ŷ n is to make a small<br />

improvement in the solution) and so ŷn 2 must be much smaller still<br />

and should be neglected—for example we typically expect the first<br />

change ŷ 0 to be O(x 2 ) whence ŷ0 2 = O(x4 ) which is much smaller<br />

and negligible—hence the left-hand side simplifies further to<br />

−ŷ ′′<br />

n − ŷ′ n + 12ŷ n ;<br />

• lastly, differentiation effectively decreases the order <strong>of</strong> any term so<br />

that the second derivative term dominates the others above and<br />

so the ode for the change becomes simply<br />

−ŷ ′′<br />

n = R n = y ′′<br />

n + (1 + x)y ′ n − 6y 2 n .<br />

For example, the first iteration starts by computing the residual<br />

R 0 = 0 + (1 + x)(−1) − 6(1 − x) 2 = −7 + 11x − 6x 2 .<br />

Then changing sign and integrating twice gives the first change<br />

∫∫<br />

ŷ 0 = −<br />

R n dx dx = 7 2 x2 − 11 6 x3 + 1 2 x4 ,


Module 6. Series solutions <strong>of</strong> differential equations give special functions 231<br />

after recalling that we need to satisfy homogeneous initial conditions<br />

ŷ ′ n(0) = ŷ n (0) = 0 for the changes in order to ensure the solution<br />

satisfies the specified initial conditions. Thus the new approximation<br />

is<br />

y 1 = 1 − x + 7 2 x2 − 11<br />

6 x3 + 1 2 x4 .<br />

Now investigate further with computer algebra. First create a file, say<br />

nod.red with<br />

on div; <strong>of</strong>f allfac; on revpri;<br />

y:=1-x;<br />

let x^10=>0;<br />

repeat begin<br />

r:=df(y,x,x)+(1+x)*df(y,x)-6*y^2;<br />

r:=-int(int(r,x),x);<br />

y:=y+r;<br />

end until r=0;<br />

y:=y;<br />

end;<br />

Second executing the commands using the in statement produces the<br />

output below<br />

2: in "nod.red";<br />

on div;<br />

<strong>of</strong>f allfac;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 232<br />

on revpri;<br />

y:=1-x;<br />

y := 1 - x<br />

let x^10=>0;<br />

repeat begin<br />

r:=df(y,x,x)+(1+x)*df(y,x)-6*y^2;<br />

r:=-int(int(r,x),x);<br />

y:=y+r;<br />

end until r=0;<br />

y:=y;<br />

7 2 3 25 4 257 5 219 6 1433 7<br />

y := 1 - x + ---*x - 3*x + ----*x - -----*x + -----*x - ------*x<br />

2 6 60 40 252<br />

end;<br />

6355 8 199277 9<br />

+ ------*x - --------*x<br />

1008 30240<br />

Thus conclude that the Maclaurin series solution is<br />

y = 1 − x + 7 2 x2 − 3x 3 + 25<br />

6 x4 − 257<br />

60 x5 + 219<br />

40 x6 − 1433<br />

252 x7


Module 6. Series solutions <strong>of</strong> differential equations give special functions 233<br />

+ 6355<br />

1008 x8 − 199277<br />

30240 x9 + O(x 10 ) .<br />

The following are the principles seen in this iterative approach to finding<br />

power series solutions to linear and nonlinear ode’s.<br />

• Make an initial approximation consistent with the initial conditions <strong>of</strong><br />

the ode.<br />

• Seek as simple an ode for successive corrections by substituting y n+1 =<br />

y n + ŷ n into the differential equation, grouping all the known terms<br />

into the residual R n , and then neglecting all but the dominant terms<br />

involving the change ŷ n :<br />

– neglect all nonlinear terms in the small change ŷ n ;<br />

– approximate all coefficient factors by the lowest order term in x;<br />

– and, counting each derivative with respect to x as equivalent to a<br />

division by x, keep only those terms <strong>of</strong> lowest order in x.<br />

This process is close kin to the linearisation that we employed in Module<br />

1 and will employ in later modules.<br />

• Iteratively make changes as guided by the residuals until the changes<br />

are zero to some order <strong>of</strong> error in x. This is handily done by computer<br />

algebra.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 234<br />

Warning: when testing computer algebra code, do not use the repeatuntil<br />

loop; while testing use a for-do loop to ensure that you do not get<br />

stuck in an infinite loop. Only when you are sure that your code works<br />

do you replace the for-do loop with a repeat-until loop.<br />

Applying these principles becomes more involved when we apply them in<br />

developing power series about a singular point <strong>of</strong> an ode. We investigate a<br />

couple <strong>of</strong> examples.<br />

Example 6.11: Bessel function <strong>of</strong> order 0. Find the power series solution<br />

<strong>of</strong> x 2 y ′′ + xy ′ + x 2 y = 0 that is well-behaved at x = 0 to an error<br />

O(x 10 )—namely find the low-orders <strong>of</strong> a power series proportional to<br />

the Bessel function J 0 (x).<br />

Solution: First find and solve the indicial equation by substituting<br />

y = x r + O(x r+1 ). Here the ode becomes<br />

x 2 y ′′ + xy ′ + x 2 y = r(r − 1)x r + rx r + x r+2 + O(x r+1 ) = r 2 x r + O(x r+1 ) .<br />

As x r+2 is absorbed<br />

into the error term<br />

O ( x r+1) .<br />

The only way this can be zero for all small x is if r 2 = 0. This leads, as<br />

discussed in Kreyszig [K,§4.4], to the homogeneous solutions <strong>of</strong> the ode<br />

being approximately y ≈ a+b log x. The logarithm is not well-behaved<br />

as x → 0 hence we set b = 0 and just seek solutions that tend to a<br />

constant as x → 0. Without loss <strong>of</strong> generality, because we can multiply<br />

by a constant later, we choose to find solutions such that y(0) = 1.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 235<br />

Second we make an initial approximation to the solution. After the<br />

above discussion <strong>of</strong> the indicial equation, choose y 0 = 1.<br />

Third, given a known approximation y n (x) seek an improved approximation<br />

y n+1 (x) = y n (x) + ŷ n (x) w<strong>here</strong> ŷ n (x) is some small change.<br />

Substitute this into the ode, neglect x 2 ŷ n because it is two orders <strong>of</strong> x<br />

smaller than either x 2 ŷ n ′′ or xŷ′ n , and deduce that ŷ n must satisfy<br />

− x 2 ŷ ′′<br />

n − xŷ′ n = R n = x 2 y ′′<br />

n + xy′ n + x2 y n . (6.6)<br />

Solving this for the correction ŷ n is no longer simply a matter <strong>of</strong> integrating<br />

twice.<br />

However, rearranging the form <strong>of</strong> the ode (6.6) we again express the<br />

solution in terms <strong>of</strong> two integrations. All we need to do is to notice<br />

that the left-hand side is identical to −x(xŷ ′ n) ′ whence<br />

Apply this iteration <strong>here</strong>.<br />

−x(xŷ ′ n) ′ = R n<br />

∫<br />

⇔ xŷ n ′ = − Rn<br />

x dx<br />

∫ ∫ 1 Rn<br />

⇔ ŷ n = − dx dx .<br />

x x<br />

(a) In the first iteration y 0 = 1 so the residual R 0 = x 2 . Thus<br />

∫<br />

ŷ 0 = −<br />

∫ 1 x<br />

2<br />

dx dx<br />

x x


Module 6. Series solutions <strong>of</strong> differential equations give special functions 236<br />

∫ 1 (<br />

= − 1<br />

2<br />

x<br />

x2 + b ) dx<br />

= − 1 4 x2 − b log x + a<br />

for integrations constants a and b.<br />

Note the freedom to include a − b log x into ŷ 0 , but we cannot<br />

tolerate any component in log x, as it behaves badly at x = 0, so<br />

b = 0, and a has to be chosen zero in order to ensure y n (0) = 1.<br />

(This argument applies at all iterations.) Hence y 1 = 1 − x 2 /4.<br />

(b) In the second iteration R 1 = −x 4 /4. Thus, setting the integration<br />

constants to zero as before,<br />

∫ ∫ 1 −x 4 /4<br />

ŷ 1 = −<br />

dx dx<br />

x x<br />

∫ ) 1<br />

= −<br />

(− x4<br />

dx<br />

x 16<br />

= x4<br />

64 .<br />

Hence y 2 = 1 − x 2 /4 + x 4 /64.<br />

For a computer algebra program, proceed as in earlier examples but<br />

modify the two integrations as in<br />

y:=1;<br />

let x^10=>0;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 237<br />

repeat begin<br />

r:=x^2*df(y,x,x)+x*df(y,x)+x^2*y;<br />

r:=-int(int(r/x,x)/x,x);<br />

write y:=y+r;<br />

end until r=0;<br />

Execute this code and see the solution is<br />

y = J 0 (x) = 1 − 1 4 x2 + 1<br />

64 x4 − 1 1<br />

2304 x6 +<br />

147456 x8 + O(x 10 ) .<br />

Example 6.12: Bessel functions <strong>of</strong> order 0. Find the power series expansion<br />

about x = 0, to errors O(x 10 ), <strong>of</strong> the general solution to Bessel’s<br />

equation with ν = 0, namely x 2 y ′′ + xy ′ + x 2 y = 0.<br />

Solution: The indicial equation shows that in general the dominant<br />

component in the solution is a + b log x for any a and b. (See that these<br />

were also naturally obtained in the integration constants <strong>of</strong> the previous<br />

example.) Use this as the first approximation y 0 and see what ensues.<br />

The derivation <strong>of</strong> the equation for the iterative changes, Eqn (6.6)<br />

remains the same.<br />

Including the command factor b,a,log; to improve the appearance<br />

<strong>of</strong> the printing and setting the initial approximation to y 0 = a + b log x,<br />

the reduce code is as before, namely


Module 6. Series solutions <strong>of</strong> differential equations give special functions 238<br />

factor b,a,log;<br />

y:=a+b*log(x);<br />

let x^10=>0;<br />

repeat begin<br />

r:=x^2*df(y,x,x)+x*df(y,x)+x^2*y;<br />

r:=-int(int(r/x,x)/x,x);<br />

write y:=y+r;<br />

end until r=0;<br />

Run this code to see the result<br />

1 2 1 4 1 6 1 8<br />

y := a*(1 - ---*x + ----*x - ------*x + --------*x )<br />

4 64 2304 147456<br />

1 2 3 4 11 6 25 8<br />

+ b*(---*x - -----*x + -------*x - ---------*x )<br />

4 128 13824 1769472<br />

1 2 1 4 1 6 1 8<br />

+ log(x)*b*(1 - ---*x + ----*x - ------*x + --------*x )<br />

4 64 2304 147456<br />

That is, as Kreyszig assures us for double roots [K,p213], the general<br />

solution is <strong>of</strong> the form y = ay 1 (x) + by 2 (x) w<strong>here</strong> <strong>here</strong><br />

y 1 = 1 − 1 4 x2 + 1<br />

64 x4 − 1 1<br />

2304 x6 +<br />

147456 x8 + O(x 10 ) ,<br />

y 2 = y 1 (x) log x + 1 4 x2 − 3<br />

128 x4 + 11<br />

13824 x6 − 25<br />

1769472 x8 + O(x 10 ) .


Module 6. Series solutions <strong>of</strong> differential equations give special functions 239<br />

This framework <strong>of</strong> using residuals to improve approximate solutions, getting<br />

computers to do the tedious algebra, can be adapted to a wide variety <strong>of</strong><br />

problems. The iteration will improve an approximation provided changes<br />

deduced from the residuals are appropriate because a simple and sensible<br />

approximation to the equation for the changes has been derived. But the ultimate<br />

result depends only upon being able to evaluate the residuals correctly<br />

and being able to drive them to zero to some level <strong>of</strong> accuracy.<br />

Activity 6.L Do problems 6.18–6.23 in the Exercises set 6.3.4, p239. Send<br />

in to the examiner for feedback at least 6.18 & 6.20.<br />

6.3.4 Exercises<br />

Ex. 6.13: Modify the iteration <strong>of</strong> Example 6.6 to find the Maclaurin series<br />

solution to the ode y ′′ − 2y = 0 such that y(0) = 1 and y ′ (0) = 0 using<br />

reduce and to errors O(x 10 ).<br />

Ex. 6.14: Similarly use reduce to find the Maclaurin series solution to<br />

errors O(x 15 ) to the ode y ′′ + xy = 0 such that y(0) = a and y ′ (0) = b<br />

(remember to factor b,a;). The Maclaurin series multiplied by a and<br />

b are those <strong>of</strong> two linearly independent solutions to Airy’s equation<br />

mentioned in Kreyszig [K,p198,p958–60].


Module 6. Series solutions <strong>of</strong> differential equations give special functions 240<br />

Ex. 6.15: Use reduce to find the Maclaurin series <strong>of</strong> the solution to y ′ =<br />

cos(x)y such that y(0) = 1 to errors O(x 10 ). Hint: replace cos x<br />

in the code by its Maclaurin series, you may use that factorial(n)<br />

in reduce computes n!. Compare your answer to that <strong>of</strong> the exact<br />

analytic solution obtained by recognising the ode is separable.<br />

Ex. 6.16: Modify the analysis <strong>of</strong> Example 6.8 to use reduce to find the<br />

Maclaurin series solution to errors O(x 10 ) <strong>of</strong> the nonlinear ode y ′′ = 6y 2<br />

such that y(0) = 1 and y ′ (0) = b w<strong>here</strong> b is some arbitrary constant.<br />

Note: because this is a nonlinear ode the solution depends nonlinearly<br />

upon b, in contrast to linear ode’s which would show a linear<br />

dependence only.<br />

Ex. 6.17: Use reduce to find the Maclaurin series solution <strong>of</strong> the nonlinear<br />

ode y ′′ = (1+x)y 3 to errors O(x 10 ) such that y(0) = 2 and y ′ (0) = −3.<br />

Ex. 6.18: Modify the reduce computer algebra <strong>of</strong> Example 6.9 to find the<br />

Maclaurin series <strong>of</strong> the general solution to Legendre’s equation in the<br />

specific case k = 3 to an error O(x 10 ).<br />

Ex. 6.19: Modify the arguments and the reduce computer algebra <strong>of</strong> Example<br />

6.9 to find the Maclaurin series, to an error O(x 10 ), <strong>of</strong> the general<br />

solution to the following three odes:<br />

(a) (x − 2)y ′ = xy ;<br />

(b) (1 − x 2 )y ′ = 2xy ;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 241<br />

(c) y ′′ − 4xy ′ + (4x 2 − 2)y = 0 .<br />

Ex. 6.20: Modify the computer algebra code for Example 6.11 to find the<br />

Maclaurin series, to errors O(x 10 ), <strong>of</strong> the well-behaved solution <strong>of</strong> the<br />

nonlinear ode x 2 y ′′ + x 2 y ′ + xy 3 = 0 such that y(0) = 2.<br />

Ex. 6.21: Use reduce to help you find the power series about x = 0, to<br />

errors O(x 10 ), <strong>of</strong> the well-behaved solutions <strong>of</strong> the ode x 2 y ′′ + x 3 y ′ +<br />

(x 2 − 2)y = 0. Hint: x 2 y ′′ − 2y = (x 4 (y/x 2 ) ′ ) ′ . Then modify your<br />

reduce code to find the power series <strong>of</strong> the one parameter family <strong>of</strong><br />

well-behaved solutions to the nonlinear ode x 2 y ′′ +x 3 y ′ +(x 2 −2)y+y 2 =<br />

0.<br />

Ex. 6.22: Use reduce to help find the power series about x = 0, to errors<br />

O(x 20 ), <strong>of</strong> the well-behaved solutions <strong>of</strong> the ode xy ′′ + 3y ′ + 3x 2 y = 0.<br />

Hint: xy ′′ + 3y = (x 3 y ′ ) ′ /x 2 .<br />

Ex. 6.23: Find the power series expansions about x = 0, to errors O(x 10 ),<br />

for the two parameter general solution to the linear ode x 2 y ′′ −sin(x)y ′ +<br />

y = 0, with the aid <strong>of</strong> computer algebra. Hint: expand sin x in a<br />

Maclaurin series and write x 2 y ′′ − xy ′ in the form x 2−p (x p y ′ ) ′ .<br />

Ex. 6.24: Following is some reduce code to iteratively find a power series<br />

solution to an ode: what is the differential equation it purports to<br />

solve? and its initial conditions? what is the value <strong>of</strong> y after the first<br />

iteration <strong>of</strong> the repeat loop? what is the order <strong>of</strong> error in the computed<br />

power series after the loop terminates?


Module 6. Series solutions <strong>of</strong> differential equations give special functions 242<br />

on div; <strong>of</strong>f allfac; on revpri;<br />

y:=2*x;<br />

let x^20=>0;<br />

repeat begin<br />

r:=(1-x^3)*df(y,x,x)-(y^2-x^2)*df(y,x);<br />

r:=-int(int(r,x),x);<br />

write y:=y+r;<br />

end until r=0;<br />

6.3.5 Summary <strong>of</strong> some reduce commands<br />

“the different branches <strong>of</strong> Arithmetic—Ambition, Distraction, Uglification<br />

and Derision.” the Mock Turtle in Alice in Wonderland<br />

by Lewis Carroll<br />

• reduce instructions must be terminated and separated by a semicolon.<br />

• quit; or bye; terminates reduce execution.<br />

• Use on div;, <strong>of</strong>f allfac; and on revpri; to improve the printing<br />

<strong>of</strong> power series.<br />

• := is the assignment operator.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 243<br />

• The normal arithmetic operators are: +, -, *, / and ^ for addition,<br />

subtraction, multiplication, division and exponentiation respectively.<br />

• write will display the result <strong>of</strong> an expression, although reduce automatically<br />

displays the results <strong>of</strong> each command that is not in a loop.<br />

• int(y,x) will provide an integral <strong>of</strong> the expression in y with respect<br />

to the variable x, provided reduce can actually do the integral.<br />

• df(y,x) returns the derivative <strong>of</strong> the expression in y with respect to<br />

the variable x; df(y,x,z) will return the second derivative <strong>of</strong> y with<br />

respect to x and z.<br />

• factorial(n) returns the value <strong>of</strong> n!.<br />

• for n:=2:5 do, for example, will repeat whatever statement follows<br />

for values <strong>of</strong> the variable used, <strong>here</strong> n, over the range specified in the<br />

command, <strong>here</strong> from 2 to 5.<br />

• The let statement does pattern matching and replacement; for example<br />

let x^15=>0; tells reduce to subsequently discard any term<br />

involving x to the power fifteen or more.<br />

• repeat...until... will repeatedly execute a statement until the given<br />

condition is true.<br />

• begin...end is used to group statements into one; end; is also used<br />

to terminate reading in a file <strong>of</strong> reduce commands.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 244<br />

• in "..."; tells reduce to execute the commands contained in the<br />

specified file.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 245<br />

6.4 The orthogonal solutions to second order differential<br />

equations<br />

Power series give us very powerful methods <strong>of</strong> deriving solutions to specific<br />

differential equations. But in order to guide us we need to know more about<br />

the structure <strong>of</strong> solutions to ode’s. Sturm-Liouville theory tells us how different<br />

solutions <strong>of</strong> an ode relate to each other, they are orthogonal, and<br />

something about their nature. This then allows us to usefully write functions<br />

in terms <strong>of</strong> families <strong>of</strong> solutions to an ode.<br />

In this section we identify patterns that occur across a wide range <strong>of</strong> ode’s.<br />

This is mathematics at a higher level—it brings together into the framework<br />

<strong>of</strong> Sturm-Liouville theory a variety <strong>of</strong> ode’s and their solutions. The<br />

task <strong>here</strong> is not the solution <strong>of</strong> actual problems, but the appreciation <strong>of</strong> the<br />

synthesis <strong>of</strong> wide ranging phenomena in the solutions <strong>of</strong> ode’s.<br />

Main aims:<br />

• see that Legendre and Bessel equations are examples <strong>of</strong> Sturm-Liouville<br />

equations;<br />

• show that important properties such as reality <strong>of</strong> eigenvalues and orthogonality<br />

<strong>of</strong> eigenfunctions can be deduced from the differential equation.


Module 6. Series solutions <strong>of</strong> differential equations give special functions 246<br />

The simplest example <strong>of</strong> functions displaying the properties that we investigate<br />

are the trigonometric functions and their harmonics, sin nx and cos nx<br />

for integer n. The properties are derived from their differential equation<br />

y ′′ + n 2 y = 0.<br />

The family <strong>of</strong> ode’s we consider are those in the form <strong>of</strong> the Sturm-Liouville<br />

equation<br />

[r(x)y ′ ] ′ + [q(x) + λp(x)]y = 0 , (6.7)<br />

w<strong>here</strong> p, q and r are given functions and λ is a constant which is <strong>of</strong>ten a<br />

parameter to the problem. Many second-order ode’s are put into this form.<br />

Reading 6.M Study Kreyszig §4.7 [K,pp233–8], including the pro<strong>of</strong> <strong>of</strong> Reality<br />

<strong>of</strong> eigenvalues in Appendix 4 [K,pA70].<br />

Recall that orthogonality is just a grand word for being at right angles. These<br />

properties <strong>of</strong> the orthogonality <strong>of</strong> eigenfunctions and the reality <strong>of</strong> eigenvalues<br />

λ are reminiscent <strong>of</strong> the similar properties for eigenvectors and eigenvalues<br />

<strong>of</strong> symmetric matrices. This is no accident and the connection is explored<br />

further in Module 7.<br />

Othogonality implies oscillations: consider how a family <strong>of</strong> functions y n (x)<br />

can all be orthogonal to each other. First y 0 (x) can be fairly boring such as<br />

the constant P 0 (x) or cos(0·x). Secondly, y 1 (x) has to change sign somew<strong>here</strong>,


Module 6. Series solutions <strong>of</strong> differential equations give special functions 247<br />

as seen in P 1 (x) or cos(x), so that the integral ∫ b<br />

a y 0(x)y 1 (x)dx can be zero by<br />

orthogonality. Thirdly, y 2 (x) has to be orthogonal to both y 0 (x) and y 1 (x)<br />

so it must oscillate a couple <strong>of</strong> times, as seen in P 2 (x). And so on—as we<br />

consider further y n (x) we find that successive y n (x) must have more and more<br />

oscillations in order to maintain orthogonality. This is seen for example in the<br />

families P n (x) and cos(nx). It holds very widely: solutions <strong>of</strong> Sturm-Liouville<br />

problems have more oscillations the higher the value <strong>of</strong> the corresponding<br />

eigenvalue. 7<br />

Activity 6.N Do problems from Problem Set 4.7 [K,pp238–9]. Send in to<br />

the examiner for feedback at least Q4, 7 & 15.<br />

6.4.1 Answers to selected Exercises<br />

6.5 (a) expect power series solution because the coefficient functions are all<br />

well behaved at x = 0 and the leading coefficient <strong>of</strong> y ′′ is not zero.<br />

(b) y = 1 + 3 2 x4 + O(x 4 )<br />

6.13 y = 1 + x 2 + 1 6 x4 + 1<br />

90 x6 + 1<br />

2520 x8 + O(x 10 )<br />

6.14 y = b(x − 1 12 x4 + 1<br />

1<br />

12960 x9 + 1<br />

504 x7 − 1<br />

45360 x10 + 1<br />

1710720 x12 ) + O(x 15 )<br />

7 This can be proved but we will not do so <strong>here</strong>.<br />

7076160 x13 ) + a(1 − 1 6 x3 + 1<br />

180 x6 −


Module 6. Series solutions <strong>of</strong> differential equations give special functions 248<br />

6.15 y = 1 + x + 1 2 x2 − 1 8 x4 − 1<br />

15 x5 − 1<br />

240 x6 + 1 90 x7 + 31<br />

5760 x8 + 1<br />

5670 x9 + O(x 10 )<br />

6.16 y = 1 + 3x 2 + 3x 4 + 3x 6 + 18 7 x8 + b(x + 2x 3 + 3x 5 + 24 7 x7 + 25<br />

7 x9 ) +<br />

b 2 ( 1 2 x4 + x 6 + 45<br />

28 x8 ) + b 3 ( 1 7 x7 + 5<br />

14 x9 ) + O(x 10 )<br />

6.17 y = 2−3x+4x 2 − 14<br />

O(x 10 )<br />

3 x3 + 11<br />

6.18 y = a(1 − 3 2 x2 − 3 8 x4 − 17<br />

159<br />

4480 x9 ) + O(x 10 )<br />

2 x4 − 25 4 x5 + 211<br />

30 x6 − 47 6 x7 + 2081<br />

80 x6 − 663<br />

6.20 y = 2−4x+4x 2 − 32<br />

9 x3 + 26 9 x4 − 56<br />

O(x 10 )<br />

4480 x8 ) + b(x − 1 6 x3 − 3<br />

240 x8 − 41243<br />

4320 x9 +<br />

40 x5 − 27 7 −<br />

560x<br />

25 x5 + 3404<br />

2025 x6 − 832<br />

675 x7 + 1199<br />

1350 x8 − 4142<br />

6561 x9 +<br />

6.21 Well behaved solutions are proportional to y = x 2 − 3<br />

144 x8 +<br />

O(x 10 ). The nonlinear solutions parametrised by a, the coefficient <strong>of</strong><br />

the quadratic term, are y = a(x 2 − 3<br />

10 x4 + 3<br />

56 x6 − 1<br />

144 x8 ) + a 2 (− 1<br />

10 x4 +<br />

11<br />

280 x6 − 661<br />

75600 x8 ) + a 3 ( 1<br />

140 x6 − 11<br />

3150 x8 ) − 17<br />

37800 a4 x 8 + O(x 10 ).<br />

10 x4 + 3<br />

56 x6 − 1<br />

6.22 Well behaved solutions are proportional to y = 1− 1 5 x3 + 1 80 x6 − 1<br />

1<br />

147840 x12 1<br />

−<br />

12566400 x15 1<br />

+<br />

1507968000 x18 + O(x 20 )<br />

6.23 y = (a+b log x)(x− 1<br />

11<br />

11612160 x7 −<br />

5951<br />

2640 x9 +<br />

24 x3 + 7<br />

3840 x5 − 89<br />

1161216 x7 + 6721<br />

2229534720 x9 1<br />

)+b(<br />

23040 x5 +<br />

44590694400 x9 ) + O(x 10 ).<br />

6.24 (1 − x 3 )y ′′ − (y 2 − x 2 )y ′ = 0, such that y(0) = 0 and y ′ (0) = 2 .<br />

y (1) = 2x + 1 2 x4 . The ultimate error is O(x 20 ) .


Module 6. Series solutions <strong>of</strong> differential equations give special functions 249<br />

6.5 Summary<br />

• Power series give a powerful general method for solving linear and<br />

nonlinear ordinary differential equations (ode’s). At a regular point<br />

(§§6.2.1) solutions <strong>of</strong> an ode are developed in the form <strong>of</strong> Taylor or<br />

Maclaurin series (§§6.1.1):<br />

∞∑<br />

y(x) = a m (x − c) m = a 0 + a 1 (x − c) + a 2 (x − c) 2 + · · · .<br />

m=0<br />

Because <strong>of</strong> the uniqueness <strong>of</strong> a power series representation, the constants<br />

a m are determined equating coefficients <strong>of</strong> like powers <strong>of</strong> x − c<br />

(§§6.1.1).<br />

• Legendre polynomials, P n (x), are an example <strong>of</strong> special functions:<br />

– are the only non-singular solutions <strong>of</strong> Legendre’s equation which<br />

is, in Sturm-Liouville form, [(1 − x 2 )y ′ ] ′ + n(n + 1)y = 0 (§§6.1.2);<br />

– and are orthogonal over the interval [−1, 1] with weight function<br />

p(x) = 1 (§§6.4).<br />

• At a singular point (but not “too” singular §§6.2.1) Frobenius asserts<br />

solutions may be developed in the modified power series:<br />

∞<br />

y(x) = (x − c) r ∑<br />

a m (x − c) m<br />

m=0<br />

= a 0 (x − c) r + a 1 (x − c) r+1 + a 2 (x − c) r+2 + · · · .


Module 6. Series solutions <strong>of</strong> differential equations give special functions 250<br />

The exponent r is determined from the indicial equation obtained from<br />

the term <strong>of</strong> lowest order after substituting into the ode.<br />

• In applying Frobenius method (§§6.2.1) to second-order ode’s t<strong>here</strong> are<br />

generally two roots r 1 ≥ r 2 to the indicial equation and consequently<br />

three cases are distinguished (taking c = 0 for simplicity):<br />

– distinct roots not differing by an integer are straightforward—a<br />

basis for the solutions are<br />

(<br />

y 1 (x) = x r 1 a0 + a 1 x + a 2 x 2 + · · ·)<br />

,<br />

(<br />

y 2 (x) = x r 2 b0 + b 1 x + b 2 x 2 + · · ·)<br />

;<br />

– a double root, r 1 = r 2 , when a basis is<br />

(<br />

y 1 (x) = x r 1 a0 + a 1 x + a 2 x 2 + · · ·)<br />

,<br />

(<br />

y 2 (x) = y 1 (x) log x + x r 1 b1 x + b 2 x 2 + · · ·)<br />

;<br />

– roots differing by an integer when a basis is<br />

(<br />

y 1 (x) = x r 1 a0 + a 1 x + a 2 x 2 + · · ·)<br />

,<br />

(<br />

y 2 (x) = ky 1 (x) log x + x r 2 b0 + b 1 x + b 2 x 2 + · · ·)<br />

;<br />

• Bessel functions, J ν (x) and Y ν (x), are special functions and<br />

– are solutions <strong>of</strong> Bessel’s equation (§§6.2.2) x 2 y ′′ +xy ′ +(x 2 −ν 2 )y =<br />

0 or, in Sturm-Liouville form, [xy ′ ] ′ + ( )<br />

x − ν2<br />

x y = 0;


Module 6. Series solutions <strong>of</strong> differential equations give special functions 251<br />

– are orthogonal over intervals with x > 0 in several senses (§§6.4)<br />

• The iterative construction <strong>of</strong> power series solutions is an ideal application<br />

<strong>of</strong> computer algebra (§§6.3.2) for linear and nonlinear problems<br />

provided we discard unwanted high-order terms.<br />

• A good iterative method is (§§6.3.3): given an approximate solution<br />

y(x), to seek small changes ŷ(x) so that y(x) + ŷ(x) is a better approximation.<br />

Such changes are determined from the residual <strong>of</strong> the<br />

governing equations.<br />

• Many ode’s <strong>of</strong> importance may be written in the form <strong>of</strong> the Sturm-<br />

Liouville equation (6.7), [r(x)y ′ ] ′ + [q(x) + λp(x)]y = 0:<br />

– non-zero solutions only exist for particular values <strong>of</strong> λ = λ n , called<br />

the eigenvalues, which are necessarily real;<br />

– the corresponding eigenfunctions are all orthogonal with weight<br />

function p(x) (§§6.4).<br />

Activity 6.O Do problems from Chapter 4 Review [K,pp247–8].


Module 7<br />

Linear transforms and their<br />

eigenvectors on inner product<br />

spaces<br />

Recall the work on differential equations and their orthogonal solutions that<br />

we finished in Module 6. Many <strong>of</strong> the properties we touched upon t<strong>here</strong> are<br />

very similar to some that you have met in linear algebra before. The time<br />

has come to bring these two strands together.<br />

But solutions <strong>of</strong> differential equations involve the infinite flexibility <strong>of</strong> functions.<br />

We will see that functions act very much like vectors. But on any


Module 7. Linear transforms and their eigenvectors on inner product spaces253<br />

finite interval t<strong>here</strong> are not just an infinite number <strong>of</strong> functions t<strong>here</strong> are an<br />

infinite variety <strong>of</strong> functions. For example, in §7.4.3 we use the infinite number<br />

<strong>of</strong> solutions to Sturm-Liouville problems to describe any other solution function.<br />

But “infinity” is a slippery concept, so we now are very careful about<br />

how to establish the mathematical basis. First we create a basic structure<br />

for space, then the properties <strong>of</strong> mappings between spaces, and lastly the<br />

representation <strong>of</strong> these mappings by simple matrices <strong>of</strong> coefficients.<br />

Ultimately the development <strong>of</strong> a common setting allows us to draw simple<br />

vector pictures even when discussing concepts in extremely complicated situations<br />

such as the space <strong>of</strong> all continuous functions.<br />

Sturm-Liouville theory introduced in §6.4 is very close to properties <strong>of</strong> eigenvalues<br />

and eigenvectors <strong>of</strong> matrices. In this module we bring both within<br />

a unified view using the abstract theory <strong>of</strong> inner product spaces. We then<br />

extend the combined view a little further.<br />

Module contents<br />

7.1 Inner product spaces . . . . . . . . . . . . . . . . . . . 255<br />

7.1.1 Vector spaces form the universe . . . . . . . . . . . . . . 255<br />

7.1.2 Inner products give distances and angles . . . . . . . . . 262<br />

7.1.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 266<br />

7.2 The nature <strong>of</strong> linear transformations . . . . . . . . . . 269<br />

7.2.1 The universe <strong>of</strong> linear transformations . . . . . . . . . . 269


Module 7. Linear transforms and their eigenvectors on inner product spaces254<br />

7.2.2 Adjoint operators . . . . . . . . . . . . . . . . . . . . . . 273<br />

7.2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 282<br />

7.3 Revision <strong>of</strong> eigenvalues and eigenvectors . . . . . . . 284<br />

7.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 287<br />

7.4 Diagonalisation transformation . . . . . . . . . . . . . 289<br />

7.4.1 Adjoint eigenvectors diagonalise operators . . . . . . . . 290<br />

7.4.2 Orthogonal eigenvectors <strong>of</strong> self-adjoint operators . . . . 300<br />

7.4.3 Expansions in orthogonal eigenfunctions . . . . . . . . . 303<br />

7.4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 308<br />

7.4.5 Answers to selected Exercises . . . . . . . . . . . . . . . 310<br />

7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 312


Module 7. Linear transforms and their eigenvectors on inner product spaces255<br />

7.1 Inner product spaces<br />

Here we establish the basic abstract structure <strong>of</strong> spaces in which the analysis<br />

takes place <strong>of</strong> linear algebra and differential and integral equations. The abstract<br />

concepts are supported by examples met in your earlier mathematics.<br />

The approach is to build up the structures and properties that are needed<br />

from an axiomatic base.<br />

Main aims:<br />

• to develop vector spaces and their properties from their basic axioms<br />

and to understand how functions and IR n are unified in this framework;<br />

• to see how the definition <strong>of</strong> inner products leads to a unified view <strong>of</strong><br />

the useful notions <strong>of</strong> length, angles and orthogonality;<br />

• show how familiar relations and inequalities generalise to many situations.<br />

7.1.1 Vector spaces form the universe<br />

The first step is to define the fundamental axioms <strong>of</strong> vector spaces. 1<br />

1 An entertaining and accurate introduction to vector spaces is available at<br />

http://ciips.ee.uwa.edu.au/~gregg/Linalg/node86.html


Module 7. Linear transforms and their eigenvectors on inner product spaces256<br />

Reading 7.A Study the first three pages <strong>of</strong> §6.8 in Kreyszig [K,pp358–60].<br />

Note the basic properties <strong>of</strong> vector addition and scalar multiplication on the<br />

vector space: closed, commutativity, associativity, distributivity, and the existence<br />

<strong>of</strong> the zero vector 0, the negative <strong>of</strong> a vector, and the multiplicative<br />

identity 1. As for ordinary vectors t<strong>here</strong> exist the concepts <strong>of</strong> linear combination,<br />

linear independence, dimensionality both finite and infinite, and a<br />

basis.<br />

In our study we will stay within the realm <strong>of</strong> real vector spaces.<br />

Example 7.1: quadratic polynomials Show the set V <strong>of</strong> all quadratic<br />

polynomials (including those with zero coefficients) form a vector space<br />

under the usual operations <strong>of</strong> addition and scalar multiplication, write<br />

down a basis for the vector space and deduce it is <strong>of</strong> dimension 3.<br />

Solution: Denote by, for example, a the quadratic polynomial a 0 +<br />

a 1 x + a 2 x 2 .<br />

• Then “vector” (polynomial) addition a + b = (a 0 + b 0 ) + (a 1 +<br />

b 1 )x + (a 2 + b 2 )x 2 clearly gives another quadratic polynomial in V<br />

and is thus closed under addition.


Module 7. Linear transforms and their eigenvectors on inner product spaces257<br />

• By definition and commutativity <strong>of</strong> ordinary addition:<br />

a + b = (a 0 + b 0 ) + (a 1 + b 1 )x + (a 2 + b 2 )x 2<br />

= (b 0 + a 0 ) + (b 1 + a 1 )x + (b 2 + a 2 )x 2<br />

= b + a .<br />

• Similarly for associativity:<br />

(u + v) + w<br />

= [(u 0 + v 0 ) + w 0 ] + [(u 1 + v 1 ) + w 1 ]x + [(u 2 + v 2 ) + w 2 ]x 2<br />

= [u 0 + (v 0 + w 0 )] + [u 1 + (v 1 + w 1 )]x + [u 2 + (v 2 + w 2 )]x 2<br />

= u + (v + w) .<br />

• Clearly the zero vector, 0, is the zero polynomial 0 + 0x + 0x 2 as<br />

a + 0 = (a 0 + 0) + (a 1 + 0)x + (a 2 + 0)x 2 = a.<br />

• Now scalar multiplication defined as ca = (ca 0 ) + (ca 1 )x + (ca 2 )x 2<br />

clearly gives another quadratic and so V is closed under scalar<br />

multiplication.<br />

• By definition and distributivity <strong>of</strong> ordinary multiplication:<br />

c(a + b) = c(a 0 + b 0 ) + c(a 1 + b 1 )x + c(a 2 + b 2 )x 2<br />

= (ca 0 + cb 0 ) + (ca 1 + cb 1 )x + (ca 2 + cb 2 )x 2<br />

= (ca 0 ) + (ca 1 )x + (ca 2 )x 2 + (cb 0 ) + (cb 1 )x + (cb 2 )x 2<br />

= (ca) + (cb) .


Module 7. Linear transforms and their eigenvectors on inner product spaces258<br />

• Similarly for<br />

(c + d)a<br />

= (c + d)a 0 + (c + d)a 1 x + (c + d)a 2 x 2<br />

= (ca 0 + da 0 ) + (ca 1 + da 1 )x + (ca 2 + da 2 )x 2<br />

= (ca 0 ) + (ca 1 )x + (ca 2 )x 2 + (da 0 ) + (da 1 )x + (da 2 )x 2<br />

= (ca) + (da) .<br />

• Again by definition and associativity <strong>of</strong> ordinary multiplication<br />

c(da) = c [ (da 0 ) + (da 1 )x + (da 2 )x 2]<br />

= c [ d(a 0 + a 1 x + a 2 x 2 ) ]<br />

= (cd)a .<br />

• Lastly, the number 1 clearly serves as the identity for scalar multiplication.<br />

Thus this system forms a vector space. A basis for the vector space<br />

could be simply the powers <strong>of</strong> x in {1, x, x 2 } which in fact we used to<br />

show the vector space properties. Note that 1, x and x 2 are linearly<br />

independent quadratics because one cannot find a linear combination<br />

<strong>of</strong> them that is the zero quadratic, that is, zero for all x. Another basis<br />

for the vector space could be the first three Legendre polynomials:<br />

P 0 (x) = 1, P 1 (x) = x and P 2 (x) = − 1 2 + 3 2 x2 . Since the number <strong>of</strong> basis<br />

vectors is necessarily three, then so is the dimensionality.


Module 7. Linear transforms and their eigenvectors on inner product spaces259<br />

Example 7.2: Show that sets with set union (∪) as the addition operator<br />

cannot form a vector space.<br />

Solution: Denote the “vectors”, namely the subsets <strong>of</strong> some universal<br />

set U, by capital letters, A and B.<br />

(a) Clearly A + B = A ∪ B is a set in U so “vector addition” is closed.<br />

(b) Also clearly, A+B = A∪B = B ∪A = B +A so “vector addition”<br />

satisfies commutativity.<br />

(c) Set union is associative so (A + B) + C = (A ∪ B) ∪ C = A ∪<br />

(B ∪ C) = A + (B + C) ensures the associativity <strong>of</strong> this “vector<br />

addition.”<br />

(d) Now A + 0 = A ∪ 0 = A can only hold for all sets A if the zero<br />

vector is 0 = ∅ the empty set.<br />

(e) But then t<strong>here</strong> is no negative for every set A as clearly t<strong>here</strong> is<br />

generally no set B (which we would like to denote by −A) such<br />

that A + B = A ∪ B = ∅.<br />

Because <strong>of</strong> the failure <strong>of</strong> this property, we cannot form a vector space.<br />

Definition 7.1 A square integrable function on the interval [a, b] is a function,<br />

say f(x), for which ∫ b<br />

a [f(x)]2 dx is finite valued. The set <strong>of</strong> all square<br />

integrable functions on [a, b] is denoted L 2 [a, b].


Module 7. Linear transforms and their eigenvectors on inner product spaces260<br />

Example 7.3: Argue that L 2 [a, b] is a vector space under the usual addition<br />

and scalar multiplication <strong>of</strong> functions.<br />

Solution: Denote the “vectors” by lower case letters such as f, g and<br />

h to denote the functions f(x), g(x) and h(x) respectively. Consider<br />

each property in turn.<br />

• Defining f + g to be the function with the value f(x) + g(x) for<br />

all x ∈ [a, b]. But is it necessarily in L 2 [a, b], namely square integrable?<br />

Note the following inequality, that for any numbers a and<br />

b: Remember this<br />

(a + b) 2 = 2a 2 + 2b 2 − (a − b) 2 ≤ 2a 2 + 2b 2 .<br />

Apply this pointwise to functions f and g:<br />

∫ b<br />

a<br />

(f + g) 2 dx ≤<br />

∫ b<br />

a<br />

∫ b<br />

∫ b<br />

2f 2 + 2g 2 dx = 2 f 2 dx + 2 g 2 dx ,<br />

a<br />

a<br />

and since the right-hand side is a finite upper bound for the nonnegative<br />

integral on the left, thus f + g must be in L 2 [a, b] and<br />

addition is closed.<br />

• Also commutativity, f + g = g + f, follows from pointwise commutativity,<br />

f(x) + g(x) = g(x) + f(x).<br />

• Similarly for associativity.<br />

• Clearly the “zero vector” is the zero function as f(x) + 0 = f(x)<br />

for all x<br />

inequality—it comes<br />

from the<br />

parallelogram<br />

equality.


Module 7. Linear transforms and their eigenvectors on inner product spaces261<br />

• The “negative” <strong>of</strong> f is simply its pointwise negative −f(x). −f<br />

is clearly in L 2 [a, b] if f is.<br />

• L 2 [a, b] is closed under scalar multiplication as ∫ b<br />

a [cf(x)]2 dx =<br />

c 2 ∫ b<br />

a f 2 dx which is finite for all finite c and square integrable f.<br />

• As above distributivity and associativity <strong>of</strong> scalar multiplication<br />

follows immediately from pointwise properties.<br />

• Lastly, the identity for scalar multiplication is the function that<br />

is 1 for all x ∈ [a, b] as 1.f(x) = f(x)<br />

Often we only want to consider subsets <strong>of</strong> a vector space. For example,<br />

when solving a differential equation with boundary conditions we only need<br />

to consider those “vectors” in the vector space <strong>of</strong> functions which satisfy the<br />

boundary conditions. The notion <strong>of</strong> a subspace <strong>of</strong> a vector space is very<br />

useful from time to time. The pro<strong>of</strong> <strong>of</strong> the following theorem follows directly<br />

from the properties <strong>of</strong> a vector space.<br />

Theorem 7.2 A subset U <strong>of</strong> a vector space V is a vector space itself if it is<br />

closed under vector addition and scalar multiplication. Such a subset is then<br />

called a vector subspace.<br />

Example 7.4: The set <strong>of</strong> vectors lying on any one line through the origin in<br />

the plane form a vector subspace. Clearly, given a fixed line U through


Module 7. Linear transforms and their eigenvectors on inner product spaces262<br />

the origin: any two vectors lying in the line U add to another vector<br />

in U; any scalar multiple <strong>of</strong> a vector in the line U is also a vector in U.<br />

Thus such a line U is closed, is a subset <strong>of</strong> the vector space <strong>of</strong> the plane<br />

and t<strong>here</strong>fore is a vector subspace.<br />

Activity 7.B Do Problems 1–12 in Problem Set 6.8 [K,p364], and 7.6–7.10<br />

from Exercises 7.1.3. Send in to the examiner for feedback at least Q1,<br />

7 and Ex. 7.6.<br />

7.1.2 Inner products give distances and angles<br />

One <strong>of</strong> our fundamental needs is the notion <strong>of</strong> distance and angles. For<br />

example, only then can we determine the errors in an approximation. A generalisation<br />

<strong>of</strong> the vector dot product to an inner product serves this purpose<br />

in any vector space.<br />

Reading 7.C Study the brief subsection on Inner Product Spaces in Kreyszig<br />

§6.8 [K,pp361–2].


Module 7. Linear transforms and their eigenvectors on inner product spaces263<br />

Note, Kreyszig uses round brackets (parentheses) to denote a general inner<br />

product, (u, v), w<strong>here</strong>as I prefer the angle brackets, 〈u, v〉, as it is less likely<br />

to be mistaken for a vector with two components, and will use it throughout<br />

this study guide. Inner products occur so extensively in mathematics that<br />

one <strong>of</strong>ten uses many different types <strong>of</strong> brackets for different inner products<br />

on different vector spaces.<br />

Definition 7.3 An inner product on a real vector space V is a real function<br />

〈u, v〉 for each u and v in V such that the following properties hold:<br />

1. linearity, 〈au + bv, w〉 = a 〈u, w〉 + b 〈v, w〉 for all real a and b, and<br />

all vectors u, v and w in V ;<br />

2. symmetry, 〈u, v〉 = 〈v, u〉 for all vectors u and v in V ;<br />

3. positivity, 〈v, v〉 ≥ 0 for all v with equality holding only if v = 0.<br />

A vector space with an inner product is called an inner product space.<br />

Example 7.5: For functions f and g in L 2 [a, b] determine whether 〈f, g〉 =<br />

∫ b<br />

a fg dx forms an inner product.


Module 7. Linear transforms and their eigenvectors on inner product spaces264<br />

Solution:<br />

Since fg = 1 4 [(f + g)2 − (f − g) 2 ] thus<br />

∫ b<br />

fg dx = 1 [ ∫ b<br />

∫ ]<br />

b<br />

(f + g) 2 dx − (f − g) 2 dx<br />

a 4 a<br />

a<br />

which always is a finite real number as f ± g are in L 2 [a, b].<br />

linearity for all c, d and functions f, g and h in L 2 [a, b]:<br />

〈cf + dg, h〉 =<br />

=<br />

∫ b<br />

a<br />

∫ b<br />

a<br />

(cf + dg)h dx<br />

cfh dx +<br />

∫ b<br />

= c 〈f, h〉 + d 〈g, h〉 .<br />

a<br />

dgh dx<br />

symmetry Clearly 〈f, g〉 = ∫ b<br />

a fg dx = ∫ b<br />

a gf dx = 〈g, f〉<br />

positivity Also clearly 〈f, f〉 = ∫ b<br />

a f 2 dx ≥ 0 as the integrand f 2 ≥ 0.<br />

However, 〈f, f〉 can be 0 without f being precisely zero. For Infinite dimensional<br />

example, consider f(x) = 0 everyw<strong>here</strong> on [a, b] except for a finite function spaces are<br />

number <strong>of</strong> points at which it takes some non-zero value, then<br />

tricky.<br />

〈f, f〉 = ∫ b<br />

a f 2 dx = 0 but f is not zero. Strictly speaking this 〈, 〉<br />

is not an inner product on L 2 [a, b].<br />

However, we can patch the definitions. Refine the definition <strong>of</strong> square<br />

integrable functions so that a “vector” f in L 2 [a, b] is the set <strong>of</strong> all<br />

functions which are the same except at some number <strong>of</strong> isolated points.


Module 7. Linear transforms and their eigenvectors on inner product spaces265<br />

Then all necessary properties <strong>of</strong> an inner product space follow including<br />

that 〈f, f〉 = 0 only if f is the zero “vector” (the set <strong>of</strong> functions such<br />

that ∫ b<br />

a f 2 dx = 0.)<br />

With an inner product defined, the definition <strong>of</strong> distance between two vectors<br />

follows immediately.<br />

Definition 7.4 For vectors √ u and v in an inner product space, the length<br />

or norm <strong>of</strong> u is ‖u‖ = 〈u, u〉. (Thus the distance between u and v is<br />

‖u − v‖). A vector <strong>of</strong> norm 1 is called a unit vector.<br />

Note especially the consequent Schwarz inequality, also known as the Cauchy-<br />

Schwarz inequality, the triangle inequality, and the parallelogram equality.<br />

These relations are familiar in 2 and 3-dimensional geometry, and now we<br />

know they also hold even for very esoteric vector spaces. It means that<br />

schematic diagrams we draw on paper are still relevant to infinite dimensional<br />

inner product spaces.<br />

Inner products not only provide the notion <strong>of</strong> distance, they are also intimately<br />

tied up with the notion <strong>of</strong> angles and hence orthogonality. This<br />

underpins the orthogonality we discussed (§6.4) in the infinite number <strong>of</strong><br />

eigenfunctions <strong>of</strong> Sturm-Liouville problems.


Module 7. Linear transforms and their eigenvectors on inner product spaces266<br />

Definition 7.5 The angle θ between two vectors u and v in an inner product<br />

space is determined from<br />

( ) 〈u, v〉<br />

〈u, v〉 = ‖u‖.‖v‖. cos θ that is θ = arccos<br />

. (7.1)<br />

‖u‖.‖v‖<br />

Consequently, two vectors are orthogonal if their inner product 〈u, v〉 = 0 .<br />

Observe how the Cauchy-Schwarz inequality ensures that t<strong>here</strong> is always a<br />

well defined angle between any two non-zero vectors <strong>of</strong> an inner product<br />

space. This leads to being able to characterise vectors for which 〈u, v〉 =<br />

0 as being orthogonal, that is at right-angles, and leads to a generalised<br />

Pythagoras theorem.<br />

Activity 7.D Do problems from Exercises 7.11–7.13. Send in to the examiner<br />

for feedback at least Ex. 7.13.<br />

7.1.3 Exercises<br />

Ex. 7.6: Show that sets <strong>of</strong> objects with set intersection, ∩, as the addition<br />

operator cannot form a vector space.<br />

Ex. 7.7: Argue that the set <strong>of</strong> infinite sequences, denoted by IR ∞ it is composed<br />

<strong>of</strong> elements <strong>of</strong> the form a = (a 1 , a 2 , a 3 , . . .), forms a vector space.


Module 7. Linear transforms and their eigenvectors on inner product spaces267<br />

Ex. 7.8: Determine whether the following functions are in L 2 on the given<br />

interval:<br />

(a) sin x on [0, π];<br />

(b) cos x on (−∞, ∞);<br />

(c) e −x on [0, ∞);<br />

(d) x −1/4 on [0, 1];<br />

(e) 1/ √ x on [0, 4];<br />

(f) x −3/4 on [1, ∞).<br />

Ex. 7.9: Let L w 2 [a, b] denote the set <strong>of</strong> functions for which the weighted integral<br />

∫ b<br />

a w(x)[f(x)]2 dx is finite for some positive weight function w(x).<br />

Argue L w 2 [a, b] is a vector space under the usual addition and scalar<br />

multiplication <strong>of</strong> functions.<br />

Ex. 7.10: Argue that the space, C n [a, b], 2 <strong>of</strong> all functions with n continuous<br />

derivatives on [a, b] forms a vector space under addition and scalar<br />

multiplication <strong>of</strong> functions.<br />

Ex. 7.11: By considering ‖u+v‖ 2 and using the Cauchy-Schwarz inequality,<br />

prove the triangle inequality ‖u + v‖ ≤ ‖u‖ + ‖v‖ for all vectors in an<br />

inner product space.<br />

2 Often the space C 0 [a, b] <strong>of</strong> continuous functions on [a, b] is written as just C[a, b].


Module 7. Linear transforms and their eigenvectors on inner product spaces268<br />

Ex. 7.12: Argue that the subset U <strong>of</strong> IR ∞ for which ∑ ∞<br />

i=1 a 2 i is finite forms<br />

a subspace with an inner product 〈a, b〉 = ∑ ∞<br />

i=1 a i b i .<br />

Ex. 7.13: Let u = x and v = x 2 and the inner product 〈f, g〉 = ∫ 1<br />

0 fg dx .<br />

What are the norms <strong>of</strong> u and v? What is the angle between u and v?


Module 7. Linear transforms and their eigenvectors on inner product spaces269<br />

7.2 The nature <strong>of</strong> linear transformations<br />

We need to start considering functions defined on vector spaces. The simplest<br />

examples are functions <strong>of</strong> many variables. But we will have to move to<br />

dealing with functions <strong>of</strong> an infinite number <strong>of</strong> variables and even functions<br />

<strong>of</strong> functions! In fact you are already intimately familiar with the examples<br />

<strong>of</strong> differentiation and integration: sin x = cos x, d<br />

d<br />

= 2xe<br />

dx dx ex2 x2 and<br />

d<br />

(2√ x) = 1/ √ x so differentiation takes a function as an argument, such as<br />

dx<br />

sin x, and returns a function as a result, such as cos x. Here we investigate<br />

the simplest functions <strong>of</strong> a vector space, the linear transformations. They<br />

are the “straight lines” <strong>of</strong> vector spaces that in later units will form a basis<br />

for understanding quite general transformations.<br />

Main aims:<br />

• show that familiar operations on functions are examples <strong>of</strong> linear transformations;<br />

• the adjoint is the general analogue <strong>of</strong> the transpose.<br />

7.2.1 The universe <strong>of</strong> linear transformations<br />

Reading 7.E Study the last part, Linear Transformations, <strong>of</strong> Kreyszig §6.8<br />

[K,pp362–4].


Module 7. Linear transforms and their eigenvectors on inner product spaces270<br />

Definition 7.6 If F : V → W is a function from the vector space V into<br />

the vector space W , then F is called a linear transformation if<br />

1. F (u + v) = F (u) + F (v) for all vectors u and v in V , and<br />

2. F (cu) = cF (u) for all vectors u in V and scalars c.<br />

A linear transform is also called a linear operator.<br />

Example 7.14: Show that the differential operator L = d2<br />

+x d<br />

dx 2<br />

transformation from C 2 [a, b] into C 0 [a, b].<br />

dx<br />

is a linear<br />

Solution: Since L involves at most the second derivative, the range<br />

and domain are clearly appropriate.<br />

(a) Observe<br />

L(f + g) = d2<br />

d<br />

(f + g) + x (f + g)<br />

dx2 dx<br />

= d2 f<br />

dx + d2 g<br />

2 dx + x df<br />

2 dx + x dg<br />

dx<br />

= d2 f<br />

dx + x df<br />

2 dx + d2 g<br />

dx + x dg<br />

2 dx<br />

= Lf + Lg ,


Module 7. Linear transforms and their eigenvectors on inner product spaces271<br />

(b) and<br />

L(cf) = d2<br />

dx 2 (cf) + x d<br />

dx (cf)<br />

= c d2 f df<br />

+ cx<br />

dx2 dx<br />

= cLf ,<br />

which are the requisite properties for any a and b.<br />

Example 7.15: Argue that the integral L(f) = ∫ b<br />

a K(x, y)f(y) dy is a linear<br />

operator from L 2 [a, b] into itself, that is from and to square integrable<br />

functions, provided that K is bounded, |K(x, y)| ≤ k, for x and y in<br />

the interval [a, b].<br />

For example, if a = 0, b = 1 and K(x, y) = x − y (the bound is k = 1)<br />

then L(x 2 ) = 1x − 1 and L(sin πx) = (2x − 1)/π.<br />

3 4<br />

Solution: As with many infinite dimensional vector space problems<br />

the overwhelming difficult lies in confirming the range <strong>of</strong> the function L.<br />

Thus we first dispense with the straightforward part <strong>of</strong> showing that it<br />

is linear:


Module 7. Linear transforms and their eigenvectors on inner product spaces272<br />

(a)<br />

(b)<br />

L(f + g) =<br />

=<br />

=<br />

∫ b<br />

a<br />

∫ b<br />

a<br />

∫ b<br />

a<br />

K(x, y)(f(y) + g(y))dy<br />

K(x, y)f(y) + K(x, y)g(y) dy<br />

K(x, y)f(y) dy +<br />

= L(f) + L(g) ;<br />

L(cf) =<br />

= c<br />

∫ b<br />

a<br />

∫ b<br />

a<br />

= cL(f) .<br />

∫ b<br />

a<br />

K(x, y)cf(y) dy<br />

K(x, y)f(y) dy<br />

K(x, y)g(y) dy<br />

For f ∈ L 2 [a, b] we know ∫ b<br />

a f 2 (x) dx is finite. We now need to prove<br />

that g(x) = ∫ b<br />

a K(x, y)f(y) dy is also in L 2[a, b]. To help we define the<br />

inner product for any f and g, 〈f, g〉 = ∫ b<br />

a fg dx and use the Cauchy-<br />

Schwarz inequality that 〈f, g〉 2 ≤ ‖f‖ 2 ‖g‖ 2 . Consider<br />

∫ b<br />

a<br />

g 2 dx =<br />

=<br />

∫ b<br />

a<br />

∫ b ∫ b<br />

a<br />

L(f)L(f) dx<br />

a<br />

∫ b<br />

K(x, y)f(y)dy K(x, z)f(z)dz dx<br />

a


Module 7. Linear transforms and their eigenvectors on inner product spaces273<br />

=<br />

≤<br />

as any variable may be used for y in L(f)<br />

∫ b ∫ b ∫ b<br />

a a a<br />

∫ b ∫ b ∫ b<br />

a<br />

a<br />

K(x, y)K(x, z)dx f(y)f(z) dy dz<br />

|K(x, y)|.|K(x, z)|dx |f(y)|.|f(z)| dy dz<br />

a<br />

} {{ }<br />

≤(b−a)k 2<br />

∫ b<br />

rearranging<br />

∫ b<br />

≤ k 2 (b − a) |f(y)|dy |f(z)|dz<br />

a<br />

a<br />

= k 2 (b − a) 〈1, |f|〉 2 by definition <strong>of</strong> inner product<br />

≤ k 2 (b − a)‖f‖ 2 ‖1‖ 2<br />

∫ b<br />

= k 2 (b − a) 2 f 2 (x) dx<br />

a<br />

by Cauchy-Schwarz<br />

by definition <strong>of</strong> norms.<br />

This bound on the integral is finite and thus g = L(f) is necessarily<br />

square integrable, that is in L 2 [a, b].<br />

Activity 7.F Do Exercise 7.23 <strong>here</strong>in.<br />

7.2.2 Adjoint operators<br />

Recall that the transpose <strong>of</strong> a matrix <strong>of</strong>ten crops up in solving matrix problems.<br />

For example, the least-squares solution <strong>of</strong> an overdetermined system


Module 7. Linear transforms and their eigenvectors on inner product spaces274<br />

Ax = b is found by solving A T Ax = A T b. Also, the eigenvalues <strong>of</strong> a symmetric<br />

matrix, one for which A T = A, are always real. For general linear<br />

transforms we define the equivalent notion <strong>of</strong> an adjoint operator.<br />

Definition 7.7 The adjoint <strong>of</strong> a linear operator L mapping a subspace V<br />

into a subspace U <strong>of</strong> an inner product space W is the operator L † such that<br />

〈u, Lv〉 = 〈 L † u, v 〉 for all vectors u ∈ U and v ∈ V .<br />

If L † = L and U = V then L is called self-adjoint.<br />

Example 7.16: A † = A T . The adjoint <strong>of</strong> a matrix is its transpose using the<br />

usual inner product 〈u, v〉 = u T v. For all u and v, consider:<br />

〈u, Av〉 = u T Av by inner product definition<br />

= (A T u) T v by transpose properties<br />

= 〈 A T u, v 〉 by inner product definition,<br />

and hence the adjoint is A T . Clearly a symmetric matrix is self-adjoint.<br />

Theorem 7.8 Some straightforward properties <strong>of</strong> the adjoint follow.<br />

any linear operators L and M:<br />

For


Module 7. Linear transforms and their eigenvectors on inner product spaces275<br />

1. (L † ) † = L;<br />

2. (L + M) † = L † + M † ;<br />

3. (LM) † = M † L † ;<br />

4. the adjoint depends upon the inner product, if the inner product is<br />

changed then so does the adjoint.<br />

The pro<strong>of</strong>s <strong>of</strong> these properties are left as Exercise 7.26.<br />

Example 7.17: The shear transformation <strong>of</strong> the plane in the horizontal x-<br />

direction with parameter k is T (x, y) = (x + ky, y). This has matrix<br />

A =<br />

[<br />

1 k<br />

0 1<br />

]<br />

so that<br />

[<br />

x<br />

T (x, y) = A<br />

y<br />

]<br />

.<br />

Thus from Example 7.16 its adjoint must have matrix<br />

A T =<br />

[<br />

1 0<br />

k 1<br />

]<br />

so that<br />

T † (x, y) = A T [<br />

x<br />

y<br />

]<br />

=<br />

[<br />

x<br />

kx + y<br />

]<br />

.<br />

Thus T † is the shear transformation in the vertical y-direction with<br />

parameter k.<br />

But what is T † if we used a weighted inner product? Say the inner<br />

product on the plane was defined as 〈(u, v), (x, y)〉 = 2xu + yv so that


Module 7. Linear transforms and their eigenvectors on inner product spaces276<br />

we weight the horizontal direction more than the vertical direction.<br />

Then for all u, v, x and y<br />

〈(u, v), T (x, y)〉 = 〈(u, v), (x + ky, y)〉<br />

= 2u(x + ky) + vy<br />

= 2ux + (2ku + v)y<br />

= 〈(u, 2ku + v), (x, y)〉<br />

and so T † (x, y) = (x, 2kx + y) which is again a shear in the vertical but<br />

now with parameter 2k. The adjoint depends upon the choice <strong>of</strong> inner<br />

product.<br />

Example 7.18: Find the adjoint <strong>of</strong> the linear operator<br />

Lf =<br />

∫ b<br />

a<br />

K(x, y)f(y)dy .<br />

Solution:<br />

Using the inner product 〈f, g〉 = ∫ b<br />

a f(x)g(x) dx we have<br />

〈f, Lg〉 =<br />

=<br />

∫ b<br />

a<br />

∫ b ∫ b<br />

a<br />

∫ b<br />

f(x)<br />

a<br />

a<br />

K(x, y)g(y) dy dx<br />

f(x)K(x, y) dx g(y) dy<br />

by definitions<br />

swap order <strong>of</strong> integration


Module 7. Linear transforms and their eigenvectors on inner product spaces277<br />

=<br />

=<br />

∫ b ∫ b<br />

a a<br />

〈 ∫ b<br />

a<br />

K(y, x)f(y) dy g(x) dx<br />

〉<br />

K(y, x)f(y) dy , g<br />

and so the adjoint L † f = ∫ b<br />

a<br />

,<br />

swapping roles <strong>of</strong> x and y<br />

K(y, x)f(y) dy—which is not the same This is analogous to<br />

matrix transpose.<br />

as L because the arguments <strong>of</strong> K are interchanged, unless K(y, x) =<br />

K(x, y) for all x and y in which case K is called symmetric and then<br />

the operator L is self-adjoint.<br />

Example 7.19: the adjoint <strong>of</strong> d/dx is almost −d/dx! Consider this differentiation<br />

operator over C 1 [a, b] with the usual inner product 〈f, g〉 =<br />

∫ b<br />

a f(x)g(x) dx. Now<br />

〈<br />

f, dg 〉 ∫ b<br />

= f dg<br />

dx<br />

a dx dx<br />

∫ b<br />

= [fg] b a − df<br />

a dx g dx by integration by parts<br />

〈<br />

= f(b)g(b) − f(a)g(a) + − df 〉<br />

dx , g .<br />

The inner product appearing <strong>here</strong> indeed suggests that d †<br />

dx = −<br />

d<br />

but dx<br />

the exact identity required by the definition <strong>of</strong> adjoint actually does<br />

not hold unless we also have f(b)g(b) − f(a)g(a) = 0.<br />

This is a usual<br />

difficulty in function


Module 7. Linear transforms and their eigenvectors on inner product spaces278<br />

• One way to ensure this is to restrict the set <strong>of</strong> functions that the<br />

adjoint is defined to the subspace U <strong>of</strong> C 1 [a, b] <strong>of</strong> functions that are<br />

zero at a and b; hence f(a) = f(b) = 0, then f(b)g(b)−f(a)g(a) =<br />

0 and hence d †<br />

dx = −<br />

d<br />

on U.<br />

dx<br />

• Another way, more aesthetically pleasing, is to restrict d to the<br />

dx<br />

subspace V <strong>of</strong> functions zero at a and restrict d †<br />

dx to the subspace<br />

U (redefined) <strong>of</strong> functions zero at b (or vice-versa); then f(b)g(b)−<br />

f(a)g(a) = 0 as g(a) = 0 and f(b) = 0 and hence d †<br />

dx = −<br />

d<br />

dx<br />

The previous example begins to show that initial and/or boundary conditions<br />

are an integral part <strong>of</strong> operators and their adjoints.<br />

Example 7.20: L = d2 is self-adjoint on the subspace V <strong>of</strong> C<br />

dx 2<br />

2 [a, b] <strong>of</strong> functions<br />

that are zero at x = a and b using the usual inner product.<br />

Solution: <strong>Just</strong> consider<br />

〈f, Lg〉 =<br />

∫ b<br />

a<br />

fg ′′ dx<br />

∫ b<br />

= [fg ′ ] b a − f ′ g ′ dx<br />

a<br />

∫ b<br />

= [fg ′ − f ′ g] b a + f ′′ g dx<br />

using dashes for derivatives<br />

a<br />

integrating by parts once<br />

integrating by parts again


Module 7. Linear transforms and their eigenvectors on inner product spaces279<br />

= [fg ′ − f ′ g] b a<br />

+ 〈Lf, g〉<br />

= [fg ′ ] b a<br />

+ 〈Lf, g〉 as g(a) = g(b) = 0 for g ∈ V<br />

= 〈Lf, g〉<br />

for f ∈ V as then f(a) = f(b) = 0. Thus L † = L on V and is<br />

self-adjoint.<br />

Example 7.21: Find the adjoint <strong>of</strong> L = d2<br />

+ x d on the subspace V =<br />

dx 2 dx<br />

{g ∈ C 2 [a, b] | g ′ (a) = g(b) = 0} using the usual inner product.<br />

Solution:<br />

Consider<br />

〈f, Lg〉 =<br />

=<br />

∫ b<br />

a<br />

∫ b<br />

a<br />

f(g ′′ + xg ′ ) dx<br />

fg ′′ + xfg ′ dx<br />

∫ b<br />

= [fg ′ + xfg] b a − f ′ g ′ + (xf) ′ g dx<br />

= f(b)g ′ (b) − af(a)g(a) −<br />

as g ′ (a) = g(b) = 0<br />

a<br />

∫ b<br />

a<br />

integrating by parts<br />

f ′ g ′ + (xf ′ + f)g dx<br />

∫ b<br />

= f(b)g ′ (b) − af(a)g(a) − [f ′ g] b a + (f ′′ − xf ′ − f)g dx<br />

a


Module 7. Linear transforms and their eigenvectors on inner product spaces280<br />

integrating f ′ g ′ by parts<br />

= f(b)g ′ (b) − af(a)g(a) + f ′ (a)g(a) + 〈f ′′ − xf ′ − f, g〉<br />

since g(b) = 0.<br />

Thus L † =<br />

d2 − x d − 1 provided we restrict it to the subspace U<br />

dx 2 dx<br />

<strong>of</strong> functions f such that f(b)g ′ (b) + [−af(a) + f ′ (a)]g(a) = 0. Now<br />

we cannot control g ′ (b) nor g(a) as these may vary over any values in<br />

V . Thus we require that f(b) = 0 and f ′ (a) = af(a)—the adjoint is<br />

L † = d2 − x d − 1 over the subspace<br />

dx 2 dx<br />

U = {f ∈ C 2 [a, b] | f ′ (a) − af(a) = f(b) = 0} .<br />

Example 7.22: Sturm-Liouville Show that Sturm-Liouville operators are<br />

self-adjoint in the usual inner product on suitable subsets <strong>of</strong> C 2 [a, b].<br />

Solution:<br />

As seen in §6.4 the Sturm-Liouville equation is<br />

Lg = [r(x)g ′ ] ′ + [q(x) + λp(x)]g = 0 ,<br />

for some functions p, q and r. Let’s consider the subspace <strong>of</strong> functions<br />

satisfying the quite general boundary conditions kg(a) + lg ′ (a) = 0 and


Module 7. Linear transforms and their eigenvectors on inner product spaces281<br />

mg(b) + ng ′ (b) = 0. Then<br />

〈f, Lg〉 =<br />

=<br />

∫ b<br />

a<br />

∫ b<br />

a<br />

f{(rg ′ ) ′ + (q + λp)g} dx<br />

f(rg ′ ) ′ + (q + λp)fg dx<br />

∫ b<br />

= [rfg ′ − rf ′ g] b a + (f ′ r) ′ g + (q + λp)fg dx<br />

a<br />

after integrating f(rg ′ ) ′ by parts twice<br />

= [r(fg ′ − f ′ g)] b a<br />

+ 〈Lf, g〉 .<br />

Thus for L to be self-adjoint we either need r(x) to be zero at the endpoints<br />

or, the case we consider <strong>here</strong>, fg ′ − f ′ g = 0 at the two ends. If<br />

l ≠ 0, then, w<strong>here</strong> all functions are evaluated at x = a, g ′ = −kg/l and<br />

hence<br />

fg ′ − f ′ g = −kfg/l − f ′ g = −(kf + lf ′ )g/l = 0<br />

if and only if kf + lf ′ = 0 since g could have any value. Similarly<br />

for the other end point x = b. If it happens that l = 0 then even<br />

easier arguments apply. We have shown that the functions f for the<br />

adjoint satisfy the same boundary conditions as for L itself. Thus<br />

Sturm-Liouville operators are self-adjoint.<br />

Observe in these examples how a differential operator together with suitable<br />

boundary conditions are very naturally complemented by the adjoint and


Module 7. Linear transforms and their eigenvectors on inner product spaces282<br />

its boundary conditions. If the operator and boundary conditions are used<br />

to form a well-posed differential equation, then so can the adjoint and its<br />

boundary conditions form a well-posed differential equation. Soon we will<br />

see that solutions to an ode are usefully related to those <strong>of</strong> its adjoint. The<br />

relationship is particularly useful when the operator is self-adjoint.<br />

7.2.3 Exercises<br />

Ex. 7.23: Argue that the set V <strong>of</strong> linear transforms from a vector space U<br />

to a vector space V , L : U → V , is itself a vector space under operator<br />

addition, that L + M is the transformation that applied to any vector<br />

u ∈ U is L(u) + M(u), and the operation <strong>of</strong> scalar multiplication, that<br />

cL is the transformation that applied to any vector is c.L(u). (For<br />

example, all the transformations <strong>of</strong> the plane, possibly represented as<br />

all 2 × 2 matrices, itself forms a vector space.)<br />

Ex. 7.24: Show that the adjoint <strong>of</strong> a matrix A under the weighted inner<br />

product 〈u, v〉 = u T Bv, for some suitable weight matrix B, is A † =<br />

(BAB −1 ) T .<br />

Ex. 7.25: Describe the adjoint <strong>of</strong> the transformation <strong>of</strong> the plane, T , which<br />

is rotation by an angle θ.<br />

Ex. 7.26: Use the definition <strong>of</strong> the adjoint to prove the first three properties<br />

listed in Theorem 7.8.


Module 7. Linear transforms and their eigenvectors on inner product spaces283<br />

Ex. 7.27: Find the adjoint <strong>of</strong> Lf = ∫ b<br />

a K(x, y)f(y)dy under the weighted<br />

inner product 〈f, g〉 = ∫ b<br />

a w(x)f(x)g(x) dx.<br />

Ex. 7.28: Find the adjoints <strong>of</strong> the following differential operators L and the<br />

subspaces they operate on:<br />

(a) Lf = df such that 2f(0) + 5f(1) = 0;<br />

dx<br />

(b) Lf = f ′′ + 3f ′ + 4f such that f(0) = 0 and f ′ (1) = 0;<br />

(c) Lf = f ′′′ + f ′ such that f(0) = 0, f ′ (0) = 2f ′′ (1), f ′ (1) = 3f(1).<br />

Use the inner product 〈f, g〉 = ∫ 1<br />

0 fg dx.<br />

Ex. 7.29: Find the adjoints <strong>of</strong> the following differential operators L and the<br />

subspaces they operate on:<br />

(a) Lf = df such that f(0) = 3f(1);<br />

dx<br />

(b) Lf = f ′′ + 2f ′ + f such that f(0) = f(1) = 0;<br />

(c) Lf = f ′′′ such that f(0) = 0, f ′ (0) = 2f ′′ (1), f(1) = f ′ (1).<br />

Use the inner product 〈f, g〉 = ∫ 1<br />

0 fg dx.


Module 7. Linear transforms and their eigenvectors on inner product spaces284<br />

7.3 Revision <strong>of</strong> eigenvalues and eigenvectors<br />

Reading 7.G Start by recalling familiar properties <strong>of</strong> eigenvalues and eigenvectors<br />

<strong>of</strong> matrices by revising the material in Kreyszig §7.1–2 [K,pp371–<br />

81]<br />

The critical facets:<br />

• a matrix A has a non-zero eigenvector v and eigenvalue λ if Av = λv,<br />

that is if the action <strong>of</strong> A is simply to stretch or compress v (possibly<br />

reversing direction if λ < 0);<br />

• eigenvalues are the solution <strong>of</strong> the characteristic equation det(λI−A) =<br />

0; the spectrum <strong>of</strong> A is the set <strong>of</strong> eigenvalues <strong>of</strong> A;<br />

• for any given eigenvector v, any scalar multiple is also an eigenvector,<br />

is a basis for a subspace and thus we seek linearly independent<br />

eigenvectors to avoid duplicity;<br />

• counted according to their multiplicity, t<strong>here</strong> are precisely n eigenvalues<br />

(possibly complex) <strong>of</strong> an n × n matrix;<br />

• if the n eigenvalues <strong>of</strong> an n × n matrix are distinct, then t<strong>here</strong> are<br />

precisely n linearly independent eigenvectors; however, if one or more<br />

eigenvalues are repeated, then the matrix may have less than n linearly<br />

independent eigenvectors;


Module 7. Linear transforms and their eigenvectors on inner product spaces285<br />

• in Matlab,<br />

– poly(a) returns the coefficients <strong>of</strong> the characteristic polynomial<br />

det(λI − A),<br />

– eig(a) returns a vector <strong>of</strong> eigenvalues,<br />

– w<strong>here</strong>as [p,d]=eig(a) returns a diagonal matrix D <strong>of</strong> eigenvalues<br />

and P whose columns are eigenvectors so that A = P DP −1 .<br />

Activity 7.H Ensure you can do the problems in Kreyszig Problem Sets 7.1<br />

and 7.2 [K,pp375–6 & pp379–81], and Exercises 7.31 and 7.32. Send in<br />

to the examiner for feedback at least Ex. 7.31(a) & 7.32(a).<br />

Many <strong>of</strong> the above properties hold on function spaces. However, the determinant<br />

is not defined so other methods have to be used to find the eigenvalues<br />

and eigenvectors.<br />

Example 7.30: Find the eigenvalues and eigenfunctions corresponding to<br />

non-zero eigenvalues <strong>of</strong> the linear transformation ∫ 1<br />

0 (x + y)f(y) dy over<br />

the space <strong>of</strong> continuously differentiable functions.<br />

Solution:<br />

We seek non-trivial solutions to<br />

∫ 1<br />

(x + y)f(y) dy = λf(x) .<br />

0


Module 7. Linear transforms and their eigenvectors on inner product spaces286<br />

• Expanding the integral on the left-hand side, observe<br />

x<br />

∫ 1<br />

0<br />

f(y) dy +<br />

∫ 1<br />

0<br />

yf(y) dy = λf(x) .<br />

• As the left-hand side is a linear function <strong>of</strong> x, then so must the<br />

right-hand side and hence so is f(x) (unless the left-hand side<br />

is zero which then implies λ = 0). Try f(x) = Ax + B in the<br />

equation to deduce<br />

( ) ( A A<br />

x<br />

2 + B +<br />

3 + B )<br />

= λAx + λB .<br />

2<br />

• This has to hold for all x and so the coefficients on the two sides<br />

A<br />

must be equal: + B = λA and A + B = λB, which in matrix<br />

2 3 2<br />

form is the matrix eigenproblem<br />

[ ] [ ] [ ]<br />

1/2 1 A A<br />

= λ .<br />

1/3 1/2 B B<br />

• The characteristic equation for this 2×2 matrix is (λ− 1 2 )2 − 1 3 = 0<br />

with solution the non-zero eigenvalues λ = 1 2 ± 1 √<br />

3<br />

.<br />

• The corresponding eigenvectors <strong>of</strong> the 2 × 2 problem are clearly<br />

proportional to (± √ 3, 1) which in terms <strong>of</strong> the functions in the<br />

function space are simply those proportional to the eigenfunctions<br />

f = ± √ 3x + 1.


Module 7. Linear transforms and their eigenvectors on inner product spaces287<br />

7.3.1 Exercises<br />

Ex. 7.31: Each <strong>of</strong> the four pictures plotted below show the effect on vectors<br />

in the plane <strong>of</strong> a different transformation <strong>of</strong> the plane obtained by<br />

multiplying by different 2×2 matrices. In each picture, t<strong>here</strong> are seven<br />

different coloured dashed vectors terminated by open circles, call them<br />

u i . In each picture the vectors resulting from multiplying by a matrix A<br />

are also plotted, say v i = Au i , and drawn as solid lines terminated by<br />

“*”. Using that the action <strong>of</strong> a matrix is to just stretch its eigenvectors<br />

by a factor λ (and reverse direction if λ < 0), draw on each picture your<br />

best estimate <strong>of</strong> the two directions corresponding to the two different<br />

families <strong>of</strong> eigenvectors <strong>of</strong> a 2 × 2 transformation. Label them with a<br />

rough estimate <strong>of</strong> the corresponding eigenvalue.


Module 7. Linear transforms and their eigenvectors on inner product spaces288<br />

(a)<br />

(b)<br />

1.5<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-1 0 1 2<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

-1 0 1 2<br />

(c)<br />

(d)<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

-1 -0.5 0 0.5 1 1.5<br />

-1 0 1 2<br />

Ex. 7.32: Find the only non-zero eigenvalues and the corresponding eigenfunctions<br />

<strong>of</strong> the following linear transformations:<br />

(a) Lf = ∫ 1<br />

0 2(x − y)f(y) dy;<br />

(b) Lf = ∫ b<br />

a exp(x − y)f(y) dy;<br />

(c) Lf = ∫ π<br />

0 cos(x + y)f(y) dy.


Module 7. Linear transforms and their eigenvectors on inner product spaces289<br />

7.4 Diagonalisation transformation<br />

The orthogonal solutions <strong>of</strong> a Sturm-Liouville problem can be used to simply<br />

solve inhomogeneous differential equations <strong>of</strong> the form [r(x)y ′ ] ′ + [q(x) +<br />

µp(x)]y = f(x). The trick is to express the right-hand side f(x) as a sum<br />

over the eigenfunctions <strong>of</strong> the differential operator. We now explore how<br />

this is analogous to the diagonalisation <strong>of</strong> matrices and proceed to develop<br />

further properties.<br />

Main aims:<br />

• show that eigenvectors (eigenfunctions) <strong>of</strong> the adjoint operator are used<br />

to find expansions in the eigenvectors (eigenfunctions) <strong>of</strong> an operator;<br />

• because the adjoint eigenvectors are orthogonal to the eigenvectors, see<br />

that eigenvectors <strong>of</strong> a self adjoint operator are orthogonal.<br />

• an eigenfunction expansion solution to the Sturm-Liouville type problem<br />

[r(x)y ′ ] ′ + [q(x) + µp(x)]y = f(x) is the linear combination <strong>of</strong><br />

eigenfunctions ∑ j y j (x) 〈f/p, y j 〉 /(µ − λ j ) for any specific value <strong>of</strong> the<br />

parameter µ.


Module 7. Linear transforms and their eigenvectors on inner product spaces290<br />

7.4.1 Adjoint eigenvectors diagonalise operators<br />

Example 7.33: Consider solving the simple linear equation<br />

[ ] [ ] [ ] [ ]<br />

3 1 3 1 1<br />

Ax = x = = + 2 .<br />

1 3 −1 1 −1<br />

The rearrangement on the very right <strong>of</strong> this equation is motivated because<br />

I happen to know that (1, ±1) are eigenvectors <strong>of</strong> the matrix<br />

A; they correspond to eigenvalues 4 and 2. Knowing this we solve<br />

the linear equation using the “method <strong>of</strong> undetermined coefficients”<br />

by guessing a solution <strong>of</strong> the form <strong>of</strong> a linear combination <strong>of</strong> the two<br />

eigenvectors:<br />

[ ] [ ]<br />

1 1<br />

x = a + b .<br />

1 −1<br />

Substituting this into the linear equation and noting that A(1, 1) is<br />

4(1, 1) and A(1, −1) is 2(1, −1)<br />

[ ]<br />

[ ] [ ] [ ] [ ]<br />

3 1 1 1 1<br />

Ax = becomes 4a + 2b = + 2 .<br />

−1<br />

1 −1 1 −1<br />

Equating coefficients on both sides shows 4a = 1 and 2b = 2, that is<br />

a = 1/4 and b = 1, hence the solution is<br />

x = 1 [ ] [ ] [ ]<br />

1 1 5/4<br />

+ = .<br />

4 1 −1 −3/4


Module 7. Linear transforms and their eigenvectors on inner product spaces291<br />

The process just given <strong>here</strong> is the same as that we use in the eigenfunction<br />

expansion <strong>of</strong> solutions <strong>of</strong> ode’s in §7.4.3. Look at and compare with Example<br />

7.37. This process is intimately tied up with the diagonalisation <strong>of</strong><br />

matrices and linear operators because the solution <strong>of</strong> linear equations with a<br />

diagonal operator is near trivial as shown above.<br />

Reading 7.I Study Kreyszig §7.5 [K,pp392–6] (overlook Theorem 4 and Example<br />

3).<br />

Using diagonalisation, A = P DP −1 , the solution <strong>of</strong> the system Ax = b is<br />

written as x = P D −1 P −1 b which is easy to do as D −1 is simply the diagonal<br />

matrix <strong>of</strong> reciprocals <strong>of</strong> the eigenvalues. More explicitly we might write this as<br />

x = P a = ∑ j v j a j w<strong>here</strong> the “amplitudes” in the solution <strong>of</strong> the eigenvectors<br />

v j are given by a j = (w j · b)/λ j w<strong>here</strong> w j comes from the jth row <strong>of</strong> P −1 .<br />

Implicitly, this general result is used in the introductory example.<br />

Activity 7.J Do problems 1–6 and 10–22 from Problem Set 7.5 [K,pp397–8]<br />

Theorem 7.11 on deriving the eigenfunction expansion <strong>of</strong> solutions to inhomogeneous<br />

Sturm-Liouville problems, seen in action in Example 7.37, is<br />

equivalent to using a diagonalisation.<br />

We proceed to determine how to “digonalise operators”, not just matrices.<br />

The general setting is <strong>of</strong> linear transforms A acting on some vector space<br />

with an inner product 〈, 〉. The definition <strong>of</strong> eigenvalues and eigenvectors<br />

proceeds as before. However, t<strong>here</strong> is one new twist that is useful.


Module 7. Linear transforms and their eigenvectors on inner product spaces292<br />

Definition 7.9 For a linear transformation A the eigenvectors <strong>of</strong> A † are<br />

called the left-eigenvectors or adjoint eigenvectors <strong>of</strong> A.<br />

They are called left-eigenvectors because in the case <strong>of</strong> a matrix A, the defining<br />

equation A T w = λw is, upon transposing, equivalent to w T A = λw T<br />

in which w T appears to the left <strong>of</strong> A. Three important properties are the<br />

following.<br />

identical spectrum The spectrum <strong>of</strong> the adjoint A † , the set <strong>of</strong> eigenvalues,<br />

is the same as that <strong>of</strong> A. This is easily seen in finite dimensions because<br />

the characteristic polynomial <strong>of</strong> a matrix and its transpose are the<br />

same.<br />

orthogonal eigenvectors Any left-eigenvector and ordinary eigenvector corresponding<br />

to distinct eigenvalues are orthogonal.<br />

To see this, suppose v i is an eigenvector corresponding to eigenvalue<br />

λ i and w j is an eigenvector corresponding to eigenvalue λ j ≠ λ i . Then<br />

consider<br />

λ i 〈w j , v i 〉 = 〈w j , λ i v i 〉<br />

= 〈w j , Av i 〉 by definition <strong>of</strong> eigenvector<br />

= 〈 A † w j , v i<br />

〉<br />

by definition <strong>of</strong> adjoint<br />

= 〈λ j w j , v i 〉 by definition <strong>of</strong> left-eigenvector<br />

= λ j 〈w j , v i 〉 .


Module 7. Linear transforms and their eigenvectors on inner product spaces293<br />

Since λ i ≠ λ j the only way the extreme sides <strong>of</strong> this equation can be<br />

equal is if the common inner product factor is zero. Thus w j and v i<br />

are orthogonal.<br />

eigen-expansion Thus, if we have a complete set <strong>of</strong> eigenvectors and when<br />

we normalise the left-eigenvectors so that the inner product with its<br />

partner eigenvector 〈w j , v j 〉 = 1, then any vector may be decomposed<br />

as the linear combination <strong>of</strong> the eigenvectors u = ∑ i 〈w i , u〉 v i .<br />

Since the eigenvectors are complete, t<strong>here</strong> exists some linear combination<br />

u = ∑ i a i v i . Take the inner product <strong>of</strong> this to determine<br />

〈w j , u〉 =<br />

〈<br />

w j , ∑ i<br />

a i v i<br />

〉<br />

A common<br />

convention is to use<br />

δ ij to be 1 if i = j<br />

and 0 otherwise.<br />

= ∑ i<br />

a i 〈w j , v i 〉<br />

by linearity <strong>of</strong> inner product<br />

= ∑ i<br />

a i δ ij<br />

by orthonormalisation <strong>of</strong> w j<br />

= a j as all other terms are 0.<br />

Hence the “amplitude” <strong>of</strong> v i in the linear combination is a i = 〈w i , u〉<br />

as claimed.<br />

For a matrix A one may form the matrices <strong>of</strong> eigenvectors,<br />

P = [v 1 | v 2 | . . . | v n ] and Q = [w 1 | w 2 | . . . | w n ] ,


Module 7. Linear transforms and their eigenvectors on inner product spaces294<br />

then, using the usual dot product for as the inner product, observe that Q T P<br />

is the matrix <strong>of</strong> inner products 〈w i , v j 〉 which is zero everyw<strong>here</strong> except the<br />

diagonal w<strong>here</strong> w i has been normalised so the diagonal is 1. Thus Q T = P −1 ,<br />

that is, the rows <strong>of</strong> P −1 are the normalised left-eigenvectors <strong>of</strong> A.<br />

Example 7.34: Compute the eigenvalues, eigenvectors and left-eigenvectors<br />

<strong>of</strong> the matrix appearing in the linear equation<br />

[<br />

1 −1<br />

−4 1<br />

and hence solve the linear equation.<br />

] [ ]<br />

2<br />

u = ,<br />

2<br />

Solution:<br />

Call the matrix A, then the characteristic equation is<br />

det(A − λI) = (λ − 1) 2 − 4 = 0 ,<br />

which has solution the eigenvalues λ 1 = −1 and λ 2 = 3.<br />

• Any eigenvector corresponding to λ 1 = −1 solves<br />

(A − λI)v 1 =<br />

[<br />

2 −1<br />

−4 2<br />

with all solutions proportional to v 1 = (1, 2).<br />

]<br />

v 1 = 0 ,


Module 7. Linear transforms and their eigenvectors on inner product spaces295<br />

• Any eigenvector corresponding to λ 2 = 3 solves<br />

[ ]<br />

−2 −1<br />

(A − λI)v 2 =<br />

v<br />

−4 −2 2 = 0 ,<br />

with all solutions proportional to v 2 = (1, −2).<br />

• The left-eigenvectors satisfy the transposed equations, so for λ 1 =<br />

−1:<br />

[ ]<br />

2 −4<br />

(A T − λI)w 1 =<br />

w<br />

−1 2 1 = 0 ,<br />

with all solutions proportional to w 1 = (2, 1). Observe that w 1 ·<br />

v 2 = 0 as assured by theory, and that w 1 · v 1 = 1 upon scaling w 1<br />

to w 1 = (1/2, 1/4).<br />

• Similarly the left-eigenvector corresponding to λ 2 = 3 solves<br />

[ ]<br />

−2 −4<br />

(A T − λI)w 2 =<br />

w<br />

−1 −2 2 = 0 ,<br />

with all solutions proportional to w 2 = (−2, 1). Observe that<br />

w 2 ·v 1 = 0 as assured by theory, and that w 2 ·v 2 = 1 upon scaling<br />

w 2 to w 2 = (1/2, −1/4).<br />

Thus the inner products <strong>of</strong> the left-eigenvectors with the given righthand<br />

side are w 1 · (2, 2) = 3/2 and w 2 · (2, 2) = 1/2 so we know it as<br />

this linear combination <strong>of</strong> the eigenvectors:<br />

[ ]<br />

2<br />

= 3 [ ]<br />

1<br />

+ 1 [ ]<br />

1<br />

.<br />

2 2 2 2 −2


Module 7. Linear transforms and their eigenvectors on inner product spaces296<br />

Divide each term in the linear combination by the corresponding eigenvalue<br />

to obtain the solution<br />

u = − 3 [ ]<br />

1<br />

+ 1 [ ] [ ]<br />

1 −4/3<br />

=<br />

.<br />

2 2 6 −2 −10/3<br />

Example 7.35: Find the eigenvalues, eigenvectors (eigenfunctions) and<br />

normalised adjoint eigenvectors for the linear operator Ly = − d2 y<br />

with<br />

dx 2<br />

boundary conditions y(0) = 0 and y ′ (π) = 1 2 y′ (0).<br />

Solution: Use the usual inner product on the domain <strong>of</strong> the differential<br />

equation, namely 〈f, g〉 = ∫ π<br />

0 f(x)g(x) dx.<br />

• First, solve the eigenproblem −y ′′ = λy such that y(0) = 0 and<br />

y ′ (π) = 1 2 y′ (0). The ode has constant coefficients so we expect<br />

exponential or trigonometric solutions. Exponential solutions cannot<br />

occur because if they did they would have to be <strong>of</strong> the form<br />

y(x) = sinh( √ −λx) to satisfy y(0) = 0, but the derivative <strong>of</strong> this<br />

y ′ (x) ∝ cosh( √ −λx) is monotonic increasing with positive x and<br />

so y ′ (π) cannot be 1 2 y′ (0). Similarly λ = 0 cannot give rise to an<br />

eigenfunction. In the last case, to satisfy y(0) = 0 trigonometric


Module 7. Linear transforms and their eigenvectors on inner product spaces297<br />

solutions must be <strong>of</strong> the form y(x) = sin( √ λx). Then the other<br />

boundary condition<br />

y ′ (π) = 1 2 y′ (0) ⇔ √ λ cos( √ λπ) = 1 2√<br />

λ<br />

⇔ cos( √ λπ) = 1 2<br />

⇔ √ λ = 1 3 , 5 3 , 7 3 , 11 3 , . . .<br />

√<br />

⇔ λ j = j − 1 + 2 (−1)j /6 j = 1, 2, 3, . . .<br />

⇔ λ j = (j − 1 2 + (−1)j /6) 2 j = 1, 2, 3, . . .<br />

The corresponding eigenfunctions are v j (x) = sin[(j− 1 2 +(−1)j /6)x]<br />

plotted below.


Module 7. Linear transforms and their eigenvectors on inner product spaces298<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

-0.2<br />

v 1<br />

-0.4<br />

-0.6<br />

-0.8<br />

v 2<br />

v 3<br />

v 4<br />

v 5<br />

-1<br />

0 0.5 1 1.5 2 2.5 3 3.5<br />

x<br />

• Second, derive and solve the adjoint. Consider<br />

〈w, Lv〉 =<br />

∫ π<br />

0<br />

−wv ′′ dx<br />

= [−wv ′ + w ′ v] π 0 +<br />

∫ π<br />

0<br />

−w ′′ v dx<br />

x=linspace(0,pi);<br />

[j,x]=meshgrid(1:5,x);<br />

k=(j-.5+(-1).^j/6);<br />

v=sin(k.*x);<br />

plot(x,v)<br />

= w ′ (π)v(π) + [w(0) − 1 2 w(π)]v′ (0) + 〈−w ′′ , v〉<br />

and t<strong>here</strong>fore the adjoint is L † w = − d2 w<br />

with boundary conditions<br />

w ′ (π) = 0 and w(0) = 1 w(π). Observe that although the<br />

dx 2<br />

2<br />

differential part <strong>of</strong> the adjoint is the same, the operator L is not<br />

self-adjoint because the boundary conditions for the adjoint are<br />

different to that for L.


Module 7. Linear transforms and their eigenvectors on inner product spaces299<br />

By a similar argument to the above the solutions to the adjoint<br />

eigenproblem L † w = λw must be trigonometric. To satisfy w ′ (π) =<br />

0 the solutions must be <strong>of</strong> the form w = cos[ √ λ(π − x)]. Then<br />

the other boundary condition<br />

w(0) = 1 2 w(π) ⇔ cos[√ λπ] = 1 2<br />

⇔ √ λ = 1 3 , 5 3 , 7 3 , 11<br />

3 , . . .<br />

as for L. The spectrum for L and its adjoint must be the same.<br />

The left-eigenfunctions are then found to be w j (x) ∝ cos[(j − 1 2 +<br />

(−1) j /6)(π − x)]. To normalise, observe<br />

Thus choose<br />

〈<br />

cos[(j −<br />

1<br />

2 + (−1)j /6)(π − x)], v j (x) 〉<br />

= π 2 sin[(j − 1 2 + (−1)j /6)π] = π√ 3<br />

4 (−1)j−1 .<br />

w j (x) = 4(−1)j−1<br />

π √ 3<br />

cos[(j − 1 2 + (−1)j /6)(π − x)] ,<br />

plotted below. A little algebra also confirms that w j (x) are orthogonal<br />

to the eigenfunctions v i (x).


Module 7. Linear transforms and their eigenvectors on inner product spaces300<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

-0.2<br />

-0.4<br />

-0.6<br />

w 1<br />

w 2<br />

w 3<br />

w 4<br />

w 5 w=cos(k.*(pi-x)) ...<br />

.*(-1).^(j-1)*4/(pi*sqrt(3));<br />

plot(x,w)<br />

-0.8<br />

0 0.5 1 1.5 2 2.5 3 3.5<br />

x<br />

Activity 7.K Do Exercises 7.39–7.40 in §7.4.4. Send in to the examiner for<br />

feedback at least Ex. 7.39(a) & 7.40.<br />

7.4.2 Orthogonal eigenvectors <strong>of</strong> self-adjoint operators<br />

One immediate consequence <strong>of</strong> the work in the previous subsection concerns<br />

a self-adjoint transform. Clearly if a transformation is self-adjoint then the


Module 7. Linear transforms and their eigenvectors on inner product spaces301<br />

left-eigenfunctions, or left-eigenvectors <strong>of</strong> a symmetric matrix, are identical<br />

to the ordinary eigenvectors. This is because they satisfy precisely the<br />

same equations. This and other rather special properties hold for self-adjoint<br />

transformations (symmetric matrices).<br />

Reading 7.L Study the properties <strong>of</strong> orthogonal and symmetric matrices<br />

in Kreyszig §7.3 [K,pp381–4].<br />

The theorems apply not just to symmetric matrices but also to self-adjoint<br />

operators. As an example consider the pro<strong>of</strong> <strong>of</strong> the reality <strong>of</strong> the eigenvalues.<br />

Theorem 7.10 The eigenvalues <strong>of</strong> a self-adjoint (real) linear transformation<br />

are all real.<br />

Pro<strong>of</strong>: Let L be a self-adjoint linear transformation on some inner product This pro<strong>of</strong> is better<br />

space with eigenvalue λ and corresponding eigenvector v: thus Lv = λv. when set in a<br />

complex vector<br />

space, but <strong>here</strong> we<br />

• Take the complex conjugate <strong>of</strong> this equation to deduce Lv = λv w<strong>here</strong> compromise by<br />

the over bar denotes complex conjugation. Since L is real, L = L. allowing complex<br />

eigenvalues and<br />

Thus λ and v must be an eigenvalue and eigenvector <strong>of</strong> L also.<br />

eigenvectors<br />

without actually<br />

giving a proper<br />

setting for them.


Module 7. Linear transforms and their eigenvectors on inner product spaces302<br />

♠<br />

• Now consider 〈v, Lv〉. On the one hand<br />

On the other hand,<br />

〈v, Lv〉 = 〈v, λv〉 as v is an eigenvector <strong>of</strong> L<br />

= λ 〈v, v〉 .<br />

〈v, Lv〉 = 〈Lv, v〉 as L is self-adjoint<br />

= 〈 λv, v 〉 as v is an eigenvector <strong>of</strong> L<br />

= λ 〈v, v〉 .<br />

• Hence, λ 〈v, v〉 = λ 〈v, v〉, equivalently<br />

(λ − λ) 〈v, v〉 = 0 .<br />

• Under all useful definitions <strong>of</strong> an inner product 〈v, v〉 ≠ 0, and indeed<br />

is real and positive. Thus the only way the previous equation can be<br />

satisfied is if λ = λ. That is, the eigenvalue must be real.<br />

This pro<strong>of</strong> directly echoes that for the reality <strong>of</strong> the eigenvalues <strong>of</strong> the Sturm-<br />

Liouville problem. The only difference is the generalisation in the Sturm-<br />

Liouville problem to differential equations <strong>of</strong> the form Ly = λp(x)y for some<br />

“weight function” p(x). The general theory can <strong>of</strong> course be extended to<br />

cover such more general cases, but the details are more involved so <strong>here</strong> we<br />

do not do so.


Module 7. Linear transforms and their eigenvectors on inner product spaces303<br />

Example 7.36: Observe that the linear transformation in Exercise 7.32(c)<br />

is self-adjoint, because K(x, y) = K(y, x), and you will have found<br />

the eigenvalues are real and the particular eigenvectors you found were<br />

orthogonal.<br />

However, the eigenvalues <strong>of</strong> the linear transformation in Exercise 7.32(a)<br />

are complex valued and this is allowed because it is not self-adjoint.<br />

That an n × n symmetric matrix is (orthogonally) diagonalisable because<br />

it always has n (orthogonal) eigenvectors, is mirrored by the claim <strong>of</strong> completeness<br />

made for eigenfunctions <strong>of</strong> the Sturm-Liouville problem: that a<br />

matrix has n eigenvectors means we can write any vector in IR n in terms <strong>of</strong><br />

the eigenvectors; that the eigenfunctions are complete means that we can<br />

represent any function on the domain as a linear combination <strong>of</strong> the eigenfunctions.<br />

Activity 7.M Do problems 1–6 from Problem Set 7.3 in Kreyszig [K,p384],<br />

and Exercises 7.42–7.44 <strong>here</strong>in.<br />

7.4.3 Expansions in orthogonal eigenfunctions<br />

Having seen that we can obtain sets <strong>of</strong> eigenfunctions as solutions <strong>of</strong> differential<br />

equations we now show that these can be used to produce a new


Module 7. Linear transforms and their eigenvectors on inner product spaces304<br />

representation <strong>of</strong> almost arbitrarily complicated functions. This is advantageous<br />

in many circumstances.<br />

For an introductory example, use the Legendre polynomials to solve the ode<br />

(1 − x 2 )y ′′ − 2xy ′ + y = 1 + x + x 2 ,<br />

such that y(x) is well-behaved at x = ±1. First, rewrite the right-hand side<br />

in terms <strong>of</strong> Legendre polynomials [K,p208]<br />

4<br />

3 P 0(x) + P 1 (x) + 2 3 P 2(x) = 4 3 + x + 2 1<br />

3 2 (3x2 − 1) = 1 + x + x 2 .<br />

Second, try a solution in the form y = aP 0 (x) + bP 1 (x) + cP 2 (x) for some<br />

constants a, b and c to be determined. Because Legendre polynomials satisfy<br />

(1 − x 2 )P n ′′ − 2xP n ′ = −n(n + 1)P n , the left-hand side <strong>of</strong> the ode simplifies<br />

immensely to just aP 0 −bP 1 −5cP 2 . Lastly equate coefficients <strong>of</strong> the Legendre<br />

polynomials on the two sides <strong>of</strong> the equation to deduce a = 4 , b = −1 and<br />

3<br />

c = − 2 . Hence the solution is<br />

15<br />

y = 4 3 P 0(x) − P 1 (x) − 2 15 P 2(x) = 21<br />

15 − x − 1 5 x2 .<br />

Of course <strong>here</strong> this could have been obtained more straightforwardly by simply<br />

guessing this polynomial form. But the approach introduced <strong>here</strong> is much<br />

more general as seen below.<br />

Reading 7.N Study Kreyszig §4.8 [K,pp240–6].


Module 7. Linear transforms and their eigenvectors on inner product spaces305<br />

Activity 7.O Do problems from Problem Set 4.8 [K,pp246–7]. Send in to<br />

the examiner for feedback at least Q3 & 7.<br />

Now generalise the earlier example.<br />

Theorem 7.11 The solution to the ode [r(x)y ′ ] ′ + [q(x) + µp(x)]y = f(x)<br />

subject to some homogeneous boundary conditions for some constant µ may<br />

be written<br />

y = ∑ 〈f/p, y m 〉<br />

y m (x) ,<br />

m µ − λ m<br />

provided µ ≠ λ m , w<strong>here</strong> λ m are eigenvalues and y m (x) are the orthonormal<br />

eigenfunctions <strong>of</strong> the associated Sturm-Liouville problem.<br />

Pro<strong>of</strong>: Try a solution in the form y(x) = ∑ m a m y m (x). Because [ry m ′ ]′ +<br />

qy m = −λ m py m the ode becomes<br />

∑<br />

a m (µ − λ m )p(x)y m (x) = f(x) .<br />

m<br />

Multiply by y n (x) for any n and integrate over the domain, say [a, b], to<br />

deduce<br />

∑<br />

∫ b<br />

∫ b<br />

a m (µ − λ m ) p(x)y m (x)y n (x) dx = f(x)y n (x) dx .<br />

a<br />

a<br />

m<br />

The right-hand side is identical to the inner product, 〈f/p, y n 〉, <strong>of</strong> f/p and y n .<br />

Because the eigenfunctions are orthonormal with weight p(x), the integral on


Module 7. Linear transforms and their eigenvectors on inner product spaces306<br />

the left-hand side is 1 if m = n and 0 otherwise. Thus the equation simplifies<br />

to a n (µ − λ n ) = 〈f/p, y n 〉 from which we deduce a n = 〈f/p, y n 〉 /(µ − λ n ) for<br />

all n provided µ ≠ λ n . ♠<br />

Example 7.37: Use eigenfunction expansion to solve the ode y ′′ + 2y = 1<br />

such that y(0) = y(π) = 0.<br />

Solution: First find the eigenfunctions <strong>of</strong> the associated problem: y ′′ +<br />

λy = 0 such that y(0) = y(π) = 0. Fortunately this is well known to<br />

us: the eigenvalues are √λ n = n 2 and the complete set <strong>of</strong> orthonormal<br />

eigenfunctions are y n = 2/π sin nx.<br />

Second, write the right-hand side, <strong>here</strong> f(x) = 1 for 0 < x < π, in<br />

terms <strong>of</strong> the eigenfunctions. From Example 1 [K,pp241–2] we know<br />

that we can do this as<br />

f(x) = 1 = 4 (sin x + 1 π 3 sin 3x + 1 sin 5x + · · ·)<br />

for 0 < x < π .<br />

5<br />

Lastly, substitution shows that a solution expressed as a sum <strong>of</strong> the<br />

eigenfunctions just involves dividing each term appearing above by the<br />

corresponding 2 − λ n :<br />

y(x) = 4 π<br />

(<br />

sin x − 1 1<br />

sin 3x − sin 5x − · · ·)<br />

.<br />

6 35


Module 7. Linear transforms and their eigenvectors on inner product spaces307<br />

Example 7.38: In example 7.35 we computed eigenvalues, eigenvectors (eigenfunctions)<br />

and normalised adjoint eigenvectors for the linear operator<br />

Ly = − d2 y<br />

with boundary conditions y(0) = 0 and y ′ (π) = 1 dx 2 2 y′ (0).<br />

Write down the formal eigenfunction expansion <strong>of</strong> the solution to the<br />

problem −y ′′ = h(x) with the same boundary conditions.<br />

Solution: We may expand any function in terms <strong>of</strong> the eigenfunctions<br />

as h(x) = ∑ ∞<br />

j=1 〈w j , h〉 v j (x). Then the formal solution to −y ′′ = h(x)<br />

is<br />

∞∑ 〈w j , h〉<br />

y = v j (x) .<br />

λ j<br />

j=1<br />

You should wonder what occurs if the possibility <strong>of</strong> dividing by a zero µ−λ m<br />

eventuates in the formal solution <strong>of</strong> Theorem 7.11. <strong>Just</strong> as for the solution<br />

<strong>of</strong> linear algebraic equations, such division by zero indicates that the Sturm-<br />

Liouville differential operator is “singular”. Thus if for any m it happens<br />

that µ − λ m = 0, then either the ode is inconsistent, indicated by the inner<br />

product 〈f/p, y n 〉 ≠ 0, or the ode is consistent, when 〈f/p, y n 〉 = 0, and the<br />

solution can include an arbitrary multiple <strong>of</strong> the corresponding eigenfunction,<br />

that is Ay m (x) could be added for any constant A.


Module 7. Linear transforms and their eigenvectors on inner product spaces308<br />

7.4.4 Exercises<br />

Ex. 7.39: Find the eigenvalues, eigenvectors and left-eigenvectors <strong>of</strong> the following<br />

matrices, and then verify the orthogonality between eigenvectors<br />

and left-eigenvectors:<br />

[ ]<br />

0 1<br />

(a)<br />

−6 5<br />

[ ]<br />

11 −6<br />

(b)<br />

18 −10<br />

(c)<br />

⎡<br />

⎢<br />

⎣<br />

0 1 3<br />

1 6 9<br />

−1 −5 −8<br />

⎤<br />

⎥<br />

⎦<br />

Ex. 7.40: Deduce the adjoint eigenfunctions that correspond to the nonzero<br />

eigenvalues <strong>of</strong> the linear operators in Exercise 7.32 (b) and (c).<br />

W<strong>here</strong> appropriate, verify the orthogonality among the eigenfunctions<br />

and these adjoint eigenfunctions.<br />

Ex. 7.41: Prove that if L is a self-adjoint linear transformation in some inner<br />

product, then eigenvectors from different eigenspaces are orthogonal.<br />

Ex. 7.42: Show that the differential operator Ly = d2<br />

dx 2 [<br />

r(x)<br />

d 2 y<br />

dx 2 ]<br />

such that<br />

y(0) = y ′ (0) = y(L) = y ′ (L) = 0 is self-adjoint and hence deduce it has<br />

real eigenvalues and a complete set <strong>of</strong> eigenfunctions. For your interest


Module 7. Linear transforms and their eigenvectors on inner product spaces309<br />

note that the ode Ly = h(x) with these boundary conditions describes<br />

the deflection under a distributed load h(x) <strong>of</strong> a beam <strong>of</strong> varying shape,<br />

encoded by r(x), with clamped ends.<br />

Ex. 7.43: Show that the eigenfunctions <strong>of</strong> the Sturm-Liouville system<br />

−y ′′ = λy , y(0) = 0 , y(1) − 2y ′ (1) = 0 ,<br />

are sin( √ λt) w<strong>here</strong> the eigenvalues are the positive solutions to tan √ λ =<br />

2 √ λ. By sketching a graph show that λ j ≈ (2j − 1) 2 π 2 /4 for large j.<br />

Why do we not need to worry about the possibility <strong>of</strong> complex eigenvalues?<br />

Ex. 7.44: Similarly find the eigenfunctions <strong>of</strong> the Sturm-Liouville system<br />

−y ′′ = λy , y(0) = y ′ (0) , y(π) = 0 ,<br />

and approximate values for the eigenvalues.<br />

Ex. 7.45: A linear operator L is defined by<br />

Lf = f ′′ + 4f ′ + 3f , w<strong>here</strong> f(0) = f ′ (1) = 0 ,<br />

on a vector space with inner product 〈f, g〉 = ∫ 1<br />

0<br />

(a) Find the adjoint <strong>of</strong> L.<br />

fg dx .<br />

(b) Show that f n (x) = e −2x sin(ω n x) are eigenfunctions <strong>of</strong> L provided<br />

ω n = 2 tan(ω n ). (For example, ω 1 = 4.2748, ω 2 = 7.5965, etc.)<br />

What are the corresponding eigenvalues?


Module 7. Linear transforms and their eigenvectors on inner product spaces310<br />

7.4.5 Answers to selected Exercises<br />

7.8 (a) yes; (b) no; (c) yes; (d) yes; (e) no; (f) yes.<br />

7.25 The adjoint is rotation by −θ.<br />

7.27 L † f = ∫ b<br />

a w(y)w(x)−1 K(y, x)f(y) dy<br />

7.28 (a) L † f = − df such that f(0) + 2f(1); (b) dx L† f = f ′′ − 3f ′ + 4f such<br />

that f ′ (1) = 0 and f(0) = 0; (c) L † f = −f ′′′ − f ′ such that f(0) = 0,<br />

f(1) + 2f ′ (0) = 0 and 3f ′′ (1) − f ′ (1) + 3f(1) = 0.<br />

7.29 (a) L † f = − df<br />

dx such that f(1) = 3f(0); (b) L† f = f ′′ − 2f ′ + f<br />

such that f(−1) = f(1) = 0; (c) L † f = −f ′′′ such that f(0) =<br />

f(1) + f ′ (0) = 0 and f ′′ (1) = f ′ (1).<br />

7.31 (a) λ = 1, λ = 2; (b) λ = 1, λ = 3; (c) λ = −1.5, λ = 1; (d)<br />

λ = −0.8, λ = 1.8.<br />

7.32 (a) λ = ±i/ √ 3 with eigenfunctions f(x) = 2x + (−1 ± i/ √ 3) ; (b)<br />

λ = b − a and the eigenfunctions are f(x) = e x ; (c) λ = ±π/2<br />

corresponding to eigenfunctions cos x and sin x.<br />

7.39 (a) λ = 2 and 3, v 1 = (1, 2), v 2 = (1, 3), w 1 = (3, −1) and w 2 =<br />

(−2, 1); (b) λ = 2 and −1, v 1 = (2, 3), v 2 = (1, 2), w 1 = (2, −1)<br />

and w 2 = (−3, 2); (c) λ = 1, −1 and −2, v 1 = (−1, 2, −1), v 2 =<br />

(2, 1, −1), v 3 = (−1, −1, 1), w 1 = (0, 1, 1), w 2 = (1, 2, 3) and w 3 =<br />

(1, 3, 5);


Module 7. Linear transforms and their eigenvectors on inner product spaces311<br />

7.40 (b) w(x) = e −x ; (c) w 1 (x) = 2 cos x and w π 2(x) = 2 sin x.<br />

π


Module 7. Linear transforms and their eigenvectors on inner product spaces312<br />

7.5 Summary<br />

• A vector space with the operations <strong>of</strong> vector addition and scalar multiplication<br />

is the foundation for the study <strong>of</strong> transformations in both<br />

finite and infinite dimensions (§7.1.1). The ten axioms <strong>of</strong> a vector<br />

space are: closure and associativity <strong>of</strong> vector addition and scalar multiplication,<br />

commutativity <strong>of</strong> vector addition, distributivity <strong>of</strong> scalar<br />

multiplication over vector addition and <strong>of</strong> scalar multiplication over<br />

scalar addition; the existence <strong>of</strong> a zero vector and a negative, and the<br />

identity <strong>of</strong> scalar multiplication by 1.<br />

• A subset <strong>of</strong> a vector space is a subspace if it is closed under vector<br />

addition and scalar multiplication.<br />

• The dimension <strong>of</strong> a vector space is the maximum number <strong>of</strong> linearly<br />

independent basis vectors.<br />

• An inner product imbues a vector space with distances, lengths and<br />

angles (§7.1.2). The three axioms <strong>of</strong> an inner product (Defn. 7.3) are<br />

linearity, symmetry and positivity. Inequalities, familiar from plane<br />

geometry, follow in general.<br />

√<br />

• Distances and lengths are given by the norm ‖u‖ = 〈u, u〉 (Defn. 7.4).<br />

The angle θ between two vectors is given by cos θ = 〈u, v〉 /(‖u‖.‖v‖);<br />

they are orthogonal if the inner product is zero (§7.1.2).


Module 7. Linear transforms and their eigenvectors on inner product spaces313<br />

• Matrix multiplication, differential and integral operators are examples<br />

<strong>of</strong> linear transformations (§7.2.1).<br />

• A linear transform (operator) is neatly complemented by its adjoint<br />

(§7.2.2) defined by 〈u, Lv〉 = 〈 L † u, v 〉 (Defn. 7.7). A self-adjoint<br />

transformation, such as the Sturm-Liouville operator, generalises the<br />

concept <strong>of</strong> a symmetric matrix.<br />

• The spectrum <strong>of</strong> a linear transformation and its adjoint are the same,<br />

and the adjoint or left-eigenvectors are orthogonal to the ordinary<br />

eigenvectors (§7.4.1). This allows them to be used to extract the component<br />

<strong>of</strong> the eigenvectors or eigenfunction in any given vector or function.<br />

• Thus the eigenvectors <strong>of</strong> a self-adjoint linear transformation (symmetric<br />

matrix) are necessarily orthogonal and complete (§7.4.2). Solutions <strong>of</strong><br />

Sturm-Liouville systems provide an important example <strong>of</strong> this property.<br />

• The solution to the ode [r(x)y ′ ] ′ + [q(x) + µp(x)]y = f(x) subject to<br />

some homogeneous boundary conditions for some constant µ may be<br />

written<br />

y = ∑ 〈f/p, y m 〉<br />

y m (x) ,<br />

m µ − λ m<br />

provided µ ≠ λ m , w<strong>here</strong> λ m are eigenvalues and y m (x) are the orthonormal<br />

eigenfunctions <strong>of</strong> the associated Sturm-Liouville problem.


Index<br />

:=, 242<br />

array, 66, 69<br />

article, 53<br />

author, 54<br />

begin-end, 243<br />

bye, 242<br />

caption, 77<br />

df, 221, 243<br />

displaymath, 63<br />

documentclass, 53<br />

document, 53<br />

end, 243<br />

eqnarray, 68, 69<br />

equation, 63<br />

factorial, 243<br />

factor, 215<br />

for, 213, 243<br />

graphicx, 76<br />

includegraphics, 77<br />

int, 213, 243<br />

in, 244<br />

left, 64<br />

let, 243<br />

maketitle, 54<br />

mbox, 66<br />

nonumber, 69<br />

paragraph, 58<br />

quad, 66<br />

quit, 242<br />

repeat-until, 226, 243<br />

right, 64<br />

section, 56<br />

title, 54<br />

until, 243<br />

usepackage, 76<br />

write, 215, 243


Index 315<br />

absolute convergence, 143<br />

accents, 73<br />

Achilles, 133<br />

adjoint, 274, 274–283, 292, 298,<br />

299, 313<br />

adjoint eigenfunction, 308<br />

adjoint eigenvectors, 292, 296, 307<br />

age structured populations, 108<br />

Airy’s equation, 239<br />

alternating harmonic series, 143<br />

ampersand, 59<br />

analytic, 191, 193<br />

analytic function, 151, 167<br />

angle, 262, 265, 266, 266, 282<br />

array, 67<br />

associativity, 256–261, 312<br />

autonomous, 35<br />

Babbage, 206<br />

basis, 256, 258<br />

Bessel function, 202, 203, 234, 250<br />

Bessel functions <strong>of</strong> the first kind,<br />

201<br />

Bessel’s equation, 201, 202, 237,<br />

250<br />

brace, 59, 60, 64<br />

bracket, 64<br />

bulleted list, 61<br />

car traffic, 93, 94, 100, 105<br />

caret, 59, 60<br />

case statement, 67<br />

Cauchy’s convergence, 142<br />

Cauchy-Schwarz inequality, 265,<br />

266, 267, 272<br />

characteristic diagram, 102, 120<br />

characteristic equation, 284<br />

circular geometry, 195<br />

closed, 256, 257, 259–262, 312<br />

coefficients, 148, 191, 195, 197,<br />

198, 220, 304<br />

command definition, 74, 75<br />

commutativity, 256, 257, 259, 260,<br />

312<br />

comparison test, 145<br />

complete, 293, 303, 306, 308<br />

complex conjugate, 301<br />

computer algebra, 206, 207, 211,<br />

214, 231, 233, 234,<br />

240–242<br />

conditional convergence, 143<br />

conservation <strong>of</strong> mass, 96


Index 316<br />

conservation <strong>of</strong> momentum, 125<br />

continuity equation, 96–98, 108,<br />

109, 117, 125<br />

converges, 134<br />

critical point, 17<br />

critical points, 23<br />

definition, 46<br />

definition, command, 74<br />

degenerate case, 18, 19<br />

delimiter, 64<br />

diagonalisation, 291<br />

dimension, 256, 312<br />

dimensionality, 256, 258<br />

directional derivatives, 172<br />

displayed equations, 47<br />

displayed mathematics, 63<br />

distance, 262, 265, 265<br />

distributivity, 256, 257, 261, 312<br />

dollar, 59<br />

dot product, 262<br />

dots, 68<br />

eigenfunction, 246, 251, 285,<br />

287–289, 296, 297, 299,<br />

303, 305–309, 313<br />

eigenfunction expansion, 291, 306,<br />

307<br />

eigenspace, 308<br />

eigenvalue, 37, 246, 251, 284–288,<br />

290–292, 294, 296,<br />

301–303, 305–309, 313<br />

eigenvector, 37, 246, 284, 285,<br />

287, 290–296, 301, 303,<br />

307, 308<br />

elastic artery, 124<br />

elementary functions, 71<br />

ellipsis, 48<br />

empty set, 259<br />

equation <strong>of</strong> state, 119<br />

equilibrium, 106<br />

Euler, 86<br />

Eulerian description, 86<br />

even powers, 149, 150<br />

example, 46<br />

exponential, 31, 296<br />

extrema, 168, 170, 171, 173<br />

figure, 75, 76<br />

fixed point, 17–19, 23–26, 28, 31,<br />

35–37, 106<br />

float, 76


Index 317<br />

font, 53, 60<br />

fraction, 61<br />

Frobenius method, 196, 198, 201,<br />

250<br />

geometric series, 145<br />

global maximum, 170<br />

global minimum, 170<br />

harmonic series, 139, 140, 143, 144<br />

hash, 59<br />

Hessian, 174<br />

Hessian matrix, 174, 175, 179, 180<br />

higher dimensions, 18, 19<br />

html, 50<br />

ideal gas, 119<br />

identity, 256, 258, 261, 312<br />

in-line mathematics, 59, 61, 73<br />

indicial equation, 197, 199, 234,<br />

235, 237, 250<br />

infinite series, 134<br />

infinite sum, 134<br />

initial conditions, 212, 215,<br />

223–225, 229, 231<br />

inner product, 262, 263, 263<br />

Inner product, 265<br />

inner product, 268, 272, 274–280,<br />

282, 283, 305, 312<br />

inner product space, 263,<br />

265–267, 274<br />

install LaTeX, 51<br />

isoclines, 24, 24, 31<br />

iteration, 211, 212, 217, 220–223,<br />

225, 230, 235, 236, 239<br />

iterative construction, 251<br />

Jacobi, 37<br />

Jacobian, 35, 37, 37, 38<br />

Lagrange’s remainder, 159, 162<br />

Lagrangian particle paths, 87<br />

LaTeX, 49<br />

LaTeX, document, 53<br />

LaTeX, install, 51<br />

LaTeX, web access, 52<br />

least-squares solution, 273<br />

left-eigenvectors, 292, 293–295,<br />

301, 308<br />

Legendre polynomials, 194, 249,<br />

258, 304


Index 318<br />

Legendre’s equation, 193, 223,<br />

224, 240, 249<br />

length, 265<br />

line breaks, 56<br />

linear combination, 256, 258, 289,<br />

293, 295<br />

linear independence, 256<br />

linear operator, 270, 271, 274,<br />

276, 291, 296, 307, 308<br />

linear transform, 270, 282, 285,<br />

288, 291, 301, 303, 308<br />

linear transformation, 269, 270,<br />

313<br />

linearisation, 24, 24, 28, 100, 112,<br />

119, 125, 233<br />

linearity, 263, 312<br />

linearly independent, 258<br />

list environment, 61<br />

local maximum, 170, 172, 173,<br />

175, 177, 180<br />

local minimum, 172, 173, 177, 182<br />

logarithm, 234<br />

Maclaurin series, 153, 211, 212,<br />

214, 216, 217, 221–223,<br />

229, 232, 239–241<br />

mass on a spring, 7<br />

material derivative, 86, 87, 91, 92<br />

mathematical functions, 48, 71<br />

mathematics environment, 60, 66,<br />

68<br />

method <strong>of</strong> characteristics, 100,<br />

101, 119<br />

method <strong>of</strong> undetermined<br />

coefficients, 290<br />

momentum equation, 115, 117,<br />

125<br />

multiplicity, 284<br />

negative, 256, 259, 261, 312<br />

negative definite, 174, 177<br />

negative space, 65, 66<br />

nonlinear differential equations,<br />

23, 207<br />

nonlinear ode, 217, 217, 221, 229,<br />

233, 240, 241<br />

norm, 265, 265, 312<br />

normal space, 65<br />

notation, 46<br />

numbered list, 62<br />

odd powers, 149


Index 319<br />

orbit, 11<br />

order <strong>of</strong>, 221, 230, 233<br />

orthogonal, 245, 246, 249, 251,<br />

265, 266, 266, 289, 292,<br />

293, 299, 303, 308, 312<br />

paragraphs, 56<br />

parallelogram equality, 265<br />

parentheses, 64<br />

partial derivative command, 74<br />

partial sums, 134, 134, 135, 137,<br />

138, 156<br />

percent, 59<br />

phase plane, 11, 11, 14, 24, 217<br />

phase portrait, 11, 23<br />

population model, 108<br />

positive definite, 174, 174, 177<br />

positivity, 263, 312<br />

postscript, 76<br />

power series, 147, 148, 155, 190,<br />

191, 193, 196–198, 201,<br />

211–213, 217, 218,<br />

220–222, 230, 233, 234,<br />

237, 241, 242, 249, 251<br />

power series method, 189, 195, 216<br />

preamble, 76<br />

pro<strong>of</strong>, 46<br />

pro<strong>of</strong> by contradiction, 139<br />

punctuate, 47, 64<br />

Pythagoras theorem, 266<br />

quad space, 65<br />

quadratic form, 174, 174<br />

quadratic polynomials, 256<br />

radius <strong>of</strong> convergence, 148, 151,<br />

153, 166<br />

range, 270, 271<br />

ratio test, 145, 145, 149, 150<br />

redefine, 74<br />

reduce, 207, 242<br />

regular point, 200, 201, 249<br />

relations, 63<br />

residual, 221, 224, 224, 226, 229,<br />

230, 233, 235, 239, 251<br />

Rolle’s theorem, 161<br />

root test, 145<br />

rubber band, 87<br />

saddle point, 27, 28, 31, 172, 172,<br />

173, 175, 177, 179, 181


Index 320<br />

scalar multiplication, 256,<br />

256–258, 260, 261, 267,<br />

282, 312<br />

Schwarz inequality, 265<br />

sections, 56<br />

self-adjoint, 274, 274, 277–282,<br />

298, 300, 301, 303, 308,<br />

313<br />

sequence, 130, 132, 134, 135, 138,<br />

142<br />

set union, 259<br />

shear transformation, 275<br />

shift summation indices, 191<br />

singular, 249, 307<br />

singular point, 200, 234, 249<br />

slosh, 59<br />

sonic boom, 119<br />

sound, 119<br />

space, 65<br />

special functions, 249, 250<br />

spectrum, 284, 292, 299<br />

square integrable, 259, 259–261,<br />

264, 271, 273<br />

stability, 17, 17, 19, 37<br />

stable, 17, 19, 21<br />

stationary points, 171, 172, 178,<br />

179, 181<br />

structure, 195, 245<br />

Sturm-Liouville, 302, 303<br />

Sturm-Liouville equation, 246,<br />

251, 280<br />

Sturm-Liouville operators, 280,<br />

281<br />

Sturm-Liouville problem, 265,<br />

289, 291, 302, 305, 313<br />

Sturm-Liouville theory, 245<br />

subscript, 59, 60, 72<br />

subspace, 261, 268, 274, 278–280,<br />

283, 284, 312<br />

superscript, 59, 60, 72<br />

symbols, 47, 60, 79<br />

symmetric matrix, 274, 301, 303<br />

symmetry, 263, 312<br />

tabular format, 66<br />

tangent plane, 172<br />

Taylor polynomial, 156, 157–159,<br />

161, 162<br />

Taylor series, 152, 153, 153, 156,<br />

162, 169, 180, 191, 200,<br />

201, 221


Index 321<br />

Taylor’s series, multivariable, 166<br />

Taylor’s theorem, 166, 173<br />

telescopic sum, 135<br />

telnet, 208<br />

testing, 234<br />

thin space, 65, 66, 71<br />

tilde, 59<br />

trajectory, 11, 11, 15<br />

triangle inequality, 265, 267<br />

truncation error, 156, 156<br />

underscore, 59, 60<br />

uniqueness, 154, 191, 249<br />

unit vector, 265<br />

unstable, 17, 19–21, 28, 42<br />

vector addition, 256, 259, 261, 312<br />

vector space, 255, 256, 258–263,<br />

265–267, 269, 271, 282,<br />

291, 312<br />

vector subspace, 261, 261, 262<br />

wave equation, 119, 124<br />

wave speed, 101<br />

Zeno <strong>of</strong> Elea, 130<br />

Zeno’s Second Paradox, 133<br />

zero vector, 256, 257, 259, 260,<br />

312

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!