Just click here. - Bad Request - University of Southern Queensland
Just click here. - Bad Request - University of Southern Queensland
Just click here. - Bad Request - University of Southern Queensland
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
MAT2101<br />
Applied Mathematics<br />
Faculty <strong>of</strong> Sciences<br />
Electronic Study Book<br />
Written by<br />
Tony Roberts, David Mander & Tim Passmore<br />
Department <strong>of</strong> Mathematics & Computing<br />
Faculty <strong>of</strong> Sciences<br />
The <strong>University</strong> <strong>of</strong> <strong>Southern</strong> <strong>Queensland</strong>
Preface<br />
This unit brings in an emphasis upon developing applications side by side<br />
with the development <strong>of</strong> mathematical concepts and techniques. Please let<br />
us know <strong>of</strong> all errors in the study book as soon as you suspect any. This<br />
feedback will improve our unit year by year.<br />
Some parts <strong>of</strong> the unit are in the mainstream, and other parts are included for<br />
a richer picture. As you read you will see that we have endeavoured to convey<br />
the importance <strong>of</strong> the various concepts and sections. For example, concepts<br />
and formulae in the “aims” and the Summaries are the most essential. In<br />
support <strong>of</strong> this, the reading you have been asked to do has been classified by<br />
requests to “study”, “read” or “peruse” in order <strong>of</strong> decreasing importance.<br />
For your convenience we have in places suggested specific problems that you<br />
should try and send in your answers to us for feedback. These problems are<br />
a minimum that you should be able to immediately do. Our feedback will<br />
help you learn the more difficult aspects <strong>of</strong> the course. Ensure you make use
Preface<br />
<strong>of</strong> us. Send in your work by post, by fax or perhaps by e-mailing scanned<br />
work.<br />
Associated with this study guide are Matlab scripts to enhance your ability<br />
to probe the problems and concepts and thus to improve learning.<br />
As part <strong>of</strong> our commitment to the highest quality <strong>of</strong> teaching we also provide<br />
this study guide in electronic format. Note several aspects about the<br />
electronic form:<br />
• the electronic form is displayed using Adobe’s acrobat reader (with<br />
bookmarks);<br />
• for electronic convenience the page size is different and so the page<br />
numbering is totally different to the printed version;<br />
• <strong>click</strong>able links allow rapid navigation around the electronic document<br />
to make it easier to connect widespread parts <strong>of</strong> the unit;<br />
• and some links to outside material have also been encoded.<br />
Information about mathematical figures in the history <strong>of</strong> the topics have been<br />
gleaned from various sources including<br />
http://www-groups.dcs.st-and.ac.uk/~history/index.html<br />
Reading 0.A Now read Chapters 1 and 2 in Kreyszig to refresh your memory<br />
<strong>of</strong> aspects <strong>of</strong> differential equations that were introduced in your<br />
previous mathematics.<br />
iii
Table <strong>of</strong> Contents<br />
Preface<br />
ii<br />
I Modelling dynamics with differential equations 1<br />
1 Systems <strong>of</strong> differential equations 4<br />
2 Scientists must write 43<br />
3 Describing the conservation <strong>of</strong> material 84<br />
4 The dynamics <strong>of</strong> momentum 113
Table <strong>of</strong> Contents<br />
II Structure, algebra and approximation <strong>of</strong> applied<br />
functions 126<br />
v<br />
5 The nature <strong>of</strong> infinite series 129<br />
6 Series solutions <strong>of</strong> differential equations give special functions<br />
185<br />
7 Linear transforms and their eigenvectors on inner product<br />
spaces 252
Part I<br />
Modelling dynamics with<br />
differential equations
Part contents<br />
1 Systems <strong>of</strong> differential equations 4<br />
1.1 Systems <strong>of</strong> linear differential equations . . . . . . . . . . . . . 7<br />
1.2 Qualitative solution <strong>of</strong> nonlinear, first-order systems <strong>of</strong> ode’s . 23<br />
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />
2 Scientists must write 43<br />
2.1 Basics <strong>of</strong> mathematical writing . . . . . . . . . . . . . . . . . 45<br />
2.2 L A TEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />
3 Describing the conservation <strong>of</strong> material 84<br />
3.1 Eulerian description <strong>of</strong> motion . . . . . . . . . . . . . . . . . . 86<br />
3.2 Conservation <strong>of</strong> mass . . . . . . . . . . . . . . . . . . . . . . . 96
PART CONTENTS 3<br />
3.3 Car traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100<br />
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112<br />
4 The dynamics <strong>of</strong> momentum 113<br />
4.1 Conservation <strong>of</strong> momentum . . . . . . . . . . . . . . . . . . . 115<br />
4.2 Dynamics <strong>of</strong> ideal gases . . . . . . . . . . . . . . . . . . . . . 119<br />
4.3 Equations <strong>of</strong> quasi-one-dimensional blood flow . . . . . . . . . 124<br />
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Module 1<br />
Systems <strong>of</strong> differential<br />
equations<br />
In the 17th century Isaac Newton published his famous universal laws <strong>of</strong><br />
motion which, in essence, showed how physical systems could be described<br />
by differential equations. The motions <strong>of</strong> planets, falling apples, billiard<br />
balls and flying arrows could all be described in terms <strong>of</strong> the forces acting to<br />
produce changes in motion.<br />
During the last 25 years we have seen how scientists using these same laws<br />
<strong>of</strong> motion, and computers to solve complex systems <strong>of</strong> differential equations,<br />
have been able to navigate the Voyager spacecraft, with amazing precision,
Module 1. Systems <strong>of</strong> differential equations 5<br />
to rendezvous in space with Jupiter, Saturn and the outer planets <strong>of</strong> our<br />
solar system. Given the governing differential equations and a set <strong>of</strong> initial<br />
conditions, the future motion can be predicted.<br />
In this module we use differential equations to model physical systems and<br />
describe and predict their behaviour under a variety <strong>of</strong> conditions.<br />
Module contents<br />
1.1 Systems <strong>of</strong> linear differential equations . . . . . . . . 7<br />
1.1.1 Case study: the motion <strong>of</strong> a mass on a spring . . . . . . 7<br />
1.1.2 Conversion <strong>of</strong> the order <strong>of</strong> differential equations . . . . . 8<br />
1.1.3 The phase plane and phase portrait <strong>of</strong> the mass-spring<br />
system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
1.1.4 Trajectories in the phase plane <strong>of</strong> a linear system . . . . 14<br />
1.1.5 Classification and stability <strong>of</strong> fixed points . . . . . . . . 17<br />
1.2 Qualitative solution <strong>of</strong> nonlinear, first-order systems<br />
<strong>of</strong> ode’s . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
1.2.1 Linearisation using the Jacobian . . . . . . . . . . . . . 35<br />
1.2.2 Answers to selected Exercises . . . . . . . . . . . . . . . 40<br />
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />
The text for this module is Chapter 3 in Kreyszig Advanced Engineering<br />
Mathematics, 8th ed, Wiley. References to the text use the format [K,reference].
Module 1. Systems <strong>of</strong> differential equations 6<br />
Main aims:<br />
• to write differential equations as a system <strong>of</strong> first-order differential<br />
equations;<br />
• classify general solutions near any fixed point or equilibrium;<br />
• predict the qualitative nature <strong>of</strong> solutions near fixed points or equilibria;<br />
• introduce the technique <strong>of</strong> linearisation;<br />
• to patch together the pictures near each fixed point to obtain a global<br />
understanding <strong>of</strong> the solutions.
Module 1. Systems <strong>of</strong> differential equations 7<br />
1.1 Systems <strong>of</strong> linear differential equations<br />
You have solved some ordinary differential equations (ode’s) in first year<br />
mathematics; these differential equations and their solutions are <strong>of</strong>ten used<br />
to describe the motion <strong>of</strong> some mechanical or otherwise evolving system.<br />
For example, the motion <strong>of</strong> a mass on a spring is discussed briefly next in<br />
§1.1.1. However, for many purposes it is much better to recast a differential<br />
equation as a system <strong>of</strong> first-order differential equations. For example, this<br />
is necessary to analyse “chaos.” 1 In this section we lay the foundations for<br />
the analysis <strong>of</strong> system <strong>of</strong> differential equations.<br />
1.1.1 Case study: the motion <strong>of</strong> a mass on a spring<br />
Kreyszig shows [K,pp158–9] that the motion <strong>of</strong> a mass attached to a spring, if<br />
t<strong>here</strong> are no friction or damping forces, is governed by the single second-order<br />
ode<br />
my ′′ = −ky (1.1)<br />
w<strong>here</strong> y = y(t) is the displacement at time t <strong>of</strong> the mass from its rest position<br />
w<strong>here</strong> the spring is unstretched, y ′ = dy/dt and k and m are constants,<br />
m being mass and k describing the ‘stiffness’ <strong>of</strong> the spring.<br />
1 The topic <strong>of</strong> chaos is explored in the fourth year course mat4102.<br />
This equation<br />
comes directly from<br />
Newton’s Second<br />
Law that<br />
applied force =<br />
mass × acceleration.<br />
The minus sign says<br />
that the force <strong>of</strong> the<br />
spring opposes the<br />
motion <strong>of</strong> the mass<br />
and acceleration is
Module 1. Systems <strong>of</strong> differential equations 8<br />
From first-year mathematics we know that this ode may be re-written as:<br />
y ′′ + k m y = 0 ,<br />
and its general solution is<br />
√ √<br />
k k<br />
y(t) = A 1 cos<br />
m t + A 2 sin<br />
m t , (1.2)<br />
for constants A 1 and A 2 . This solution describes √ an unending oscillation in<br />
time with constant angular frequency ω = k/m.<br />
1.1.2 Conversion <strong>of</strong> the order <strong>of</strong> differential equations<br />
To illustrate the approach we take in general, the second-order differential<br />
equation (1.1) describing a spring is <strong>here</strong> re-written as a system <strong>of</strong> two firstorder<br />
equations.<br />
Introduce two new variables y 1 and y 2 and put<br />
y 1 = y , and y 2 = y ′ .<br />
Then using (1.1) we also describe the motion <strong>of</strong> the spring by the first-order<br />
system<br />
y ′ 1 = y 2 ,<br />
y ′ 2 = − k m y 1 .
Module 1. Systems <strong>of</strong> differential equations 9<br />
√<br />
In matrix form, with ω = k/m,<br />
w<strong>here</strong><br />
[<br />
y<br />
′<br />
1<br />
y ′ 2<br />
]<br />
y =<br />
=<br />
[<br />
[<br />
y1<br />
0 1<br />
−ω 2 0<br />
y 2<br />
]<br />
] [<br />
y1<br />
y 2<br />
]<br />
and A =<br />
[<br />
or y ′ = Ay . (1.3)<br />
0 1<br />
−ω 2 0<br />
Similarly, many higher order differential equations are reduced to first-order<br />
systems.<br />
]<br />
.<br />
It is convenient to<br />
use the angular<br />
frequency ω instead<br />
<strong>of</strong> k/m in the<br />
matrix formulation.<br />
Reading 1.A Kreyszig Chapter 3: read §3.0, then study §3.1 [K,p152–8] on<br />
modelling with systems <strong>of</strong> differential equations and their solutions.<br />
Example 1.1: rewrite the following ode as a first-order system:<br />
y ′′′ + 7y ′′ − 4y ′ + 8y = 0 .<br />
Solution:<br />
Define new variables:<br />
y 1 = y , y 2 = y ′ and y 3 = y ′′ ,
Module 1. Systems <strong>of</strong> differential equations 10<br />
then<br />
y ′ 1 = y 2 ,<br />
y ′ 2 = y 3 ,<br />
y ′ 3 = −8y 1 + 4y 2 − 7y 3 ,<br />
which is written in matrix form as<br />
⎡<br />
y ′ ⎤ ⎡<br />
⎤ ⎡ ⎤<br />
1 0 1 0 y 1<br />
⎢<br />
⎣ y 2<br />
′ ⎥ ⎢<br />
⎥ ⎢ ⎥<br />
⎦ = ⎣ 0 0 1 ⎦ ⎣ y 2 ⎦ .<br />
y 3<br />
′ −8 4 −7 y 3<br />
Exercise 1.2:<br />
Convert the following to first-order systems:<br />
(a) y ′′′ + 12y ′′ − 5y ′ + 11y = 0;<br />
(b) y ′′ + αy ′ + cy = 0, with α and c constants;<br />
(c) y ′′′′ + 7y ′′ = 9y.<br />
Activity 1.B Do problems from §3.1 [K,p158] and Exercise 1.2 above. Send<br />
in to the examiner for feedback at least Q9.<br />
Reading 1.C Read §3.2 [K,p159–161] for some background theory.
Module 1. Systems <strong>of</strong> differential equations 11<br />
1.1.3 The phase plane and phase portrait <strong>of</strong> the massspring<br />
system<br />
The point <strong>of</strong> writing the spring equation (1.1) as the two dimensional matrix<br />
system (1.3) is that we now have a 2-D description <strong>of</strong> the motion <strong>of</strong> the mass<br />
in terms <strong>of</strong> its position, y 1 = y, and its velocity, y 2 = y ′ . By plotting y 2<br />
against y 1 a graph, known as a phase portrait, <strong>of</strong> the motion <strong>of</strong> the mass on<br />
the spring is made. At each point in the phase plane, illustrated by the little<br />
pictures in Figure 1.1, the mass-spring system has a specific combination <strong>of</strong><br />
extension and velocity.<br />
At each time, t, a single point on the phase plane is plotted corresponding<br />
to the position and velocity <strong>of</strong> the mass. Over time the system traverses a<br />
path in the phase plane, known as a trajectory or an orbit. For a given set<br />
<strong>of</strong> initial conditions a trajectory for the mass might appear as in Figure 1.1.<br />
T<strong>here</strong> are two points, corresponding to the left/right extremes <strong>of</strong> the ellipse,<br />
w<strong>here</strong> the velocity y 2 = 0 and the displacement y 1 is extreme, meaning that<br />
the mass is instantaneously at rest and the spring has reached maximum<br />
compression/extension. A moment later the mass has changed direction and<br />
is picking up speed; at the top/bottom extremes <strong>of</strong> the ellipse the mass is<br />
moving through y 1 = 0 w<strong>here</strong> the spring is unstretched and the speed is maximal.<br />
At other times the velocity and displacement have values intermediate<br />
between these extremes.<br />
Since t<strong>here</strong> is no friction (an ideal case) the motion just keeps repeating itself<br />
indefinitely.
Module 1. Systems <strong>of</strong> differential equations 12<br />
y 2<br />
=y'<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
-0.2<br />
-0.4<br />
-0.6<br />
-0.8<br />
-1<br />
-1 -0.5 0 0.5 1 1.5<br />
y =y 1<br />
Figure 1.1: mass spring phase plane showing that at each point in the phase<br />
plane a little picture displays the unique state <strong>of</strong> the system quantified by<br />
its position y = y 1 and velocity y ′ = y 2 . The green ellipse shows a possible<br />
orbit or trajectory <strong>of</strong> the mass-spring system, the path through the states,<br />
over time.
Module 1. Systems <strong>of</strong> differential equations 13<br />
Example 1.3: We show all this by solving (1.3) which is a homogeneous linear<br />
system with constant coefficient matrix, A. Kreyszig shows [K,p163,<br />
Theorem 1] that the general solution will be <strong>of</strong> the form<br />
y = c 1 x (1) e λ 1t + c 2 x (2) e λ 2t<br />
(1.4)<br />
w<strong>here</strong> c j are arbitrary complex constants, λ j are the eigenvalues <strong>of</strong> A<br />
and x (j) the corresponding eigenvectors. The characteristic equation<br />
is:<br />
λ −1<br />
det(λI − A) =<br />
∣ ω 2<br />
λ ∣ = λ2 + ω 2 = 0<br />
yielding λ 1 = iω and λ 2 = −iω. For the eigenvectors solve<br />
[ ] [ ] [ ]<br />
λ −1 v1 0<br />
ω 2<br />
=<br />
λ v 2 0<br />
to get<br />
and<br />
x (1) =<br />
x (2) =<br />
[<br />
[<br />
1<br />
iω<br />
1<br />
−iω<br />
]<br />
]<br />
for λ = λ 1 = iω ,<br />
So the general solution to the system is:<br />
[ ] [ ] [<br />
y1 1<br />
= c<br />
y 1 e iωt + c<br />
2 iω<br />
2<br />
for λ = λ 2 = −iω .<br />
1<br />
−iω<br />
]<br />
e −iωt . (1.5)
Module 1. Systems <strong>of</strong> differential equations 14<br />
Exercise 1.4: Show by choosing<br />
c 1 = 1 2 (A 1 − iA 2 ) and c 2 = 1 2 (A 1 + iA 2 )<br />
in the general solution above, that you recover the solution (1.2).<br />
Normally, the constants c 1 and c 2 will be chosen so that the solution (1.5) is<br />
real, in which case the plot <strong>of</strong> y 1 versus y 2 or <strong>of</strong> (1.2) and its derivative will<br />
generate an ellipse.<br />
1.1.4 Trajectories in the phase plane <strong>of</strong> a linear system<br />
The big advantage <strong>of</strong> the phase plane is that we qualitatively see how the<br />
dynamics <strong>of</strong> a system will evolve. For example, in the mass-spring system<br />
we know<br />
y ′ 1 = y 2 and y ′ 2 = −ω2 y 1 .<br />
That is the rate <strong>of</strong> change <strong>of</strong> the position vector y is (y 2 , −ω 2 y 1 ). 2 Thus<br />
at each point in the phase plane we can tell the direction that the system<br />
evolves by drawing an arrow as in the following plot.<br />
2 We use the row-vector in parentheses, such as (y 1 , y 2 , . . . , y n ) to denote the corresponding<br />
column vector.
Module 1. Systems <strong>of</strong> differential equations 15<br />
1<br />
[y1,y2]=meshgrid(linspace(-1,1,7));<br />
u=y2; v=-0.88*y1;<br />
quiver(y1,y2,u,v)<br />
y 2<br />
=y'<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-1.5 -1 -0.5 0 0.5 1 1.5<br />
y 1<br />
=y<br />
% then try the following code to<br />
% simulate evolution <strong>of</strong> this DE<br />
hold on<br />
y1=0.05;y2=0.6;dt=0.01;<br />
pt = plot(y1,y2,’r*’,’erase’,’xor’);<br />
drawnow<br />
for t=dt:dt:20<br />
dy1=y2; dy2=-0.88*y1;<br />
y1=y1+dt*dy1; y2=y2+dt*dy2;<br />
set(pt,’xdata’,y1,’ydata’,y2)<br />
end<br />
The green curve shows the trajectory taken by the system: the set <strong>of</strong> states<br />
it goes through as time evolves. See how the evolution arrows are tangent<br />
to the trajectory as they must point along the direction <strong>of</strong> evolution. In this<br />
subsection we look at the few different sorts <strong>of</strong> pictures generically seen in<br />
two-dimensions.<br />
Reading 1.D Study Kreyszig §3.3 [K,pp162–9]: take note <strong>of</strong> the phase<br />
plane pictures in Fig. 78–82 [K,pp165–6], and ignore the irrelevant distinction<br />
between “improper node” and a “proper node.”.
Module 1. Systems <strong>of</strong> differential equations 16<br />
Exercise 1.5: Some systems <strong>of</strong> differential equations evolve according to<br />
the vectors plotted in 2-D below. For each system, visualise the trajectories<br />
<strong>of</strong> the system and classify the origin at the centre point as either<br />
a node, saddle, centre or spiral point.<br />
(a) (b) (c)<br />
(d) (e) (f)
Module 1. Systems <strong>of</strong> differential equations 17<br />
Activity 1.E Do exercises in Problem Set 3.3 [K,pp169–170] and Exercise<br />
1.5 above. Send in to the examiner for feedback at least Q1 &<br />
10.<br />
1.1.5 Classification and stability <strong>of</strong> fixed points<br />
These pictures <strong>of</strong> the dynamics near the origin allow us to answer very important<br />
qualitative questions about the solutions <strong>of</strong> differential equations.<br />
In application, we principally concern ourselves with things that can be observed.<br />
Thus we need to predict what may be observed and what cannot<br />
be observed. This is expressed via the notion <strong>of</strong> stability. Loosely, a fixed<br />
point (or critical point), the origin for linear systems, is stable if all nearby<br />
solutions stay nearby for all time and thus could be observed, a pendulum<br />
hanging downwards for example; w<strong>here</strong>as a critical point is unstable if at least<br />
one nearby solution escapes from the neighbourhood <strong>of</strong> the critical point and<br />
thus cannot be expected to be observed because we expect the escape to be<br />
found, a pencil is impossible to balance on its sharp tip for example.<br />
Reading 1.F Study §3.4 [K,pp170–5], especially the definition <strong>of</strong> stability<br />
and its consequences.<br />
Kreyszig, as do many texts, writes the conditions for stability and classifications<br />
in terms <strong>of</strong> the coefficients <strong>of</strong> the characteristic polynomial. While
Module 1. Systems <strong>of</strong> differential equations 18<br />
this may be slightly more convenient in 2-D, it is usually easier to remember<br />
the conditions directly in terms <strong>of</strong> the eigenvalues. This is for two reasons:<br />
the classification then proceeds systematically to higher dimensions; and it<br />
is easy to remember the details because the dynamics are simply those <strong>of</strong><br />
exp(λ j t).<br />
Thus we urge you to classify the fixed points <strong>of</strong> two-dimensional linear systems<br />
according to the eigenvalues <strong>of</strong> their coefficient matrix, A. The results<br />
are summarised in this table.<br />
Eigenvalues λ j Condition Fixed point (0, 0)<br />
R(λ j ) = 0 for j = 1, 2 stable centre<br />
Complex R(λ j ) > 0 for j = 1, 2 unstable spiral<br />
(R denotes real part) R(λ j ) < 0 for j = 1, 2 stable spiral<br />
λ 1 , λ 2 > 0 unstable node<br />
Real λ 1 , λ 2 < 0 stable node<br />
λ 1 < 0 < λ 2 unstable saddle<br />
However, cases not covered by the above table, the so-called degenerate cases,<br />
have to be considered on their own merits.<br />
Activity 1.G Do Problem Set 3.4 [K,p174–5]. Send in to the examiner for<br />
feedback at least Q2, 4 & 14.
Module 1. Systems <strong>of</strong> differential equations 19<br />
In higher dimensions the stability <strong>of</strong> a fixed point is most easily expressed in<br />
terms <strong>of</strong> the eigenvalues <strong>of</strong> the corresponding coefficient matrix. Based upon<br />
the generic solution [K,p163]<br />
y = c 1 x (1) e λ 1t + · · · + c n x (n) e λnt ,<br />
and the behaviour <strong>of</strong> exp(λ j t) we deduce:<br />
• the fixed point y = 0 is unstable if R(λ j ) > 0 for at least one j as<br />
then at least that component exp(λ j t) in the solution will grow and<br />
lead solutions away from the fixed point;<br />
• the fixed point y = 0 is stable if R(λ j ) ≤ 0 for all j as then all the<br />
components exp(λ j t) in the solution will decay or just oscillate;<br />
• unless the exceptional degenerate case occurs w<strong>here</strong> R(λ j ) ≤ 0 for all Note: the case<br />
j but two or more pairs <strong>of</strong> eigenvalues with R(λ j ) = 0 also have equal<br />
imaginary part, say ω, when the general solution will have the growing<br />
component cte iωt from the degeneracy [K,p167] to cause the fixed point<br />
to be unstable.<br />
Eigenvalues in two or three dimensional problems may be calculated by hand.<br />
In higher dimensions we typically resort to computer numerics.<br />
Example 1.6: Determine the stability <strong>of</strong> the origin in the three-dimensional<br />
systems <strong>of</strong> Problems 7–9 in Problem Set 3.3 [K,p169].<br />
R(λ j ) = 0 is very<br />
delicate as it is<br />
exactly on the<br />
dividing line<br />
between stability<br />
and instability, and<br />
hence any small<br />
effect will tip the<br />
dynamics from one<br />
to the other.
Module 1. Systems <strong>of</strong> differential equations 20<br />
Solution:<br />
use Matlab to compute the eigenvalues as follows:<br />
>>a7=[10 -10 -4;-10 1 -14;-4 -14 -2];<br />
>>eig(a7)<br />
ans =<br />
18.0000<br />
9.0000<br />
-18.0000<br />
>>a8=[-3 -1 2; 0 -4 2; 0 1 -5];<br />
>>eig(a8)<br />
ans =<br />
-3<br />
-3<br />
-6<br />
>>a9=[-1 -4 2;2 5 -1;2 2 2];<br />
>>eig(a9)<br />
ans =<br />
-0.0000<br />
3.0000<br />
3.0000<br />
Thus (<strong>here</strong> all the eigenvalues are real) the origin is:<br />
• unstable in Problem 7 as at least one eigenvalue (<strong>here</strong> two) is<br />
positive;
Module 1. Systems <strong>of</strong> differential equations 21<br />
• stable in Problem 8 as all eigenvalues are negative (the multiple<br />
eigenvalue −3 introduces the component te −3t but this still decays);<br />
• unstable in Problem 9 as at least one eigenvalue is positive.<br />
Exercise 1.7: A function y(t) is governed by the third-order equation:<br />
y ′′′ + 5y ′′ − 2y ′ − 6y = 0 .<br />
(a) By introducing appropriate variables show how this can be expressed<br />
as a linear system <strong>of</strong> first-order ode’s.<br />
(b) Write down the general solution <strong>of</strong> the system.<br />
(c) Describe the nature <strong>of</strong> the fixed point at (0, 0, 0).<br />
Exercise 1.8: Prepare a phase plane diagram for the following system:<br />
dx<br />
dt<br />
dy<br />
dt<br />
= −x − 3y ,<br />
= 2x − 3y .
Module 1. Systems <strong>of</strong> differential equations 22<br />
(a) Find the real solution to this system when x(0) = 4 and y(0) = 1.<br />
(b) Sketch on your phase plane the solution curve above.
Module 1. Systems <strong>of</strong> differential equations 23<br />
1.2 Qualitative solution <strong>of</strong> nonlinear, first-order<br />
systems <strong>of</strong> ode’s<br />
Systems which have complex or physically interesting behaviour are governed<br />
by nonlinear differential equations. It is usually impossible to solve such<br />
equations algebraically, but phase portraits can give a rough overview <strong>of</strong><br />
what solutions look like. Near each fixed point <strong>of</strong> the system, the solution<br />
is dominated by the linear terms in the differential equations and so for<br />
each fixed point one <strong>of</strong> the pictures examined in the previous section applies.<br />
After considering each fixed point, all the little pictures are reasonably joined<br />
together to give a global overview <strong>of</strong> the solutions.<br />
The techniques you study <strong>here</strong> will be developed further in later modules.<br />
Reading 1.H Study §3.5 [K,pp180–6] and all its examples as the understanding<br />
<strong>of</strong> this section is the main purpose <strong>of</strong> this module.<br />
Note the main steps used in the analysis <strong>of</strong> nonlinear systems:<br />
• set up a mathematical model, converting to a system <strong>of</strong> first-order<br />
differential equations if necessary;<br />
• determine all the fixed points (critical points) <strong>of</strong> the system;
Module 1. Systems <strong>of</strong> differential equations 24<br />
• use linearisation to examine the dynamics near each fixed point;<br />
• fill-in the trajectories in phase space in a sensible way.<br />
This last step is <strong>of</strong>ten aided by determining isoclines: the curves in the phase<br />
plane w<strong>here</strong> trajectories have constant slope.<br />
Example 1.9: Prepare a phase plane diagram <strong>of</strong> the following system: locate<br />
its fixed points and determine the nature <strong>of</strong> each fixed point by<br />
linearisation. Find the approximate linear solution near each <strong>of</strong> the<br />
fixed points. Find some isoclines. Sketch in trajectories.<br />
dx<br />
dt<br />
dy<br />
dt<br />
= 3x − xy<br />
= y − x 2 y .<br />
Aside: the whole phase plot is easy enough to do with Matlab: below<br />
is the sort <strong>of</strong> picture we will work towards. However, we use mathematical<br />
analysis.
Module 1. Systems <strong>of</strong> differential equations 25<br />
5<br />
4<br />
3<br />
y<br />
2<br />
1<br />
0<br />
[x y]=meshgrid(-2:.4:2,-.8:.4:5);<br />
dx=3*x-x.*y;<br />
dy=y-x.*x.*y;<br />
-1<br />
-3 -2 -1 0 1 2 3 quiver(x,y,dx,dy);<br />
x<br />
Solution:<br />
• It is useful to know w<strong>here</strong> the fixed points are before we do the<br />
phase plot. So set the right-hand sides to zero.<br />
3x − xy = x(3 − y) = 0 ⇒ x = 0 or y = 3<br />
y − x 2 y = y(1 − x 2 ) = 0 ⇒ y = 0 or x = ±1<br />
Putting x = 0 from the first equation forces y = 0 from the<br />
second, w<strong>here</strong>as y = 3 from the first equation forces x = ±1<br />
from the second. So t<strong>here</strong> are three fixed points: (0, 0), (1, 3) and
Module 1. Systems <strong>of</strong> differential equations 26<br />
(−1, 3). We should make sure the phase plot includes these points<br />
as shown below.<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
y<br />
1.5<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-3 -2 -1 0 1 2 3<br />
x<br />
• Consider each <strong>of</strong> the fixed points in turn.<br />
– To linearise near the fixed point (1, 3), make the change <strong>of</strong><br />
variable:<br />
x = 1 + X(t) , and y = 3 + Y (t)
Module 1. Systems <strong>of</strong> differential equations 27<br />
w<strong>here</strong> X(t) and Y (t) are small. Then x ′ = X ′ , y ′ = Y ′ and<br />
X ′ = 3(1 + X) − (1 + X)(3 + Y )<br />
= −Y − XY ,<br />
Y ′ = (3 + Y ) − (1 + X) 2 (3 + Y )<br />
= −6X − 3X 2 − 2XY − X 2 Y .<br />
Since X and Y are small, all the nonlinear quadratic and cubic<br />
terms in X and Y are negligible compared to the linear terms<br />
and we approximate as the linear system<br />
[ ] [<br />
X<br />
′<br />
Y ′ ≈<br />
0 −1<br />
−6 0<br />
] [<br />
X<br />
Y<br />
]<br />
. (1.6)<br />
The coefficient matrix has eigenvalues λ 1<br />
− √ 6. Thus (1, 3) is a saddle point.<br />
= √ 6 and λ 2 =<br />
– Similarly near (−1, 3),<br />
X ′ = 3(−1 + X) − (−1 + X)(3 + Y )<br />
= Y − XY ,<br />
Y ′ = (3 + Y ) − (−1 + X) 2 (3 + Y )<br />
= 6X − 3X 2 + 2XY − X 2 Y ,<br />
linearises to [ ] [<br />
X<br />
′ 0 1<br />
Y ′ ≈<br />
6 0<br />
] [<br />
X<br />
Y<br />
]<br />
.
Module 1. Systems <strong>of</strong> differential equations 28<br />
The eigenvalues are again ± √ 6, so (−1, 3) is also a saddle<br />
point.<br />
– Lastly, near (0, 0), x and y are small, so ignoring nonlinear<br />
terms in these original variables we have:<br />
[ ] [<br />
x<br />
′ 3 0<br />
y ′ ≈<br />
0 1<br />
] [<br />
x<br />
y<br />
which has eigenvalues 1 and 3, and hence (0, 0) is an unstable<br />
node.<br />
• To predict trajectories it is most important to explore the neighbourhood<br />
<strong>of</strong> each fixed point.<br />
– Here the simplest fixed point is the origin (0, 0) as its linearisation<br />
(see above) is simply<br />
]<br />
x ′ = 3x and y ′ = y .<br />
Immediately write down the general solution <strong>of</strong> each <strong>of</strong> these<br />
basic differential equations separately:<br />
x = c 1 e 3t and y = c 2 e t .<br />
,<br />
Writing this in vector notation,<br />
[ ] [ ] [<br />
x 1 0<br />
= c<br />
y 1 e 3t + c<br />
0<br />
2<br />
1<br />
]<br />
e t ,
Module 1. Systems <strong>of</strong> differential equations 29<br />
see the eigenvectors are (1, 0) and (0, 1) corresponding to the<br />
eigenvalues 3 and 1 respectively. Thus in the direction (1, 0)<br />
solutions grow three times faster than in the (0, 1) direction.<br />
Hence we draw the little local picture below.<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
y<br />
1.5<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-3 -2 -1 0 1 2 3<br />
x<br />
– To find approximate solutions near (±1, 3) simultaneously we<br />
need to find the eigenvectors <strong>of</strong> the coefficient matrix (being
Module 1. Systems <strong>of</strong> differential equations 30<br />
careful with the sign <strong>of</strong> x):<br />
A =<br />
[<br />
0 ∓1<br />
∓6 0<br />
]<br />
.<br />
So solve:<br />
(λI − A)v =<br />
[<br />
λ ±1<br />
±6 λ<br />
] [ ] [<br />
v1 0<br />
=<br />
v 2 0<br />
]<br />
,<br />
which yields eigenvectors<br />
[ ]<br />
∓1<br />
v (1) = √ for λ 6 1 = √ 6 ,<br />
and<br />
v (2) =<br />
[ ]<br />
√ ±1<br />
6<br />
for λ 2 = − √ 6 .<br />
The linear solution <strong>of</strong> (1.6) near (±1, 3) will be [K,p163, Theorem<br />
1]:<br />
[<br />
X<br />
Y<br />
]<br />
[ ] [ ]<br />
∓1<br />
= c 1<br />
√ e √ ±1<br />
6t + c 6 2<br />
√ e −√6t , 6<br />
or, in terms <strong>of</strong> the original variables,<br />
[ ] [ ] [ ]<br />
x ±1 ∓1<br />
= + c<br />
y 3 1<br />
√ e √ [ ]<br />
±1<br />
6t + c 6 2<br />
√ e −√6t , 6
Module 1. Systems <strong>of</strong> differential equations 31<br />
w<strong>here</strong> c 1 and c 2 are arbitrary constants. Thus for these two<br />
saddle points t<strong>here</strong> will be exponential growth in the direction<br />
(∓1, √ 6), towards the top-left and bottom-right for the fixed<br />
point (1, 3), and decay in the direction (±1, √ 6), the top-right<br />
and bottom-left directions for (1, 3). Thus we sketch in the<br />
local pictures shown below for the two fixed points (±1, 3).<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
y<br />
1.5<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-3 -2 -1 0 1 2 3<br />
x<br />
• The isoclines help fill in the picture. The two easiest isoclines are
Module 1. Systems <strong>of</strong> differential equations 32<br />
w<strong>here</strong> the trajectories are horizontal, slope zero, obtained by finding<br />
w<strong>here</strong> y ′ = 0, and w<strong>here</strong> the trajectories are vertical, infinite<br />
slope, obtained by finding w<strong>here</strong> x ′ = 0.<br />
– y ′ = y − x 2 y = 0 whenever y = 0 or x = ±1. Thus all<br />
trajectories are vertical when they cross the three red dotdashed<br />
lines shown below.<br />
– x ′ = 3x − xy = 0 whenever x = 0 or y = 3. Thus all trajectories<br />
are horizontal when they cross the two magenta dashed<br />
lines plotted below.
Module 1. Systems <strong>of</strong> differential equations 33<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
y<br />
1.5<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-3 -2 -1 0 1 2 3<br />
x<br />
• Lastly, use all this information to qualitatively sketch in trajectories<br />
such as the solid green lines shown next.
Module 1. Systems <strong>of</strong> differential equations 34<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
y<br />
1.5<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-3 -2 -1 0 1 2 3<br />
x<br />
Activity 1.I Do exercises from Problem Set 3.5 [K,p183]. Send in to the<br />
examiner for feedback at least Q5 & 8.
Module 1. Systems <strong>of</strong> differential equations 35<br />
The qualitative methods developed <strong>here</strong> generalise to higher dimensional<br />
systems, but it is much harder to “fill in” the phase space because <strong>of</strong> the<br />
intricate contortions permitted for trajectories in dimensions higher than<br />
two. That is why chaos (in its mathematical sense) only occurs in differential<br />
systems with three or more components.<br />
1.2.1 Linearisation using the Jacobian<br />
When exploring the dynamics <strong>of</strong> systems we analyse the linear dynamics<br />
in the neighbourhood <strong>of</strong> each fixed point. Previously we found the linear<br />
dynamics by a change <strong>of</strong> variable and subsequent neglect <strong>of</strong> nonlinear terms.<br />
This is a useful technique but a little laborious in straightforward situations.<br />
The alternative we explore <strong>here</strong> is to obtain the matrix <strong>of</strong> coefficients simply<br />
by evaluating the “derivative” <strong>of</strong> the differential system (the Jacobian).<br />
Consider a nonlinear, first-order system <strong>of</strong> ode’s <strong>of</strong> the form<br />
x ′ = f(x, y)<br />
y ′ = g(x, y)<br />
w<strong>here</strong> we have taken a 2-D example, but the extension to an n-dimensional<br />
system is straightforward. Also assume that the system is autonomous, i.e.<br />
t<strong>here</strong> is no explicit t-dependence on the right-hand sides. Suppose that<br />
(x 0 , y 0 ) is a fixed point for the system, i.e. f(x 0 , y 0 ) = g(x 0 , y 0 ) = 0, and
Module 1. Systems <strong>of</strong> differential equations 36<br />
consider a point (x, y) nearby. We now appeal to Taylor’s theorem in twodimensions<br />
that we will explore in detail in a later module, §5.4. Taylor’s<br />
theorem allows us to approximate f and g in the neighbourhood <strong>of</strong> the fixed<br />
point as<br />
f(x, y) ≈ f(x 0 , y 0 ) + (x − x 0 ) ∂f<br />
+ (y − y<br />
∂x ∣ 0 ) ∂f<br />
,<br />
(x0 ,y 0<br />
∂y ∣<br />
)<br />
(x0 ,y 0 )<br />
g(x, y) ≈ g(x 0 , y 0 ) + (x − x 0 ) ∂g<br />
∣ + (y − y 0 ) ∂g<br />
∣ .<br />
∂x<br />
∂y<br />
∣ (x0 ,y 0 )<br />
∣ (x0 ,y 0 )<br />
Since (x 0 , y 0 ) is a fixed point f(x 0 , y 0 ) = g(x 0 , y 0 ) = 0; so in the neighbourhood<br />
<strong>of</strong> the fixed point the evolution is governed by the linear system<br />
x ′ ≈ ∂f<br />
(x − x<br />
∂x ∣ 0 ) + ∂f<br />
(y − y<br />
(x0 ,y 0<br />
∂y ∣ 0 ) ,<br />
)<br />
(x0 ,y 0 )<br />
y ′ ≈ ∂g<br />
∣ (x − x 0 ) + ∂g<br />
∣ (y − y 0 ) .<br />
∂x<br />
∂y<br />
∣ (x0 ,y 0 )<br />
Thus, making the change <strong>of</strong> variable:<br />
the linearised system is then<br />
∣ (x0 ,y 0 )<br />
x = x 0 + X(t) ⇒ x ′ = X ′ ,<br />
y = y 0 + Y (t) ⇒ y ′ = Y ′ ,<br />
[<br />
X<br />
′<br />
Y ′ ]<br />
≈<br />
⎡<br />
∂f<br />
⎣ ∂x<br />
∂g<br />
∂x<br />
⎤<br />
∂f [ ]<br />
∂y ⎦ X<br />
∂g Y<br />
∂y
Module 1. Systems <strong>of</strong> differential equations 37<br />
w<strong>here</strong> all the derivatives in the matrix are evaluated at the fixed point (x 0 , y 0 ).<br />
A common notation for the matrix appearing <strong>here</strong> is<br />
⎡ ⎤<br />
∂f ∂f<br />
∂(f, g)<br />
J(x, y) =<br />
∂(x, y) = ⎣ ∂x ∂y ⎦ ,<br />
which is called the Jacobian 3 and must be evaluated at the fixed point in<br />
question. This Jacobian matrix evaluated at a fixed point is the matrix <strong>of</strong><br />
coefficients <strong>of</strong> the linearised dynamics and hence its eigenvalues and eigenvectors<br />
determine the stability and classification <strong>of</strong> the fixed point.<br />
Example 1.10: Use the Jacobian in Example 1.9.<br />
worked example we had<br />
so the Jacobian is<br />
∂g<br />
∂x<br />
∂g<br />
∂y<br />
ẋ = f(x, y) = 3x − xy<br />
ẏ = g(x, y) = y − x 2 y<br />
[ ]<br />
∂(f, g) 3 − y −x<br />
J =<br />
∂(x, y) = −2xy 1 − x 2<br />
In our previous<br />
which when evaluated at the fixed points (1, 3), (−1, 3) and (0, 0) gives<br />
the respective coefficient matrices<br />
[ ] [ ] [ ]<br />
0 −1 0 1<br />
3 0<br />
,<br />
and<br />
−6 0 6 0<br />
0 1<br />
3 After Karl Jacobi (1804–51) a German mathematician and pr<strong>of</strong>essor at Königsberg,<br />
noted for work in elliptic functions, number theory and differential determinants.
Module 1. Systems <strong>of</strong> differential equations 38<br />
as seen previously. The eigenvalues <strong>of</strong> these matrices determined the<br />
stability <strong>of</strong> the corresponding fixed point as seen earlier.<br />
In n-dimensions the Jacobian matrix is<br />
⎡<br />
J = ∂(f 1, f 2 , . . . , f n )<br />
∂(x 1 , x 2 , . . . , x n ) = ⎢<br />
⎣<br />
∂f 1 ∂f 1<br />
∂x 1<br />
∂f 2 ∂f 2<br />
∂x 1<br />
.<br />
∂f n<br />
∂x 1<br />
∂x 2<br />
· · ·<br />
· · ·<br />
∂x 2<br />
.<br />
. .. .<br />
∂f n<br />
· · ·<br />
∂x 2<br />
∂f 1<br />
⎤<br />
∂x n<br />
∂f 2<br />
∂x n<br />
,<br />
⎥<br />
⎦<br />
and similarly determines the behaviour <strong>of</strong> the linearised dynamics near any<br />
fixed point.<br />
Activity 1.J Do problems 5–11 from Problem Set 3.5 [K,p183] using the<br />
Jacobian to determine the coefficient matrices <strong>of</strong> the linear dynamics<br />
about each <strong>of</strong> the fixed points. Send in to the examiner for feedback<br />
at least Q7 & 10.<br />
∂f n<br />
∂x n<br />
Exercise 1.11: Consider the following system <strong>of</strong> equations:<br />
dx<br />
dt<br />
dy<br />
dt<br />
= −4y + y 3 ,<br />
= 5x − y − xy 2 .
Module 1. Systems <strong>of</strong> differential equations 39<br />
(a) Find the fixed points <strong>of</strong> this system.<br />
(b) Compute the Jacobian and evaluate it at each fixed point. From<br />
your results classify each <strong>of</strong> the fixed points.<br />
(c) Find a general linearised solution near (0, 0).<br />
Exercise 1.12: A predator-prey population model (not the Lotka-Volterra<br />
model) is governed by the equations<br />
y ′ 1 = y 1 (1 − y 1 ) − 3 2 y 2<br />
y ′ 2 = 1 2 y 1 − y 2<br />
(a) Deduce that the only critical points <strong>of</strong> this system <strong>of</strong> equations<br />
are (0, 0) and (1/4, 1/8).<br />
(b) By linearising the system in the neighbourhood <strong>of</strong> each critical<br />
point, determine the nature <strong>of</strong> these points are a saddle and stable<br />
spiral respectively.<br />
(c) Based upon the eigenvectors at (0, 0) and the nature <strong>of</strong> the fixed<br />
points, sketch some representative trajectories for the system.<br />
(d) Write down the general solution to the linearised system in the<br />
neighbourhood <strong>of</strong> the origin (0, 0).
Module 1. Systems <strong>of</strong> differential equations 40<br />
1.2.2 Answers to selected Exercises<br />
1.11 (a) (0, 0) and ±(2, 2) (b) saddle as λ = 4 and −5, and saddle as<br />
λ = 0.8191 and −9.8191 .<br />
1.12 (b) eigenvalues are λ = ±1/2 and λ = (−1±i √ 3)/4 respectively. (c)<br />
v = (3, 1) and v = (1, 1) correponding to eigenvalues λ = 1/2 and −1/2<br />
respectively. (d) y = c 1 (3, 1)e t/2 + c 2 (1, 1)e −t/2 .
Module 1. Systems <strong>of</strong> differential equations 41<br />
1.3 Summary<br />
• The behaviour <strong>of</strong> physical systems, such as mechanical and electrical<br />
systems, are described using differential equations (§1.1.1 for example).<br />
• Higher order differential equations can be reduced to systems <strong>of</strong> firstorder<br />
differential equations by introducing more dependent variables<br />
(§1.1.2).<br />
• Solutions to 2-D systems <strong>of</strong> ode’s are graphically represented on the<br />
phase plane (§1.1.3). Higher dimensional systems may be also represented<br />
using a phase space, but imagination is required above 3-D. For<br />
a given set <strong>of</strong> initial conditions, the system evolves along a trajectory<br />
or orbit in the phase space, which describes how the system evolves in<br />
time (§1.1.4).<br />
• At a critical point or fixed point, the system undergoes no further evolution<br />
in time (§1.1.4).<br />
For a first-order system <strong>of</strong> ode’s in the form y ′ = f(y), a fixed point<br />
occurs w<strong>here</strong>ver f(y) = 0 (§1.2).<br />
• Linear first-order systems with constant coefficients are written y ′ =<br />
Ay for some matrix A. Their general solution (equation (1.4)) is written<br />
down in terms <strong>of</strong> the eigenvalues and eigenvectors <strong>of</strong> A, provided the<br />
eigenvectors form a basis for the phase space (§1.1.4).<br />
Such linear systems have only the origin as a fixed point.
Module 1. Systems <strong>of</strong> differential equations 42<br />
• In particular, in 2-D, the solution to a linear first-order system with<br />
constant coefficients must 4 be one <strong>of</strong> six basic types: a centre, stable<br />
or unstable node, stable or unstable spiral or a saddle point (§1.1.5).<br />
• Nonlinear systems are capable <strong>of</strong> more complex behaviour and usually<br />
have more than one fixed point. Nevertheless the behaviour <strong>of</strong> the<br />
system near a fixed point is dominated by linear terms in the ode and<br />
approximate local solutions are found via linearisation. It is usually<br />
not be possible to write down a general solution for a nonlinear system,<br />
so approximations and phase plane methods are useful to build up a<br />
working picture <strong>of</strong> how these systems behave (§1.2).<br />
Activity 1.K Do Chapter 3 Review Problems 1–23, 28–30 and 35–38 [K,pp190–<br />
2].<br />
4 That is, unless no basis <strong>of</strong> eigenvectors exists, see [K,p168].
Module 2<br />
Scientists must write<br />
Module contents<br />
2.1 Basics <strong>of</strong> mathematical writing . . . . . . . . . . . . . 45<br />
2.2 L A TEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />
2.2.1 Why L A TEX? . . . . . . . . . . . . . . . . . . . . . . . . 50<br />
2.2.2 Start to use L A TEX . . . . . . . . . . . . . . . . . . . . . 51<br />
2.2.3 Simple mathematics goes in-line . . . . . . . . . . . . . 59<br />
2.2.4 List environments usefully organise . . . . . . . . . . . . 61<br />
2.2.5 Complex mathematics is displayed . . . . . . . . . . . . 63<br />
2.2.6 Figures float . . . . . . . . . . . . . . . . . . . . . . . . 75
Module 2. Scientists must write 44<br />
2.2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 78<br />
2.2.8 Many mathematical symbols in L A TEX . . . . . . . . . . 79<br />
Developing technical communication is essential as preparation for the workplace<br />
and advanced study. In this module we help you to structure, prepare<br />
and deliver small documents <strong>of</strong> technical material. This module is to be studied<br />
in parallel with the first module in preparation for your first assignment.<br />
In your assignments you will be required to use your skills in technical writing<br />
for certain exercises. These exercises will not only be graded on mathematical<br />
content, but also on the style and manner <strong>of</strong> the technical and English<br />
expression.<br />
The first section (§2.1) discusses the composition <strong>of</strong> mathematical writing.<br />
Although mathematical writing has much in common with non-technical<br />
writing, t<strong>here</strong> are many distinctions and extensions. Some <strong>of</strong> the common<br />
problem areas are identified and discussed. These and basic aspects <strong>of</strong> writing<br />
will be assessed in the specified exercises in the assignments.<br />
The second section (§2.2) introduces you to L A TEX, the open standard for<br />
high quality typesetting <strong>of</strong> scientific and general documents (t<strong>here</strong> are two<br />
alternative pronunciations <strong>of</strong> L A TEX: either “lay-teck” or “lah-teck”). As<br />
well as typesetting documents, L A TEX provides a convenient standard for the<br />
communication <strong>of</strong> mathematics in plain text such as e-mails—your e-mail<br />
enquiries to us should be phrased using the syntax and grammar <strong>of</strong> L A TEX.<br />
It is compulsory for you to use L A TEX for the specified exercises.
Module 2. Scientists must write 45<br />
2.1 Basics <strong>of</strong> mathematical writing<br />
In this first introduction to the writing <strong>of</strong> technical documents involving<br />
mathematics we focus on incorporating mathematical equations, symbols<br />
and structures into a short expository document. This mathematical detail<br />
is based upon basic communications concepts that we first summarise and<br />
which I expect to be familiar to you.<br />
Basic written communication: Successful writing in any discipline is based<br />
on certain elements and these are summarised below. It could be useful<br />
to read Chapters 4 and 6 from Communication: A Foundation Course, by<br />
S. Tyler, C. Kossen and C. Ryan, rev. edn.<br />
• Analysing the task (what you are being asked to write about).<br />
• Analysing the audience (to whom are you writing).<br />
• Developing a thesis statement (what you intend to prove).<br />
• Deciding on your main points (how you intend to prove or support your<br />
thesis statement).<br />
• Logical sequence <strong>of</strong> points (developing a co<strong>here</strong>nt argument).<br />
I recommend international students also read Chapter 5, “When English is a<br />
foreign language”, from Handbook <strong>of</strong> writing for the mathematical sciences,<br />
by N. J. Higham, 2nd edition.
Module 2. Scientists must write 46<br />
Mathematical writing has special features<br />
Reading 2.A Study, noting the comments below, Chapter 3 “Mathematical<br />
writing” from Handbook <strong>of</strong> writing for the mathematical sciences, by<br />
N. J. Higham, 2nd edition.<br />
§3.1 What is a theorem? This is <strong>of</strong> interest, but we will not worry about<br />
whether you call results theorems, propositions, or lemmas. However,<br />
we will look for a well structured argument in your writing—you will<br />
need to state clearly what are your main results.<br />
§3.2 Pro<strong>of</strong>s It is essential to help readers and to show you appreciate the<br />
role <strong>of</strong> the various parts <strong>of</strong> an argument by annotating the argument<br />
accordingly.<br />
§3.3 The role <strong>of</strong> examples Although I expect this to be irrelevant to your<br />
assignments, if necessary, introduce a generality by a preliminary specific<br />
example.<br />
§3.4 Definitions Only define terms if they are new and are needed in several<br />
places.<br />
§3.5 Notation Endeavour to choose a notation that is consistent and not<br />
confusing.
Module 2. Scientists must write 47<br />
§3.6 Words versus symbols Readers typically have difficulty remembering<br />
the meaning <strong>of</strong> symbols that you have introduced. Even though<br />
you know what they mean, use words w<strong>here</strong>ver reasonable w<strong>here</strong>ver a<br />
symbol appears.<br />
§3.7 Displaying equations The crucial point in this section is the first<br />
sentence: “An equation is displayed when it needs to be numbered,<br />
when it would be hard to read if placed in-line, or when it merits<br />
special attention,. . . ” otherwise typeset equations in-line.<br />
§3.8 Parallelism A subtle aspect—we will not assess this.<br />
§3.9 Dos and Don’ts <strong>of</strong> mathematical writing has lots <strong>of</strong> good tips.<br />
Punctuating expressions It is important to remember that mathematical<br />
content, whether expressions, equations or derivations,<br />
must form an integral part <strong>of</strong> a sentence. Write and punctuate<br />
accordingly.<br />
Otiose symbols Avoid gratuitous symbols.<br />
Placement <strong>of</strong> symbols Endeavour to be clear.<br />
“The” or “A” addresses a small but common error.<br />
Notational synonyms Strive to find, among all readable and clear<br />
possibilities, an aesthetically pleasing version <strong>of</strong> your mathematical<br />
expressions.
Module 2. Scientists must write 48<br />
Referencing equations Analogous to §3.6, a descriptive word or two<br />
helps remind readers <strong>of</strong> the subject <strong>of</strong> an equation that you reference.<br />
Miscellaneous In my opinion the most important <strong>of</strong> these are:<br />
• standard mathematical functions are set in roman, not italic;<br />
• avoid stacked fractions in superscripts and subscripts;<br />
• avoid tall in-line expressions;<br />
• choose the correct ellipsis;<br />
• avoid ambiguity in slashed fractions.<br />
If you see our study guides failing any <strong>of</strong> the above, then please inform us.
Module 2. Scientists must write 49<br />
2.2 L A TEX<br />
Good writing deserves the best reproduction. Here we introduce you to<br />
L A TEX, the world’s best package for typesetting technical and other documents.<br />
You will use L A TEX for at least the specified exercises.<br />
From: pete@nospam <br />
t<strong>here</strong> is simply nothing better.<br />
i started learning latex on my own few years ago, and at work i<br />
write all my reports using it. every one tells me how good the<br />
reports look and they wonder how i do it, since everyone else uses<br />
word and their report do not half look as good.<br />
i try to keep it a secret, but sometimes i am forced to tell, but<br />
most will not even try it because they think it is hard to do this<br />
way, but they do not know that it is actually easier. in latex i<br />
concentrate on the logic and content <strong>of</strong> my report and let latex<br />
worry about how to format it and typeset it. so with latex i am<br />
much faster than with those gui word processors.<br />
tex and latex makes writing more fun.
Module 2. Scientists must write 50<br />
2.2.1 Why L A TEX?<br />
• L A TEX is arguably the premier typesetting package in the world. Donald<br />
Knuth and Leslie Lamport have distilled for us the wisdom, accumulated<br />
over hundreds <strong>of</strong> years, <strong>of</strong> many generations <strong>of</strong> printers.<br />
• The L A TEX system typesets documents with line and page breaks to<br />
maximise readability and appeal by avoiding as far as possible poor<br />
breaks and hyphenation.<br />
• It is simply the best package for documents containing mathematics.<br />
“TEX can print virtually any mathematical thought that comes into<br />
your head, and print it beautifully.” (Herbert S. Wilf, 1986)<br />
• It is free on virtually every computer in the world.<br />
• It is portable—stick to the standard commands and everyone can read<br />
and exchange documents.<br />
• The source file uses standard keyboard characters so it can be read<br />
by eye or posted by e-mail with no problems associated with different<br />
versions or binary files.<br />
• L A TEX has the reputation <strong>of</strong> being hard, but as a mark-up language it<br />
is effectively the same as html!<br />
• Weakness: it is not usually wysiwyg.
Module 2. Scientists must write 51<br />
L A TEX is a very powerful typesetting system. Here we only introduce you to<br />
some basics <strong>of</strong> L A TEX. The idea is to provide you with enough to typeset the<br />
specified assignment questions. T<strong>here</strong> is much more that you may learn to<br />
extend your use <strong>of</strong> L A TEX.<br />
As well as the guide written <strong>here</strong> t<strong>here</strong> is a wealth <strong>of</strong> support information for<br />
L A TEX on the internet. More L A TEX information and many links to further<br />
sources are to be found at http://www.sci.usq.edu.au/staff/robertsa/LaTeX/latexintro.<br />
For further reading I suggest:<br />
• Chapter 13 “TEX and L A TEX” <strong>of</strong> Handbook <strong>of</strong> writing for the mathematical<br />
sciences, by N. J. Higham, 2nd edition; and<br />
• Learning L A TEX by D. F. Griffiths and N. J. Higham.<br />
2.2.2 Start to use L A TEX<br />
Install L A TEX on your computer: if you run Windows, the Department cdrom<br />
set has a copy <strong>of</strong> L A TEX (called MiKTeX, essential) and the shareware<br />
editor WinEdt (helpful, but not essential) for you to install; if you use Linux,<br />
L A TEX is included as an optional part <strong>of</strong> the Linux release; if you use a Macintosh,<br />
obtain OzTeX and I recommend the editor Alpha. Install whichever<br />
is appropriate for you. For Windows follow the instructions on the Maths &<br />
Computing cd-rom. If the installation fails, still make progress by following<br />
the instructions in a subsequent paragraph.
Module 2. Scientists must write 52<br />
Execute L A TEX on a windows computer:<br />
1. Prepare a plain text file in any simple editing application such as<br />
notepad, or preferably in WinEdt. Your L A TEX source forms the text<br />
and name the file with the .tex extension, for example, first.tex .<br />
2. Execute the L A TEX application giving as input your source file, for example,<br />
first.tex .<br />
3. If t<strong>here</strong> are errors, correct your source and redo the previous step.<br />
4. View the beautifully typeset “dvi” file generated by L A TEX, for example<br />
first.dvi , using the application yap .<br />
If your execution fails, still make progress by following the instructions in the<br />
following paragraph.<br />
If you do not have access to a computer with L A TEX: we have provided<br />
a web interface to L A TEX. The following are its instructions.<br />
1. Prepare a plain text file in any simple editing application, such as<br />
notepad, with your L A TEX source and preferably name it with the .tex<br />
extension, for example, first.tex (although notepad likes to insist on<br />
a .txt extension which is also acceptable).
Module 2. Scientists must write 53<br />
2. Point your internet browser to the web address http://www.sci.usq.edu.au/latex,<br />
enter your usqconnect username and password when requested.<br />
3. Click the Browse... button and navigate around the file system on<br />
your computer to your L A TEX source file, for example, first.tex .<br />
4. Click Submit Document button.<br />
5. Wait for hopefully no more than a few seconds for a new web page to<br />
appear saying “The PDF file is available for download” in which case<br />
<strong>click</strong> on the link PDF file and view your beautiful document in Adobe<br />
acrobat reader.<br />
6. If a serious error has occurred, download the log file and use the error<br />
messages to guide fixing your document, then return to Step 2. It is a<br />
good idea to download the log file and check for non-fatal errors in any<br />
case.<br />
Your first document: you need to prepare a text file <strong>of</strong> the content <strong>of</strong> your<br />
composition interspersed with L A TEX commands. First tell L A TEX the sort<br />
<strong>of</strong> document you will be typesetting. For our straightforward needs in this<br />
course you will use the article style typeset in a 12pt size font. Around<br />
the document text that you wish to typeset, you need<br />
\documentclass[12pt,a4paper]{article}<br />
\begin{document}
Module 2. Scientists must write 54<br />
...<br />
\end{document}<br />
The three dots above denote the place w<strong>here</strong> the content text is to be placed.<br />
Second, specify the title and author <strong>of</strong> the document. These are to be specified<br />
in the following manner using the \title{...}, \author{...} and<br />
\maketitle commands.<br />
\documentclass[12pt,a4paper]{article}<br />
\begin{document}<br />
\title{Assignment 1, Question 3: The<br />
importance <strong>of</strong> beings fractal}<br />
\author{Ben Hall, Q99123456}<br />
\maketitle<br />
...<br />
\end{document}<br />
<strong>Just</strong> those seven lines form a complete, though pointless, document. Try it.<br />
Type the above into a file (perhaps named first.tex) then run it through<br />
L A TEX and view the result. These seven lines form the skeleton <strong>of</strong> all our<br />
L A TEX documents. Contact us if t<strong>here</strong> is any problem.<br />
Now put in some information. Simply type the text <strong>of</strong> your document in<br />
place <strong>of</strong> the three dots in the above skeleton. For example:
Module 2. Scientists must write 55<br />
\documentclass[12pt,a4paper]{article}<br />
\begin{document}<br />
\title{Assignment 1, Question 3: The<br />
importance <strong>of</strong> beings fractal}<br />
\author{Ben Hall, Q99123456}<br />
\maketitle<br />
Fractal geometry, largely inspired by Benoit Mandelbrot<br />
during the sixties and seventies, is one <strong>of</strong> the great<br />
advances in mathematics for two thousand years. Given<br />
the rich and diverse power <strong>of</strong> developments in<br />
mathematics and its applications, this is a remarkable<br />
claim.<br />
Often presented as being just a part <strong>of</strong> modern chaos<br />
theory, fractals are momentous in their own right.<br />
Euclid’s geometry describes the world around us in<br />
terms <strong>of</strong> points, lines and planes---for two thousand<br />
years these have formed the somewhat limited repertoire<br />
<strong>of</strong> basic geometric objects with which to describe the<br />
universe. Fractals immeasurably enhance this<br />
world-view by providing a description <strong>of</strong> much around us<br />
that is rough and fragmented---<strong>of</strong> objects that have<br />
structure on many sizes. Examples include: coastlines,<br />
rivers, plant distributions, architecture, wind gusts,
Module 2. Scientists must write 56<br />
music, and the cardiovascular system.<br />
\end{document}<br />
See that paragraphs are indicated by introducing a blank line to indicate to<br />
L A TEX w<strong>here</strong> one paragraph ends and another begins. All other line breaks in<br />
the source are treated as simply blank characters. Line breaks in your source<br />
do not correspond to line breaks in the typeset document. 1 Type a few<br />
paragraphs, each <strong>of</strong> a couple <strong>of</strong> sentences, and typeset your own document.<br />
Longer documents need sections: although you may not need sectioning<br />
to answer simple questions, most documents do. In L A TEX automatically<br />
numbered sections and their titles are specified by the \section{...} command.<br />
W<strong>here</strong>ver you want to start a new section, just put this command<br />
with the title <strong>of</strong> the section within the braces. See the two sections in the<br />
following example<br />
\documentclass[12pt,a4paper]{article}<br />
\begin{document}<br />
\title{Assignment 1, Question 3: The<br />
importance <strong>of</strong> beings fractal}<br />
1 In fact, this is one brilliant aspect <strong>of</strong> TEX: Knuth programmed a sophisticated optimisation<br />
scheme to determine the very best line breaks to be made in a paragraph. He<br />
incorporated the knowledge <strong>of</strong> the best printers accumulated over hundreds <strong>of</strong> years.
Module 2. Scientists must write 57<br />
\author{Ben Hall, Q99123456}<br />
\maketitle<br />
Fractal geometry, largely inspired by Benoit Mandelbrot<br />
during the sixties and seventies, is one <strong>of</strong> the great<br />
advances in mathematics for two thousand years. Given<br />
the rich and diverse power <strong>of</strong> developments in<br />
mathematics and its applications, this is a remarkable<br />
claim.<br />
\section{Some fractal models}<br />
Before discussing in detail the common feature <strong>of</strong> the<br />
previously mentioned examples, I present a few<br />
examples <strong>of</strong> fractals and the type <strong>of</strong> physical<br />
applications that they have.<br />
\section{Scaling and dimensionality}<br />
The common theme in these examples is not just that<br />
they have detail on many lengths, but also that the<br />
structure at any scale is much the same at any other<br />
scale---the coastline around a continent looks just<br />
like any small part <strong>of</strong> the coastline.
Module 2. Scientists must write 58<br />
\end{document}<br />
However, for many purposes you will want to emphasise the main point <strong>of</strong><br />
a paragraph by using the \paragraph{...} command to introduce a simple<br />
statement at the start. For example,<br />
\paragraph{Construct the Cantor set:} start with a<br />
bar <strong>of</strong> some length; then remove its middle third to<br />
leave two separate thirds; then remove the middle<br />
thirds <strong>of</strong> these to leave four separate ninths; then<br />
remove the middle thirds <strong>of</strong> these to obtain eight<br />
separate twenty-sevenths; and so on. Eventually we<br />
just obtain a scattered dust <strong>of</strong> points.<br />
produces the following<br />
Construct the Cantor set: start with a bar <strong>of</strong> some length;<br />
then remove its middle third to leave two separate thirds; then<br />
remove the middle thirds <strong>of</strong> these to leave four separate ninths;<br />
then remove the middle thirds <strong>of</strong> these to obtain eight separate<br />
twenty-sevenths; and so on. Eventually we just obtain a scattered<br />
dust <strong>of</strong> points.
Module 2. Scientists must write 59<br />
I like using \paragraph commands. For example, at the start <strong>of</strong> this subsection<br />
they were used to highlight the actions to do if you could install L A TEX<br />
on you computer, or not, as the case may be.<br />
Note that the characters slosh, \, and braces, { and }, are special characters<br />
to L A TEX as they are used to invoke the typesetting mark-up commands.<br />
T<strong>here</strong> are other special characters in L A TEX <strong>of</strong> which to be wary:<br />
• the percent sign, %, causes LaTeX to ignore the rest <strong>of</strong> the line, so you<br />
may comment the document if needed;<br />
• the dollar, $, used to delineate in-line mathematics;<br />
• the ampersand, &, for tabbing;<br />
• the underscore and caret, _ and ^, for subscripts and superscripts;<br />
• the hash, #, and the tilde, ~ .<br />
To actually get any <strong>of</strong> these last nine characters (except the slosh \) to appear<br />
in the final typeset document, just precede them by a slosh (a backslash).<br />
2.2.3 Simple mathematics goes in-line<br />
Mathematics to be typeset in-line with the text must be contained within<br />
matching dollar signs $...$ . For example, Newton’s law F = ma is typeset
Module 2. Scientists must write 60<br />
by $F=ma$ . Note the different font used for mathematical letters (called math<br />
italic): it is imperative that all mathematics be typeset in a mathematics<br />
environment (even if the mathematics is just a single letter symbol), and not<br />
in the roman font that is the default for text; equally it is imperative that<br />
non-mathematical text is not placed within a mathematics environment. For<br />
example,<br />
Newton’s law is $F= m a$<br />
for force $ F $, mass $m$<br />
and acceleration $a$.<br />
Newton’s law is F = ma for<br />
force F , mass m and acceleration<br />
a.<br />
See that in any mathematics environment, blank characters are totally ignored.<br />
Subscripts and superscripts are typeset in a mathematics environment using<br />
the underscore and the caret character respectively. For example, $d^{-1}$<br />
and $d^2$ will typeset d −1 and d 2 . The theorem <strong>of</strong> Pythagoras is a 2 + b 2 =<br />
c 2 obtained by $a^2+b^2=c^2$ . Similarly, subscripts are generated by the<br />
underscore character _ : for example, the Fibonacci numbers are obtained by<br />
the recursion F n+2 = F n+1 + F n typeset by $F_{n+2}=F_{n+1}+F_n$ . Single<br />
character scripts need no enclosing braces.<br />
L A TEX has an enormously wide variety <strong>of</strong> symbols to help typeset mathematics.<br />
For example, \times to get a times sign, ×; \propto to get a<br />
proportional to symbol, ∝; \pi to get the Greek letter π, and similarly for
Module 2. Scientists must write 61<br />
the whole Greek alphabet. The names <strong>of</strong> these symbols have to be followed<br />
by a non-alphabetic character, <strong>of</strong>ten a blank. See §2.2.8 for tables <strong>of</strong> just<br />
some <strong>of</strong> the vast number <strong>of</strong> symbols available in L A TEX.<br />
Fractions are typeset using \frac{}{} (with two arguments in braces): 1 − n<br />
1 = 3 only if n = 1 is typeset using $\frac{1}{n}-1=3$ and $n=\frac{1}{2\times 2}$ .<br />
2×2<br />
For in-line mathematics you must only use simple expressions in fractions as<br />
otherwise it becomes to hard to read.<br />
2.2.4 List environments usefully organise<br />
Extremely useful are the list environments <strong>of</strong> which I describe two. Use<br />
them w<strong>here</strong>ver you have a sequence <strong>of</strong> steps (perhaps in a mathematical<br />
argument) or a list <strong>of</strong> things (perhaps describing an algorithm). The format<br />
for a bulleted list is<br />
\begin{itemize}<br />
\item ...<br />
\item ...<br />
...<br />
\end{itemize}<br />
You might use such a list to clearly structure an argument such as:
Module 2. Scientists must write 62<br />
\begin{itemize}<br />
\item To solve the differential equation $y’’-y’-2y=0$,<br />
\item<br />
substitute the exponential $y=\exp(\lambda t)$,<br />
\item and deduce that $\lambda^2-\lambda-2=0$.<br />
\end{itemize}<br />
(the blank lines are optional) is typeset as<br />
• To solve the differential equation y ′′ − y ′ − 2y = 0,<br />
• substitute the exponential y = exp(λx),<br />
• and deduce that λ 2 − λ − 2 = 0.<br />
For an example <strong>of</strong> a numbered list, see that I used a numbered list at the<br />
beginning <strong>of</strong> §2.2.2 to advise you <strong>of</strong> the steps to follow to start using L A TEX.<br />
The general format for a numbered list is<br />
\begin{enumerate}<br />
\item ...<br />
\item ...<br />
...<br />
\end{enumerate}<br />
Lists may be nested within lists, to a maximum depth <strong>of</strong> four.
Module 2. Scientists must write 63<br />
2.2.5 Complex mathematics is displayed<br />
Recall, to include mathematics in-line with the text, use $...$ . However,<br />
<strong>of</strong>ten mathematics is sufficiently complicated that it displayed, centred, on<br />
a line by itself. For this purpose use the displaymath or equation environments.<br />
The difference between the two is that the equation environment<br />
automatically typesets a useful labelling number alongside the mathematics,<br />
w<strong>here</strong>as the displaymath environment does not. See the following examples:<br />
\begin{displaymath}<br />
a^2+b^2=c^2<br />
\end{displaymath}<br />
\begin{equation}<br />
\log x=\int_1^x<br />
\frac{1}{t}dt<br />
\end{equation}<br />
log x =<br />
a 2 + b 2 = c 2<br />
∫ x<br />
1<br />
1<br />
dt (2.1)<br />
t<br />
Relations<br />
L A TEX knows to typeset extra space around relations such as =, \approx (≈)<br />
including inequalities such as , \leq (≤) and \geq (≥), and also set<br />
relations such as \in (∈) or \subset (⊂).
Module 2. Scientists must write 64<br />
Delimiters<br />
Delimiters, such as parentheses, brackets, and braces, come in various sizes<br />
to cope with different sub-expressions that they surround. The easiest way to<br />
get the size <strong>of</strong> a delimiter nearly correct is to use the modifying commands as<br />
in \left(...\right) . Note that \left and \right must be used in pairs<br />
so that L A TEX can determine the size <strong>of</strong> the intervening mathematics.<br />
See how the delimiters are <strong>of</strong> reasonable size in these two examples:<br />
\begin{displaymath}<br />
\left(a+b\right)\left[1<br />
-\frac{b}{a+b}\right]=a\,;<br />
\end{displaymath}<br />
\begin{displaymath}<br />
\sqrt{|xy|}\leq\left|\frac<br />
{x+y}{2}\right|\,.<br />
\end{displaymath}<br />
(a + b)<br />
[<br />
1 − b ]<br />
a + b<br />
√<br />
|xy| ≤<br />
∣<br />
x + y<br />
2<br />
= a ;<br />
∣ .<br />
Spacing<br />
In the previous two examples I used \,, and \,. to punctuate at the end<br />
<strong>of</strong> the equations. Both in and out <strong>of</strong> mathematics, LaTeX provides the<br />
commands:
Module 2. Scientists must write 65<br />
• \, to typeset a thin space;<br />
• \␣ to typeset a normal space;<br />
• \quad to typeset a quad space;<br />
• and \! to typeset some negative space!<br />
Use these to space the mathematics w<strong>here</strong> needed. Integrals <strong>of</strong>ten need a bit<br />
<strong>of</strong> help with their spacing as in<br />
\begin{displaymath}<br />
\int\!\!\!\int xy^2\,dx\,dy<br />
=\frac{1}{6}x^2y^3\,,<br />
\end{displaymath}<br />
∫∫<br />
xy 2 dxdy = 1 6 x2 y 3 ,<br />
w<strong>here</strong>as vector problems <strong>of</strong>ten lead to statements such as<br />
\begin{displaymath}<br />
u=\frac{-y}{x^2+y^2}<br />
\quad\mbox{and}\quad<br />
v=\frac{x}{x^2+y^2}\,.<br />
\end{displaymath}<br />
u =<br />
−y<br />
x 2 + y 2 and v =<br />
x<br />
x 2 + y 2 .
Module 2. Scientists must write 66<br />
Use:<br />
• thin spaces, \,, to separate the infinitesimals from each other and from<br />
the integrand in integrals, and to separate punctuation from an equation<br />
or expression;<br />
• some negative space, \!\!\!, in multi-dimensional integrals to bring<br />
the integral signs closer together;<br />
• \quad to separate two or more equations or text on the one line.<br />
Observe the use <strong>of</strong> the \mbox{...} command to include a few words <strong>of</strong><br />
ordinary (roman) text and its spacing within a mathematics environment.<br />
Arrays<br />
Frequently we need to set mathematics in a tabular format. For example,<br />
arrays are typeset within a mathematics environment by the array environment.<br />
The structure is<br />
\begin{array}{argument}<br />
... & ... & ... & ... \\<br />
... & ... & ... & ... \\<br />
... & ... & ... & ...<br />
\end{array}
Module 2. Scientists must write 67<br />
for an array <strong>of</strong> three rows and four columns. The argument consists <strong>of</strong> the<br />
letters r, c or l to indicate that the corresponding columns are to be typeset<br />
right, centred or left justified. An array example is<br />
\begin{displaymath}<br />
\left[\begin{array}{ccc}<br />
1 & x & 0 \\<br />
0 & 1 & -1<br />
\end{array}\right]<br />
\left[\begin{array}{c}<br />
1 \\ y \\ 1<br />
\end{array}\right]<br />
=\left[ \begin{array}{c}<br />
1+xy \\<br />
y-1<br />
\end{array} \right]\,,<br />
\end{displaymath}<br />
[<br />
1 x 0<br />
0 1 −1<br />
] ⎡ ⎢ ⎣<br />
1<br />
y<br />
1<br />
⎤<br />
⎥<br />
⎦ =<br />
[<br />
1 + xy<br />
y − 1<br />
]<br />
,<br />
or in a case statement such as
Module 2. Scientists must write 68<br />
\begin{displaymath}<br />
|x|=\left\{\begin{array}{rl}<br />
x\,, & \mbox{if }x\geq 0\,, \\<br />
-x\,, & \mbox{if }x< 0\,.<br />
\end{array}\right.<br />
\end{displaymath}<br />
|x| =<br />
{<br />
x , if x ≥ 0 ,<br />
−x , if x < 0 .<br />
Many arrays have lots <strong>of</strong> dots all over the place as in<br />
\begin{displaymath}<br />
\left[\begin{array}{ccccc}<br />
-2&1&0&\cdots&0 \\<br />
1&-2&1&\cdots&0 \\<br />
0&1&-2&\ddots&\vdots \\<br />
\vdots&\vdots&\ddots&\ddots&1\\<br />
0&0&\cdots&1&-2<br />
\end{array}\right]<br />
\end{displaymath}<br />
⎡<br />
⎢<br />
⎣<br />
−2 1 0 · · · 0<br />
1 −2 1 · · · 0<br />
0 1 −2 . . . .<br />
.<br />
. . .. . .. 1<br />
0 0 · · · 1 −2<br />
⎤<br />
⎥<br />
⎦<br />
Equation arrays<br />
Often we want to align related equations together, or to align each line <strong>of</strong> a<br />
multi-line derivation. The eqnarray mathematics environment does this.
Module 2. Scientists must write 69<br />
The format is the same as an array environment, except that the eqnarray<br />
environment automatically assumes three columns: the left column right<br />
justified; the centre, centred; and the right column left justified:<br />
\begin{eqnarray}<br />
... & ... & ... \\<br />
... & ... & ... \\<br />
... & ... & ...<br />
\end{eqnarray}<br />
Each line will be numbered by L A TEX, unless you specify \nonumber in a line,<br />
or unless you use the * form <strong>of</strong> eqnarray.<br />
For examples, in the flow <strong>of</strong> a fluid film we may report<br />
\begin{eqnarray}<br />
u&=&\epsilon^2 k_{xxx}\sin y\,,\\<br />
v&=&\epsilon^3 k_{xxx} y\,, \\<br />
p&=&\epsilon k_{xx}\,.<br />
\end{eqnarray}<br />
u = ɛ 2 k xxx sin y , (2.2)<br />
v = ɛ 3 k xxx y , (2.3)<br />
p = ɛk xx . (2.4)<br />
Alternatively, the curl <strong>of</strong> a vector field (u, v, w) may be written with only<br />
one equation number:
Module 2. Scientists must write 70<br />
\begin{eqnarray}<br />
\omega_1 & = &<br />
\frac{\partial w}{\partial y}<br />
-\frac{\partial v}{\partial z}<br />
\,, \nonumber \\<br />
\omega_2 & = &<br />
\frac{\partial u}{\partial z}<br />
-\frac{\partial w}{\partial x}<br />
\,, \\<br />
\omega_3 & = &<br />
\frac{\partial v}{\partial x}<br />
-\frac{\partial u}{\partial y}<br />
\,. \nonumber<br />
\end{eqnarray}<br />
ω 1 = ∂w<br />
∂y − ∂v<br />
∂z ,<br />
ω 2 = ∂u<br />
∂z − ∂w<br />
∂x , (2.5)<br />
ω 3 = ∂v<br />
∂x − ∂u<br />
∂y .<br />
W<strong>here</strong>as a derivation may look like
Module 2. Scientists must write 71<br />
\begin{eqnarray*}<br />
&&<br />
(p ∧ q) ∨ (p ∧ ¬q)<br />
(p\wedge q)\vee(p\wedge\neg q)\\<br />
= p ∧ (q ∨ ¬q) distributive law<br />
& = & p\wedge(q\vee\neg q)<br />
\quad\mbox{distributive law}\\ = p ∧ T excluded middle<br />
& = & p\wedge T<br />
\quad\mbox{excluded middle} \\<br />
& = & p<br />
\quad\mbox{by identity}<br />
\end{eqnarray*}<br />
= p by identity<br />
Functions<br />
L A TEX knows how to typeset a lot <strong>of</strong> mathematical functions.<br />
• Trigonometric and other elementary functions are defined by the obvious<br />
corresponding command name. Two examples are \sin x or<br />
\exp(i\theta) . Observe that trigonometric and other elementary<br />
functions are typeset properly, in roman, even to the extent <strong>of</strong> automatically<br />
providing a thin space if followed by a single symbol argument:
Module 2. Scientists must write 72<br />
\begin{displaymath}<br />
\exp(i\theta)=\cos\theta<br />
+i\sin\theta\,,<br />
exp(iθ) = cos θ + i sin θ ,<br />
\end{displaymath}<br />
\begin{displaymath}<br />
\sinh(\log x)=\frac{1}{2}<br />
\left(x-\frac{1}{x}\right)\,.<br />
\end{displaymath}<br />
sinh(log x) = 1 2<br />
(<br />
x − 1 )<br />
.<br />
x<br />
• Subscriptss on more complicated functions, such as \lim_{...} and<br />
\max_{...} are appropriately placed under the function name.<br />
\begin{displaymath}<br />
\lim_{q\to\infty}\|f(x)\|_q<br />
=\max_{x}|f(x)|\,.<br />
\end{displaymath}<br />
lim ‖f(x)‖ q = max |f(x)| .<br />
q→∞ x<br />
• And the same goes for both sub- and superscripts on large operators<br />
such as \sum, \prod, etc.
Module 2. Scientists must write 73<br />
\begin{displaymath}<br />
e^x=\sum_{n=0}^\infty<br />
\frac{x^n}{n!}\,,\quad<br />
n!=\prod_{i=1}^n i\,.<br />
\end{displaymath}<br />
e x =<br />
∞∑<br />
n=0<br />
x n<br />
n! , n! = n ∏<br />
i=1<br />
i .<br />
Although in in-line mathematics the scripts are automatically placed<br />
to the side in order to conserve vertical space and to strive for uniform<br />
vertical spacing, as in 1/(1 − x) = ∑ ∞<br />
n=0 x n obtained from<br />
$1/(1-x)=\sum_{n=0}^\infty x^n$<br />
Accents<br />
Common mathematical accents over a single character, say a, are: \bar a<br />
for ā; \tilde a for ã; \hat a for â; \dot a for ȧ; \ddot a for ä; and \vec a<br />
for ⃗a . Two examples:
Module 2. Scientists must write 74<br />
\begin{displaymath}<br />
\bar f=\frac{1}{L}<br />
\int_0^L f(x)\,dx\,,<br />
¯f = 1 ∫ L<br />
f(x) dx ,<br />
L 0<br />
\end{displaymath}<br />
\begin{displaymath}<br />
\dot{\vec \omega}=<br />
\vec r\times\vec I\,.<br />
\end{displaymath}<br />
˙⃗ω = ⃗r × ⃗ I .<br />
Command definition<br />
L A TEX provides a facility for you to define your very own commands. Most<br />
useful commands involve arguments; I give three <strong>of</strong> my favourites. The first<br />
two, with two arguments, define partial derivative commands<br />
\newcommand{\D}[2]{\frac{\partial #2}{\partial #1}}<br />
\newcommand{\DD}[2]{\frac{\partial^2 #2}{\partial #1^2}}<br />
\renewcommand{\vec}[1]{{\bf #1}}<br />
and the last, with one argument, redefines the \vec command to denote<br />
vectors by boldface characters (rather than have an arrow accent). Note that<br />
within a definition, #n denotes a placeholder for the nth supplied argument.<br />
This vector identity will serve nicely to illustrate two <strong>of</strong> the new commands:
Module 2. Scientists must write 75<br />
\begin{displaymath}<br />
\nabla\times\vec q<br />
=\vec i\left(\D yw-\D zv\right)<br />
+\vec j\left(\D zu-\D xw\right)<br />
+\vec k\left(\D xv-\D yu\right)<br />
\end{displaymath}<br />
∇ × q = i<br />
( ∂w<br />
∂y − ∂v ) ( ∂u<br />
+ j<br />
∂z ∂z − ∂w ) ( ∂v<br />
+ k<br />
∂x ∂x − ∂u )<br />
∂y<br />
You will have noticed that L A TEX is very verbose. Many people define their<br />
own abbreviations for the common commands so that they are quicker to<br />
type. My advice is do not do this; it makes your L A TEX much less portable<br />
and harder to read. Instead, setup your editor to cater for the verbosity;<br />
use command definitions only to give you new logical facilities, such as the<br />
partial differentiation.<br />
2.2.6 Figures float<br />
Often we illustrate or support discussion with a figure. Usually figures are<br />
big. Thus they make a mess <strong>of</strong> the pagination <strong>of</strong> a document. The solution<br />
adopted by pr<strong>of</strong>essional printers and L A TEX is to generally place figures at
Module 2. Scientists must write 76<br />
the top or bottom <strong>of</strong> a page, or on a page by itself, near w<strong>here</strong> the author<br />
specifies. That is, the location <strong>of</strong> a figures “floats”. The usual way to include<br />
a figure in L A TEX is as follows.<br />
1. Create a postscript file <strong>of</strong> the drawing from whatever application is<br />
being used to generate the figure. For example, in Matlab draw the<br />
figure then execute the command<br />
print -depsc2 filename.eps<br />
to store in the file filename.eps the encapsulated postscript to draw<br />
the figure. Users <strong>of</strong> Windows may have trouble generating postscript<br />
from other applications as Micros<strong>of</strong>t generally do not distribute the<br />
postscript printer driver—if needed, get it.<br />
2. Then place in the preamble (that part <strong>of</strong> the document between the<br />
\documentclass command and the \begin{document} environment)<br />
the command \usepackage{graphicx} . This tells L A TEX to load information<br />
about how to include graphics.<br />
3. Somew<strong>here</strong> near w<strong>here</strong> you want the figure, include the figure environment<br />
\begin{figure}<br />
\centerline{\includegraphics{...}}<br />
\caption{...}<br />
\end{figure}
Module 2. Scientists must write 77<br />
w<strong>here</strong> the argument <strong>of</strong> the \includegraphics command is the full filename,<br />
and the argument to the \caption command is text describing<br />
the figure.<br />
4. Or use this version to scale the picture up/down to the width <strong>of</strong> the<br />
page<br />
\begin{figure}<br />
\includegraphics[width=0.9\textwidth]{...}<br />
\caption{...}<br />
\end{figure}<br />
The optional [width=0.9\textwidth] scales the figure to 90% <strong>of</strong> the<br />
width <strong>of</strong> the typeset text: change it if desired; leave it out in order to<br />
reproduce the figures unscaled.<br />
For example, the following commands draw and place Figure 2.1 somew<strong>here</strong><br />
near <strong>here</strong> (on page 78), but not precisely <strong>here</strong> as L A TEX has chosen a better<br />
place for it.<br />
\begin{figure}<br />
\centerline{\includegraphics{cantor.eps}}<br />
\caption{steps in the construction <strong>of</strong> a Cantor set.}<br />
\end{figure}
Module 2. Scientists must write 78<br />
Figure 2.1: steps in the construction <strong>of</strong> a Cantor set.<br />
Note: I strongly recommend that you generate any graphic at about the same<br />
size as it is to appear. This is so that the title, label and legend information<br />
is actually readable and the line thicknesses are creditable. Astonishingly,<br />
some people do shrink a figure by a factor <strong>of</strong> about three and expect the<br />
captions and labelling to be readable!<br />
2.2.7 Summary<br />
• L A TEX is the best.<br />
• In §2.2.2 you saw how to prepare and process simple documents in<br />
L A TEX complete with titles and sectioning. then simple mathematics<br />
could go in-line as seen in §2.2.3.<br />
• But note that lists, §2.2.4, provide a useful structure for mathematical<br />
derivations as well as lists <strong>of</strong> points.
Module 2. Scientists must write 79<br />
• Complicated mathematics is displayed. As discussed in §2.2.5 t<strong>here</strong> is<br />
a multitude <strong>of</strong> L A TEX structures and commands to help you do this.<br />
Make your displayed mathematics a work <strong>of</strong> art, but keep it as simple<br />
as possible so L A TEX works for you.<br />
• Lastly, we <strong>of</strong>ten need to typeset a figure in a mathematical document<br />
as described in §2.2.6. L A TEX floats these to a reasonable position.<br />
L A TEX can do so much more for you too: automatic cross-referencing, tables,<br />
table <strong>of</strong> contents, footnotes, marginal notes, hypertext links, bibliographies,<br />
indexing, two-sided printing, two-column printing, colour, different fonts, a<br />
vast number <strong>of</strong> mathematical symbols, music, etc. Here we have presented<br />
the basics.<br />
2.2.8 Many mathematical symbols in L A TEX
Module 2. Scientists must write 80<br />
α \alpha β \beta γ \gamma δ \delta<br />
ɛ \epsilon ε \varepsilon ζ \zeta η \eta<br />
θ \theta ϑ \vartheta ι \iota κ \kappa<br />
λ \lambda µ \mu ν \nu ξ \xi<br />
π \pi ϖ \varpi ρ \rho ϱ \varrho<br />
σ \sigma ς \varsigma τ \tau υ \upsilon<br />
φ \phi ϕ \varphi χ \chi ψ \psi<br />
ω \omega<br />
Table 2.1: Lowercase Greek letters<br />
Γ \Gamma ∆ \Delta Θ \Theta Λ \Lambda<br />
Ξ \Xi Π \Pi Σ \Sigma Υ \Upsilon<br />
Φ \Phi Ψ \Psi Ω \Omega<br />
Table 2.2: Uppercase Greek letters<br />
± \pm ∩ \cap ⋄ \diamond ⊕ \oplus<br />
∓ \mp ∪ \cup △ \bigtriangleup ⊖ \ominus<br />
× \times ⊎ \uplus ▽ \bigtriangledown ⊗ \otimes<br />
÷ \div ⊓ \sqcap ⊳ \triangleleft ⊘ \oslash<br />
∗ \ast ⊔ \sqcup ⊲ \triangleright ⊙ \odot<br />
⋆ \star ∨ \vee ∧ \wedge ○ \bigcirc<br />
† \dagger \ \setminus ∐ \amalg ◦ \circ<br />
‡ \ddagger · \cdot ≀ \wr • \bullet<br />
Table 2.3: Binary Operation Symbols
Module 2. Scientists must write 81<br />
≠<br />
≤ \leq ≥ \geq ≡ \equiv |= \models<br />
≺ \prec ≻ \succ ∼ \sim ⊥ \perp<br />
≼ \preceq ≽ \succeq ≃ \simeq | \mid<br />
≪ \ll ≫ \gg ≍ \asymp ‖ \parallel<br />
⊂ \subset ⊃ \supset ≈ \approx ⊲⊳ \bowtie<br />
⊆ \subseteq ⊇ \supseteq ∼ = \cong ⊲⊳ \Join<br />
\neq ⌣ \smile<br />
⊑ \sqsubseteq ⊒ \sqsupseteq<br />
∈ \in ∋ \ni ∝ \propto<br />
⊢ \vdash ⊣ \dashv<br />
Table 2.4: Binary relations<br />
.<br />
= \doteq ⌢ \frown<br />
̸<br />
ℵ \aleph ′ \prime ∀ \forall ∞ \infty<br />
¯h \hbar ∅ \emptyset ∃ \exists<br />
ı \imath ∇ \nabla ¬ \neg △ \triangle<br />
j \jmath √ \surd ♭ \flat △ \triangle<br />
l \ell ⊤ \top ♮ \natural ♣ \clubsuit<br />
℘ \wp ⊥ \bot ♯ \sharp ♦ \diamondsuit<br />
R \Re ‖ \| \ \backslash ♥ \heartsuit<br />
I \Im \angle ∂ \partial ♠ \spadesuit<br />
Table 2.5: Miscellaneous symbols
Module 2. Scientists must write 82<br />
← \leftarrow ←− \longleftarrow ↑ \uparrow<br />
⇐ \Leftarrow ⇐= \Longleftarrow ⇑ \Uparrow<br />
→ \rightarrow −→ \longrightarrow ↓ \downarrow<br />
⇒ \Rightarrow =⇒ \Longrightarrow ⇓ \Downarrow<br />
↔ \leftrightarrow ←→ \longleftrightarrow ↕ \updownarrow<br />
⇔ \Leftrightarrow ⇐⇒ \Longleftrightarrow ⇕ \Updownarrow<br />
↦→ \mapsto ↦−→ \longmapsto ↗ \nearrow<br />
←↪ \hookleftarrow ↩→ \hookrightarrow ↘ \searrow<br />
↼ \leftharpoonup ⇀ \rightharpoonup ↙ \swarrow<br />
↽ \leftharpoondown ⇁ \rightharpoondown ↖ \nwarrow<br />
⇀↽ \rightleftharpoons<br />
Table 2.6: Arrow symbols<br />
( ( ) ) ↑ \uparrow<br />
[ ] ] ] ↓ \downarrow<br />
{ \{ } \} ↕ \updownarrow<br />
⌊ \lfloor ⌋ \rfloor ⇑ \Uparrow<br />
⌈ \lceil ⌉ \rceil ⇓ \Downarrow<br />
〈 \langle 〉 \rangle ⇕ \Updownarrow<br />
/ / \ \backslash<br />
| — ‖ \|<br />
Table 2.7: Delimiters
Module 2. Scientists must write 83<br />
∑ ∑<br />
\sum<br />
⋂ ⋂<br />
\bigcap<br />
⊙ ⊙<br />
\bigodot<br />
∏ ∏<br />
\prod<br />
⋃ ⋃<br />
\bigcup<br />
⊗ ⊗<br />
\bigotimes<br />
∐ ∐ ⊔ ⊔ ⊕ ⊕<br />
\coprod \bigsqcup \bigoplus<br />
∫ ∫<br />
∨ ∨ ⊎ ⊎<br />
\int<br />
\bigvee \biguplus<br />
∮ ∮<br />
∧ ∧<br />
\oint<br />
\bigwedge<br />
Table 2.8: Variable-sized symbols<br />
û \hat{u} ú \acute{u} ū \bar{u} ˙u \dot{u}<br />
ǔ \check{u} ù \grave{u} u \vec{u} ü \ddot{u}<br />
ŭ \breve{u} ũ \tilde{u}<br />
Table 2.9: Math accents
Module 3<br />
Describing the conservation <strong>of</strong><br />
material<br />
In this module we begin to learn how to mathematically model the flow <strong>of</strong><br />
material such as water and air. On our human scale, such material appears<br />
smooth and continuous, albeit made <strong>of</strong> uncounted billions <strong>of</strong> tiny molecules.<br />
We are lead to treat it as smooth in a mathematical description. The first<br />
task is to discover how to describe the movement <strong>of</strong> material. Only then do<br />
we move on to encode physical principles in mathematical terms that tell us<br />
how the various properties <strong>of</strong> the material interact and evolve. Solutions <strong>of</strong><br />
the resulting mathematical models exhibit the rich variety <strong>of</strong> behaviour we<br />
see and use in our everyday life.
Module 3. Describing the conservation <strong>of</strong> material 85<br />
Module contents<br />
3.1 Eulerian description <strong>of</strong> motion . . . . . . . . . . . . . 86<br />
3.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 89<br />
3.2 Conservation <strong>of</strong> mass . . . . . . . . . . . . . . . . . . . 96<br />
3.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />
3.3 Car traffic . . . . . . . . . . . . . . . . . . . . . . . . . . 100<br />
3.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 105<br />
3.3.2 Age structured populations is another application . . . 108<br />
3.3.3 Answers to selected Exercises . . . . . . . . . . . . . . . 110<br />
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 112<br />
The text for this module is by AJ Roberts, A one-dimensional introduction<br />
to continuum mechanics, World Scientific. References to the text use the<br />
format [R,reference].
Module 3. Describing the conservation <strong>of</strong> material 86<br />
3.1 Eulerian description <strong>of</strong> motion<br />
What does it mean to say: “the tide moves water in an estuary with a velocity<br />
5 cos(x/100−t/3) km/h”? W<strong>here</strong> will the fallout from the Chernobyl nuclear<br />
reactor accident be carried by the wind? Answers to such questions require<br />
an understanding <strong>of</strong> how the bulk movement <strong>of</strong> a material may be described<br />
by mathematics. In this section we begin to do this.<br />
Main aims:<br />
the most important aims <strong>of</strong> the section are to<br />
• understand the Eulerian description 1 <strong>of</strong> movement [R,§1.3], and<br />
• to introduce, understand and use the material derivative [R,§1.4]<br />
w<strong>here</strong> v denotes the velocity <strong>of</strong> the material.<br />
Df<br />
Dt = ∂f<br />
∂t + v ∂f<br />
∂x , (3.1)<br />
Reading 3.A Read all <strong>of</strong> Chapter 1 [R,pp1–10]. Especially study §1.3–4<br />
and work through Examples 1.2–4.<br />
1 Leonhard Euler (1707–83), born in Switzerland, developed the application <strong>of</strong> differential<br />
equations to the world around us. He worked prolifically in hydraulics, ballistics,<br />
geometry, optics, magnetism and electricity. He also introduced much modern notation<br />
such as i = √ −1.
Module 3. Describing the conservation <strong>of</strong> material 87<br />
The discussion in §1.2 [R,p6–7] and the Example 1.2–3 shown in Figure 1.4<br />
is readily realised. Get a rubber band and initially hold lightly tensioned<br />
between your two hands. Then move your hands apart. This is roughly the<br />
deformation discussed in Example 1.2–3. Use a bit <strong>of</strong> “liquid paper” or a<br />
texta to put some dots on the rubber band. Watch how the dots move as<br />
you stretch the rubber band. These could be the Lagrangian particle paths<br />
discussed in Example 1.2.<br />
Oceanographers <strong>of</strong>ten drop “floats” to drift with the ocean currents. These<br />
floaters are Lagrangian because they are carried with the moving water.<br />
They are used to help determine ocean circulation which, for example, in<br />
turn helps us model the ocean-atmosp<strong>here</strong> system to predict the nature <strong>of</strong><br />
global warming.<br />
The nature <strong>of</strong> the material derivative is also illustrated in car traffic. An<br />
observer sitting on the side <strong>of</strong> the road is an Eulerian observer <strong>of</strong> the traffic.<br />
He/she would see, for example, a tight bunch <strong>of</strong> cars quickly passing by and<br />
so the observed change in time <strong>of</strong> the density, the time derivative, would be<br />
quite high. However, a driver in one <strong>of</strong> the cars in the bunch is a Lagrangian<br />
observer, the driver would be stuck in the bunch for a long time and so the<br />
moving driver observes rates <strong>of</strong> change in time which are much lower. Using<br />
the material derivative, a stationary observer is able to determine how the<br />
moving driver will see the surrounding traffic, and vice-versa.<br />
Example 3.1: worked Problem 1.4 Here I outline the steps in an answer<br />
to Problem 1.4 [R,p10]. Work through the details.
Module 3. Describing the conservation <strong>of</strong> material 88<br />
(a) dx L /dt = v L is the velocity <strong>of</strong> the particle at x L at time t which<br />
= v E (x L , t) by its definition.<br />
(b) From part (a)<br />
• dxL = v E (x L , t) = 2t x L +1+t 2 which is a linear, first-order,<br />
dt 1+t 2<br />
ordinary differential equation for x L .<br />
• Recall [K,§1.7] that we multiply by an integrating factor (What<br />
is it?) to solve analytically this class <strong>of</strong> differential equations,<br />
to find<br />
• x L = (1+t 2 )(t+C) is the general solution for some integration<br />
constant C.<br />
• But you know that at time t = 0 particles have their initial<br />
position, namely x L (ξ, 0) = ξ, which determines the integration<br />
constant to be just C = ξ.<br />
• Thus x L = (1 + t 2 )(t + ξ).<br />
• Then use this analytic solution to find that the endpoints <strong>of</strong><br />
[0, 1] get carried to the endpoints <strong>of</strong> [10, 15].<br />
(c) Straightforwardly check your answers are:<br />
• ξ E =<br />
x − t by rearranging x L ;<br />
1+t 2<br />
• v L = 1 + 3t 2 + 2tξ by differentiating x L ;<br />
• a L = 6t + 2ξ by differentiating v L ;<br />
• and then confirm that a L (ξ E , t) = DvE . Dt<br />
“ξ” is the Greek<br />
letter “xi” and<br />
corresponds to the<br />
English “x”.
Module 3. Describing the conservation <strong>of</strong> material 89<br />
3.1.1 Exercises<br />
Activity 3.B Do Problems 1.1 [R,p5], 1.3, 1.5 [R,pp10–1], and the Exercises<br />
3.2–3.6. Send in to the examiner for feedback at least Problem 1.3<br />
[R,p10] and Exercise 3.2.<br />
Ex. 3.2: Sketch the velocity field, v(x), corresponding to the particle paths<br />
shown in the following picture. Note that v = dx = 1/(dt/dx) =<br />
dt<br />
1/(slope) and so v(x) at any x is inversely proportional to the slope at<br />
that x.<br />
5<br />
4<br />
3<br />
t<br />
2<br />
1<br />
0<br />
0 2 4 6 8 10<br />
x
Module 3. Describing the conservation <strong>of</strong> material 90<br />
Ex. 3.3: This time the velocity field, v(x, t), depends upon time. For particle<br />
paths shown below, sketch the velocity field at times t = 2 and<br />
t = 4.<br />
5<br />
4<br />
3<br />
t<br />
2<br />
1<br />
0<br />
0 2 4 6 8 10<br />
x<br />
Ex. 3.4: Consider the movement <strong>of</strong> some material in a one-dimensional continuum.<br />
Sketch the velocity field, v(x, t), at times t = 1.5 and t = 3.5<br />
corresponding to the particle paths shown below.
Module 3. Describing the conservation <strong>of</strong> material 91<br />
4<br />
particle paths<br />
3.5<br />
3<br />
2.5<br />
t<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
0 2 4 6 8 10<br />
x<br />
Ex. 3.5: For the steady (time independent) velocity field shown below, sketch<br />
the acceleration field obtained from the material derivative.
Module 3. Describing the conservation <strong>of</strong> material 92<br />
v<br />
1<br />
0.5<br />
0<br />
-0.5<br />
0 1 2 3 4 5 6 7 8<br />
x<br />
Ex. 3.6: Suppose particles <strong>of</strong> a continuum accelerate at a = sin 2x, use the<br />
material derivative to determine the corresponding steady velocity field<br />
v(x) given that v = 0 at x = 0.<br />
Ex. 3.7: Some particle paths are shown in the following picture:
Module 3. Describing the conservation <strong>of</strong> material 93<br />
t<br />
5<br />
4.5<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0.5<br />
particle paths<br />
0<br />
0 2 4 6 8 10<br />
x<br />
Given that the material was <strong>of</strong> uniform density at time t = 0, say<br />
ρ(x, 0) = 1, sketch the density <strong>of</strong> the material at time t = 3. Also<br />
sketch a graph <strong>of</strong> the particles’ velocity versus x at time t = 4.<br />
Example 3.8: worked Problem 1.2 Problem 1.2 [R,p5] leads into Section<br />
3.3 w<strong>here</strong> we model the flow <strong>of</strong> car traffic as a continuum. This
Module 3. Describing the conservation <strong>of</strong> material 94<br />
problem is a little difficult but shows how some algebra leads us to deduce<br />
that we may treat car traffic on a road as a material continuum!<br />
(a) The probability <strong>of</strong> having n cars in a stretch <strong>of</strong> road <strong>of</strong> length<br />
x + δx, P n (x + δx), equals the probability <strong>of</strong> n cars in length x<br />
and none in length δx, together with the probability <strong>of</strong> n − 1 cars<br />
in length x and one car in length δx.<br />
• Hence P n (x + δx) = P n (x)(1 − λδx) + P n−1 (x)λδx.<br />
• Rearrange to Pn(x+δx)−Pn(x) + λP<br />
δx n = λP n−1 .<br />
• Thus as δx → 0, dPn + λP dx n = λP n−1 is a differential equation<br />
for P n .<br />
• Substitute the expression P n = (λx) n e −λx /n! to show that it<br />
satisfies the differential equation.<br />
• One should also show that ∫ ∞<br />
0 P n (x) dx = 1 in order for P n to<br />
be a proper probability distribution. Use induction on n and<br />
integration by parts to do so. (Should any other property be<br />
checked?)<br />
(b) Deduce:<br />
(i) P n (0) = 0 is the probability <strong>of</strong> n cars fitting on a stretch <strong>of</strong><br />
road <strong>of</strong> length 0;<br />
(ii) P 0 (L) = e −λL is the exponentially decaying probability <strong>of</strong> no<br />
cars on a length L;<br />
(iii) P 1 (L) = λLe −λL is the probability <strong>of</strong> finding just one car<br />
in a length L, it reasonably rises from zero with L but over
Module 3. Describing the conservation <strong>of</strong> material 95<br />
lengths bigger than 1/λ it decays to zero as more and more<br />
cars are likely on long lengths <strong>of</strong> road.<br />
(c) The expected number <strong>of</strong> cars on a length x <strong>of</strong> road is<br />
¯n(x) =<br />
∞∑<br />
nP n (x) by definition <strong>of</strong> expectation<br />
n=0<br />
=<br />
∞∑<br />
n (λx)n e −λx<br />
n=1<br />
n!<br />
by value <strong>of</strong> P n<br />
∞<br />
= λxe −λx ∑ (λx) n−1<br />
n=1<br />
(n − 1)!<br />
rearranging factors<br />
= λx by Taylor series for e λx<br />
By similar machinations deduce the variance σ 2 (x) = (n − ¯n) 2 =<br />
n 2 − ¯n 2 is also just λx.<br />
Thus using an averaging length L, the average density <strong>of</strong> cars is ρ =<br />
¯n/L = λ. Since ¯n typically has random fluctuations <strong>of</strong> size σ = √ √<br />
λL,<br />
this estimate <strong>of</strong> the density has fluctuations <strong>of</strong> a size σ/L = λ/L → 0<br />
for large L. Thus averaging works and so cars on a road may be viewed<br />
as a continuum!
Module 3. Describing the conservation <strong>of</strong> material 96<br />
3.2 Conservation <strong>of</strong> mass<br />
A biologist, physicist and mathematician were in a bar. They<br />
watch two people enter a house across the street. A little later,<br />
they see three people leave the house. The biologist says, “They<br />
must have reproduced.” The physicist says, “We must have misinterpreted<br />
the initial input.” The mathematician says, “If one<br />
more person enters the house, t<strong>here</strong> will be no one inside.”<br />
Once we know how to describe motion and properties, we then have to deduce<br />
how these relate to each other. Principles based upon conservation enable<br />
us to do this. Based upon the conservation <strong>of</strong> mass we deduce a differential<br />
equation <strong>of</strong> wide spread importance.<br />
Main aims:<br />
the most important aims <strong>of</strong> this section are:<br />
• to understand how identifying physical processes in and on a slice <strong>of</strong><br />
the continuum leads to a partial differential equation to solve;<br />
• the derivation <strong>of</strong> the continuity equation<br />
Reading 3.C Study all <strong>of</strong> Section 2.1 [R,pp13–16].<br />
∂ρ<br />
∂t + ∂ (ρv) = 0 . (3.2)<br />
∂x
Module 3. Describing the conservation <strong>of</strong> material 97<br />
3.2.1 Exercises<br />
Activity 3.D Do Problem 2.1 [R,p15] and Exercises 3.9–3.12. Send in to<br />
the examiner for feedback at least Problems 3.10–3.12.<br />
Ex. 3.9: At some time the density <strong>of</strong> a material just happens to be constant<br />
in x and the velocity field is as drawn below<br />
v<br />
1<br />
0.5<br />
0<br />
-0.5<br />
0 1 2 3 4 5 6 7 8<br />
x<br />
Use the continuity equation (3.2) to identify w<strong>here</strong> the density is increasing<br />
in time and w<strong>here</strong> the density is decreasing.<br />
Ex. 3.10: Suppose the density at some fixed station x evolves in time according<br />
to the following picture
Module 3. Describing the conservation <strong>of</strong> material 98<br />
2.5<br />
2<br />
ρ<br />
1.5<br />
1<br />
0.5<br />
0 1 2 3 4 5 6<br />
t<br />
Can you justifiably deduce anything from the continuity equation about<br />
the velocity field v in the neighbourhood <strong>of</strong> x? If so, what?<br />
Ex. 3.11: Suppose the density and velocity at some time t are such that the<br />
product ρv is as shown in the following picture<br />
1.5<br />
ρ v<br />
1<br />
0.5<br />
0 1 2 3 4 5 6<br />
x
Module 3. Describing the conservation <strong>of</strong> material 99<br />
Can you justifiably deduce anything about the evolution <strong>of</strong> the density<br />
field ρ? If so, what?<br />
Ex. 3.12: Consider an interval [a, b] <strong>of</strong> a continuum and investigate the rate<br />
at which material is carried into and out <strong>of</strong> the interval. Suppose the<br />
velocity and density at the left-hand side is v(a) = 1 + t 2 and ρ(a) = 2,<br />
while that at the right-hand side is v(b) = (1+t) 2 and ρ(b) = 1/(1+t 2 ).<br />
At what net rate is matter being carried into the interval? How much<br />
is carried in between times t = 0 and t = 1?<br />
Ex. 3.13: Suppose the density field <strong>of</strong> a one dimensional continuum is ρ =<br />
exp[sin(t − x)] and the velocity field is v = cos(t − x). What is the<br />
flux <strong>of</strong> material past x = 0 as a function <strong>of</strong> time? how much material<br />
passes x = 0 in the time interval [0, π/2] ?
Module 3. Describing the conservation <strong>of</strong> material 100<br />
3.3 Car traffic<br />
As far as the laws <strong>of</strong> mathematics refer to reality, they are not<br />
certain, and as far as they are certain, they do not refer to reality.<br />
Albert Einstein<br />
One application <strong>of</strong> continuum modelling is to car traffic. We explore the<br />
modelling <strong>here</strong>, and from the mathematical model deduce phenomena that<br />
are seen on the roads.<br />
Main aims:<br />
the most important aims <strong>of</strong> this section are:<br />
• to appreciate the continuum modelling <strong>of</strong> car traffic;<br />
• the use <strong>of</strong> experimental results to formulate a complete problem;<br />
• the use <strong>of</strong> the classic approach <strong>of</strong> seeking equilibria and then linearisation<br />
to gain a preliminary understanding <strong>of</strong> the dynamics (as in Module<br />
1 but in vastly higher dimension).<br />
• and to introduce the basic features <strong>of</strong> the method <strong>of</strong> characteristics for<br />
solving nonlinear partial differential equations.<br />
Reading 3.E Study all <strong>of</strong> Section 2.2 [R,§2.2,pp16–37].
Module 3. Describing the conservation <strong>of</strong> material 101<br />
Note that the method <strong>of</strong> characteristics is not just an algebraic technique,<br />
that geometry and graphical drawing plays a crucial role. This is a feature<br />
that many people find difficult as they predominantly see mathematics purely<br />
as algebraic manipulation. But for the method <strong>of</strong> characteristics the graphical<br />
element is essential.<br />
Example 3.14: Given the car flux density relation Q(ρ) = ρ(1−ρ/150) cars<br />
per minute w<strong>here</strong> ρ is measured in cars per km, 0 ≤ ρ ≤ 150.<br />
(a) Draw the graph <strong>of</strong> characteristics for the evolution <strong>of</strong> a denser<br />
patch <strong>of</strong> cars for which the initial density is ρ 0 (x) = 25 + 50e −x2 /2<br />
cars per km.<br />
(b) Hence graph the predicted solution ρ(x, t) at times t = 0, 1, 2 and<br />
3 minutes.<br />
Solution: First, deduce the wave speed c(ρ) = Q ′ (ρ) = 1 − ρ/75 km<br />
per minute. Then tabulate the initial density field, the wave speed (the<br />
“slope” <strong>of</strong> the characteristics passing through all the initial points) and<br />
thus also the equation <strong>of</strong> the characteristic x = s + c 0 (s) t:
Module 3. Describing the conservation <strong>of</strong> material 102<br />
s ρ 0 c 0 = c[ρ 0 ] characteristic t = 1 t = 2 t = 3<br />
-4 25.02 0.6664 x = −4 + 0.67 t -3.33 -2.67 -2.00<br />
-3 25.56 0.6593 x = −3 + 0.66 t -2.34 -1.68 -1.02<br />
-2 31.77 0.5764 x = −2 + 0.58 t -1.42 -0.85 -0.27<br />
-1 55.33 0.2623 x = −1 + 0.26 t -0.74 -0.47 -0.21<br />
0 75 0 x = 0 + 0 t 0 0 0<br />
1 55.33 0.2623 x = 1 + 0.26 t 1.26 1.52 1.79<br />
2 31.77 0.5764 x = 2 + 0.58 t 2.58 3.15 3.73<br />
3 25.56 0.6593 x = 3 + 0.66 t 3.66 4.32 4.98<br />
4 25.02 0.6664 x = 4 + 0.67 t 4.67 5.33 6.00<br />
Also tabulated is the position, x value, <strong>of</strong> each characteristic at three<br />
later times. At these positions the density is the value <strong>of</strong> ρ 0 in the same<br />
row. On each <strong>of</strong> the characteristics, plotted below in a characteristic<br />
diagram, the density is constant, namely the value it had initially, as<br />
tabulated in the legend.
Module 3. Describing the conservation <strong>of</strong> material 103<br />
3<br />
2.5<br />
t<br />
2<br />
1.5<br />
1<br />
0.5<br />
25.02<br />
25.56<br />
31.77<br />
55.33<br />
75<br />
55.33<br />
31.77<br />
25.56<br />
25.02<br />
0<br />
-4 -2 0 2 4 6<br />
x<br />
s=(-4:4)’;<br />
rho0=25+50*exp(-s.^2/2);<br />
c0=1-rho0/75;<br />
t=0:3;<br />
x=s*ones(size(t))+c0*t;<br />
plot(x,t)<br />
legend(num2str(rho0))<br />
Then at any time t compute from the equation for each characteristic<br />
given above, the value <strong>of</strong> x w<strong>here</strong> you will find the density ρ 0 (s). Plotting<br />
these points gives the following curves for the density ρ(x, t) at<br />
each time t.
Module 3. Describing the conservation <strong>of</strong> material 104<br />
70<br />
60<br />
50<br />
t=0<br />
t=1<br />
t=2<br />
t=3<br />
ρ<br />
40<br />
30<br />
20<br />
10<br />
0<br />
-4 -2 0 2 4<br />
x<br />
s=linspace(-4,4)’;<br />
rho0=25+50*exp(-s.^2/2);<br />
c0=1-rho0/75;<br />
t=0:3;<br />
x=s*ones(size(t))+c0*t;<br />
plot(x,rho0)<br />
Observe how, over time, the density steepens at the back <strong>of</strong> a bunch <strong>of</strong><br />
cars, and lessens at the front.<br />
In the traffic light example in the textbook you have to imagine that all values<br />
<strong>of</strong> density occur at the mathematical point x = 0. Physically the initial<br />
density will smoothly rise from 0 in a distance in front <strong>of</strong> the traffic lights<br />
to the jamming density a distance behind the traffic lights: the reason is<br />
that density is only defined by choosing some averaging length, when this<br />
averaging length contains part queue and part empty road in front <strong>of</strong> the
Module 3. Describing the conservation <strong>of</strong> material 105<br />
lights then you get an intermediate value <strong>of</strong> the density. However, such a<br />
physically smooth transition occurs in the mathematical model at the mathematical<br />
point x = 0. Thus draw characteristics corresponding to every value<br />
<strong>of</strong> density emanating from x = 0.<br />
3.3.1 Exercises<br />
Activity 3.F Do Problems 2.2–6 [R,pp34–6] and Exercises 3.15–3.16. Send<br />
in to the examiner for feedback at least Problems 2.2 & 2.3(a) [R,pp34–<br />
5], and Ex. 3.16(a) below.<br />
In the last line <strong>of</strong><br />
Prob. 2.6(d),<br />
“dr/dt” should be<br />
“dρ/dt.”<br />
Ex. 3.15: Assume the flux Q(ρ) = ρ(1 − ρ/150)(1 − ρ/300) cars per minute<br />
w<strong>here</strong> the density ρ is in cars per km.<br />
(a) A uniform stream <strong>of</strong> cars is travelling at 50 km/hr. Approximately<br />
what density is the car traffic? At location x = 0, say, a group<br />
<strong>of</strong> cars leave the road to view the scenery, fill up with petrol, etc.<br />
Because they leave, the car traffic density is decreased locally: at<br />
what speed does the patch <strong>of</strong> low density travel in the car traffic?<br />
(b) Repeat the above for the case when the cars travel at 20 km/hr<br />
(because they are at a higher density).
Module 3. Describing the conservation <strong>of</strong> material 106<br />
Ex. 3.16: (a) Show that a constant ρ(x, t) = ρ ∗ is an equilibrium solution<br />
(a fixed point) <strong>of</strong><br />
∂ρ<br />
∂t = c(ρ) ∂2 ρ<br />
∂x . 2<br />
Argue that small fluctuations to ρ about ρ ∗ , say ˆρ(x, t), then obey<br />
the differential equation ∂ ˆρ = c ∂t ∗ ∂2 ˆρ<br />
(approximately).<br />
∂x 2<br />
(b) Repeat the above for the differential equation<br />
∂ρ<br />
∂t = ∂ [<br />
c(ρ) ∂ρ ]<br />
.<br />
∂x ∂x<br />
Ex. 3.17: The initial value problem ∂ρ + ρ ∂ρ = 0 such that ρ(x, 0) = ρ ∂t ∂x 0(x)<br />
has solution ρ = ρ 0 (s) on characteristics x = s+ρ 0 (s)t. Regard x = s+<br />
ρ 0 (s)t as an implicit equation for the function s(x, t), then differentiate<br />
it to find implicit formula for ∂s ∂s<br />
and . Hence show that ρ = ρ ∂t ∂x 0[s(x, t)]<br />
does indeed satisfy the governing differential equation ∂ρ + ρ ∂ρ = 0 .<br />
∂t ∂x<br />
Ex. 3.18: In Figure 3.1 is drawn the wave speed c(ρ) as a function <strong>of</strong> density<br />
ρ for car traffic along some road. Sketch the corresponding car<br />
flux-density relation Q(ρ). Estimate the value <strong>of</strong> the density corresponding<br />
to the maximum flux <strong>of</strong> cars.<br />
Also shown in Figure 3.1 is a plot <strong>of</strong> some initial car density field<br />
ρ 0 (x). Draw, with a little care, in the tx-plane characteristic curves<br />
for the subsequent evolution <strong>of</strong> car traffic; draw enough so that you<br />
then can draw a predicted density field ρ(x, t) at time t = 1 minute.
Module 3. Describing the conservation <strong>of</strong> material 107<br />
1<br />
c(ρ) (km/min)<br />
0.5<br />
0<br />
-0.5<br />
0 50 100 150<br />
ρ (cars/km)<br />
150<br />
ρ 0<br />
(cars/km)<br />
100<br />
50<br />
0<br />
-3 -2 -1 0 1 2 3<br />
x (km)<br />
Figure 3.1:
Module 3. Describing the conservation <strong>of</strong> material 108<br />
Approximately w<strong>here</strong> and when do you estimate a “traffic shock” will<br />
form? Give reasons.<br />
3.3.2 Age structured populations is another application<br />
This example <strong>of</strong> age structured populations is introduced to show a slightly<br />
different use <strong>of</strong> the continuity equation. The same important concepts are<br />
used in a different application.<br />
Consider a population <strong>of</strong> individuals <strong>of</strong> some species, either plant or animal.<br />
We study the structure <strong>of</strong> ages <strong>of</strong> the individuals in the population (not their<br />
spatial structure as in car traffic and other applications explored later).<br />
• Let x denote the age <strong>of</strong> individuals, in years say, and use fractions <strong>of</strong><br />
years by letting the age x be a real number (not just integers). Let<br />
t denote time as usual, in years also. Then the density ρ(x, t) is the<br />
average number <strong>of</strong> individuals in the population who at time t have age<br />
approximately x.<br />
• Now, the number <strong>of</strong> individuals who cross an age x are precisely those<br />
that are at age x. Hence the flux is q = ρ. Equivalently, imagine that<br />
individuals are aging at a rate v = 1 year per year (obviously!), and so<br />
again the flux q = ρv = ρ.
Module 3. Describing the conservation <strong>of</strong> material 109<br />
• Individual plants or animals will die due to accidents, disease, old age,<br />
etc. For simplicity, <strong>here</strong> just assume death only by accident with a constant<br />
probability; ignore old age and other mortal enemies. Then this is<br />
an example <strong>of</strong> the process introduced in Problem 2.1 w<strong>here</strong> individuals<br />
are removed at some rate. The expected number <strong>of</strong> individuals to die<br />
at age x, and hence removed from the population, is then proportional<br />
to the number at that age, namely ρ. Thus include a “source” term<br />
r = −λρ to the right-hand side <strong>of</strong> the continuity equation (negative<br />
because deaths remove individuals).<br />
• A continuity equation for the age structure ( ∂ρ + ∂q<br />
∂t ∂t<br />
∂ρ<br />
∂t + ∂ρ<br />
∂x = −λρ .<br />
= r) is thus<br />
• For example, a steady age population is found by assuming ∂ρ/∂t = 0<br />
and solving this equation. Then ρ = Ce −λx which shows the exponentially<br />
decreasing numbers <strong>of</strong> individuals at age x as fatal accidents<br />
almost inevitably happen to an individual sooner or later.<br />
• But what is the integration constant C? A partial differential equation<br />
generally needs boundary conditions and so far I have not supplied it<br />
with any. Here we need to specify some birth-rate. A constant rate <strong>of</strong><br />
births could reflect a scientist continually preparing new young cultures<br />
to place in the population: specified by say ρ(0, t) = C.<br />
A more sophisticated model says that the number <strong>of</strong> births depends<br />
upon the number and age <strong>of</strong> the parent population. One simple example
Module 3. Describing the conservation <strong>of</strong> material 110<br />
arises by assuming that each individual constantly and independent <strong>of</strong><br />
age gives rise to new individuals: then ρ(0, t) = µ ∫ ∞<br />
0 ρ dx for some<br />
birth-rate µ. Determine the integration constant C as a function <strong>of</strong> the<br />
birth-rate µ.<br />
3.3.3 Answers to selected Exercises<br />
3.6 v = √ 1 − cos 2x<br />
3.10 No.<br />
3.11 Yes. The rate <strong>of</strong> change in density with time has opposite sign to the<br />
slope <strong>of</strong> ρv.<br />
3.12 Rate <strong>of</strong> matter increase is 1 + 2t 2 − 2t/(1 + t 2 ). Total is 1 + 2 − log 2.<br />
3<br />
3.15 (a) ρ = 17.33 cars per km and c = 40.40 km per hour (b) ρ = 81.39<br />
cars per km and c = −11.17 cars per km.<br />
Prob. 1.1 (a) ρ = 1 (2n + 1) w<strong>here</strong> n = [ ]<br />
L<br />
L 2 ; (b) p =<br />
2n+1<br />
{2 + L 10−6 n(n +<br />
1)/3}<br />
Prob. 1.3 (a) Plot x = 1/(1 + t); (b) v L = −ξ 2 /(1 + ξt) 2 and a L =<br />
2ξ 3 /(1 + ξt) 3 ; (c) v E = −x 2 (d) determine Dv E /Dt = 2x 3 .<br />
Prob. 1.5 The label ξ is constant for each particle.
Module 3. Describing the conservation <strong>of</strong> material 111<br />
Prob. 2.5 (a) ∂s<br />
∂x<br />
shock!<br />
= 1/[1 + c′ ∂s<br />
0 (s)t] and = −c<br />
∂t 0 (s)/[1 + c ′ 0 (s)t]] (b) A<br />
Prob. 2.6 (a) Because the radioactive material is conserved. (b) characteristics<br />
are x = (s + t)(1 + t 2 ) (c) follows from s = x/(1 + t 2 ) − t<br />
(d) characteristics stay the same, but t<strong>here</strong> is decay along each characteristic.
Module 3. Describing the conservation <strong>of</strong> material 112<br />
3.4 Summary<br />
• The continuum approximation leads to describing density, velocity and<br />
stress/pressure fields, for example, as functions <strong>of</strong> position x and time t<br />
(§3.1).<br />
• Conservation <strong>of</strong> material leads to the continuity equation (§3.2)<br />
∂ρ<br />
∂t + ∂<br />
∂x (ρv) = 0 .<br />
• Typically, experimental observations are needed to complete the set <strong>of</strong><br />
continuum equations. For example, v = V (ρ) for cars (§3.3).<br />
• Linearisation <strong>of</strong> dynamical equations about convenient equilibria leads<br />
to approximate solutions which allow us to make useful predictions<br />
about the dynamics that occur in the applications.<br />
• The method <strong>of</strong> characteristics, based upon the chain rule and a graphical<br />
approach, leads to exact solutions <strong>of</strong> the partial differential equations<br />
describing nonlinear waves and shocks.
Module 4<br />
The dynamics <strong>of</strong> momentum<br />
Mass conservation is just one powerful principle in modelling the dynamics<br />
<strong>of</strong> material. Another fundamental principle for mechanical systems is the<br />
conservation <strong>of</strong> momentum. This is explored in this module and applied to<br />
the dynamics <strong>of</strong> gases and blood.<br />
Module contents<br />
4.1 Conservation <strong>of</strong> momentum . . . . . . . . . . . . . . . 115<br />
4.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 116<br />
4.2 Dynamics <strong>of</strong> ideal gases . . . . . . . . . . . . . . . . . . 119
Module 4. The dynamics <strong>of</strong> momentum 114<br />
4.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 120<br />
4.3 Equations <strong>of</strong> quasi-one-dimensional blood flow . . . . 124<br />
4.3.1 Answers to selected Exercises . . . . . . . . . . . . . . . 124<br />
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Module 4. The dynamics <strong>of</strong> momentum 115<br />
4.1 Conservation <strong>of</strong> momentum<br />
Balls in flight and other bodies obey Newton’s laws <strong>of</strong> motion, in particular<br />
F = ma, or in words, the net applied force F causes a body with mass m<br />
to move with acceleration a. These rules apply when the body is rigid. But<br />
many bodies are not. Many materials flex and compress or expand. What<br />
rules apply then? We find out in this section that Newton’s laws still apply<br />
in a novel manner.<br />
Main aims: This section largely repeats the arguments for conservation<br />
<strong>of</strong> mass (§3.2) although applied in a more sophisticated fashion. The most<br />
important aims are:<br />
• to understand how identifying the physical processes in and on a slice<br />
<strong>of</strong> continuum leads to a partial differential equation;<br />
• to derive the momentum equation<br />
ρ Dv ( ∂v<br />
Dt = ρ ∂t + v ∂v )<br />
= F + ∂σ<br />
∂x ∂x . (4.1)<br />
Reading 4.A Study Section 3.1 [R,pp47–51].
Module 4. The dynamics <strong>of</strong> momentum 116<br />
4.1.1 Exercises<br />
Activity 4.B Do Problems 3.1–2 [R,pp51–2] and the problems below. Send<br />
in to the examiner for feedback at least Exercises 4.1 and 4.3 below.<br />
Ex. 4.1: Consider a body extending over some range <strong>of</strong> x which is initially<br />
at rest and then is accelerated into motion by the gravitational body<br />
force F = −gρ. Show that v = −gt (independent <strong>of</strong> x) satisfies the<br />
momentum equation (4.1) and explain why this describes a body falling<br />
freely.<br />
Ex. 4.2: A material moves according to the velocity field v = 2tx + 1 + t 2<br />
1+t 2<br />
and has a constant density field ρ. How much momentum is in the<br />
interval [0, 1] <strong>of</strong> the material? As a function <strong>of</strong> time t, what is the rate<br />
at which momentum enters the interval [0, 1] through being carried<br />
across the ends x = 0 and x = 1?<br />
Ex. 4.3: A material body has no applied body force, F = 0, but has an internal<br />
pattern <strong>of</strong> stress σ(x) shown below. Sketch the resultant material<br />
acceleration. For simplicity assume the body has constant density in x<br />
at this particular time.
Module 4. The dynamics <strong>of</strong> momentum 117<br />
1<br />
stress σ<br />
0<br />
-1<br />
0 1 2 3 4 5 6 7 8<br />
x<br />
Example 4.4: worked Problem 3.3 In outline [R,p52].<br />
(a) The continuity equation ∂ρ + ∂ (ρv) = 0 when ρ is constant reduces<br />
∂t ∂x<br />
to just ∂v = 0, which implies v may not depend upon x and hence<br />
∂x<br />
only depends upon t.<br />
(b) With stress σ = −p, F = −Cv and v independent <strong>of</strong> x, the momentum<br />
equation (3.2) reduces to<br />
• ∂p = − ( )<br />
Cv + ρ ∂v<br />
∂x ∂t which, since the right-hand side is independent<br />
<strong>of</strong> x, is integrated to<br />
• p = − ( )<br />
Cv + ρ ∂v<br />
∂t x + D for some integration constant D<br />
independent <strong>of</strong> x,<br />
• and thus observe the pressure is linear in x.<br />
(c) Substituting x = 0 determines D = p 0 (t). Substituting x = L then<br />
determines the given differential equation for v(t).
Module 4. The dynamics <strong>of</strong> momentum 118<br />
(d) The differential equation is a linear, first order differential equation<br />
and so may be solved by multiplying by the integrating factor<br />
e Ct/ρ . Treating p L−p 0<br />
as a constant, obtain the solution<br />
L<br />
v = p L − p 0<br />
LC<br />
(<br />
1 − e<br />
−Ct/ρ ) .<br />
Interpret the solution to see that the flow exponentially quickly<br />
approaches the steady, long-term flow-rate <strong>of</strong> (p L − p 0 )/(LC) representing<br />
a balance between driving pressure drop, p L − p 0 , and<br />
total viscous drag, LC.
Module 4. The dynamics <strong>of</strong> momentum 119<br />
4.2 Dynamics <strong>of</strong> ideal gases<br />
Perhaps the next simplest mechanical continuum is that formed by ideal<br />
gases. For example, air is an ideal gas to a very good approximation. Here<br />
we show how to use the two conservation equations, one for material and one<br />
for momentum, to deduce the nature and propagation <strong>of</strong> sound. We extend<br />
the analysis to a description <strong>of</strong> a sonic boom such as that generated by a<br />
supersonic plane.<br />
Main aims:<br />
• to understand the need to supplement the partial differential equations<br />
by an equation <strong>of</strong> state;<br />
• to see the wave equation arise in the linearised dynamics <strong>of</strong> the mathematical<br />
model<br />
∂ 2 u<br />
∂t 2<br />
= c2 ∂2 u<br />
∂x 2 . (4.2)<br />
• to show further use <strong>of</strong> the basics <strong>of</strong> the method <strong>of</strong> characteristics.<br />
Reading 4.C Study Section 3.2 [R,pp52–8].<br />
Note: in [R,p55],<br />
twice g(x − c ∗ t)<br />
should read<br />
g(x + c ∗ t).
Module 4. The dynamics <strong>of</strong> momentum 120<br />
4.2.1 Exercises<br />
Activity 4.D Do Problem 3.4 [R,p58]<br />
Example 4.5: worked Problem 3.5 In outline [R,p58], fill in the details.<br />
(a) Reproduce the argument on [R,pp24–5] for velocity v instead <strong>of</strong> ρ,<br />
and for k + v instead <strong>of</strong> c(ρ).<br />
(b) Draw a characteristic diagram for 0 ≤ x ≤ 4.5 and 0 ≤ t ≤ 4 as in<br />
Figure 4.1.<br />
• Characteristics emanating from the x-axis (x > 0) have slope<br />
k(= 1) as the velocity v = 0 on them all from the initial state.<br />
For example, the characteristic emanating from (x, t) = (1, 0)<br />
is the line x = 1 + t, and on this line we know that v = 0.<br />
• Characteristics emanating from the t-axis (t > 0) have differing<br />
slopes <strong>of</strong> 1 + v for the prescribed v. For example,<br />
the characteristic emanating from (x, t) = (0, 1/2) is the line<br />
x = ( )<br />
1 + 1<br />
2π (t − 1/2), and on this line v = 1/2π.<br />
By looking at the value <strong>of</strong> v on each characteristic at each <strong>of</strong><br />
the two times t = 2 and t = 4, draw the solution curves for the<br />
velocity v as seen in Figure 4.2. Evidently from the intersection<br />
<strong>of</strong> the characteristics the first shock needs to form at some time<br />
between roughly t = 2 and t = 2.5.<br />
A Matlab<br />
program to animate<br />
the characteristic<br />
solution is in<br />
fanreal.m, try it.
t<br />
Module 4. The dynamics <strong>of</strong> momentum 121<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5<br />
x<br />
Figure 4.1: characteristic diagram for a fan blowing air into a long pipe.
Module 4. The dynamics <strong>of</strong> momentum 122<br />
0.2<br />
t=2<br />
0.1<br />
v<br />
0<br />
-0.1<br />
-0.2<br />
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5<br />
x<br />
0.2<br />
t=4<br />
0.1<br />
v<br />
0<br />
-0.1<br />
-0.2<br />
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5<br />
x<br />
Figure 4.2: velocity field predicted at two times by the characteristic solution.<br />
Note the multi-valued solution indicating the need for “shocks”.
Module 4. The dynamics <strong>of</strong> momentum 123<br />
Exercise 4.6: For an ideal gas with γ = 1 the continuity and momentum<br />
equations are<br />
∂ρ<br />
∂t + ∂(ρv)<br />
[ ∂v<br />
∂x = 0 and ρ ∂t + v ∂v ]<br />
= −k 2 ∂ρ<br />
∂x ∂x .<br />
Linearise about the fixed point v = 0 and ρ = ρ ∗ and then combine the<br />
linearised equations to deduce that sound, density-velocity fluctuations,<br />
obey the wave equation<br />
∂ 2ˆv<br />
∂t 2<br />
= k2<br />
∂2ˆv<br />
∂x 2 .
Module 4. The dynamics <strong>of</strong> momentum 124<br />
4.3 Equations <strong>of</strong> quasi-one-dimensional blood flow<br />
Main aims:<br />
• to generalise the continuity and momentum equations to situations<br />
w<strong>here</strong> the cross-sectional area <strong>of</strong> a continuum varies in space and time;<br />
• to see how to model the dynamics <strong>of</strong> blood flowing through an elastic<br />
artery by the forced wave equation.<br />
Reading 4.E Study Section 5.1–2, [R,pp111–123].<br />
Activity 4.F Do Problems 5.1–2, [R,pp123–4]. Send in to the examiner for<br />
feedback at least Prob. 5.1.<br />
4.3.1 Answers to selected Exercises<br />
4.2 ρ ( t<br />
+ 1 + t 2) , −ρ [ ]<br />
2t + 4t2<br />
1+t 2 (1+t 2 ) 2<br />
Prob. 3.1 (a) F i+1 = −(4 − i)mg<br />
(b) σ = −(L − x)ρg<br />
Prob. 3.2 ∂(ρv) + ∂(ρv2 )<br />
∂t ∂x<br />
= F + ∂σ<br />
∂x + ru<br />
Prob. 3.4 (a) ρ = [ ρ 2/5<br />
0 − 2 5 gx/k2] 5/2<br />
(b) ∂v + ( ρ 1/5<br />
∂t ∗ k + 6v/5 ) ∂v<br />
= 0 ∂x
Module 4. The dynamics <strong>of</strong> momentum 125<br />
4.4 Summary<br />
• The principle <strong>of</strong> conservation <strong>of</strong> momentum leads to the momentum<br />
equation (§4.1) ( ∂v<br />
ρ<br />
∂t + v ∂v )<br />
= F + ∂σ<br />
∂x ∂x .<br />
• Typically, experimental observations are needed to complete the set <strong>of</strong><br />
continuum equations. For example, σ = −p ∝ −ρ γ−1 for gasses (§4.2).<br />
• Linearisation <strong>of</strong> dynamical equations about convenient equilibria leads<br />
to approximate solutions which allow us to make useful predictions<br />
about the dynamics that occur in the applications.<br />
• In situations w<strong>here</strong> the continuum varies in cross-sectional area, say<br />
A(x, t), the continuity equation becomes (§4.3)<br />
∂<br />
∂t (Aρ) + ∂<br />
∂x (Aρv) = 0 ,<br />
and the momentum equation is<br />
[ ∂v<br />
Aρ<br />
∂t + v ∂v ]<br />
= F 1 + ∂<br />
∂x ∂x (Aσ) .<br />
• The material and muscles <strong>of</strong> an artery suggest a linear model, Hooke’s<br />
law for arteries, p = p ∗ +α(R−R ∗ )+P (x, t), relating the pressure (−σ)<br />
to the varying radius <strong>of</strong> the artery and including the muscle applied<br />
pressure P (§4.3).
Part II<br />
Structure, algebra and<br />
approximation <strong>of</strong> applied functions
Part contents<br />
5 The nature <strong>of</strong> infinite series 129<br />
5.1 Introduction to summing an infinite series . . . . . . . . . . . 132<br />
5.2 Establishing when a series converges . . . . . . . . . . . . . . 142<br />
5.3 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147<br />
5.4 Taylor’s theorem in n-dimensions . . . . . . . . . . . . . . . . 166<br />
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183<br />
6 Series solutions <strong>of</strong> differential equations give special functions<br />
185<br />
6.1 Power series method leads to Legendre polynomials . . . . . . 189<br />
6.2 Frobenius method is needed to describe Bessel functions . . . 195<br />
6.3 Computer algebra for repetitive tasks . . . . . . . . . . . . . . 206
PART CONTENTS 128<br />
6.4 The orthogonal solutions to second order differential equations 245<br />
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249<br />
7 Linear transforms and their eigenvectors on inner product<br />
spaces 252<br />
7.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . 255<br />
7.2 The nature <strong>of</strong> linear transformations . . . . . . . . . . . . . . 269<br />
7.3 Revision <strong>of</strong> eigenvalues and eigenvectors . . . . . . . . . . . . 284<br />
7.4 Diagonalisation transformation . . . . . . . . . . . . . . . . . 289<br />
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Module 5<br />
The nature <strong>of</strong> infinite series<br />
Quite <strong>of</strong>ten we use power series to approximately solve differential equations<br />
(see the next module). For example, an exact solution to the differential<br />
equation y ′′ = 6y 2 is y = 1/(1+x) 2 . But suppose, using techniques developed<br />
in the next module, we knew only the power series approximate solution<br />
y = 1 − 2x + 3x 2 − 4x 3 + 5x 4 − · · · : how can we sensibly ascribe a value to<br />
such an infinite sum?<br />
This module will focus on the following question:<br />
How is it possible, quite generally, to add up infinitely many numbers<br />
and still obtain a sum which is finite and sensible?
Module 5. The nature <strong>of</strong> infinite series 130<br />
A set <strong>of</strong> infinitely many numbers will be called a ‘sequence’ and when the<br />
numbers in a sequence are added together they are said to form an ‘infinite<br />
series’. If an infinite series has a finite sum it is said to ‘converge’ and if not,<br />
to ‘diverge’.<br />
Though not couched in these terms, our question has a long history in mathematics,<br />
beginning with the work <strong>of</strong> the Greek philosopher Zeno <strong>of</strong> Elea<br />
in the 5th century B.C. Zeno is noted for having posed four ‘paradoxes’<br />
which showed that in order to understand fundamental concepts like motion,<br />
change, continuity and infinity, one must resolve questions like the one we<br />
have before us. In turn, it was essential that these concepts be placed on<br />
a firm mathematical foundation to allow the complete development <strong>of</strong> differential<br />
and integral calculus, begun by Newton and Leibnitz in the 17th<br />
century.<br />
Module contents<br />
5.1 Introduction to summing an infinite series . . . . . . 132<br />
5.1.1 Zeno’s Second Paradox: Achilles and the Tortoise . . . . 133<br />
5.1.2 Case studies: using partial sums . . . . . . . . . . . . . 134<br />
5.1.3 Case study: the harmonic series diverges . . . . . . . . . 139<br />
5.2 Establishing when a series converges . . . . . . . . . . 142<br />
5.2.1 Absolute and conditional convergence . . . . . . . . . . 143<br />
5.2.2 Tests for the convergence <strong>of</strong> series . . . . . . . . . . . . 145
Module 5. The nature <strong>of</strong> infinite series 131<br />
5.3 Power series . . . . . . . . . . . . . . . . . . . . . . . . . 147<br />
5.3.1 Functions from power series . . . . . . . . . . . . . . . . 151<br />
5.3.2 Taylor and Maclaurin Series . . . . . . . . . . . . . . . . 152<br />
5.3.3 Truncation error for Taylor series . . . . . . . . . . . . . 155<br />
5.3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 164<br />
5.4 Taylor’s theorem in n-dimensions . . . . . . . . . . . . 166<br />
5.4.1 Identify local maxima and minima . . . . . . . . . . . . 170<br />
5.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 180<br />
5.4.3 Answers to selected Exercises . . . . . . . . . . . . . . . 182<br />
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Module 5. The nature <strong>of</strong> infinite series 132<br />
5.1 Introduction to summing an infinite series<br />
Suppose then that we have an infinite sequence <strong>of</strong> real numbers and, for<br />
simplicity, that all the numbers are positive. Intuitively, it is clear that if<br />
all the numbers remain about the same size, or if they progressively increase<br />
in size, such as 1, 2, 3, 4,. . . , then when the numbers are added together<br />
their sum will grow without limit. On the other hand, if the numbers grow<br />
progressively smaller in size, so that when the numbers are added together<br />
each successive number contributes less and less to the overall sum, such as<br />
1, 1/2, 1/4, 1/8, . . . , then it might be possible for the sum to remain finite.<br />
Now suppose that we allow the sequence to contain both positive and negative<br />
numbers. The negative numbers will tend to cancel out the contributions<br />
which the positive numbers make to the sum. If the negative numbers are<br />
randomly interspersed among the positive numbers, then the effect that they<br />
might have on the sum is difficult to assess. However, in many practical<br />
situations, the negative terms alternate with the positive ones and this is<br />
easier to handle. These intuitive ideas will be developed more fully below.<br />
Main aims:<br />
• introduce some examples <strong>of</strong> summing an infinite series;<br />
• show examples <strong>of</strong> when a sum cannot be found.
Module 5. The nature <strong>of</strong> infinite series 133<br />
5.1.1 Zeno’s Second Paradox: Achilles and the Tortoise<br />
Consider just one <strong>of</strong> Zeno’s paradoxes which, in modern units, could be<br />
expressed as follows.<br />
Achilles, who runs 10 times faster than a tortoise, set <strong>of</strong>f to chase one<br />
100 metres away. At the same time the tortoise began to crawl away from<br />
him. By the time Achilles reached the point w<strong>here</strong> the tortoise started, the<br />
tortoise was 10 m away. Achilles continued the chase but, upon reaching the<br />
tortoises previous position, the tortoise had moved and was now 1 m away.<br />
Achilles continued for another metre, but yet again the tortoise had moved<br />
further. This apparently continues forever: the tortoise has always moved<br />
by the time Achilles had reached w<strong>here</strong> it last was. Evidently, Achilles was<br />
never able to catch the tortoise.<br />
This conclusion is clearly absurd. We know from experience that tortoises<br />
are relatively easy to catch. Zeno was concerned with finding the fault in<br />
his logic. Had he the use <strong>of</strong> a modern number system, much <strong>of</strong> his problem<br />
would have disappeared: the total distance run by Achilles in chasing the<br />
tortoise is<br />
100 + 10 + 1 + 0.1 + 0.01 + 0.001 + · · · = 111.11111 . . . metres.<br />
To suggest that Achilles could not cover this distance is to say that he would<br />
never be able to run 112m, which he certainly could. The problem is with the<br />
word never. The ancient Greeks apparently thought that it was impossible
Module 5. The nature <strong>of</strong> infinite series 134<br />
to sum an infinite set <strong>of</strong> numbers and arrive at a finite sum. This is refuted<br />
in our number system by the commonplace notion <strong>of</strong> a recurring decimal, for<br />
example<br />
1<br />
3 = 0.333333 . . .<br />
= 3 10 + 3<br />
10 2 + 3<br />
10 3 + 3<br />
10 4 + 3<br />
10 5 + · · ·<br />
w<strong>here</strong> the right-hand side is the sum <strong>of</strong> an infinite series and the left-hand<br />
side is its clearly finite sum. Thus this infinite series converges to the value<br />
1/3.<br />
This is an example<br />
<strong>of</strong> a convergent<br />
geometric series.<br />
Definition 5.1 Given an infinite sequence <strong>of</strong> numbers that we wish to sum,<br />
say z 1 , z 2 , z 3 , . . . , we define the partial sums S n = ∑ n<br />
k=1 z k and say that the<br />
infinite series, the infinite sum, ∑ ∞<br />
k=1 z k converges to the value lim n→∞ S n if<br />
this limit exists.<br />
5.1.2 Case studies: using partial sums<br />
Example 5.1: establishing convergence from partial sums.<br />
show that:<br />
∞∑ 1<br />
k(k + 1) = 1 .<br />
k=1<br />
Here I
Module 5. The nature <strong>of</strong> infinite series 135<br />
• Begin by considering the sequence {S n } <strong>of</strong> partial sums:<br />
S 1 =<br />
S 2 =<br />
S 3 =<br />
1<br />
(the first term)<br />
1 × 2<br />
1<br />
1 × 2 + 1 (the sum <strong>of</strong> the first 2 terms)<br />
2 × 3<br />
1<br />
1 × 2 + 1<br />
2 × 3 + 1 (the sum <strong>of</strong> the first 3 terms)<br />
3 × 4<br />
.<br />
S n =<br />
1<br />
1 × 2 + 1<br />
2 × 3 + 1<br />
3 × 4 + · · · + 1<br />
n(n + 1) .<br />
• Finding a sum for this series requires that we find a limit for the<br />
sequence {S n }. To proceed, note that<br />
then<br />
S n =<br />
1<br />
n(n + 1) = 1 n − 1<br />
n + 1<br />
( 1<br />
1 − 1 ( 1<br />
+<br />
2)<br />
2 − 1 ( 1<br />
+ · · · +<br />
3)<br />
n − 1 − 1 ) ( 1<br />
+<br />
n n − 1 )<br />
.<br />
n + 1<br />
• Clearly, all terms cancel except the first and last, a process known<br />
as a telescopic sum and this leaves:<br />
S n = 1 − 1<br />
n + 1 .
Module 5. The nature <strong>of</strong> infinite series 136<br />
• It follows that:<br />
(<br />
lim S n = lim 1 − 1 )<br />
= 1 .<br />
n→∞ n→∞ n + 1<br />
Thus ∑ ∞ 1<br />
k=1<br />
below.<br />
k(k+1)<br />
converges and its sum is 1, as is seen in the table<br />
n nth term z n partial sum S n<br />
1 0.5 0.5<br />
10 0.0090909090909 0.909090909090<br />
50 0.0003921568627 0.980392156862<br />
100 0.0000990099009 0.990099009900<br />
200 0.0000248756218 0.995024875621<br />
500 0.0000039920159 0.998003992015<br />
1000 0.0000009990009 0.999000999000<br />
10 000 0.0000000099990 0.999900009999<br />
This result is also displayed graphically using Matlab
Module 5. The nature <strong>of</strong> infinite series 137<br />
1<br />
0.9<br />
S n<br />
0.8<br />
0.7<br />
0.6<br />
S n<br />
0.5<br />
0 5 10 15 20 25 30 35 40 45 50<br />
n<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />
1/n<br />
n=50;<br />
k=1:n;<br />
s=cumsum(1./(k.*(k+1)));<br />
subplot(2,1,1)<br />
plot(k,s,’+’,k,1+zeros(size(k)),’--’)<br />
subplot(2,1,2)<br />
plot(1./k,s,’.’,0,1,’o’)<br />
The top plot shows the partial sums converging to the limit as n → ∞<br />
and the bottom plot shows the same limit, plotted as a circle in the<br />
top left corner, but perhaps more convincingly as 1/n → 0 (equivalent<br />
to n → ∞) by plotting S n against 1/n.<br />
Example 5.2: establishing divergence from partial sums. The series<br />
∞∑<br />
(−1) k+1 = 1 − 1 + 1 − 1 + · · ·<br />
k=1
Module 5. The nature <strong>of</strong> infinite series 138<br />
is divergent, for the partial sum<br />
n∑<br />
S n = (−1) k+1 = 1 − 1 + 1 − 1 + · · · + (−1) n+1 =<br />
k=1<br />
{<br />
0 , if n is even,<br />
1 , if n is odd.<br />
Thus the sequence <strong>of</strong> partial sums is {1, 0, 1, 0, . . .} which has no limit.<br />
This example provides a good illustration <strong>of</strong> the absurdities which can<br />
arise from supposing that a limit exists when, in fact, it does not.<br />
• Suppose that the above series has a limit, say S, then<br />
S = 1 − 1 + 1 − 1 + 1 − 1 + · · ·<br />
= 1 − (1 − 1 + 1 − 1 + 1 − · · ·)<br />
= 1 − S<br />
⇒ 2S = 1<br />
⇒ S = 1 2 .<br />
• However, it is equally valid (actually equally invalid) to argue<br />
S = 1 − 1 + 1 − 1 + 1 − 1 + · · ·<br />
= (1 − 1) + (1 − 1) + (1 − 1) + · · ·<br />
= 0 ,<br />
• or again,<br />
S = 1 − (1 − 1) − (1 − 1) − (1 − 1) − · · ·<br />
= 1 .<br />
In this context<br />
divergence has<br />
nothing to do with a<br />
differential operator!<br />
It means that an
Module 5. The nature <strong>of</strong> infinite series 139<br />
• We cannot sensibly ascribe any particular value to the sum and<br />
hence we say that the series is divergent.<br />
5.1.3 Case study: the harmonic series diverges<br />
The infinite series<br />
∞∑<br />
k=1<br />
1<br />
k = 1 + 1 2 + 1 3 + 1 4 + · · ·<br />
is called the harmonic series. We will show in a moment that the harmonic<br />
series diverges, which is important in connection with Kreyszig’s caution<br />
[K,p735] that the terms in a series getting inexorably smaller, z k → 0, is not<br />
a sufficient condition for the series to converge.<br />
Example 5.3: The harmonic series is divergent The pro<strong>of</strong> is by contradiction,<br />
i.e. assume that the series converges to some value H, and<br />
show that this leads to a contradiction.<br />
Let H = 1 + 1 2 + 1 3 + 1 4 + 1 5 + 1 6 + 1 7 + · · · ,<br />
E = 1 2 + 1 4 + 1 6 + · · · ,<br />
O = 1 + 1 3 + 1 5 + 1 7 + · · · .<br />
E represents the<br />
sum <strong>of</strong> the even<br />
terms and O the<br />
sum <strong>of</strong> the odd<br />
terms. Since they<br />
are both sub-sets <strong>of</strong><br />
H, they must<br />
converge if H does.
Module 5. The nature <strong>of</strong> infinite series 140<br />
Now observe three facts which form the contradiction.<br />
• Since the harmonic series has simply been partitioned into a series<br />
<strong>of</strong> its even terms and a series <strong>of</strong> its odd terms, we must have<br />
H = E + O .<br />
• Since for all n, the nth term <strong>of</strong> O is larger than the nth term <strong>of</strong><br />
E, it follows that<br />
O > E<br />
which means that O contributes more than half <strong>of</strong> the total <strong>of</strong> H,<br />
so that E must contribute less than half <strong>of</strong> the total.<br />
• Taking a common factor <strong>of</strong> 1/2 out <strong>of</strong> each term <strong>of</strong> E allows us to<br />
rewrite E as<br />
E = 1 (1 + 1 2 2 + 1 3 + 1 4 + 1 5 + 1 ·)<br />
6 + · · ,<br />
or E = 1 2 H ,<br />
which contradicts the previous observation that E must be less<br />
than half H.<br />
In spite <strong>of</strong> the fact that 1/k → 0 as k → ∞, the harmonic series is<br />
divergent, a famous example originally discovered by Nicole d’Oresme<br />
in the 14th century. It should be noted however, that the harmonic<br />
series diverges very slowly: after fifteen thousand terms the sum has
Module 5. The nature <strong>of</strong> infinite series 141<br />
grown to 10.1931 and after one million terms, to only 14.3927, yet it<br />
does diverge!<br />
5<br />
4<br />
S n<br />
3<br />
2<br />
S n<br />
1<br />
0 5 10 15 20 25 30 35 40 45 50<br />
n<br />
4.5<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />
1/n<br />
n=50;<br />
k=1:n;<br />
s=cumsum(1./k);<br />
subplot(2,1,1)<br />
plot(k,s,’+’,k,log(2*k+1),’--’)<br />
subplot(2,1,2)<br />
plot(1./k,s,’+’)
Module 5. The nature <strong>of</strong> infinite series 142<br />
5.2 Establishing when a series converges<br />
Main aims:<br />
• introduce the two types <strong>of</strong> convergence when summing an infinite series:<br />
absolute convergence is robust, and conditional convergence which is,<br />
in a sense, marginal.<br />
• develop and use three tests for convergence <strong>of</strong> the sum <strong>of</strong> a series.<br />
Reading 5.A Study Section 14.1 in Kreyszig [K,pp732–40].<br />
Note:<br />
• Chapter 14 deals with sequences and series <strong>of</strong> complex numbers, but<br />
the same theory applies if the numbers are real.<br />
• Remember the distinction between a sequence and a series: an infinite<br />
series is summed to give a sequence <strong>of</strong> partial sums.<br />
• Cauchy’s convergence principle for series also applies to a sequence in<br />
the form, paraphrasing that on [K,p735], that
Module 5. The nature <strong>of</strong> infinite series 143<br />
Theorem 5.2 A sequence S n converges if and only if for every ɛ > 0<br />
(no matter how small), we can find an N (depending upon ɛ in general)<br />
such that |S n − S m | < ɛ for all n, m > N.<br />
Cauchy’s principle is extremely useful, especially in more difficult problems,<br />
because we can test rigorously for convergence without actually<br />
knowing the value <strong>of</strong> the limit to which the sequence or series converges!<br />
But we will see little <strong>of</strong> this aspect in this unit.<br />
5.2.1 Absolute and conditional convergence<br />
The notions <strong>of</strong> absolute convergence and conditional convergence are well<br />
illustrated by contrasting the harmonic series, which diverges, with the alternating<br />
harmonic series,<br />
∞∑<br />
(−1) k+1 1<br />
k=1<br />
k = 1 − 1 2 + 1 3 − 1 4 + · · · ,<br />
which converges [K,p736, Example 3], but only just.
Module 5. The nature <strong>of</strong> infinite series 144<br />
1<br />
0.9<br />
S n<br />
0.8<br />
0.7<br />
0.6<br />
S n<br />
0.5<br />
0 5 10 15 20 25 30 35 40 45 50<br />
n<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />
1/n<br />
n=50;<br />
k=1:n;<br />
s=cumsum((-1).^(k-1)./k);<br />
subplot(2,1,1)<br />
plot(k,s,’+’,k,log(2)+zeros(1,n),’--’)<br />
subplot(2,1,2)<br />
plot(1./k,s,’+’,0,log(2),’o’)<br />
Essentially, the alternation <strong>of</strong> sign produces some degree <strong>of</strong> cancellation in<br />
successive terms which is sufficient to allow the series to converge w<strong>here</strong>as<br />
the harmonic series, which has terms <strong>of</strong> the same size but all positive, fails<br />
to converge. In this situation the convergence is conditional.<br />
On the other hand, an absolutely convergent series such as<br />
∞∑<br />
(−1) k+1 1 k 2<br />
k=1<br />
converges absolutely because the sum <strong>of</strong> the absolute values <strong>of</strong> terms<br />
∞∑<br />
1 ∣ ∣∣∣ ∞∑ 1<br />
∣ (−1)k+1 =<br />
k 2 k 2<br />
k=1<br />
k=1
Module 5. The nature <strong>of</strong> infinite series 145<br />
converges, though obviously to a different sum (see Using the Comparison<br />
test in §§5.2.2). Here its terms z k → 0 fast enough to ensure convergence<br />
even though all terms are positive.<br />
5.2.2 Tests for the convergence <strong>of</strong> series<br />
The comparison test, ratio test and root test which Kreyszig establishes in<br />
Theorems 5–10 <strong>of</strong> §14.1 are very useful tools in determining whether a given<br />
series converges. Notice that they do not tell you what the sum <strong>of</strong> the series<br />
may be, other methods are needed for that. The ratio test is the most<br />
important <strong>of</strong> these.<br />
Often geometric series are useful in applications <strong>of</strong> the comparison test since<br />
their convergence is easily established [K,Theorem 9, p739].<br />
Example 5.4: using the Comparison test. The series ∑ ∞<br />
k=1 1/k 2 is convergent,<br />
for<br />
• a sneaky way to write this is<br />
∞∑<br />
k=1<br />
• Now observe that<br />
1<br />
k 2 = 1 + ∞ ∑<br />
k=2<br />
1<br />
k 2 = 1 + ∞ ∑<br />
k=1<br />
1<br />
(k + 1) 2 < 1<br />
k(k + 1) ,<br />
1<br />
(k + 1) 2 .
Module 5. The nature <strong>of</strong> infinite series 146<br />
• then ∑ ∞<br />
k=1 1/(k + 1) 2 converges by comparison with ∑ ∞<br />
k=1 1/[k(k + 1)]<br />
which was shown to converge in the worked example in §§5.1.2.<br />
• Hence the series ∑ ∞<br />
k=1 1/k 2 converges as displayed below.<br />
1.8<br />
1.6<br />
S n<br />
1.4<br />
1.2<br />
S n<br />
1<br />
0 5 10 15 20 25 30 35 40 45 50<br />
n<br />
1.8<br />
1.6<br />
1.4<br />
1.2<br />
1<br />
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />
1/n<br />
n=50;<br />
k=1:n;<br />
s=cumsum(1./k.^2);<br />
subplot(2,1,1)<br />
plot(k,s,’+’,k,pi^2/6+zeros(1,n),’--’)<br />
subplot(2,1,2)<br />
plot(1./k,s,’+’,0,pi^2/6,’o’)<br />
Activity 5.B Do examples from Problem Set 14.1 [K,p730]. Send in to the<br />
examiner for feedback at least Q3, 7, 12 & 13.
Module 5. The nature <strong>of</strong> infinite series 147<br />
5.3 Power series<br />
We are interested in power series such as the “solution” y = 1 − 2x + 3x 2 −<br />
4x 3 + 5x 4 − · · · <strong>of</strong> the differential equation (1 + x) 2 y ′′ = 6y. This power<br />
series and its properties will depend upon x, for example: at x = 0 it is<br />
y = 1 − 0 + 0 − 0 + 0 − · · · which trivially converges to y = 1; at x = 1 it<br />
clearly diverges as the terms in the series 1 − 2 + 3 − 4 + 5 − · · · increase in<br />
magnitude; at x = 1/2 it converges and so we might say y(1/2) ≈ 1 − 1 +<br />
3/4 − 1/2 + 5/16 = 9/16, but what then is the error? how good may we<br />
expect the linear approximation, y = 1 − 2x? This section addresses these<br />
questions:<br />
• how does convergence depend upon x in such a power series?<br />
• what sort <strong>of</strong> error may we expect in any finite truncation <strong>of</strong> the infinite<br />
series?<br />
Main aims:<br />
• to show that within their domain <strong>of</strong> convergence, power series define<br />
well-behaved functions <strong>of</strong> x (or z);<br />
• conversely, the Taylor or Maclaurin series <strong>of</strong> a function generally converges<br />
to the function in some domain;
Module 5. The nature <strong>of</strong> infinite series 148<br />
• to deduce an expression that usefully estimates the error in using a<br />
Taylor series approximation.<br />
A power series is an infinite series with terms that involve a variable; Kreyszig<br />
uses a complex variable z, but the theory applies equally to real power series,<br />
w<strong>here</strong> we might use x to represent a real variable. Thus a power series like<br />
∞∑<br />
a n z n = a 0 + a 1 z + a 2 z 2 + · · ·<br />
n=0<br />
involves both constant coefficients, a 0 , a 1 , a 2 ,. . . , and increasing powers <strong>of</strong><br />
a complex or real variable z, roughly like an “infinite polynomial”. Notice<br />
that we start the summation at n = 0 to allow for a constant term a 0 , not<br />
depending on z, but the convergence or divergence <strong>of</strong> the resulting series is<br />
determined by the value <strong>of</strong> z, as well as by the coefficients.<br />
Reading 5.C Study [K,pp741–5, §14.2], particularly Radius <strong>of</strong> convergence.<br />
Example 5.5: Write down the centre and determine the radius <strong>of</strong> convergence<br />
<strong>of</strong> the power series 1 − 2x + 3x 2 − 4x 3 + 5x 4 − · · ·.
Module 5. The nature <strong>of</strong> infinite series 149<br />
Solution: Clearly this has centre <strong>of</strong> expansion x = 0 as it is written<br />
in powers <strong>of</strong> x = (x − 0). To determine its radius <strong>of</strong> convergence note<br />
that the power series is ∑ ∞<br />
n=0 (n + 1)(−1) n x n ; that is, its nth coefficient<br />
is a n = (−1) n (n + 1). Use the ratio test<br />
a n+1 x n+1 ∣ ∣∣∣∣ =<br />
∣ a n x n<br />
∣<br />
(−1) n+1 (n + 2)x n+1<br />
(−1) n (n + 1)x n ∣ ∣∣∣∣<br />
=<br />
∣ n + 2 ∣∣∣ ∣<br />
n + 1 x → |x| as n → ∞ ,<br />
which is less than 1 if and only if |x| < 1. Thus the radius <strong>of</strong> convergence<br />
is R = 1 and we expect the power series to usefully converge for −1 <<br />
x < 1.<br />
This analysis holds<br />
if x is either real or<br />
complex.<br />
Sometimes a power series only involves only even or odd powers <strong>of</strong> x−c (or x)<br />
in which case the radius <strong>of</strong> convergence is best determined from that in terms<br />
<strong>of</strong> (x − c) 2 (or x 2 ). The following example shows the sort <strong>of</strong> considerations<br />
that could be applied.<br />
Example 5.6: convergence in x 2 Consider the power series for<br />
sin x = x − 1 6 x3 + 1<br />
120 x5 − · · · = ∑<br />
n odd<br />
(−1) (n−1)/2<br />
and show it converges for all x. A direct application <strong>of</strong> the ratio test<br />
fails because the ratio <strong>of</strong> consecutive terms is either 0 or ∞ as all the<br />
n!<br />
x n ,
Module 5. The nature <strong>of</strong> infinite series 150<br />
terms in even powers are zero! However, recast the series as<br />
sin x =<br />
= x<br />
∞∑ (−1) n<br />
(2n + 1)! x2n+1<br />
n=0<br />
∞∑<br />
n=0<br />
1<br />
(2n + 1)! zn<br />
upon letting z = −x 2 and extracting the common factor <strong>of</strong> x from the<br />
series. Then it is straightforward to show that the series ∑ ∞ 1<br />
n=0 (2n+1)! zn<br />
converges for all z from the ratio test:<br />
a n+1 z n+1 ∣ ∣∣∣∣ =<br />
∣ a n z n<br />
∣<br />
z n+1 /(2n + 3)!<br />
z n /(2n + 1)!<br />
∣ ∣∣∣∣<br />
∣ = z<br />
(2n + 3)(2n + 2) ∣ → 0 as n → ∞ ,<br />
for all z. Since it converges for all z = −x 2 , the original series must<br />
correspondingly converge for all x.<br />
Other substitutions may be used to analyse the convergence <strong>of</strong> power series<br />
with other patterns <strong>of</strong> zero terms.<br />
Activity 5.D Do problems in Problem Set 14.2 [K,p745]. Send in to the<br />
examiner for feedback at least Q2 & 4.
Module 5. The nature <strong>of</strong> infinite series 151<br />
5.3.1 Functions from power series<br />
The key point <strong>of</strong> this subsection is that at every point z for which such a<br />
power series converges, we can use its sum to define the value <strong>of</strong> a function<br />
f(z).<br />
∞∑<br />
f(z) = a n z n = a 0 + a 1 z + a 2 z 2 + · · · (5.1)<br />
n=0<br />
Kreyszig shows that such functions f(z), called analytic functions, have nice<br />
properties: they are continuous, differentiable and integrable at every point<br />
inside their radius <strong>of</strong> convergence. Also their derivatives and integrals are<br />
found exactly as you would hope to, by differentiating or integrating the<br />
power series term-by-term.<br />
Reading 5.E Study all <strong>of</strong> §14.3 [K,pp746–8] except for the subsection Power<br />
series represent analytic functions which you need only read.<br />
Exercise 5.7: Suppose function f(x) defined by a power series in (x − c)<br />
with some nonzero radius <strong>of</strong> convergence R:<br />
f(x) =<br />
∞∑<br />
a k (x − c) k<br />
k=0<br />
= a 0 + a 1 (x − c) + · · · + a n (x − c) n + · · ·<br />
∀x such that |x − c| < R .<br />
Recall that ∀ is<br />
short for “for all.”
Module 5. The nature <strong>of</strong> infinite series 152<br />
By differentiating f repeatedly with respect to x and evaluating each<br />
derivative at x = c, show that<br />
f (n) (c)<br />
n!<br />
= a n for n = 0, 1, 2, . . . .<br />
Given that we finally have established convergence <strong>of</strong> an infinite sum and that<br />
we can differentiate a power series this exercise can now be done. It most<br />
importantly establishes that the power series representation <strong>of</strong> any function<br />
f(x) about x = c is unique and is its Taylor series.<br />
Activity 5.F Do the above exercise and problems in Problem Set 14.3<br />
[K,pp750–1]. Send in to the examiner for feedback at least Q3 & 4.<br />
5.3.2 Taylor and Maclaurin Series<br />
Two English mathematicians, Brook Taylor (1685–1731) and Colin Maclaurin<br />
(1698–1746) pioneered this work for real power series. Taylor presented<br />
his results for power series in (x − c) while Maclaurin’s name is associated<br />
with power series in x.
Module 5. The nature <strong>of</strong> infinite series 153<br />
If a function f can be represented by a power series in (x − c), with radius<br />
<strong>of</strong> convergence R, then<br />
f(x) = f(c) + f ′ (c)(x − c) + f ′′ (c)<br />
2!<br />
(x − c) 2 + · · · + f (n) (c)<br />
(x − c) n + · · ·<br />
n!<br />
for all x such that |x − c| < R. This series representation is called the Taylor<br />
series in (x − c) <strong>of</strong> the function f.<br />
You have shown in Exercise 5.7 that if f has a power series representation<br />
then it must be the Taylor series, i.e. t<strong>here</strong> is only one power series in (x − c)<br />
to correspond to a given function f.<br />
When c = 0, the Taylor series gives a power series in x called the Maclaurin<br />
series. The Maclaurin series representation <strong>of</strong> function f is:<br />
f(x) = f(0) + f ′ (0)x + f ′′ (0)<br />
2!<br />
x 2 + · · · + f (n) (0)<br />
x n + · · ·<br />
n!<br />
for all x such that |x| < R. Note that in the Maclaurin series, all the<br />
derivatives <strong>of</strong> f are evaluated at 0, and the interval <strong>of</strong> convergence has its<br />
centre at 0.<br />
The Taylor series in (x−c) <strong>of</strong> a function f is usually referred to as the ‘Taylor<br />
series expansion <strong>of</strong> f about c’, while the Maclaurin series <strong>of</strong> f is the ‘Taylor<br />
series expansion <strong>of</strong> f about 0’.
Module 5. The nature <strong>of</strong> infinite series 154<br />
Example 5.8: finding a Maclaurin series. Assuming that f(x) = e x can<br />
be represented by a power series in x, we find its Maclaurin series as<br />
follows. Firstly, find f and its derivatives at x = 0:<br />
Hence the Maclaurin series is:<br />
f(x) = e x ⇒ f(0) = 1<br />
f ′ (x) = e x ⇒ f ′ (0) = 1<br />
f ′′ (x) = e x ⇒ f ′′ (0) = 1<br />
.<br />
f(x) = e x = 1 + x + x2<br />
2! + x3<br />
3! + x4<br />
4! + · · · = ∞ ∑<br />
.<br />
k=0<br />
x k<br />
k! .<br />
Reading 5.G Study the part <strong>of</strong> §14.4 [K,pp754–7] from Power series as Taylor<br />
series to the end <strong>of</strong> the section inclusive.<br />
Note the pivotal role <strong>of</strong> their power series properties <strong>of</strong> uniqueness, differentiability<br />
and integrability.<br />
Exercise 5.9: Find the radius and interval <strong>of</strong> convergence for the power<br />
series<br />
∞∑ (x − 2) n+1<br />
f(x) =<br />
(n + 1)3 . n+1<br />
n=0
Module 5. The nature <strong>of</strong> infinite series 155<br />
Find the sum <strong>of</strong> the series for f(x), thus writing an expression for f(x)<br />
not involving an infinite series. Hint: consider f ′ (x).<br />
Activity 5.H Do problems from Problem Set 14.4 [K,pp757–9]. Send in to<br />
the examiner for feedback at least Q2, 10 & 19.<br />
5.3.3 Truncation error for Taylor series<br />
Some <strong>of</strong> the earliest work on power series was done by the Scots mathematician<br />
James Gregory (1638–1675). He developed a power series method for<br />
interpolating table values for functions. The idea <strong>of</strong> using power series to<br />
estimate function values remained a prime motivation for later workers like<br />
Taylor. For example, putting x = 1 in the Maclaurin series for e x we obtain:<br />
e = exp(1) = 1 + 1 + 1 2! + 1 3! + 1 4! + · · ·<br />
∞∑ 1<br />
=<br />
k=0<br />
k!<br />
≈ 2.718281828459045235360287 . . . .<br />
Now e is a transcendental number, i.e. it is not the root <strong>of</strong> any algebraic<br />
equation and its value is an infinite, non-recurring decimal. In fact the only
Module 5. The nature <strong>of</strong> infinite series 156<br />
way <strong>of</strong> representing the number e exactly is as the sum <strong>of</strong> an infinite series.<br />
To estimate its value though, we have to take a partial sum <strong>of</strong> the series<br />
and in doing so we make a truncation error. With computers, it is now<br />
possible to compute e to hundreds, thousands, or even millions <strong>of</strong> decimal<br />
places. This is far greater accuracy than was ever dreamed <strong>of</strong> by Gregory,<br />
but every expansion involves an error and we should know something about<br />
these errors.<br />
Consider the Taylor series <strong>of</strong> a function f about c:<br />
f(x) = f(c) + f ′ (c)(x − c) + f ′′ (c)<br />
(x − c) 2 + · · · + f (n) (c)<br />
(x − c) n + · · ·<br />
2!<br />
n!<br />
∀x such that |x − c| < R . (5.2)<br />
Truncate the series after terms up to order n to form an nth degree polynomial<br />
approximation to f(x):<br />
P n (x) = f(c) + f ′ (c)(x − c) + f ′′ (c)<br />
2!<br />
(x − c) 2 + · · · + f (n) (c)<br />
(x − c) n ,<br />
n!<br />
w<strong>here</strong> P n (x) is called the Taylor polynomial <strong>of</strong> degree n for f at c.<br />
truncation error made in such an approximation is:<br />
The<br />
Taylor polynomials<br />
are like the partial<br />
sums <strong>of</strong> a series.<br />
R n (x) = f(x) − P n (x) ,<br />
w<strong>here</strong> R n (x) is called the remainder term for an nth order approximation.
Module 5. The nature <strong>of</strong> infinite series 157<br />
Example 5.10: Taylor polynomials approximate the function Consider<br />
the power series 1 − 2x + 3x 2 − 4x 3 + 5x 4 − · · · discussed earlier<br />
which we claimed is the power series for y = 1/(1 + x) 2 . The first few<br />
Taylor polynomials are:<br />
P 0 (x) = 1 ,<br />
P 1 (x) = 1 − 2x ,<br />
P 2 (x) = 1 − 2x + 3x 2 ,<br />
P 3 (x) = 1 − 2x + 3x 2 − 4x 3 .<br />
These are plotted below with 1/(1 + x) 2 plotted dashed:<br />
f(x) and approximations<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0.5<br />
P 0<br />
(x)<br />
P 1<br />
(x)<br />
P 2<br />
(x)<br />
P 3<br />
(x)<br />
0<br />
-0.5 0 0.5<br />
x<br />
x=linspace(-0.5,0.5);<br />
p=[ones(size(x))<br />
-2*x<br />
3*x.^2<br />
-4*x.^3];<br />
p=cumsum(p);<br />
plot(x’,p’,x,1./(1+x).^2,’--’)
Module 5. The nature <strong>of</strong> infinite series 158<br />
Observe that all Taylor polynomials are accurate sufficiently close to<br />
the centre <strong>of</strong> expansion x = 0. The error, or remainder, away from<br />
x = 0 is given by the distance from a curve to the exact dashed line<br />
and is different for each polynomial.<br />
Example 5.11: A 1st order Taylor polynomial for f(x) = e x about<br />
x = 1.<br />
f(x) = f(c) + f ′ (c)(x − c) + R 1 (x)<br />
Here f(c) = f ′ (c) = e 1 = e, so<br />
e x = e + e(x − 1) + R 1 (x)<br />
= e.x + R 1 (x)<br />
The following theorem shows one way to estimate the remainder, R n (x).<br />
Theorem 5.3 (Lagrange’s remainder) let f be a function which has n+1<br />
derivatives that are continuous on some interval I containing c. Then, for<br />
every x ∈ I, t<strong>here</strong> exists a number, u, between x and c, such that:<br />
f(x) = f(c) + f ′ (c)(x − c) + · · · + f (n) (c)<br />
(x − c) n + R n (x)<br />
n!<br />
= P n (x) + R n (x)
Module 5. The nature <strong>of</strong> infinite series 159<br />
w<strong>here</strong> Lagrange’s remainder is<br />
R n (x) = f (n+1) (u)<br />
(n + 1)! (x − c)n+1 . (5.3)<br />
Example 5.12: Lagrange’s remainder Examine the simple example <strong>of</strong><br />
the cubic f(x) = 1 + x + x 3 . It has a Taylor’s series about x = 0<br />
which is just itself (this is why this example is simple). The linear<br />
Taylor polynomial approximation to f(x) is simply P 1 (x) = 1 + x.<br />
By inspection we know that its error is the remainder R 1 (x) = x 3 .<br />
However, in complicated cases we will not know this and we have to<br />
see what the theorem can tell us. Here it tells us that t<strong>here</strong> exists a u,<br />
0 ≤ u ≤ x, such that<br />
R 1 (x) = f ′′ (u)<br />
x 2 = 6u 2 2 x2 = 3ux 2 .<br />
Here, because we already know R 1 (x) = x 3 we identify the correct<br />
u = x/3 which is indeed between 0 and x. In general we will not know<br />
R 1 (x) exactly but because 0 ≤ u ≤ x we will be able to say that the<br />
remainder, the error, R 1 (x) ≤ 3x 3 as 3ux 2 ≤ 3x 3 for 0 ≤ u ≤ x. Thus<br />
we can <strong>of</strong>ten place a bound on the error in a Taylor polynomial.<br />
Pro<strong>of</strong>:
Module 5. The nature <strong>of</strong> infinite series 160<br />
• since x is a fixed point in I with x ≠ c, let g be a function <strong>of</strong> t, defined<br />
as follows:<br />
g(t) = f(x) − f(t) − f ′ (t)(x − t) − f ′′ (t)<br />
(x − t) 2 −<br />
2!<br />
· · · − f (n) (t)<br />
(x − t) n (x − t)n+1<br />
− R n (x)<br />
n!<br />
(x − c) . n+1<br />
The reason for defining g in this way is that differentiating with respect<br />
to t has a telescoping effect. For example:<br />
d<br />
dt [−f(t) − f ′ (t)(x − t)] = −f ′ (t) + f ′ (t) − f ′′ (t)(x − t)<br />
= −f ′′ (t)(x − t).<br />
• The net result is that g ′ (t) simplifies to:<br />
g ′ (t) = − f (n+1) (t)<br />
n!<br />
(x − t) n (x − t)n<br />
+ (n + 1)R n (x)<br />
(x − c) n+1<br />
for all t between x and c. Also note that, for fixed x,<br />
and<br />
g(c) = f(x) − P n (x) − R n (x) = 0 ,<br />
g(x) = f(x) − f(x) − 0 − · · · − 0 = 0 .<br />
Thus we have g(c) = g(x) = 0 and g is differentiable between x and c.<br />
Moreover, g is continuous throughout I, since f and its derivatives are<br />
continuous. This includes c, x and all points in between.
Module 5. The nature <strong>of</strong> infinite series 161<br />
♠<br />
• T<strong>here</strong>fore, g satisfies the conditions for Rolle’s theorem 1 , and it follows<br />
that t<strong>here</strong> is a number u between x and c for which g ′ (u) = 0. Now<br />
substituting t = u in g ′ (t) gives:<br />
g ′ (u) = − f (n+1) (u)<br />
n!<br />
⇒ R n (x) = f (n+1) (u)<br />
(n + 1)! (x − c)n+1 .<br />
(x − u) n (x − u)n<br />
+ (n + 1)R n (x)<br />
(x − c) = 0 n+1<br />
Note that when applying this result, we do not expect to be able to find the<br />
exact value <strong>of</strong> u. If we could do that, then making an approximation to f<br />
would not have been necessary. Rather, we try to find bounds for f (n+1) (u)<br />
from which we can estimate how large the remainder R n (x) might become,<br />
as in the worked example below.<br />
Lastly, suppose we approximate a function f by some Taylor polynomial, so<br />
that:<br />
f(x) = P n (x) + R n (x) ,<br />
1 Those unfamiliar with Rolle’s theorem should consult either <strong>of</strong> the following:<br />
– Mizrahi & Sullivan: Calculus & Analytic Geometry (3rd Edition); Wadsworth<br />
(1990)–Chapter 11.<br />
– Larson, Hostetler & Edwards: Calculus (5th Edition); Heath (1994)–Chapter 8.
Module 5. The nature <strong>of</strong> infinite series 162<br />
or equivalently,<br />
P n (x) = f(x) − R n (x) .<br />
Taking limits as n → ∞, the left-hand side will give the whole Taylor series<br />
for f, and on the right, f(x) does not depend on n. Thus a necessary and<br />
sufficient condition for the Taylor series to converge to f is that:<br />
lim R f (n+1) (u)<br />
n(x) = lim<br />
n→∞ n→∞ (n + 1)! (x − c)n+1 = 0 .<br />
Example 5.13: determining the accuracy <strong>of</strong> an approximation. Use a Taylor<br />
polynomial <strong>of</strong> degree 5 for sin x about x = 0 to estimate sin(0.1) and<br />
bound the accuracy <strong>of</strong> the approximation using Lagrange’s remainder.<br />
• start by calculating derivatives:<br />
f(x) = sin x ⇒ f(0) = 0<br />
f ′ (x) = cos x ⇒ f ′ (0) = 1<br />
f ′′ (x) = − sin x ⇒ f ′′ (0) = 0<br />
f ′′′ (x) = − cos x ⇒ f ′′′ (0) = −1<br />
f (4) (x) = sin x ⇒ f (4) (0) = 0<br />
f (5) (x) = cos x ⇒ f (5) (0) = 1<br />
f (6) (x) = − sin x .<br />
• Now<br />
sin x ≈ P 5 (x) = x − x3<br />
3! + x5<br />
5!
Module 5. The nature <strong>of</strong> infinite series 163<br />
and<br />
R 5 (x) = f (6) (u)<br />
x 6<br />
6!<br />
= − sin u x6<br />
6!<br />
• Using the above to approximate sin(0.1):<br />
for some number u with 0 ≤ u ≤ 0.1 .<br />
sin(0.1) ≈ P 5 (0.1) = 0.1 − (0.1)3 + (0.1)5<br />
3! 5!<br />
= 0.1 − 0.000166666 − 0.000000083<br />
and the remainder is given by<br />
= 0.099833416<br />
R 5 (0.1) = − sin u (0.1)6<br />
6!<br />
• Since the sine function is increasing on the interval [0, 0.1] we must<br />
have 0 ≤ sin u < 1 so<br />
−0.000000001 ≈ − (0.1)6<br />
6!<br />
and we conclude that<br />
or<br />
< R 5 (0.1) = − sin u (0.1)6<br />
6!<br />
0.099833416 − 0.000000001 ≤ sin(0.1) ≤ 0.099833416<br />
0.099833415 ≤ sin(0.1) ≤ 0.099833416<br />
.<br />
≤ 0
Module 5. The nature <strong>of</strong> infinite series 164<br />
Activity 5.I Do problems 5.14–5.16 from Exercises 5.3.4.<br />
5.3.4 Exercises<br />
Ex. 5.14: Bound the error on the Taylor polynomial P 2 (x) (about x = 0) as<br />
an approximation to 1/(1 + x) 2 over the interval − 1 2 < x < 1 2 . What<br />
would the bound be if we were only interested in 0 ≤ x < 1 2 ?<br />
Ex. 5.15: Use a Taylor polynomial <strong>of</strong> degree 2 about x = 0 for e x to estimate<br />
e 0.1 and bound the accuracy <strong>of</strong> the approximation using Lagrange’s<br />
remainder.<br />
Ex. 5.16: Use a Taylor polynomial <strong>of</strong> degree 3 to estimate f(x) = e 2x at<br />
x = 0.1, and use Lagrange’s remainder theorem to determine an error<br />
bound for your estimate.<br />
Ex. 5.17: Use a Taylor polynomial <strong>of</strong> degree 4 about x = 0 for log(1 + x) to<br />
estimate log(1.2) and bound the accuracy <strong>of</strong> the approximation using<br />
Lagrange’s remainder (note: log denotes the natural logarithm).
Module 5. The nature <strong>of</strong> infinite series 165<br />
Ex. 5.18: Find the Maclaurin series for the function f(x) = arctan x and<br />
determine its radius <strong>of</strong> convergence. Hint: the Maclaurin series for<br />
1/(1 + x 2 ) = 1 − x 2 + x 4 − x 6 + · · · .<br />
Ex. 5.19: Consider the function defined by the infinite series<br />
g(x) =<br />
∞∑<br />
n=1<br />
[<br />
1<br />
(−1) n + 1 ]<br />
(x + 1) n .<br />
n2 n 2 n<br />
Find the region in which this series converges.
Module 5. The nature <strong>of</strong> infinite series 166<br />
5.4 Taylor’s theorem in n-dimensions<br />
It is useful to generalise Taylor’s result to functions <strong>of</strong> several variables. An<br />
outline <strong>of</strong> the three variable case is presented below, from which generalisation<br />
to other cases is straightforward.<br />
Main aims:<br />
• generalise Taylor series to many independent variables;<br />
• use this generalisation to find and characterise maxima and minima <strong>of</strong><br />
functions <strong>of</strong> many variables.<br />
Given a function f(x, y, z) we seek an expansion for f(x + h, y + p, z + q)<br />
at some ‘nearby’ point, w<strong>here</strong> the expansion is written in terms <strong>of</strong> f and its<br />
derivatives and powers <strong>of</strong> h, p and q.<br />
Exercise 5.20: By setting x−c = h in equation (5.2) show that the Taylor<br />
series <strong>of</strong> a real function, f(x), centred at x can be written<br />
f(x + h) = f(x) + hf ′ (x) + h2<br />
2! f ′′ (c) + · · · + hn<br />
n! f (n) (x) + · · ·<br />
assuming, <strong>of</strong> course, that |h| < R, the radius <strong>of</strong> convergence <strong>of</strong> the<br />
power series at x.
Module 5. The nature <strong>of</strong> infinite series 167<br />
The implication <strong>of</strong> this expansion is that the value <strong>of</strong> an analytic function<br />
at points x + h ‘nearby’ to x is entirely determined by the values<br />
<strong>of</strong> f and its derivatives at the point x, and the separation, h. This is<br />
useful, particularly if the radius <strong>of</strong> convergence about x is not small.<br />
Outline <strong>of</strong> a Taylor’s series for a function <strong>of</strong> three variables: begin by<br />
using the single variable Taylor expansion derived in the Exercise 5.20.<br />
• First, vary x only holding y and z constant, then<br />
f(x + h, y + p, z + q) = f + h ∂f<br />
∂x + h2 ∂ 2 f<br />
2! ∂x + h3 ∂ 3 f<br />
2 3! ∂x + · · · 3<br />
w<strong>here</strong> all derivatives and f are evaluated at (x, y + p, z + q).<br />
Since only one<br />
variable changes, all<br />
derivatives are<br />
partial derivatives.<br />
• Now hold x and z constant in this series and do the expansion for y +p,<br />
f and its derivatives are now evaluated at (x, y, z + q).<br />
• Now hold x and y constant and do the expansion for z + q. Collect together<br />
all terms with the same total order <strong>of</strong> differentiation and obtain<br />
the following result.
Module 5. The nature <strong>of</strong> infinite series 168<br />
f(x + h, y + p, z + q) =<br />
(<br />
f + h ∂f<br />
∂x + p∂f ∂y + q ∂f )<br />
∂z<br />
+ 1 (<br />
)<br />
h 2 ∂2 f<br />
2! ∂x + ∂2 f<br />
2 p2<br />
∂y + ∂2 f<br />
2 q2<br />
∂z + 2hp ∂2 f<br />
2 ∂x∂y + 2hq ∂2 f<br />
∂x∂z + 2pq ∂2 f<br />
∂y∂z<br />
+ 1 (<br />
h 3 ∂3 f<br />
3! ∂x + 2 similar terms + 3 3h2 p ∂3 f<br />
+ 5 similar terms<br />
∂x 2 ∂y<br />
∂ 3 )<br />
f<br />
+ 6hpq<br />
∂x∂y∂z<br />
+ · · ·<br />
w<strong>here</strong> f and all its derivatives are evaluated at (x, y, z). This is expressed<br />
more compactly in terms <strong>of</strong> the displacement vector H = hi + pj + qk as:<br />
f(x + h, y + p, z + q) = f + (H · ∇)f + 1 2! (H · ∇)2 f + 1 3! (H · ∇)3 f<br />
+ · · · + 1 n! (H · ∇)n f + · · ·<br />
Recall from first<br />
year mathematics<br />
that the gradient <strong>of</strong><br />
f is ∇f =<br />
i ∂f<br />
∂x + j ∂f<br />
∂y + k ∂f<br />
∂z .<br />
w<strong>here</strong><br />
H · ∇ ≡<br />
(<br />
h ∂<br />
∂x + p ∂ ∂y + q ∂ )<br />
∂z<br />
and (H · ∇) n f means do the operation H · ∇ to f, then to the result, then<br />
to the result <strong>of</strong> that etc., until the operation has been done n times.<br />
Our work on extrema requires only the terms up to second order.
Module 5. The nature <strong>of</strong> infinite series 169<br />
Example 5.21: Find up to the second-order terms <strong>of</strong> the multi-variable<br />
Taylor series <strong>of</strong> f(x, y) = cos x e 2y about (x, y) = (0, 0).<br />
Solution: “Up to the second-order terms” includes (H · ∇) 2 f but<br />
excludes all third derivative terms. Now, using subscripts to denote<br />
partial differentiation:<br />
• f(0, 0) = 1;<br />
• f x = − sin x e 2y so f x (0, 0) = 0;<br />
• f y = 2 cos x e 2y so f y (0, 0) = 2;<br />
• f xx = − cos x e 2y so f xx (0, 0) = −1;<br />
• f xy = −2 sin x e 2y so f xy (0, 0) = 0;<br />
• f yy = 4 cos x e 2y so f yy (0, 0) = 4.<br />
Hence the second-order truncation <strong>of</strong> the Taylor series is<br />
f(h, p) ≈ f + (hf x + pf y ) + 1 2<br />
= 1 + 2p − 1 2 h2 + 2p 2 .<br />
(<br />
h 2 f xx + 2hpf xy + p 2 f yy<br />
)<br />
Note: as f(x, y) is the product <strong>of</strong> a function <strong>of</strong> x and a function <strong>of</strong> y,<br />
namely cos x and e 2y , this answer is quite sensibly the product <strong>of</strong> the<br />
two single variable, second-order Taylor polynomials, namely 1 − x 2 /2<br />
and 1 + 2y + 2y 2 .
Module 5. The nature <strong>of</strong> infinite series 170<br />
Activity 5.J Do Problem 5.23 in Exercises 5.4.2 [p180].<br />
examiner for feedback at least part (b).<br />
Send in to the<br />
5.4.1 Identify local maxima and minima<br />
The 3D-surface plotted in the following graph contains several peaks and<br />
a trough. The highest peak is a global maximum the trough is a global<br />
minimum and the two smaller peaks are called local maxima. Collectively,<br />
such points are known as extrema. A local maximum is higher than all points<br />
nearby, but a global maximum is the highest <strong>of</strong> all points on the surface.<br />
Minima are defined analogously.
Module 5. The nature <strong>of</strong> infinite series 171<br />
18<br />
16<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
0<br />
5<br />
10<br />
15<br />
20<br />
25<br />
30 0<br />
5<br />
10<br />
15<br />
20<br />
25<br />
30<br />
surfc(peaks(40)+8)<br />
The location and study <strong>of</strong> extrema is frequently important, for example,<br />
suppose the height z <strong>of</strong> the surface above the xy-plane represents the temperature<br />
<strong>of</strong> a chemical reaction as quantities x and y <strong>of</strong> two reactants are<br />
added, it may be essential to know how high or low the temperature can go<br />
in order to properly contain the reaction.<br />
Mathematically, a 3-D surface is represented explicitly as z = f(x, y), or<br />
implicitly by F (x, y, z) = C for some constant C. In first-year mathematics<br />
courses we saw that local extrema occur at stationary points w<strong>here</strong><br />
∂f<br />
∂x = ∂f<br />
∂y = 0 ,
Module 5. The nature <strong>of</strong> infinite series 172<br />
so that all directional derivatives <strong>of</strong> f vanish at a stationary point, or equivalently<br />
the tangent plane to the surface is horizontal, which means that the<br />
normal to the surface must be in the same direction as the z-axis: that is,<br />
parallel ∇F ‖ k. T<strong>here</strong> are stationary points, called saddle points which<br />
satisfy these conditions but are nether minima nor maxima. In the following<br />
figure the origin (0, 0, 0) is a saddle point. In the plane x = 0, moving<br />
along the dashed line, the origin appears to be a local maximum, but in the<br />
plane y = 0, along the solid line, a local minimum. The behaviour <strong>of</strong> nearby<br />
points depends on the direction in which (0, 0, 0) is approached which defines<br />
a saddle point. It is neither a local minimum nor local maximum.<br />
100<br />
50<br />
0<br />
z<br />
-50<br />
-100<br />
-150<br />
5<br />
0<br />
y<br />
-5<br />
-5<br />
x<br />
0<br />
x=linspace(-5,5), y=x;<br />
[X Y]=meshgrid(x,y);<br />
Z=2*X.^2-5*Y.^2;<br />
surfl(X,Y,Z)<br />
5<br />
Activity 5.K Do Problem 5.24 in the Exercises 5.4.2.
Module 5. The nature <strong>of</strong> infinite series 173<br />
Algebraically, extrema are characterised using Taylor’s formula in n-dimensions.<br />
For example in 2-D, suppose (a, b) is a local extremum <strong>of</strong> f(x, y), then compare<br />
the value <strong>of</strong> f(a, b) with nearby points f(a + h, b + p), w<strong>here</strong> h, p are<br />
small:<br />
• if all nearby values <strong>of</strong> f are greater than f(a, b) then (a, b) is a local<br />
minimum;<br />
• if all nearby values <strong>of</strong> f are less than f(a, b) then (a, b) is a local maximum;<br />
• otherwise (a, b) is a saddle point.<br />
Taylor’s theorem gives<br />
f(a + h, b + p) = f(a, b) + hf x (a, b) + pf y (a, b)<br />
+ 1 (<br />
h 2 f xx (a, b) + p 2 f yy (a, b) + 2hpf xy (a, b) )<br />
2!<br />
+ higher order terms.<br />
Subscripts x and y<br />
to a function f are<br />
used to denote<br />
partial derivatives<br />
with respect to the<br />
subscript variable.<br />
Now f x (a, b) = f y (a, b) = 0, since (a, b) is an extremum and terms which<br />
are cubic and higher order in (h, p) are negligible compared to the quadratic<br />
term, so<br />
f(a + h, b + p) − f(a, b) ≈ 1 Q(h, p) (5.4)<br />
2
Module 5. The nature <strong>of</strong> infinite series 174<br />
w<strong>here</strong> the quadratic terms<br />
Q = f xx h 2 + 2f xy hp + f yy p 2 = [ h p ] [ ] [<br />
f xx f xy h<br />
f yx f yy p<br />
]<br />
= h T Hh , (5.5)<br />
w<strong>here</strong> all the second-order derivatives are evaluated at (a, b), and w<strong>here</strong> the<br />
vector h = (h, p).<br />
Definition 5.4 In (5.5) Q(h) has been written as the quadratic form Q =<br />
h T Hh:<br />
• h T Hh is called the Hessian 2 <strong>of</strong> f at the point (a, b);<br />
• the symmetric matrix H <strong>of</strong> second derivatives is called the Hessian<br />
matrix;<br />
• such a quadratic form, Q, is said to be positive definite if Q(h) > 0 for<br />
all h ≠ 0;<br />
• and is said to be negative definite if Q(h) < 0 for all h ≠ 0.<br />
From (5.4):<br />
• if Q is positive definite then f(a + h, b + p) − f(a, b) > 0 (at least near<br />
enough to (a, b)) and so (a, b) is a local minimum;<br />
2 Ludwig Otto Hesse introduced these in 1884.
Module 5. The nature <strong>of</strong> infinite series 175<br />
• if Q is negative definite then (a, b) is a local maximum;<br />
• otherwise, (a, b) could be a saddle point, but it could also mean that we<br />
need information from the “higher order terms” neglected in forming<br />
the approximation (5.4).<br />
Observe that the Hessian matrix, in n-D<br />
H =<br />
[ ∂ 2 f<br />
∂x i ∂x j<br />
]<br />
=<br />
⎡<br />
⎢<br />
⎣<br />
∂ 2 f<br />
∂x 2 1<br />
∂ 2 f<br />
∂x 2 ∂x 1<br />
.<br />
∂ 2 f<br />
∂x n∂x 1<br />
∂ 2 f<br />
∂x 1 ∂x 2<br />
· · ·<br />
∂ 2 f<br />
∂x 2 2<br />
∂ 2 f<br />
∂x n∂x 2<br />
· · ·<br />
∂ 2 f<br />
∂x 1 ∂x n<br />
∂ 2 f<br />
∂x 2 ∂x n<br />
· · ·<br />
. .. .<br />
∂ 2 f<br />
∂x 2 n<br />
⎤<br />
⎥<br />
⎦<br />
(evaluated at a stationary point) is symmetric and so has real eigenvalues<br />
and orthogonal eigenvectors. Recall from first-year mathematics that we can<br />
thus diagonalise H = P DP T , w<strong>here</strong> the columns <strong>of</strong> P are the normalised<br />
eigenvectors <strong>of</strong> H and w<strong>here</strong> the matrix D is diagonal with the eigenvalues<br />
<strong>of</strong> H along its diagonal. Make a change <strong>of</strong> variable so that the axes <strong>of</strong> the See Kreyszig §7.5<br />
r = (r, s) coordinate system are aligned along the principle directions <strong>of</strong> the<br />
quadratic Q. An example is seen in the graph below w<strong>here</strong> the r and s axes<br />
are chosen to fit to the nature <strong>of</strong> the quadratic (whose contours are shown)<br />
with Hessian matrix<br />
]<br />
H =<br />
[<br />
−8 4<br />
4 −4<br />
.<br />
[K,p392–8] for<br />
another summary <strong>of</strong><br />
diagonalisation.
Module 5. The nature <strong>of</strong> infinite series 176<br />
1<br />
s<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
p<br />
0<br />
-0.2<br />
-0.4<br />
-0.6<br />
r<br />
-0.8<br />
-1<br />
-1 -0.5 0 0.5 1<br />
h<br />
The appropriate change <strong>of</strong> variable is<br />
r = P T h , equivalently h = P r ,
Module 5. The nature <strong>of</strong> infinite series 177<br />
so that in the new coordinate system the quadratic simplifies to give<br />
Q = h T Hh = r T P T HP r = r T Dr .<br />
But D is diagonal with diagonal entries the eigenvalues <strong>of</strong> H: namely D =<br />
diag(λ 1 , . . . , λ n ) in n-dimensions. Thus in the r coordinate system the quadratic<br />
is<br />
Q = λ 1 r 2 1 + · · · + λ n r 2 n . (5.6)<br />
From this we readily deduce the shape <strong>of</strong> the quadratic and hence the nature<br />
<strong>of</strong> the stationary point:<br />
• if all eigenvalues <strong>of</strong> H are positive then Q is positive definite, as all<br />
terms in (5.6) are positive, and the stationary point is a local minimum;<br />
• if all eigenvalues are negative then Q is negative definite, as all terms<br />
in (5.6) are negative, and the stationary point is a local maximum;<br />
• if some eigenvalues are positive and some are negative then the stationary<br />
point is a saddle point as we can increase the value <strong>of</strong> Q by<br />
moving in some directions and decrease the value <strong>of</strong> Q by moving in<br />
other directions;<br />
• lastly, if the eigenvalues are all positive or all negative except some that<br />
are precisely zero, then the neglected higher order terms in f need to<br />
be taken into account.
Module 5. The nature <strong>of</strong> infinite series 178<br />
Example 5.22: analyse the behaviour <strong>of</strong> z = f(x, y) = x 3 + 4xy − 2y 2 + 8<br />
at its stationary points.<br />
Before beginning the analysis Matlab draws the following surface z =<br />
f(x, y):<br />
15<br />
10<br />
z<br />
5<br />
0<br />
2<br />
1<br />
0<br />
1<br />
2<br />
-1<br />
0<br />
y<br />
-2<br />
-3<br />
-3<br />
-2<br />
-1<br />
x
Module 5. The nature <strong>of</strong> infinite series 179<br />
Solution:<br />
First find the stationary points:<br />
∂f<br />
∂x = 3x2 + 4y<br />
∂f<br />
∂y<br />
= 4x − 4y<br />
setting both <strong>of</strong> these equal to 0 gives x = y and 3x 2 +4x = x(3x+4) = 0.<br />
So the stationary points are (0, 0) and (− 4, − 4 ). Now find the second<br />
3 3<br />
order derivatives:<br />
Thus<br />
∂ 2 f<br />
∂x 2 = 6x ,<br />
∂ 2 f<br />
∂y 2 = −4 ,<br />
∂ 2 f<br />
∂x∂y = 4 .<br />
(0, 0) the Hessian matrix is<br />
H =<br />
[<br />
0 4<br />
4 −4<br />
]<br />
and hence the characteristic polynomial is<br />
|λI − H| = λ 2 + 4λ − 16 .<br />
This is an upwards parabola which is −16 at λ = 0 and hence<br />
t<strong>here</strong> must be one 0 for negative λ and one for positive λ. Hence<br />
the two eigenvalues have opposite sign and so (0, 0) is a saddle<br />
point.
Module 5. The nature <strong>of</strong> infinite series 180<br />
(−4/3, −4/3) the Hessian matrix is<br />
H =<br />
[<br />
−8 4<br />
4 −4<br />
and hence the characteristic polynomial is<br />
|λI − H| = λ 2 + 12λ + 16 .<br />
This is an upwards parabola which is +16 at λ = 0 and hence<br />
both 0’s have to occur for same signed λ. Since the slope <strong>of</strong> the<br />
parabola is positive when λ = 0, namely 12, then both 0’s occur<br />
for negative λ. Hence both (all) eigenvalues are negative and so<br />
(−4/3, −4/3) is a local maximum.<br />
]<br />
Activity 5.L Do problems 5.24–5.25 from Exercises 5.4.2. Send in to the<br />
examiner for feedback at least Ex. 5.25(a) and (d).<br />
5.4.2 Exercises<br />
Ex. 5.23: Find up to the second-order terms <strong>of</strong> the multi-variable Taylor<br />
series <strong>of</strong> the following functions about the specified points:
Module 5. The nature <strong>of</strong> infinite series 181<br />
(a) f(x, y) = cos x e 2y about (π/2, 0);<br />
(b) f(x, y) = (x + y)/(1 + y) about (0, 0);<br />
(c) f(x, y, z) = e x√ 1 + y 2 + z 2 about (0, 2, 2).<br />
Ex. 5.24: If z = f(x, y) = 2x 2 − 5y 2 show that f x (0, 0) = f y (0, 0) = 0 and<br />
hence that (0, 0) is a stationary point.<br />
Re-write the equation for the surface in the form F (x, y, z) = C, for<br />
some constant C, and show that ∇F ‖ k at (0, 0, 0), proving again it<br />
is a stationary point.<br />
Ex. 5.25: Find the stationary points <strong>of</strong> the given functions and then determine<br />
whether they are local maxima, local minima, or saddle points.<br />
(a) f(x, y) = x 2 − y 2 + xy<br />
(b) f(x, y) = x 2 + y 2 − xy<br />
(c) f(x, y) = x 2 − 3xy + 5x − 2y + 6y 2 + 8<br />
(d) f(x, y) = log(x 2 + y 2 + 1).<br />
(e) f(x, y) = x 5 y + xy 5 + xy .<br />
Ex. 5.26: Find the three stationary points <strong>of</strong> f(x, y) = x 2 +y 2 +2 cos(x+y)<br />
and classify the stationary point at the origin.<br />
Ex. 5.27: Analyse the behaviour <strong>of</strong> f(x, y) = x 3 + 6xy + 3y 2 + 5 at its<br />
stationary points.
Module 5. The nature <strong>of</strong> infinite series 182<br />
5.4.3 Answers to selected Exercises<br />
5.18 x − 1 3 x3 + 1 5 x5 − · · · with radius <strong>of</strong> convergence 1.<br />
5.19 −3 < x < 1<br />
5.23 (a) f(x, y) ≈ −x − 2xy<br />
(b) x + y − xy − y 2<br />
(c) 3 + 3x + 2 3 y + 2 3 z + 3 2 x2 + 5<br />
54 y2 + 5 54 z2 + 2 3 xy + 2 3 xz − 4 27 yz<br />
5.24 For example F = z − 2x 2 + 5y 2 = 0 whence ∇F = −4xi + 10yj + k = k<br />
at x = y = z = 0.<br />
5.25 (a) (0, 0), a saddle point<br />
(b) (0, 0), a local minimum<br />
(c) (− 18, − 11 ), a local minimum<br />
5 15<br />
(d) (0, 0), a local minimum<br />
(e) (0, 0), a saddle point.
Module 5. The nature <strong>of</strong> infinite series 183<br />
5.5 Summary<br />
• Tests like the Comparison test, Ratio test and Root test (§§5.2.2) are<br />
useful in determining whether a given infinite series converges or diverges<br />
but they do not establish what its sum may be. For that it may<br />
be necessary to use direct arguments based on partial sums (§§5.1.2),<br />
or to resort to direct numerical evaluation.<br />
• Infinite series that converge absolutely are “robust”, w<strong>here</strong>as series<br />
that converge conditionally rely on delicate cancellation <strong>of</strong> terms in the<br />
series (§§5.2.1).<br />
• Power series are complex/real infinite series with terms involving increasing<br />
powers <strong>of</strong> a complex/real variable z/x, around a given centre,<br />
c (§5.3). Generally, they converge (absolutely) within a disc <strong>of</strong> the<br />
complex plane, or an interval <strong>of</strong> the real line, centred at c. The radius<br />
<strong>of</strong> this disc is called the radius <strong>of</strong> convergence, but convergence is not<br />
guaranteed (and conditional at best) on the edge <strong>of</strong> the disc.<br />
• Power series are used to define functions which are continuous, differentiable<br />
and integrable within their radii <strong>of</strong> convergence (5.3.1). Conversely,<br />
a given analytic function, f(z) can be represented by a power<br />
series expansion about some centre, c, and this expansion is unique,<br />
being the Taylor series <strong>of</strong> f about c, or when c = 0, the Maclaurin<br />
series (§§5.3.2).
Module 5. The nature <strong>of</strong> infinite series 184<br />
• Truncated Taylor series, or Taylor polynomials are used to compute<br />
approximate values for functions (§§5.3.3). The accuracy <strong>of</strong> these approximations<br />
may be estimated with Lagrange’s remainder (5.3).<br />
• Taylor series are generalised to functions <strong>of</strong> more than one variable<br />
(§5.4). This is used, for example, in describing the nature <strong>of</strong> the stationary<br />
points <strong>of</strong> functions <strong>of</strong> several variables, w<strong>here</strong> the first-order<br />
derivatives vanish (§§5.4.1). Such points will be local minima, local<br />
maxima or saddle points depending upon the eigenvalues <strong>of</strong> the Hessian<br />
matrix <strong>of</strong> second-order derivatives.<br />
Activity 5.M Do representatives <strong>of</strong> Problems 1–5 and 16–35 from the Chapter<br />
14 Review [K,pp767–8].
Module 6<br />
Series solutions <strong>of</strong> differential<br />
equations give special functions<br />
“Although this may seem a paradox, all exact science is dominated<br />
by the idea <strong>of</strong> approximation”<br />
Bertrand<br />
Russell<br />
We have seen how linear ordinary differential equations (ode’s) are solved if<br />
they have constant coefficients (Module 1). Higher-order ode’s are first represented<br />
as linear systems <strong>of</strong> first-order ode’s and then the general solution<br />
will be typically <strong>of</strong> the form (1.4). The power series solution is the standard<br />
method for solving linear ordinary differential equations with variable
Module 6. Series solutions <strong>of</strong> differential equations give special functions 186<br />
coefficients. It gives solutions in the form <strong>of</strong> power series, hence the name.<br />
Power series are also the paramount method for solving otherwise intractable<br />
nonlinear differential equations.<br />
How do variable coefficients arise in differential equations? Perhaps it is best<br />
to first explain how constant coefficients arise. Constant coefficients arise<br />
because one part <strong>of</strong> space looks very much like another; thus the mathematical<br />
expression <strong>of</strong> the processes at each point in space is the same and hence<br />
the differential equation modelling the processes is everyw<strong>here</strong> the same. We<br />
saw this in earlier modules on continuum mechanics. Conversely, differential<br />
equations with variable coefficients arise when different points in space have<br />
different properties. I give two examples:<br />
• look at the waves near a beach. They curve in towards the beach,<br />
steepen and break. Let x measure distance from the beach and h(x)<br />
denote the depth <strong>of</strong> the water (small near the shore and larger further<br />
away), then the height, y(x), <strong>of</strong> the waves satisfies a differential equation<br />
<strong>of</strong> the form h(x)y ′′ +h ′ (x)y ′ = · · · with coefficients depending upon<br />
the local water depth;<br />
• in finance, the Black-Scholes equation is used to estimate the current<br />
value <strong>of</strong> future transactions (see the course on Advanced Mathematics).<br />
Letting s denote the price <strong>of</strong> a stock, then the value v(s) satisfies a<br />
differential equation <strong>of</strong> the form rsv ′ + 1 2 β2 s 2 v ′′ = · · · w<strong>here</strong> r is the bank<br />
interest rate and β measures how volatile is the stock. The variable<br />
coefficients, rs and 1 2 β2 s 2 , arise because returns are relative to the<br />
investment.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 187<br />
These are two examples <strong>of</strong> w<strong>here</strong> variable coefficient differential equations<br />
arise. This module supplies tools for the analytic solution <strong>of</strong> such variable<br />
coefficient differential equations.<br />
In this module we develop not only the general principles and methods, but<br />
also apply them to differential equations that commonly arise in physical<br />
problems. In practise all we do is to simply try a power series solution and<br />
see what solutions we obtain, §6.1. This works except when the coefficient<br />
<strong>of</strong> the highest derivative is zero; in this case we are more inventive, §6.2.<br />
The solutions <strong>of</strong> these important differential equations have special properties<br />
that make them widely useful, though perhaps not quite so useful as<br />
trigonometric and exponential functions. Called Legendre polynomials and<br />
Bessel functions, these are examples <strong>of</strong> a wide class <strong>of</strong> special functions. Inspired<br />
by these examples we then develop Sturm-Liouville theory, in §6.4, to<br />
tell us useful and general properties about the solutions <strong>of</strong> a wide class <strong>of</strong><br />
differential equations.<br />
We also introduce a little computer algebra, §6.3, to help with the repetitive<br />
analysis <strong>of</strong> this module and to attack nonlinear ode’s.<br />
Module contents<br />
6.1 Power series method leads to Legendre polynomials 189<br />
6.1.1 Introduction to the power series method . . . . . . . . . 190<br />
6.1.2 Legendre’s equation and Legendre polynomials . . . . . 192
Module 6. Series solutions <strong>of</strong> differential equations give special functions 188<br />
6.2 Frobenius method is needed to describe Bessel functions<br />
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195<br />
6.2.1 Frobenius extends the method . . . . . . . . . . . . . . 196<br />
6.2.2 Bessel functions are used in circular geometries . . . . . 201<br />
6.3 Computer algebra for repetitive tasks . . . . . . . . . 206<br />
6.3.1 Introducing reduce . . . . . . . . . . . . . . . . . . . . 208<br />
6.3.2 Introduction to the iterative method . . . . . . . . . . . 211<br />
6.3.3 Iteration is very flexible . . . . . . . . . . . . . . . . . . 222<br />
6.3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 239<br />
6.3.5 Summary <strong>of</strong> some reduce commands . . . . . . . . . . 242<br />
6.4 The orthogonal solutions to second order differential<br />
equations . . . . . . . . . . . . . . . . . . . . . . . . . . 245<br />
6.4.1 Answers to selected Exercises . . . . . . . . . . . . . . . 247<br />
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Module 6. Series solutions <strong>of</strong> differential equations give special functions 189<br />
6.1 Power series method leads to Legendre polynomials<br />
In this first section we introduce the fundamental ideas <strong>of</strong> the power series<br />
method. These ideas are applied to standard differential equations that we<br />
could readily solve other ways. Do not be misled, this is only so that we can<br />
compare the results to the known solutions. The power series method is very<br />
powerful and is applied to even immensely difficult mathematical problems.<br />
Main aims:<br />
• use the uniqueness <strong>of</strong> power series representations to derive power series<br />
solutions <strong>of</strong> differential equations;<br />
• see how the method leads to linearly independent power series solutions;<br />
• find the polynomial solutions <strong>of</strong> Legendre’s equation as an example <strong>of</strong><br />
the method.<br />
Note: in this module we will generally seek a solution y as a function <strong>of</strong> the<br />
independent variable x.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 190<br />
6.1.1 Introduction to the power series method<br />
To find power series solutions to differential equations we simply substitute<br />
a power series and see the logical consequences. In particular, see how neatly<br />
we get two linearly independent solutions <strong>of</strong> a second order ode.<br />
Reading 6.A Study Kreyszig §4.1 [K,pp194–8] and note especially how the<br />
examples work.<br />
Activity 6.B Do problems from Problem Set 4.1 [K,p198]—find the general<br />
solutions in terms <strong>of</strong> arbitrary “integration” constants. Verify for a few<br />
<strong>of</strong> these that the power series method yields the Taylor series expansion<br />
<strong>of</strong> the general analytic solution obtained by well known methods. Send<br />
in to the examiner for feedback at least Q1 & 7.<br />
Most <strong>of</strong> the theoretical basis for using power series to represent functions was<br />
developed in Module 5.<br />
Reading 6.C Read §4.2 [K,pp198–204], but make sure you review the sections<br />
on Shifting summation indices [K,pp202–3] and Existence <strong>of</strong> power<br />
series solutions [K,pp203–4].<br />
Four important points are the following.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 191<br />
• By the uniqueness <strong>of</strong> power series coefficients, the zero function must<br />
have zero coefficients. Thus when we compute the left-hand side <strong>of</strong> a<br />
differential equation as a power series and the right-hand side is zero,<br />
then the coefficients <strong>of</strong> each power on the left-hand side has to be zero.<br />
This determines the equations for the power series coefficients.<br />
• Being able to shift summation indices is an important skill to learn in<br />
order to quickly develop power series solutions.<br />
• Power series solutions to linear ordinary differential equations exist<br />
and converge for some non-zero radius provided that the coefficient<br />
functions <strong>of</strong> the differential equation are well-behaved: namely they<br />
can all be expanded in convergent Taylor series and the coefficient <strong>of</strong><br />
the highest derivative in the ode does not vanish at the expansion<br />
point.<br />
• Well behaved functions are called analytic.<br />
Example 6.1: shifting summation indices Perhaps the easiest way to<br />
learn how to shift summation indices is to: write out the first few terms<br />
in the sum; then rewrite as a new sum in the desired form. Usually<br />
the aim is to make the exponent <strong>of</strong> x the variable <strong>of</strong> summation. For<br />
example, consider the second derivative<br />
y ′′ =<br />
∞∑<br />
m(m − 1)a m x m−2<br />
m=0
Module 6. Series solutions <strong>of</strong> differential equations give special functions 192<br />
writing out the first 7 terms<br />
= 0 + 0 + 2 · 1 · a 2 + 3 · 2 · a 3 x + 4 · 3 · a 4 x 2<br />
=<br />
+ 5 · 4 · a 5 x 3 + 6 · 5 · a 6 x 4 + · · ·<br />
rewriting in terms <strong>of</strong> the exponent <strong>of</strong> x<br />
∞∑<br />
(m + 2)(m + 1)a m+2 x m .<br />
m=0<br />
We may use the same summation variable m, or something different<br />
if we wish, because m is a parameter to the sum: it has no meaning<br />
outside <strong>of</strong> the sum in which it is used, and thus is allowed to mean<br />
different things in different sums.<br />
Activity 6.D Do problems from Problem Set 4.2 [K,pp204–5]. Send in to<br />
the examiner for feedback at least Q5, 15 & 23.<br />
6.1.2 Legendre’s equation and Legendre polynomials<br />
In many applications <strong>of</strong> mathematics we <strong>of</strong>ten have a need to solve problems<br />
in a spherical domain or on the surface <strong>of</strong> a sp<strong>here</strong>. This might be because we<br />
study the internal dynamics <strong>of</strong> a star, the weather in the global atmosp<strong>here</strong>,
Module 6. Series solutions <strong>of</strong> differential equations give special functions 193<br />
the dynamics <strong>of</strong> a ball, or the deformation <strong>of</strong> a near spherical drop <strong>of</strong> water. The classic<br />
In all these cases the differential equations describing the material take a<br />
similar form because <strong>of</strong> the spherical symmetry. This form leads to Legendre’s<br />
equation,<br />
(1 − x 2 )y ′′ − 2xy ′ + n(n + 1)y = 0 , (6.1)<br />
whose solutions we now explore using the techniques <strong>of</strong> power series.<br />
When solving problems on a sp<strong>here</strong> such as the earth: x = sin(latitude)<br />
so that x = ±1 corresponds to the North and South poles and x = 0 the<br />
equator. Consequently, in applications we require that the solutions are well<br />
behaved (analytic) at x = ±1 . See that this is an essential ingredient in the<br />
analysis.<br />
We will concentrate on the solutions <strong>of</strong> Legendre’s equation for integer n.<br />
For example:<br />
n = 1 y = P 1 (x) = x satisfies (1 − x 2 )y ′′ − 2xy ′ + 2y = 0 ;<br />
n = 2 y = P 2 (x) = 1 2 (3x2 − 1) satisfies (1 − x 2 )y ′′ − 2xy ′ + 6y = 0 .<br />
But what is the other independent solution for each case? and what about<br />
other values <strong>of</strong> n?<br />
differential<br />
equations in<br />
spherical geometry<br />
will be derived and<br />
discussed in the<br />
course on Vector<br />
calculus and partila<br />
differential<br />
equations.<br />
Reading 6.E Study Kreyszig §4.3 [K,p205–8]. Note that Legendre polynomials<br />
arise as solutions when the parameter n to Legendre’s equation<br />
is integral, n ≥ 0.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 194<br />
Legendre polynomials and associated Legendre functions are readily computed<br />
with Matlab. See below for code to plot Legendre polynomials.<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
P 1<br />
(x)<br />
P 2<br />
(x)<br />
P 3<br />
(x)<br />
P 4<br />
(x)<br />
0<br />
-0.2<br />
x=linspace(-1,1);<br />
-0.4<br />
pp=[];<br />
for n=1:4<br />
-0.6<br />
p=legendre(n,x);<br />
-0.8<br />
pp(n,:)=p(1,:);<br />
end<br />
-1<br />
-1 -0.5 0 0.5 1 plot(x,pp)<br />
x<br />
Activity 6.F Do problems 1–9, 11 & 12 from Problem Set 4.3 [K,pp209–10].<br />
Send in to the examiner for feedback at least Q1, 4 & 8.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 195<br />
6.2 Frobenius method is needed to describe Bessel<br />
functions<br />
We are <strong>of</strong>ten interested in mathematically formulating and solving problems<br />
in a circular geometry, for example: the vibrations <strong>of</strong> a drum; the development<br />
<strong>of</strong> blood flow along nearly circular arteries and veins; and propagation <strong>of</strong><br />
light down an optical fibre. In these circumstances we use polar coordinates<br />
(r, θ) to describe the cross-sectional structures in these circular domains.<br />
Then the unknown fields, say u, are expressed as u = f(r) cos nθ w<strong>here</strong> integer<br />
n parametrises the structure around the circular domain, whence we are Such solutions <strong>of</strong><br />
lead to solve ode’s for f(r) <strong>of</strong> the form<br />
f ′′ + 1 r f ′ − n2<br />
r 2 f = 0 .<br />
Not only does such an equation have variable coefficients, it also has badly<br />
behaved coefficients as r → 0, the very centre <strong>of</strong> the domain! In this section<br />
we extend the power series method to cope with these interesting sorts <strong>of</strong><br />
problems.<br />
partial differential<br />
equations are<br />
developed in the<br />
course mat2102.<br />
Main aims:<br />
• generalise the power series method to cope with singular differential<br />
equations via the indicial equation;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 196<br />
• see how the different cases that can arise lead to the different Bessel<br />
function solutions <strong>of</strong> Bessel’s equation.<br />
6.2.1 Frobenius extends the method<br />
The key to analysing such more general problems, called the Frobenius method,<br />
is to seek a power series in a slightly more general form. For a problem expressed<br />
as an ode for y(x) all we need do is to introduce a prefactor to the<br />
power series <strong>of</strong> x r w<strong>here</strong> r is some real or complex number to be determined<br />
as needed. 1 That is, we seek solutions in the form<br />
y(x) = x r<br />
∞ ∑<br />
m=0<br />
a m x m = x r (a 0 + a 1 x + a 2 x 2 + a 3 x 3 + · · ·) . (6.2)<br />
Example 6.2: Find the first few terms in a generalised power series solution<br />
to the ode 4x 2 y ′′ + x 2 y ′ + y = 0 expanded about the centre x = 0.<br />
Solution:<br />
Substitute the more general power series form<br />
y(x) = a 0 x r + a 1 x r+1 + a 2 x r+2 + · · · ,<br />
1 In more tricky problems still we may resort to not only having a prefactor <strong>of</strong> x r , but<br />
also expanding in non-integral powers <strong>of</strong> x. Trying y(x) = ∑ ∞<br />
m=0 a mx r+qm for some real<br />
or complex r and q is very powerful. But we will not explore this.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 197<br />
noting that its derivatives are<br />
y ′ = ra 0 x r−1 + (r + 1)a 1 x r + (r + 2)a 2 x r+1 + · · ·<br />
y ′′ = r(r − 1)a 0 x r−2 + (r + 1)ra 1 x r−1 + (r + 2)(r + 1)a 2 x r + · · · ,<br />
then the ode becomes<br />
4r(r − 1)a 0 x r + 4(r + 1)ra 1 x r+1 + 4(r + 2)(r + 1)a 2 x r+2 + · · ·<br />
+ ra 0 x r+1 + (r + 1)a 1 x r+2 + · · ·<br />
+a 0 x r + a 1 x r+1 + a 2 x r+2 + · · · = 0 .<br />
As before, the fundamental principle is that the complicated generalised<br />
power series on the left-hand side can only be equal to the zero on<br />
the right-hand side if all the coefficients <strong>of</strong> each power <strong>of</strong> x vanish.<br />
Grouping all terms in x r , x r+1 and x r+2 we must have:<br />
[4r(r − 1) + 1] a 0 = 0 ,<br />
[4(r + 1)r + 1] a 1 + ra 0 = 0 ,<br />
and [4(r + 2)(r + 1) + 1] a 2 + (r + 1)a 1 = 0 .<br />
• Now, without loss <strong>of</strong> generality we may assume that a 0 ≠ 0. 2 Thus<br />
we arrive at the indicial equation for r, that 4r(r − 1) + 1 = 0.<br />
This is simply a quadratic for r which factors to (2r − 1) 2 = 0,<br />
thus r = 1/2 and the prefactor to the power series must be simply<br />
√ x. a0 is not constrained (other than being non-zero).<br />
2 If a 0 = 0 then we are effectively seeking a power series <strong>of</strong> the form y = x r+1 (a 1 +<br />
a 2 x + · · ·) which is not any different in principle.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 198<br />
• The second equation above, from coefficients <strong>of</strong> x r+1 , says that<br />
a 1 = −ra 0 /[4(r + 1)r + 1]. But we know r = 1/2 and hence this<br />
determines a 1 = −a 0 /8.<br />
• Similarly, the third equation above, from coefficients <strong>of</strong> x r+2 , says<br />
that a 2 = −(r + 1)a 1 /[4(r + 2)(r + 1) + 1]. Hence a 2 = −3a 1 /32 =<br />
+3a 0 /256.<br />
Thus a power series solution to the ode is<br />
(<br />
√<br />
y 1 (x) = a 0 x 1 − 1 8 x + 3 )<br />
256 x2 + · · · ,<br />
w<strong>here</strong> a 0 is an arbitrary constant.<br />
This example leads to two questions: when does the Frobenius method work?<br />
and what happened to the second (linearly independent) solution that must<br />
exist for this second order ode?<br />
Reading 6.G Study Kreyszig §4.4 [K,pp211–6].<br />
Example 6.3: Find the first few orders in the expansion <strong>of</strong> a second linearly<br />
independent solution <strong>of</strong> the ode in Example 6.2.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 199<br />
Solution: See that Example 6.2 is an example <strong>of</strong> Case 2 when the<br />
indicial equation has a double root. Hence expect a second linearly<br />
independent solution to be<br />
y 2 (x) = y 1 (x) log x + √ x(b 1 x + b 2 x 2 + · · ·) .<br />
Note the omission <strong>of</strong> b 0 in this expansion in order to avoid introducing<br />
an arbitrary multiple <strong>of</strong> y 1 —we could leave b 0 in, but we would pointlessly<br />
reproduce some <strong>of</strong> the earlier analysis. Differentiating y 2 leads<br />
to<br />
y ′ 2 = y ′ 1 log x + y 1x −1 + 3 2 b 1x 1/2 + 5 2 b 2x 3/2 + · · · ,<br />
y ′′<br />
2 = y ′′<br />
1 log x + 2y′ 1 x−1 − y 1 x −2 + 3 4 b 1x −1/2 + 15 4 b 2x 1/2 + · · · .<br />
Substitute these into the differential equation:<br />
4x 2 y 1 ′′ 1 − 4y 1 +3b 1 x 3/2 +15b 2 x 5/2 + · · ·<br />
+x 2 y 1 ′ log x + xy 1<br />
+y 1 log x +b 1 x 3/2 + 3 b 2 1x 5/2 + · · ·<br />
+b 2 x 5/2 + · · · = 0 .<br />
• The three terms involving log x immediately cancel because y 1 (x)<br />
satisfies the ode.<br />
• Also 8xy ′ 1 − 4y 1 + xy 1 (upon setting a 0 = 1 in y 1 for simplicity)<br />
becomes just x 5/2 /16 + · · · —the x 1/2 term disappears because the<br />
indicial equation has a double root, and the x 3/2 term disappears<br />
by chance.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 200<br />
• Thus grouping all terms in x 3/2 and setting its coefficient to zero<br />
leads to 4b 1 = 0, that is b 1 = 0.<br />
• Grouping all terms in x 5/2 and setting its coefficient to zero leads<br />
to 1/16 + 3 2 b 1 + 16b 2 = 0. Hence b 2 = −1/256.<br />
A second linearly independent solution is thus<br />
y 2 = y 1 (x) log x + √ (<br />
x − 1 )<br />
256 x2 + · · ·<br />
.<br />
Note:<br />
• A regular point <strong>of</strong> a linear ode is any point w<strong>here</strong> all the coefficient<br />
functions are analytic, namely they all have Taylor series expansions<br />
that have a non-zero radius <strong>of</strong> convergence, and the coefficient <strong>of</strong> the<br />
highest derivative is non-zero.<br />
If a point is not regular, then it is called a singular point. Singular<br />
points for an ode <strong>of</strong>ten arise because <strong>of</strong> a degeneracy <strong>of</strong> the coordinate<br />
system and have nothing to do with the subject <strong>of</strong> the application <strong>of</strong><br />
the mathematics. For example, in polar coordinates the point r = 0<br />
is degenerate because all angles θ meet at t<strong>here</strong>, but the centre <strong>of</strong> a<br />
circular domain is usually completely undistinguished, just an ordinary<br />
point <strong>of</strong> the domain, in the application.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 201<br />
• Convergent Taylor series centred about a regular point can always be<br />
found for the general solution <strong>of</strong> an ode. At singular points, a more<br />
general power series expansion may be needed.<br />
• The Frobenius method straightforwardly applies to higher order differential<br />
equations as well.<br />
Activity 6.H Do problems from Problem Set 4.4 [K,pp216–7]. Send in to<br />
the examiner for feedback at least Q4 & 7.<br />
6.2.2 Bessel functions are used in circular geometries<br />
Bessel’s equation,<br />
x 2 y ′′ + xy ′ + (x 2 − ν 2 )y = 0 , (6.3)<br />
arises in circular or cylindrical geometries (w<strong>here</strong> the variable x would represent<br />
the radial distance). For example, y(x) could represent the deflection,<br />
as a function <strong>of</strong> radius, <strong>of</strong> the membrane <strong>of</strong> a circular drum; or y(x) could<br />
represent the cross-pipe structure in the blood flow along a near circular<br />
artery. Indeed the differential equation mentioned in Example 6.2 is a variant<br />
<strong>of</strong> Bessel’s differential equation. We now solve this sort <strong>of</strong> equation using<br />
Frobenius’ method. The solutions for integer ν that we find, Bessel functions<br />
<strong>of</strong> the first kind, are plotted below.<br />
The letter “ν” is<br />
the Greek letter<br />
“nu” corresponding<br />
to the English “n”.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 202<br />
1<br />
0.5<br />
J 0<br />
(x)<br />
J 1<br />
(x)<br />
J 2<br />
(x)<br />
J 3<br />
(x)<br />
J 4<br />
(x)<br />
0<br />
x=linspace(0,10);<br />
j=besselj((0:4)’,x);<br />
-0.5<br />
0 2 4 6 8 10 plot(x,j)<br />
x<br />
Reading 6.I Study Kreyszig §4.5 [K,pp218–225].<br />
Positive order Bessel’s functions are relevant. In applications the Bessel<br />
functions <strong>of</strong> order ν ≥ 0 are the ones <strong>of</strong> interest. Observe that as x → 0, the<br />
Bessel functions J ν (x) ∼ a 0 x ν which tends to zero if the order ν is positive,<br />
but goes to infinity is the order ν is negative. In most applications the variable<br />
x is the radius r. Thus x → 0 corresponds to approaching the centre <strong>of</strong> the<br />
domain. The general solution to Bessel’s equation is y = c 1 J ν (x) + c 2 J −ν (x)<br />
but in applications we usually cannot tolerate solutions going to infinity and
Module 6. Series solutions <strong>of</strong> differential equations give special functions 203<br />
so the arbitrary constant c 2 = 0 in order to eliminate the bad behaviour <strong>of</strong><br />
Bessel functions <strong>of</strong> negative order. This just leaves the physically interesting<br />
solution to be y = c 1 J ν (x) for ν ≥ 0.<br />
Variable transforms are useful. Now that are investigating ode’s with<br />
variable coefficients we find a much richer range <strong>of</strong> possible ode’s. Some<br />
<strong>of</strong> these may be transformed into a well studied ode such as Bessel’s or<br />
Legendre’s equations. For example, if we can deduce that the solutions to a<br />
strange ode are J ν (x 2 ) or P n ( √ x) then we immediately know lots about the<br />
solutions. Thus one useful technique that <strong>of</strong> transforming an ode from one<br />
form into another, hopefully well known.<br />
Example 6.4: transform an ode to Bessel’s equation Consider, as an<br />
example, Problem 3 in Problem Set 4.5 <strong>of</strong> Kreyszig, p226. The task is<br />
to transform the ode in y(x) into Bessel’s ode for y(z) w<strong>here</strong> z = x 2 .<br />
Then we would be able to say that the solution is known to be y ∝<br />
J ν (z) for some ν, and hence know the solution to the original ode is<br />
y ∝ J ν (x 2 ) .<br />
The challenge is to transform the x-derivatives in the original ode,<br />
x 2 y ′′ + xy ′ + (4x 4 − 1 )y = 0, into derivatives with respect to z. We do<br />
4<br />
this using the chain rule. Among many equally valid routes, see the<br />
logic in the following for both the first and second derivatives.<br />
dy<br />
dx = dy<br />
dz × dz<br />
dx<br />
by the chain rule
Module 6. Series solutions <strong>of</strong> differential equations give special functions 204<br />
= dy<br />
dz 2x as z = x2<br />
= 2 √ z dy as x = √ z .<br />
Then<br />
d2 y<br />
dx<br />
dz<br />
= d ( ) dy<br />
dx dx<br />
= d (<br />
2 √ z dy )<br />
by above expression for dy/dx<br />
dx dz<br />
= d (<br />
2 √ z dy )<br />
× dz by chain rule<br />
dz dz dx<br />
= 2 √ z d (<br />
2 √ z dy )<br />
as z = x 2<br />
dz dz<br />
= 2 dy<br />
dz + 4z d2 y<br />
by derivative <strong>of</strong> a product .<br />
dz 2<br />
Then substitute these derivatives into the original ode to deduce the<br />
equivalent ode<br />
(<br />
x 2 2 dy ) (<br />
dz + 4z d2 y<br />
+ x 2 √ z dy ) (<br />
+ 4x 4 − 1 )<br />
y = 0 ;<br />
dz 2 dz<br />
4<br />
that is, using x = √ z,<br />
upon dividing by 4<br />
4z 2 d2 y dy<br />
+ 4z<br />
dz2 (4z<br />
dz + 2 − 1 4<br />
z 2 d2 y<br />
dz + z dy (z 2 dz + 2 − 1 16<br />
)<br />
y = 0 ;<br />
)<br />
y = 0 .<br />
This is Bessel’s ode for y(z) with parameter ν = 1/4. Thus its solutions
Module 6. Series solutions <strong>of</strong> differential equations give special functions 205<br />
are, for example, y ∝ J 1/4 (z) = J 1/4 (x 2 ) .<br />
Activity 6.J Do problems from Problem Set 4.5 [K,226–7]. Send in to the<br />
examiner for feedback at least Q4 & 11.<br />
Exercise 6.5: Consider the differential equation 2y ′′ −4xy ′ +(4x 2 −6)y = 0 .<br />
(a) Briefly explain why you would expect it to have power series solutions<br />
in the form <strong>of</strong> the Maclaurin series y = ∑ ∞<br />
n=0 a n x n .<br />
(b) Hence construct the first few terms in a power series, with errors<br />
O (x 4 ), <strong>of</strong> the solution with y(0) = 1 and y ′ (0) = 0 to the<br />
differential equation.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 206<br />
6.3 Computer algebra for repetitive tasks<br />
The whole <strong>of</strong> the developments and operations <strong>of</strong> analysis are<br />
now capable <strong>of</strong> being executed by machinery. . . . As soon as<br />
an Analytical Engine exists, it will necessarily guide the future<br />
course <strong>of</strong> science. Charles Babbage in Passages from the Life<br />
<strong>of</strong> a Philosopher (London 1864)<br />
“On two occasions I have been asked [by members <strong>of</strong> Parliament!],<br />
‘Pray, Mr. Babbage, if you put into the machine wrong figures,<br />
will the right answers come out?’<br />
I am not able rightly to apprehend the kind <strong>of</strong> confusion <strong>of</strong> ideas<br />
that could provoke such a question.” Charles Babbage<br />
S<strong>of</strong>tware packages to do computer algebra do much incredibly sophisticated<br />
analysis. However, mostly we want computers to do the tedious repetitive<br />
tasks—those that it is worth investing our time making sure the computer<br />
is doing what we want. Developing power series solutions <strong>of</strong> differential<br />
equations is an ideal application.<br />
Main aims:<br />
• see how computer algebra can be usefully employed to do tedious tasks;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 207<br />
• use simple iteration to develop power series solutions <strong>of</strong> linear and<br />
nonlinear differential equations;<br />
• make iteration more flexible by basing it upon the residual <strong>of</strong> the governing<br />
equations.<br />
We will use the free demonstration copies <strong>of</strong> reduce 3 available from:<br />
windows PC: ftp://ftp.maths.bath.ac.uk/pub/algebra, download in<br />
binary demored.exe and demored.img;<br />
linux: ftp://ftp.zib.de/pub/reduce/demo/linux;<br />
Macintosh: ftp://ftp.maths.bath.ac.uk/pub/algebra, download and unpack<br />
demored.hqx.<br />
Check that you can start and run reduce, it should open up a window<br />
saying something like<br />
REDUCE 3.6, patched to 30 Aug 98...<br />
1:<br />
3 T<strong>here</strong> is always a limitation in using free, demonstration copies. Here the main restriction<br />
on the demonstration version is that “garbage collection” is disabled in reduce.<br />
What that means in practise is that only small to medium amounts <strong>of</strong> computer algebra<br />
can be done before having to restart reduce. It is probably best to solve one problem at<br />
a time, restarting reduce in between each problem.<br />
We generally use<br />
such a coloured,<br />
teletype font for<br />
computer<br />
instructions and<br />
dialogue.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 208<br />
The “1:” is a prompt for a command: type quit; followed by the return or<br />
enter key for reduce to finish. If this works, you can run reduce.<br />
• If you cannot get reduce to execute on your computer system, contact<br />
us for help. However, in the meantime you may start your work by<br />
using a telnet application to connect over the internet to the computer<br />
marlene.zib.de, 4 login as reducet and with an empty password. A<br />
reduce session will start for you; it is a little slow but at least you<br />
can make progress with your work.<br />
• A summary <strong>of</strong> the reduce commands that we will use are given in<br />
§§6.3.5.<br />
• A simple introduction to reduce is given in the following Section 6.3.1.<br />
• http://www.zib.de/Symbolik/reduce/Overview/Overview.html is an<br />
on-line overview to the capabilities <strong>of</strong> reduce.<br />
• http://www.uni-koeln.de/cgi-bin/redref/redr_dir.html gives extensive<br />
online help to the commands and syntax for reduce.<br />
6.3.1 Introducing reduce<br />
• Start reduce in Unix by typing reduce in a command window. To<br />
exit from reduce type the command quit; followed by the enter<br />
4 Courtesy <strong>of</strong> Konrad-Zuse-Zentrum für Informationstechnik, Berlin
Module 6. Series solutions <strong>of</strong> differential equations give special functions 209<br />
key .<br />
• Note: all reduce statements must be terminated with a semi-colon.<br />
Do not forget. They are subsequently executed by typing a enter key.<br />
• reduce uses exact arithmetic by default: for example to find 100! in<br />
full gory detail type factorial(100);enter (I will not mention the<br />
enter key again unless necessary).<br />
• Identifiers, usually we use single letters, denote either variables or expressions:<br />
in f:=2*x^2+3*x-5; the identifier x is a variable w<strong>here</strong>as f,<br />
after the assignment with :=, contains the above expression; similary<br />
after g:=x^2-x-6; then g contains an algebraic expression.<br />
• Expressions may be<br />
added with f+g;<br />
subtracted with f-g;<br />
multiplied with f*g;<br />
divided with f/g;<br />
exponentiated with f^3;, etc.<br />
• Straightforward equations may be solved (by default equal to zero):<br />
solve(x^2-x-6,x); or through using an expression previously found<br />
such as solve(f,x); .
Module 6. Series solutions <strong>of</strong> differential equations give special functions 210<br />
Systems <strong>of</strong> equations may be solved by giving a list (enclosed in braces)<br />
<strong>of</strong> equations and a list <strong>of</strong> variables to be determined. For example,<br />
solve({x-y=2,a*x+y=0},{x,y}); returns the solution parametrised<br />
by a.<br />
• Basic calculus is a snap:<br />
differentiation uses the function df as in df(f,x); to find the first<br />
derivative; or df(g,x,x); for the second; or df(sin(x*y),x,y);<br />
for a mixed derivative.<br />
The product rule for differentiation is verified for the above two<br />
functions by df(f*g,x)-df(f,x)*g-f*df(g,x); reducing to zero.<br />
integration is similar, int(f,x); giving the integral <strong>of</strong> the polynomial<br />
in f, without an integration constant, but perhaps more impressive<br />
is the almost instant integration <strong>of</strong> int(x^5*cos(x^2),x); .<br />
Note that repeated integration must be done by repeated invocations<br />
<strong>of</strong> int, not by further arguments as for df . Instead, for<br />
example, int(f,x,0,2); will give you the definite integral from 0<br />
to 2.<br />
• One can substitute an expression for a variable in another expression.<br />
For example the composition f(g(x)) is computed by sub(x=g,f);<br />
• reduce allows you to use many lines for the one command: a command<br />
is not terminated until the semi-colon is typed. reduce alerts you to<br />
the fact that you are still entering the one command by displaying the
Module 6. Series solutions <strong>of</strong> differential equations give special functions 211<br />
prompt again. Thus if you forget the semi-colon, just type a semi-colon<br />
at the new prompt and then the enter key to execute what you had<br />
typed on the previous lines.<br />
• If reduce displays an error message along the lines <strong>of</strong> Declare xxx operator ?<br />
then you have probably mistyped something and the best answer is to<br />
type N then enter.<br />
6.3.2 Introduction to the iterative method<br />
Computers are extremely good are repeating the same thing many times over.<br />
We use this aspect to find power series solutions <strong>of</strong> some simple differential<br />
equations, and then some “horrible” nonlinear differential equations. The<br />
ideas are developed by example.<br />
Example 6.6: The solution to y ′′ + y = 0, y(0) = 1 and y ′ (0) = 0 is<br />
y = cos x. Find the Maclaurin series solution by iteration first by hand<br />
and secondly using computer algebra.<br />
Solution: Rearrange this ode to y ′′ = −y and then formally integrate<br />
twice to y = − ∫∫ y dx dx. These integrals on the right-hand side are indefinite<br />
integrals so implicit constants <strong>of</strong> integration, say a+bx, should<br />
appear on the right-hand side. But we know that the cosine solution
Module 6. Series solutions <strong>of</strong> differential equations give special functions 212<br />
to y ′′ + y = 0 has y(0) = 1 and y ′ (0) = 0 so surely we should set a = 1<br />
and b = 0 to account for these initial conditions. Thus<br />
∫∫<br />
y = 1 − y dx dx (6.4)<br />
w<strong>here</strong> <strong>here</strong> the integrals are implicitly the definite integral from 0 to<br />
x. This rearrangement incorporates the information <strong>of</strong> the ode and its<br />
initial conditions.<br />
In this form we readily find its power series solution by iteration: given<br />
an approximation y n (x) we find a new approximation by evaluating<br />
∫∫<br />
y n+1 = 1 − y n dx dx .<br />
First try by hand starting from y 0 = 1:<br />
• y 1 = 1 − ∫∫ 1 dx dx = 1 − 1 2 x2 ;<br />
• y 2 = 1 − ∫∫ 1 − 1 2 x2 dx dx = 1 − 1 2 x2 + 1<br />
24 x4 .<br />
See these are the first few terms in the Maclaurin series for cos x. Now<br />
try using reduce to do the algebra:<br />
• first type the three commands<br />
on div;, <strong>of</strong>f allfac; and on revpri;<br />
(do not forget the semi-colon to logically terminate each command<br />
and the return or enter key to get reduce to actually execute the<br />
line you have typed)—these commands tell reduce to format its<br />
output in a nice way for power series;<br />
Interestingly, this is<br />
Picard iteration that<br />
is also used to prove<br />
existence <strong>of</strong><br />
solutions to ode’s.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 213<br />
• second set a variable to the first approximation by typing y:=1;<br />
which assigns the value one to the variable y;<br />
• type y:=1-int(int(y,x),x); to assign the first approximation,<br />
y 1 = 1 − x 2 /2, to the variable y—int(y,x) computes an integral<br />
with respect to x <strong>of</strong> whatever is in y, fortunately for us, for<br />
polynomial y it computes the integral which is zero at x = 0;<br />
• type y:=1-int(int(y,x),x); again to compute y 2 , etc;<br />
• iterative loops are standard in computer languages and computer<br />
algebra is no exception so type<br />
for n:=3:8 do y:=1-int(int(y,x),x);<br />
to compute further iterations. But nothing was printed so finally<br />
type y; to see the resulting power series for cos x.<br />
The entire dialogue should look like this:<br />
1: on div;<br />
2: <strong>of</strong>f allfac;<br />
3: on revpri;<br />
4: y:=1;<br />
y := 1<br />
5: y:=1-int(int(y,x),x);
Module 6. Series solutions <strong>of</strong> differential equations give special functions 214<br />
1 2<br />
y := 1 - ---*x<br />
2<br />
6: y:=1-int(int(y,x),x);<br />
1 2 1 4<br />
y := 1 - ---*x + ----*x<br />
2 24<br />
8: for n:=3:8 do y:=1-int(int(y,x),x);<br />
9: y;<br />
1 2 1 4 1 6 1 8 1 10<br />
1 - ---*x + ----*x - -----*x + -------*x - ---------*x<br />
2 24 720 40320 3628800<br />
1 12 1 14 1 16<br />
+ -----------*x - -------------*x + ----------------*x<br />
479001600 87178291200 20922789888000<br />
Example 6.7: Find the general Maclaurin series solution to y ′′ +y = 0 using<br />
computer algebra (reduce).
Module 6. Series solutions <strong>of</strong> differential equations give special functions 215<br />
Solution: In the previous example we built in the specific initial conditions<br />
appropriate to y = cos x, namely y(0) = 1 and y ′ (0) = 0. If we<br />
make the integration constants arbitrary, by iterating<br />
∫∫<br />
y = a + bx − y dx dx ,<br />
then we recover the general solution parametrised by a and b w<strong>here</strong><br />
y(0) = a and y ′ (0) = b. Let’s do it. Start reduce:<br />
• type factor a,b; to get reduce to group all terms in a and all<br />
terms in b;<br />
• set the initial value to something simple satisfying the initial conditions<br />
y:=a+b*x;<br />
• iterate for n:=1:4 do write y:=a+b*x-int(int(y,x),x); using<br />
the write command to print each iterate.<br />
The dialogue is:<br />
4: factor a,b;<br />
5: y:=a+b*x;<br />
y := b*x + a<br />
6: for n:=1:4 do write y:=a+b*x-int(int(y,x),x);<br />
1 3 1 2<br />
Always remember to<br />
start reduce with<br />
on div;<br />
<strong>of</strong>f allfac;<br />
on revpri;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 216<br />
y := b*(x - ---*x ) + a*(1 - ---*x )<br />
6 2<br />
1 3 1 5 1 2 1 4<br />
y := b*(x - ---*x + -----*x ) + a*(1 - ---*x + ----*x )<br />
6 120 2 24<br />
1 3 1 5 1 7<br />
y := b*(x - ---*x + -----*x - ------*x )<br />
6 120 5040<br />
1 2 1 4 1 6<br />
+ a*(1 - ---*x + ----*x - -----*x )<br />
2 24 720<br />
1 3 1 5 1 7 1 9<br />
y := b*(x - ---*x + -----*x - ------*x + --------*x )<br />
6 120 5040 362880<br />
1 2 1 4 1 6 1 8<br />
+ a*(1 - ---*x + ----*x - -----*x + -------*x )<br />
2 24 720 40320<br />
See how easily this generates the Maclaurin series for y = a cos x +<br />
b sin x.<br />
Now let’s try something rather hard—in fact almost impossible to quantitatively<br />
solve except via power series methods. We now use precisely the same
Module 6. Series solutions <strong>of</strong> differential equations give special functions 217<br />
iteration to solve a nonlinear ode!<br />
Example 6.8: Find the Maclaurin series solution to the nonlinear ode<br />
y ′′ = 6y 2 , y(0) = 1 and y ′ (0) = −2.<br />
Before solving this as a power series (by my design its exact solution<br />
just happens to be y = 1/(1 + x) 2 ), investigate it qualitatively using<br />
techniques developed in earlier by considering it as a system <strong>of</strong> firstorder<br />
differential equations. Introduce z(x) = y ′ then the equivalent<br />
system is<br />
y ′ = z , z ′ = 6y 2 .<br />
Hence the evolution in the phase plane is dictated by the arrows shown<br />
below with the particular trajectory starting from the initial condition<br />
(1, −2) shown in green:
Module 6. Series solutions <strong>of</strong> differential equations give special functions 218<br />
z=y'<br />
0<br />
-0.2<br />
-0.4<br />
-0.6<br />
-0.8<br />
-1<br />
-1.2<br />
-1.4<br />
-1.6<br />
-1.8<br />
-2<br />
-0.5 0 0.5 1<br />
y<br />
[y,z]=meshgrid(-.5:.1:1,-2:.1:0);<br />
quiver(y,z,z,6*y.^2)<br />
hold on<br />
x=linspace(0,7);<br />
y=1./(1+x).^2;<br />
z=-2./(1+x).^3;<br />
plot(y,z,’g’)<br />
hold <strong>of</strong>f<br />
Solution: Now we find its power series solution! As before, recast the<br />
ode in the following form that also incorporates the initial conditions
Module 6. Series solutions <strong>of</strong> differential equations give special functions 219<br />
by formally integrating twice the ode:<br />
∫∫<br />
y = 1 − 2x + 6 y 2 dx dx , (6.5)<br />
w<strong>here</strong> again the repeated x integral is assumed done so that each integral<br />
is zero at x = 0. Then iterate, starting from y 0 = 1 − 2x say:<br />
∫∫<br />
y 1 = 1 − 2x + 6 1 − 4x + 4x 2 dx dx<br />
= 1 − 2x + 3x 2 − 4x 3 + 2x 4 ;<br />
y 2 =<br />
∫∫ (1<br />
1 − 2x + 6 − 2x + 3x 2 − 4x 3 + 2x 4) 2<br />
dx dx<br />
∫∫<br />
= 1 − 2x + 6 1 − 4x + 10x 2 − 20x 3 + 29x 4 − 32x 5<br />
+28x 6 − 16x 7 + 4x 8 dx dx<br />
= 1 − 2x + 3x 2 − 4x 3 + 5x 4 − 6x 5<br />
+ 29<br />
5 x6 − 32<br />
7 x7 + 3x 8 − 4 3 x9 + 4<br />
15 x10 .<br />
This is quickly becoming horrible. But that is just why computers<br />
are made. Before rushing in to use reduce, observe that <strong>here</strong> the<br />
quadratic nonlinearity y 2 is going to generate very high powers <strong>of</strong> x,<br />
most <strong>of</strong> which we do not want. For example, in y 2 above the terms up<br />
to x 5 are correct, but all the higher powers are as yet wrong. 5 Another<br />
5 The quadratic nonlinearity y 2 rapidly generates high powers <strong>of</strong> x in the expressions.<br />
However, the iteration plods along only getting one or two orders <strong>of</strong> x more accurate each<br />
iteration.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 220<br />
iteration would generate a 22nd order polynomial for y 3 <strong>of</strong> which only<br />
the first 8 coefficients are correct, the rest are rubbish. In reduce we<br />
discard such high order terms in a power series by using, for example,<br />
the command let x^10=>0; which tells reduce to discard, set to zero,<br />
or otherwise ignore, all terms with a power <strong>of</strong> x <strong>of</strong> ten or more. This<br />
is just what we want. Thus <strong>here</strong> the dialogue would be:<br />
5: let x^10=>0;<br />
6: y:=1-2*x;<br />
y := 1 - 2*x<br />
7: for n:=1:5 do write y:=1-2*x+6*int(int(y^2,x),x);<br />
2 3 4<br />
y := 1 - 2*x + 3*x - 4*x + 2*x<br />
2 3 4 5 29 6 32 7 8<br />
y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + ----*x - ----*x + 3*x<br />
5 7<br />
4 9<br />
- ---*x<br />
3<br />
2 3 4 5 6 7 306 8<br />
y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + 7*x - 8*x + -----*x<br />
35
Module 6. Series solutions <strong>of</strong> differential equations give special functions 221<br />
316 9<br />
- -----*x<br />
35<br />
2 3 4 5 6 7 8 9<br />
y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + 7*x - 8*x + 9*x - 10*x<br />
2 3 4 5 6 7 8 9<br />
y := 1 - 2*x + 3*x - 4*x + 5*x - 6*x + 7*x - 8*x + 9*x - 10*x<br />
See how the iteration settles on the correct power series but with all<br />
terms with powers <strong>of</strong> ten or higher have been neglected. Check this satisfies<br />
the ode by computing the residual df(y,x,x)-6*y^2; (df(y,x)<br />
computes the derivative <strong>of</strong> y with respect to x and df(y,x,x) computes<br />
the second derivative); the result is zero except for two terms in x 8 and<br />
x 9 which would cancel with the second derivative <strong>of</strong> the absent tenth<br />
and eleventh order terms. We thus triumphantly write the solution <strong>of</strong><br />
this nonlinear ode as<br />
y = 1 − 2x + 3x 2 − 4x 3 + 5x 4 − 6x 5 + 7x 6 − 8x 7 + 9x 8 − 10x 9 + O(x 10 ) ,<br />
w<strong>here</strong> O(x 10 ) (read “order <strong>of</strong> x 10 ”) tells us that the error in the power<br />
series, the neglected terms, are x 10 or higher powers.<br />
In the above three examples we have developed the Taylor series about x = 0,<br />
the Maclaurin series. To find Taylor series about any point x = c it is simply
Module 6. Series solutions <strong>of</strong> differential equations give special functions 222<br />
a matter <strong>of</strong> changing the independent variable to, for example, t = x − c<br />
and then finding the Maclaurin series in t. We will continue to find only<br />
Maclaurin series because that is all we need to also find other power series<br />
solutions.<br />
Activity 6.K Do Problems 6.13–6.17 in the Exercises 6.3.4, p239. Send in<br />
to the examiner for feedback at least Ex. 6.13 & 6.14.<br />
6.3.3 Iteration is very flexible<br />
So far we have simply rearranged an ode in order to derive an iteration that<br />
will generate the desired power series solution. 6 In this subsection we discuss<br />
why this strategy works at all, and what extension we need in order to solve<br />
a very wide range <strong>of</strong> differential equations.<br />
The iteration works because integration is basically a smoothing operation.<br />
This smoothing tends to reduce errors in a power series. For example, suppose<br />
an error was O(x 3 ), so that it is roughly about 10 −3 when x = 0.1 say:<br />
6 What we have done is rather remarkable. In the course on Numerical Computing you<br />
will learn about fixed point iteration as a method <strong>of</strong> solving linear and nonlinear equations.<br />
In fact we have done precisely fixed point iteration <strong>here</strong>. The remarkable difference is<br />
that in Numerical Computing you will simply find the one number that satisfies a given<br />
equation; <strong>here</strong> you have found the function, via its power series, that satisfies the given<br />
differential equation—a much more difficult task. Nonetheless the strategy <strong>of</strong> appropriately<br />
rearranging the equation and iterating works.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 223<br />
then integrating it twice will lead to an error O(x 5 ) in the integral which<br />
is much smaller in magnitude, roughly 10 −5 when x = 0.1. Conversely differentiation<br />
magnifies errors: two derivatives <strong>of</strong> an error O(x 3 ) becomes an<br />
error O(x) which, at roughly 10 −1 when x = 0.1, is much larger. To make<br />
errors smaller, equivalently to push them to higher powers in x, we generally<br />
need to integrate. Thus an integral reformulation <strong>of</strong> an ode is the basis for<br />
a successful iterative solution.<br />
The other question is: how do we know how many iterations should be<br />
performed? The answer <strong>here</strong> is simple: keep iterating until t<strong>here</strong> is no more<br />
change to the solution. One consequence <strong>of</strong> the answer though is that we<br />
have to keep track <strong>of</strong> the change in the approximations. A good way to find<br />
the change in an approximation is to solve for it explicitly. But first we have<br />
to find an equation for the small change in the approximate solution at each<br />
iteration. This leads us to a powerful iterative framework, based upon the<br />
residual <strong>of</strong> the ode, which we develop and explore by example.<br />
Example 6.9: Legendre functions. Use iteration to find the general<br />
Maclaurin series solutions to Legendre’s equation (6.1), written <strong>here</strong><br />
as<br />
(1 − x 2 )y ′′ − 2xy ′ + ky = 0 for k = n(n + 1) ,<br />
to an error O(x 10 ) for initial conditions y(0) = 1 and y ′ (0) = 0.<br />
Solution:<br />
Immediately an initial approximation is<br />
y 0 = 1 ,
Module 6. Series solutions <strong>of</strong> differential equations give special functions 224<br />
as this satisfies the initial conditions. The iterative challenge is: given<br />
a known approximation y n , find an improved solution<br />
ŷ is read as “y-hat”.<br />
y n+1 (x) = y n (x) + ŷ n (x) ,<br />
w<strong>here</strong> ŷ n is the as yet unknown change in the approximation that we<br />
have to find. Now substitute this form for y n+1 into the ode and<br />
rearrange to put all the known terms on the right-hand side and all the<br />
unknown on the left:<br />
−(1 − x 2 )ŷ ′′<br />
n + 2xŷ′ n − kŷ n = (1 − x 2 )y ′′<br />
n − 2xy′ n + ky n .<br />
This looks like a differential equation for the as yet unknown change<br />
ŷ n forced by the known right-hand side, the residual <strong>of</strong> Legendre’s<br />
equation evaluated at the current approximation, R n = (1 − x 2 )y n ′′ −<br />
2xy n ′ + ky n. For example, the first residual from y 0 = 1 is R 0 = k. But<br />
this ode for the change is far too complicated—indeed if we could solve<br />
it exactly then the problem would be over immediately. Instead we seek<br />
a simplification to make the ode for ŷ n tractable while still useful. The<br />
general principles <strong>of</strong> the simplification are that in any terms involving<br />
ŷ n :<br />
• near the point <strong>of</strong> expansion x = 0, x is much smaller than 1 and x 2<br />
is even smaller still, thus we neglect higher powers <strong>of</strong> x relative to<br />
lower powers—so in this example we replace the (1 − x 2 ) factor<br />
by 1 because the x 2 is negligible in comparison to 1 for the small x<br />
near the point <strong>of</strong> expansion;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 225<br />
• also, though be careful, because differentiation increases errors as<br />
differentiation by x corresponds roughly to lowering the power<br />
<strong>of</strong> x by 1 (equivalently it roughly corresponds to dividing by x)<br />
we neglect low order derivatives <strong>of</strong> ŷ n (provided they are not also<br />
divided by x)—so in this example xŷ n ′ is roughly <strong>of</strong> the same “size”<br />
as ŷ n because the derivative makes it larger but the multiplication<br />
by x cancels this effect, but both <strong>of</strong> these terms are smaller than<br />
ŷ n ′′ which is roughly 1/x2 times larger.<br />
After this simplification, the ode for the change then reduces to<br />
−ŷ ′′<br />
n = R n(x) = (1 − x 2 )y ′′<br />
n − 2xy′ n + ky n .<br />
In the first iteration, as R 0 = k, ŷ 0 ′′ = −k which upon integrating twice<br />
leads to the requisite change being ŷ 0 = −kx 2 /2.<br />
But what about the constants <strong>of</strong> integration? In this approach the initial<br />
approximation satisfies the initial conditions y(0) = 1 and y ′ (0) =<br />
0. We ensure these are satisfied by all approximations by ensuring all<br />
the changes ŷ n satisfy the corresponding homogeneous initial conditions<br />
ŷ n (0) = ŷ ′ n (0) = 0. Thus, for example, the change ŷ 0 above is indeed<br />
correct. Hence the next approximation is y 1 = 1 − kx 2 /2.<br />
We could continue doing this by hand, but the plan is to use computer<br />
algebra to do the tediously repetitious iteration.<br />
• The initial approximation is set simply by y:=1;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 226<br />
• We wish to discard any powers generated <strong>of</strong> O(x 10 ) so include the<br />
declaration let x^10=>0;<br />
• To iterate until the change is negligible use the repeat loop, namely<br />
repeat ... until r=0; w<strong>here</strong> we will use r to store the residual<br />
and the change.<br />
• The repeat-until construct in reduce, unlike many other computing<br />
languages, expects only a single statement between the repeat<br />
and the until—we bracket the multiple statements needed inside<br />
with a begin ... end<br />
• Inside the loop:<br />
– compute residual, r:=(1-x^2)*df(y,x,x)-2*x*df(y,x)+k*y;<br />
– compute change, r:=-int(int(r,x),x);<br />
– update the approximation, write y:=y+r;<br />
The reduce dialogue might be:<br />
4: y:=1;<br />
y := 1<br />
5: let x^10=>0;<br />
6: repeat begin<br />
6: r:=(1-x^2)*df(y,x,x)-2*x*df(y,x)+k*y;<br />
6: r:=-int(int(r,x),x);<br />
6: write y:=y+r;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 227<br />
6: end until r=0;<br />
1 2<br />
y := 1 - ---*k*x<br />
2<br />
1 2 1 4 1 2 4<br />
y := 1 - ---*k*x - ---*k*x + ----*k *x<br />
2 4 24<br />
1 2 1 4 1 6 1 2 4 13 2 6<br />
y := 1 - ---*k*x - ---*k*x - ---*k*x + ----*k *x + -----*k *x<br />
2 4 6 24 360<br />
1 3 6<br />
- -----*k *x<br />
720<br />
1 2 1 4 1 6 1 8 1 2 4<br />
y := 1 - ---*k*x - ---*k*x - ---*k*x - ---*k*x + ----*k *x<br />
2 4 6 8 24<br />
13 2 6 101 2 8 1 3 6 17 3 8<br />
+ -----*k *x + ------*k *x - -----*k *x - -------*k *x<br />
360 3360 720 10080<br />
1 4 8<br />
+ -------*k *x<br />
40320
Module 6. Series solutions <strong>of</strong> differential equations give special functions 228<br />
1 2 1 4 1 6 1 8 1 2 4<br />
y := 1 - ---*k*x - ---*k*x - ---*k*x - ---*k*x + ----*k *x<br />
2 4 6 8 24<br />
13 2 6 101 2 8 1 3 6 17 3 8<br />
+ -----*k *x + ------*k *x - -----*k *x - -------*k *x<br />
360 3360 720 10080<br />
1 4 8<br />
+ -------*k *x<br />
40320<br />
It is painful having to retype the entire loop anytime one typing mistake<br />
is made. Instead prepare a file, called say leg.red, containing the<br />
reduce commands (including an extra end; at the end):<br />
on div; <strong>of</strong>f allfac; on revpri;<br />
factor x;<br />
y:=1;<br />
let x^10=>0;<br />
repeat begin<br />
r:=(1-x^2)*df(y,x,x)-2*x*df(y,x)+k*y;<br />
r:=-int(int(r,x),x);<br />
write y:=y+r;<br />
end until r=0;<br />
end;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 229<br />
then start reduce and get all these commands executed by typing<br />
in "leg.red"; The output gives the desired Maclaurin series to be<br />
y = 1 − k ( 1<br />
2 x2 +<br />
24 k2 − 1 ) ( 1<br />
4 k x 4 −<br />
720 k3 − 13<br />
360 k2 + 1 )<br />
6 k x 6<br />
( 1<br />
+<br />
40320 k4 − 17<br />
10080 k3 + 101<br />
3360 k2 − 1 )<br />
8 k x 8 + O(x 10 ) .<br />
Example 6.10: Find the Maclaurin series solution to errors O(x 10 ) to the<br />
nonlinear ode y ′′ +(1+x)y ′ −6y 2 = 0 such that y(0) = 1 and y ′ (0) = −1.<br />
Solution: Again immediately write down an initial approximation<br />
consistent with the initial conditions: namely y 0 = 1 − x. Then, given<br />
a known approximation, say y n (x), seek an improved approximation<br />
y n+1 (x) = y n (x) + ŷ n (x) w<strong>here</strong> ŷ n (x) is the as yet unknown change.<br />
Substitute into the differential equation and rearrange to deduce the<br />
following ode for the change:<br />
−ŷ ′′<br />
n − (1 + x)ŷ′ n + 6ŷ2 n + 12y nŷ n = R n = y ′′<br />
n + (1 + x)y′ n − 6y2 n ,<br />
w<strong>here</strong> <strong>here</strong>, as always, R n (x) is the known residual evaluated for the<br />
current approximation. Now simplify the left-hand side:
Module 6. Series solutions <strong>of</strong> differential equations give special functions 230<br />
• since x is “small” (in the power series expansion) 1 + x ≈ 1 and<br />
similarly y n ≈ 1 from the initial condition y(0) = 1 so the lefthand<br />
side first simplifies to<br />
−ŷ ′′<br />
n − ŷ′ n + 6ŷ2 n + 12ŷ n ;<br />
• but also the change ŷ n must be small (as each ŷ n is to make a small<br />
improvement in the solution) and so ŷn 2 must be much smaller still<br />
and should be neglected—for example we typically expect the first<br />
change ŷ 0 to be O(x 2 ) whence ŷ0 2 = O(x4 ) which is much smaller<br />
and negligible—hence the left-hand side simplifies further to<br />
−ŷ ′′<br />
n − ŷ′ n + 12ŷ n ;<br />
• lastly, differentiation effectively decreases the order <strong>of</strong> any term so<br />
that the second derivative term dominates the others above and<br />
so the ode for the change becomes simply<br />
−ŷ ′′<br />
n = R n = y ′′<br />
n + (1 + x)y ′ n − 6y 2 n .<br />
For example, the first iteration starts by computing the residual<br />
R 0 = 0 + (1 + x)(−1) − 6(1 − x) 2 = −7 + 11x − 6x 2 .<br />
Then changing sign and integrating twice gives the first change<br />
∫∫<br />
ŷ 0 = −<br />
R n dx dx = 7 2 x2 − 11 6 x3 + 1 2 x4 ,
Module 6. Series solutions <strong>of</strong> differential equations give special functions 231<br />
after recalling that we need to satisfy homogeneous initial conditions<br />
ŷ ′ n(0) = ŷ n (0) = 0 for the changes in order to ensure the solution<br />
satisfies the specified initial conditions. Thus the new approximation<br />
is<br />
y 1 = 1 − x + 7 2 x2 − 11<br />
6 x3 + 1 2 x4 .<br />
Now investigate further with computer algebra. First create a file, say<br />
nod.red with<br />
on div; <strong>of</strong>f allfac; on revpri;<br />
y:=1-x;<br />
let x^10=>0;<br />
repeat begin<br />
r:=df(y,x,x)+(1+x)*df(y,x)-6*y^2;<br />
r:=-int(int(r,x),x);<br />
y:=y+r;<br />
end until r=0;<br />
y:=y;<br />
end;<br />
Second executing the commands using the in statement produces the<br />
output below<br />
2: in "nod.red";<br />
on div;<br />
<strong>of</strong>f allfac;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 232<br />
on revpri;<br />
y:=1-x;<br />
y := 1 - x<br />
let x^10=>0;<br />
repeat begin<br />
r:=df(y,x,x)+(1+x)*df(y,x)-6*y^2;<br />
r:=-int(int(r,x),x);<br />
y:=y+r;<br />
end until r=0;<br />
y:=y;<br />
7 2 3 25 4 257 5 219 6 1433 7<br />
y := 1 - x + ---*x - 3*x + ----*x - -----*x + -----*x - ------*x<br />
2 6 60 40 252<br />
end;<br />
6355 8 199277 9<br />
+ ------*x - --------*x<br />
1008 30240<br />
Thus conclude that the Maclaurin series solution is<br />
y = 1 − x + 7 2 x2 − 3x 3 + 25<br />
6 x4 − 257<br />
60 x5 + 219<br />
40 x6 − 1433<br />
252 x7
Module 6. Series solutions <strong>of</strong> differential equations give special functions 233<br />
+ 6355<br />
1008 x8 − 199277<br />
30240 x9 + O(x 10 ) .<br />
The following are the principles seen in this iterative approach to finding<br />
power series solutions to linear and nonlinear ode’s.<br />
• Make an initial approximation consistent with the initial conditions <strong>of</strong><br />
the ode.<br />
• Seek as simple an ode for successive corrections by substituting y n+1 =<br />
y n + ŷ n into the differential equation, grouping all the known terms<br />
into the residual R n , and then neglecting all but the dominant terms<br />
involving the change ŷ n :<br />
– neglect all nonlinear terms in the small change ŷ n ;<br />
– approximate all coefficient factors by the lowest order term in x;<br />
– and, counting each derivative with respect to x as equivalent to a<br />
division by x, keep only those terms <strong>of</strong> lowest order in x.<br />
This process is close kin to the linearisation that we employed in Module<br />
1 and will employ in later modules.<br />
• Iteratively make changes as guided by the residuals until the changes<br />
are zero to some order <strong>of</strong> error in x. This is handily done by computer<br />
algebra.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 234<br />
Warning: when testing computer algebra code, do not use the repeatuntil<br />
loop; while testing use a for-do loop to ensure that you do not get<br />
stuck in an infinite loop. Only when you are sure that your code works<br />
do you replace the for-do loop with a repeat-until loop.<br />
Applying these principles becomes more involved when we apply them in<br />
developing power series about a singular point <strong>of</strong> an ode. We investigate a<br />
couple <strong>of</strong> examples.<br />
Example 6.11: Bessel function <strong>of</strong> order 0. Find the power series solution<br />
<strong>of</strong> x 2 y ′′ + xy ′ + x 2 y = 0 that is well-behaved at x = 0 to an error<br />
O(x 10 )—namely find the low-orders <strong>of</strong> a power series proportional to<br />
the Bessel function J 0 (x).<br />
Solution: First find and solve the indicial equation by substituting<br />
y = x r + O(x r+1 ). Here the ode becomes<br />
x 2 y ′′ + xy ′ + x 2 y = r(r − 1)x r + rx r + x r+2 + O(x r+1 ) = r 2 x r + O(x r+1 ) .<br />
As x r+2 is absorbed<br />
into the error term<br />
O ( x r+1) .<br />
The only way this can be zero for all small x is if r 2 = 0. This leads, as<br />
discussed in Kreyszig [K,§4.4], to the homogeneous solutions <strong>of</strong> the ode<br />
being approximately y ≈ a+b log x. The logarithm is not well-behaved<br />
as x → 0 hence we set b = 0 and just seek solutions that tend to a<br />
constant as x → 0. Without loss <strong>of</strong> generality, because we can multiply<br />
by a constant later, we choose to find solutions such that y(0) = 1.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 235<br />
Second we make an initial approximation to the solution. After the<br />
above discussion <strong>of</strong> the indicial equation, choose y 0 = 1.<br />
Third, given a known approximation y n (x) seek an improved approximation<br />
y n+1 (x) = y n (x) + ŷ n (x) w<strong>here</strong> ŷ n (x) is some small change.<br />
Substitute this into the ode, neglect x 2 ŷ n because it is two orders <strong>of</strong> x<br />
smaller than either x 2 ŷ n ′′ or xŷ′ n , and deduce that ŷ n must satisfy<br />
− x 2 ŷ ′′<br />
n − xŷ′ n = R n = x 2 y ′′<br />
n + xy′ n + x2 y n . (6.6)<br />
Solving this for the correction ŷ n is no longer simply a matter <strong>of</strong> integrating<br />
twice.<br />
However, rearranging the form <strong>of</strong> the ode (6.6) we again express the<br />
solution in terms <strong>of</strong> two integrations. All we need to do is to notice<br />
that the left-hand side is identical to −x(xŷ ′ n) ′ whence<br />
Apply this iteration <strong>here</strong>.<br />
−x(xŷ ′ n) ′ = R n<br />
∫<br />
⇔ xŷ n ′ = − Rn<br />
x dx<br />
∫ ∫ 1 Rn<br />
⇔ ŷ n = − dx dx .<br />
x x<br />
(a) In the first iteration y 0 = 1 so the residual R 0 = x 2 . Thus<br />
∫<br />
ŷ 0 = −<br />
∫ 1 x<br />
2<br />
dx dx<br />
x x
Module 6. Series solutions <strong>of</strong> differential equations give special functions 236<br />
∫ 1 (<br />
= − 1<br />
2<br />
x<br />
x2 + b ) dx<br />
= − 1 4 x2 − b log x + a<br />
for integrations constants a and b.<br />
Note the freedom to include a − b log x into ŷ 0 , but we cannot<br />
tolerate any component in log x, as it behaves badly at x = 0, so<br />
b = 0, and a has to be chosen zero in order to ensure y n (0) = 1.<br />
(This argument applies at all iterations.) Hence y 1 = 1 − x 2 /4.<br />
(b) In the second iteration R 1 = −x 4 /4. Thus, setting the integration<br />
constants to zero as before,<br />
∫ ∫ 1 −x 4 /4<br />
ŷ 1 = −<br />
dx dx<br />
x x<br />
∫ ) 1<br />
= −<br />
(− x4<br />
dx<br />
x 16<br />
= x4<br />
64 .<br />
Hence y 2 = 1 − x 2 /4 + x 4 /64.<br />
For a computer algebra program, proceed as in earlier examples but<br />
modify the two integrations as in<br />
y:=1;<br />
let x^10=>0;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 237<br />
repeat begin<br />
r:=x^2*df(y,x,x)+x*df(y,x)+x^2*y;<br />
r:=-int(int(r/x,x)/x,x);<br />
write y:=y+r;<br />
end until r=0;<br />
Execute this code and see the solution is<br />
y = J 0 (x) = 1 − 1 4 x2 + 1<br />
64 x4 − 1 1<br />
2304 x6 +<br />
147456 x8 + O(x 10 ) .<br />
Example 6.12: Bessel functions <strong>of</strong> order 0. Find the power series expansion<br />
about x = 0, to errors O(x 10 ), <strong>of</strong> the general solution to Bessel’s<br />
equation with ν = 0, namely x 2 y ′′ + xy ′ + x 2 y = 0.<br />
Solution: The indicial equation shows that in general the dominant<br />
component in the solution is a + b log x for any a and b. (See that these<br />
were also naturally obtained in the integration constants <strong>of</strong> the previous<br />
example.) Use this as the first approximation y 0 and see what ensues.<br />
The derivation <strong>of</strong> the equation for the iterative changes, Eqn (6.6)<br />
remains the same.<br />
Including the command factor b,a,log; to improve the appearance<br />
<strong>of</strong> the printing and setting the initial approximation to y 0 = a + b log x,<br />
the reduce code is as before, namely
Module 6. Series solutions <strong>of</strong> differential equations give special functions 238<br />
factor b,a,log;<br />
y:=a+b*log(x);<br />
let x^10=>0;<br />
repeat begin<br />
r:=x^2*df(y,x,x)+x*df(y,x)+x^2*y;<br />
r:=-int(int(r/x,x)/x,x);<br />
write y:=y+r;<br />
end until r=0;<br />
Run this code to see the result<br />
1 2 1 4 1 6 1 8<br />
y := a*(1 - ---*x + ----*x - ------*x + --------*x )<br />
4 64 2304 147456<br />
1 2 3 4 11 6 25 8<br />
+ b*(---*x - -----*x + -------*x - ---------*x )<br />
4 128 13824 1769472<br />
1 2 1 4 1 6 1 8<br />
+ log(x)*b*(1 - ---*x + ----*x - ------*x + --------*x )<br />
4 64 2304 147456<br />
That is, as Kreyszig assures us for double roots [K,p213], the general<br />
solution is <strong>of</strong> the form y = ay 1 (x) + by 2 (x) w<strong>here</strong> <strong>here</strong><br />
y 1 = 1 − 1 4 x2 + 1<br />
64 x4 − 1 1<br />
2304 x6 +<br />
147456 x8 + O(x 10 ) ,<br />
y 2 = y 1 (x) log x + 1 4 x2 − 3<br />
128 x4 + 11<br />
13824 x6 − 25<br />
1769472 x8 + O(x 10 ) .
Module 6. Series solutions <strong>of</strong> differential equations give special functions 239<br />
This framework <strong>of</strong> using residuals to improve approximate solutions, getting<br />
computers to do the tedious algebra, can be adapted to a wide variety <strong>of</strong><br />
problems. The iteration will improve an approximation provided changes<br />
deduced from the residuals are appropriate because a simple and sensible<br />
approximation to the equation for the changes has been derived. But the ultimate<br />
result depends only upon being able to evaluate the residuals correctly<br />
and being able to drive them to zero to some level <strong>of</strong> accuracy.<br />
Activity 6.L Do problems 6.18–6.23 in the Exercises set 6.3.4, p239. Send<br />
in to the examiner for feedback at least 6.18 & 6.20.<br />
6.3.4 Exercises<br />
Ex. 6.13: Modify the iteration <strong>of</strong> Example 6.6 to find the Maclaurin series<br />
solution to the ode y ′′ − 2y = 0 such that y(0) = 1 and y ′ (0) = 0 using<br />
reduce and to errors O(x 10 ).<br />
Ex. 6.14: Similarly use reduce to find the Maclaurin series solution to<br />
errors O(x 15 ) to the ode y ′′ + xy = 0 such that y(0) = a and y ′ (0) = b<br />
(remember to factor b,a;). The Maclaurin series multiplied by a and<br />
b are those <strong>of</strong> two linearly independent solutions to Airy’s equation<br />
mentioned in Kreyszig [K,p198,p958–60].
Module 6. Series solutions <strong>of</strong> differential equations give special functions 240<br />
Ex. 6.15: Use reduce to find the Maclaurin series <strong>of</strong> the solution to y ′ =<br />
cos(x)y such that y(0) = 1 to errors O(x 10 ). Hint: replace cos x<br />
in the code by its Maclaurin series, you may use that factorial(n)<br />
in reduce computes n!. Compare your answer to that <strong>of</strong> the exact<br />
analytic solution obtained by recognising the ode is separable.<br />
Ex. 6.16: Modify the analysis <strong>of</strong> Example 6.8 to use reduce to find the<br />
Maclaurin series solution to errors O(x 10 ) <strong>of</strong> the nonlinear ode y ′′ = 6y 2<br />
such that y(0) = 1 and y ′ (0) = b w<strong>here</strong> b is some arbitrary constant.<br />
Note: because this is a nonlinear ode the solution depends nonlinearly<br />
upon b, in contrast to linear ode’s which would show a linear<br />
dependence only.<br />
Ex. 6.17: Use reduce to find the Maclaurin series solution <strong>of</strong> the nonlinear<br />
ode y ′′ = (1+x)y 3 to errors O(x 10 ) such that y(0) = 2 and y ′ (0) = −3.<br />
Ex. 6.18: Modify the reduce computer algebra <strong>of</strong> Example 6.9 to find the<br />
Maclaurin series <strong>of</strong> the general solution to Legendre’s equation in the<br />
specific case k = 3 to an error O(x 10 ).<br />
Ex. 6.19: Modify the arguments and the reduce computer algebra <strong>of</strong> Example<br />
6.9 to find the Maclaurin series, to an error O(x 10 ), <strong>of</strong> the general<br />
solution to the following three odes:<br />
(a) (x − 2)y ′ = xy ;<br />
(b) (1 − x 2 )y ′ = 2xy ;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 241<br />
(c) y ′′ − 4xy ′ + (4x 2 − 2)y = 0 .<br />
Ex. 6.20: Modify the computer algebra code for Example 6.11 to find the<br />
Maclaurin series, to errors O(x 10 ), <strong>of</strong> the well-behaved solution <strong>of</strong> the<br />
nonlinear ode x 2 y ′′ + x 2 y ′ + xy 3 = 0 such that y(0) = 2.<br />
Ex. 6.21: Use reduce to help you find the power series about x = 0, to<br />
errors O(x 10 ), <strong>of</strong> the well-behaved solutions <strong>of</strong> the ode x 2 y ′′ + x 3 y ′ +<br />
(x 2 − 2)y = 0. Hint: x 2 y ′′ − 2y = (x 4 (y/x 2 ) ′ ) ′ . Then modify your<br />
reduce code to find the power series <strong>of</strong> the one parameter family <strong>of</strong><br />
well-behaved solutions to the nonlinear ode x 2 y ′′ +x 3 y ′ +(x 2 −2)y+y 2 =<br />
0.<br />
Ex. 6.22: Use reduce to help find the power series about x = 0, to errors<br />
O(x 20 ), <strong>of</strong> the well-behaved solutions <strong>of</strong> the ode xy ′′ + 3y ′ + 3x 2 y = 0.<br />
Hint: xy ′′ + 3y = (x 3 y ′ ) ′ /x 2 .<br />
Ex. 6.23: Find the power series expansions about x = 0, to errors O(x 10 ),<br />
for the two parameter general solution to the linear ode x 2 y ′′ −sin(x)y ′ +<br />
y = 0, with the aid <strong>of</strong> computer algebra. Hint: expand sin x in a<br />
Maclaurin series and write x 2 y ′′ − xy ′ in the form x 2−p (x p y ′ ) ′ .<br />
Ex. 6.24: Following is some reduce code to iteratively find a power series<br />
solution to an ode: what is the differential equation it purports to<br />
solve? and its initial conditions? what is the value <strong>of</strong> y after the first<br />
iteration <strong>of</strong> the repeat loop? what is the order <strong>of</strong> error in the computed<br />
power series after the loop terminates?
Module 6. Series solutions <strong>of</strong> differential equations give special functions 242<br />
on div; <strong>of</strong>f allfac; on revpri;<br />
y:=2*x;<br />
let x^20=>0;<br />
repeat begin<br />
r:=(1-x^3)*df(y,x,x)-(y^2-x^2)*df(y,x);<br />
r:=-int(int(r,x),x);<br />
write y:=y+r;<br />
end until r=0;<br />
6.3.5 Summary <strong>of</strong> some reduce commands<br />
“the different branches <strong>of</strong> Arithmetic—Ambition, Distraction, Uglification<br />
and Derision.” the Mock Turtle in Alice in Wonderland<br />
by Lewis Carroll<br />
• reduce instructions must be terminated and separated by a semicolon.<br />
• quit; or bye; terminates reduce execution.<br />
• Use on div;, <strong>of</strong>f allfac; and on revpri; to improve the printing<br />
<strong>of</strong> power series.<br />
• := is the assignment operator.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 243<br />
• The normal arithmetic operators are: +, -, *, / and ^ for addition,<br />
subtraction, multiplication, division and exponentiation respectively.<br />
• write will display the result <strong>of</strong> an expression, although reduce automatically<br />
displays the results <strong>of</strong> each command that is not in a loop.<br />
• int(y,x) will provide an integral <strong>of</strong> the expression in y with respect<br />
to the variable x, provided reduce can actually do the integral.<br />
• df(y,x) returns the derivative <strong>of</strong> the expression in y with respect to<br />
the variable x; df(y,x,z) will return the second derivative <strong>of</strong> y with<br />
respect to x and z.<br />
• factorial(n) returns the value <strong>of</strong> n!.<br />
• for n:=2:5 do, for example, will repeat whatever statement follows<br />
for values <strong>of</strong> the variable used, <strong>here</strong> n, over the range specified in the<br />
command, <strong>here</strong> from 2 to 5.<br />
• The let statement does pattern matching and replacement; for example<br />
let x^15=>0; tells reduce to subsequently discard any term<br />
involving x to the power fifteen or more.<br />
• repeat...until... will repeatedly execute a statement until the given<br />
condition is true.<br />
• begin...end is used to group statements into one; end; is also used<br />
to terminate reading in a file <strong>of</strong> reduce commands.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 244<br />
• in "..."; tells reduce to execute the commands contained in the<br />
specified file.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 245<br />
6.4 The orthogonal solutions to second order differential<br />
equations<br />
Power series give us very powerful methods <strong>of</strong> deriving solutions to specific<br />
differential equations. But in order to guide us we need to know more about<br />
the structure <strong>of</strong> solutions to ode’s. Sturm-Liouville theory tells us how different<br />
solutions <strong>of</strong> an ode relate to each other, they are orthogonal, and<br />
something about their nature. This then allows us to usefully write functions<br />
in terms <strong>of</strong> families <strong>of</strong> solutions to an ode.<br />
In this section we identify patterns that occur across a wide range <strong>of</strong> ode’s.<br />
This is mathematics at a higher level—it brings together into the framework<br />
<strong>of</strong> Sturm-Liouville theory a variety <strong>of</strong> ode’s and their solutions. The<br />
task <strong>here</strong> is not the solution <strong>of</strong> actual problems, but the appreciation <strong>of</strong> the<br />
synthesis <strong>of</strong> wide ranging phenomena in the solutions <strong>of</strong> ode’s.<br />
Main aims:<br />
• see that Legendre and Bessel equations are examples <strong>of</strong> Sturm-Liouville<br />
equations;<br />
• show that important properties such as reality <strong>of</strong> eigenvalues and orthogonality<br />
<strong>of</strong> eigenfunctions can be deduced from the differential equation.
Module 6. Series solutions <strong>of</strong> differential equations give special functions 246<br />
The simplest example <strong>of</strong> functions displaying the properties that we investigate<br />
are the trigonometric functions and their harmonics, sin nx and cos nx<br />
for integer n. The properties are derived from their differential equation<br />
y ′′ + n 2 y = 0.<br />
The family <strong>of</strong> ode’s we consider are those in the form <strong>of</strong> the Sturm-Liouville<br />
equation<br />
[r(x)y ′ ] ′ + [q(x) + λp(x)]y = 0 , (6.7)<br />
w<strong>here</strong> p, q and r are given functions and λ is a constant which is <strong>of</strong>ten a<br />
parameter to the problem. Many second-order ode’s are put into this form.<br />
Reading 6.M Study Kreyszig §4.7 [K,pp233–8], including the pro<strong>of</strong> <strong>of</strong> Reality<br />
<strong>of</strong> eigenvalues in Appendix 4 [K,pA70].<br />
Recall that orthogonality is just a grand word for being at right angles. These<br />
properties <strong>of</strong> the orthogonality <strong>of</strong> eigenfunctions and the reality <strong>of</strong> eigenvalues<br />
λ are reminiscent <strong>of</strong> the similar properties for eigenvectors and eigenvalues<br />
<strong>of</strong> symmetric matrices. This is no accident and the connection is explored<br />
further in Module 7.<br />
Othogonality implies oscillations: consider how a family <strong>of</strong> functions y n (x)<br />
can all be orthogonal to each other. First y 0 (x) can be fairly boring such as<br />
the constant P 0 (x) or cos(0·x). Secondly, y 1 (x) has to change sign somew<strong>here</strong>,
Module 6. Series solutions <strong>of</strong> differential equations give special functions 247<br />
as seen in P 1 (x) or cos(x), so that the integral ∫ b<br />
a y 0(x)y 1 (x)dx can be zero by<br />
orthogonality. Thirdly, y 2 (x) has to be orthogonal to both y 0 (x) and y 1 (x)<br />
so it must oscillate a couple <strong>of</strong> times, as seen in P 2 (x). And so on—as we<br />
consider further y n (x) we find that successive y n (x) must have more and more<br />
oscillations in order to maintain orthogonality. This is seen for example in the<br />
families P n (x) and cos(nx). It holds very widely: solutions <strong>of</strong> Sturm-Liouville<br />
problems have more oscillations the higher the value <strong>of</strong> the corresponding<br />
eigenvalue. 7<br />
Activity 6.N Do problems from Problem Set 4.7 [K,pp238–9]. Send in to<br />
the examiner for feedback at least Q4, 7 & 15.<br />
6.4.1 Answers to selected Exercises<br />
6.5 (a) expect power series solution because the coefficient functions are all<br />
well behaved at x = 0 and the leading coefficient <strong>of</strong> y ′′ is not zero.<br />
(b) y = 1 + 3 2 x4 + O(x 4 )<br />
6.13 y = 1 + x 2 + 1 6 x4 + 1<br />
90 x6 + 1<br />
2520 x8 + O(x 10 )<br />
6.14 y = b(x − 1 12 x4 + 1<br />
1<br />
12960 x9 + 1<br />
504 x7 − 1<br />
45360 x10 + 1<br />
1710720 x12 ) + O(x 15 )<br />
7 This can be proved but we will not do so <strong>here</strong>.<br />
7076160 x13 ) + a(1 − 1 6 x3 + 1<br />
180 x6 −
Module 6. Series solutions <strong>of</strong> differential equations give special functions 248<br />
6.15 y = 1 + x + 1 2 x2 − 1 8 x4 − 1<br />
15 x5 − 1<br />
240 x6 + 1 90 x7 + 31<br />
5760 x8 + 1<br />
5670 x9 + O(x 10 )<br />
6.16 y = 1 + 3x 2 + 3x 4 + 3x 6 + 18 7 x8 + b(x + 2x 3 + 3x 5 + 24 7 x7 + 25<br />
7 x9 ) +<br />
b 2 ( 1 2 x4 + x 6 + 45<br />
28 x8 ) + b 3 ( 1 7 x7 + 5<br />
14 x9 ) + O(x 10 )<br />
6.17 y = 2−3x+4x 2 − 14<br />
O(x 10 )<br />
3 x3 + 11<br />
6.18 y = a(1 − 3 2 x2 − 3 8 x4 − 17<br />
159<br />
4480 x9 ) + O(x 10 )<br />
2 x4 − 25 4 x5 + 211<br />
30 x6 − 47 6 x7 + 2081<br />
80 x6 − 663<br />
6.20 y = 2−4x+4x 2 − 32<br />
9 x3 + 26 9 x4 − 56<br />
O(x 10 )<br />
4480 x8 ) + b(x − 1 6 x3 − 3<br />
240 x8 − 41243<br />
4320 x9 +<br />
40 x5 − 27 7 −<br />
560x<br />
25 x5 + 3404<br />
2025 x6 − 832<br />
675 x7 + 1199<br />
1350 x8 − 4142<br />
6561 x9 +<br />
6.21 Well behaved solutions are proportional to y = x 2 − 3<br />
144 x8 +<br />
O(x 10 ). The nonlinear solutions parametrised by a, the coefficient <strong>of</strong><br />
the quadratic term, are y = a(x 2 − 3<br />
10 x4 + 3<br />
56 x6 − 1<br />
144 x8 ) + a 2 (− 1<br />
10 x4 +<br />
11<br />
280 x6 − 661<br />
75600 x8 ) + a 3 ( 1<br />
140 x6 − 11<br />
3150 x8 ) − 17<br />
37800 a4 x 8 + O(x 10 ).<br />
10 x4 + 3<br />
56 x6 − 1<br />
6.22 Well behaved solutions are proportional to y = 1− 1 5 x3 + 1 80 x6 − 1<br />
1<br />
147840 x12 1<br />
−<br />
12566400 x15 1<br />
+<br />
1507968000 x18 + O(x 20 )<br />
6.23 y = (a+b log x)(x− 1<br />
11<br />
11612160 x7 −<br />
5951<br />
2640 x9 +<br />
24 x3 + 7<br />
3840 x5 − 89<br />
1161216 x7 + 6721<br />
2229534720 x9 1<br />
)+b(<br />
23040 x5 +<br />
44590694400 x9 ) + O(x 10 ).<br />
6.24 (1 − x 3 )y ′′ − (y 2 − x 2 )y ′ = 0, such that y(0) = 0 and y ′ (0) = 2 .<br />
y (1) = 2x + 1 2 x4 . The ultimate error is O(x 20 ) .
Module 6. Series solutions <strong>of</strong> differential equations give special functions 249<br />
6.5 Summary<br />
• Power series give a powerful general method for solving linear and<br />
nonlinear ordinary differential equations (ode’s). At a regular point<br />
(§§6.2.1) solutions <strong>of</strong> an ode are developed in the form <strong>of</strong> Taylor or<br />
Maclaurin series (§§6.1.1):<br />
∞∑<br />
y(x) = a m (x − c) m = a 0 + a 1 (x − c) + a 2 (x − c) 2 + · · · .<br />
m=0<br />
Because <strong>of</strong> the uniqueness <strong>of</strong> a power series representation, the constants<br />
a m are determined equating coefficients <strong>of</strong> like powers <strong>of</strong> x − c<br />
(§§6.1.1).<br />
• Legendre polynomials, P n (x), are an example <strong>of</strong> special functions:<br />
– are the only non-singular solutions <strong>of</strong> Legendre’s equation which<br />
is, in Sturm-Liouville form, [(1 − x 2 )y ′ ] ′ + n(n + 1)y = 0 (§§6.1.2);<br />
– and are orthogonal over the interval [−1, 1] with weight function<br />
p(x) = 1 (§§6.4).<br />
• At a singular point (but not “too” singular §§6.2.1) Frobenius asserts<br />
solutions may be developed in the modified power series:<br />
∞<br />
y(x) = (x − c) r ∑<br />
a m (x − c) m<br />
m=0<br />
= a 0 (x − c) r + a 1 (x − c) r+1 + a 2 (x − c) r+2 + · · · .
Module 6. Series solutions <strong>of</strong> differential equations give special functions 250<br />
The exponent r is determined from the indicial equation obtained from<br />
the term <strong>of</strong> lowest order after substituting into the ode.<br />
• In applying Frobenius method (§§6.2.1) to second-order ode’s t<strong>here</strong> are<br />
generally two roots r 1 ≥ r 2 to the indicial equation and consequently<br />
three cases are distinguished (taking c = 0 for simplicity):<br />
– distinct roots not differing by an integer are straightforward—a<br />
basis for the solutions are<br />
(<br />
y 1 (x) = x r 1 a0 + a 1 x + a 2 x 2 + · · ·)<br />
,<br />
(<br />
y 2 (x) = x r 2 b0 + b 1 x + b 2 x 2 + · · ·)<br />
;<br />
– a double root, r 1 = r 2 , when a basis is<br />
(<br />
y 1 (x) = x r 1 a0 + a 1 x + a 2 x 2 + · · ·)<br />
,<br />
(<br />
y 2 (x) = y 1 (x) log x + x r 1 b1 x + b 2 x 2 + · · ·)<br />
;<br />
– roots differing by an integer when a basis is<br />
(<br />
y 1 (x) = x r 1 a0 + a 1 x + a 2 x 2 + · · ·)<br />
,<br />
(<br />
y 2 (x) = ky 1 (x) log x + x r 2 b0 + b 1 x + b 2 x 2 + · · ·)<br />
;<br />
• Bessel functions, J ν (x) and Y ν (x), are special functions and<br />
– are solutions <strong>of</strong> Bessel’s equation (§§6.2.2) x 2 y ′′ +xy ′ +(x 2 −ν 2 )y =<br />
0 or, in Sturm-Liouville form, [xy ′ ] ′ + ( )<br />
x − ν2<br />
x y = 0;
Module 6. Series solutions <strong>of</strong> differential equations give special functions 251<br />
– are orthogonal over intervals with x > 0 in several senses (§§6.4)<br />
• The iterative construction <strong>of</strong> power series solutions is an ideal application<br />
<strong>of</strong> computer algebra (§§6.3.2) for linear and nonlinear problems<br />
provided we discard unwanted high-order terms.<br />
• A good iterative method is (§§6.3.3): given an approximate solution<br />
y(x), to seek small changes ŷ(x) so that y(x) + ŷ(x) is a better approximation.<br />
Such changes are determined from the residual <strong>of</strong> the<br />
governing equations.<br />
• Many ode’s <strong>of</strong> importance may be written in the form <strong>of</strong> the Sturm-<br />
Liouville equation (6.7), [r(x)y ′ ] ′ + [q(x) + λp(x)]y = 0:<br />
– non-zero solutions only exist for particular values <strong>of</strong> λ = λ n , called<br />
the eigenvalues, which are necessarily real;<br />
– the corresponding eigenfunctions are all orthogonal with weight<br />
function p(x) (§§6.4).<br />
Activity 6.O Do problems from Chapter 4 Review [K,pp247–8].
Module 7<br />
Linear transforms and their<br />
eigenvectors on inner product<br />
spaces<br />
Recall the work on differential equations and their orthogonal solutions that<br />
we finished in Module 6. Many <strong>of</strong> the properties we touched upon t<strong>here</strong> are<br />
very similar to some that you have met in linear algebra before. The time<br />
has come to bring these two strands together.<br />
But solutions <strong>of</strong> differential equations involve the infinite flexibility <strong>of</strong> functions.<br />
We will see that functions act very much like vectors. But on any
Module 7. Linear transforms and their eigenvectors on inner product spaces253<br />
finite interval t<strong>here</strong> are not just an infinite number <strong>of</strong> functions t<strong>here</strong> are an<br />
infinite variety <strong>of</strong> functions. For example, in §7.4.3 we use the infinite number<br />
<strong>of</strong> solutions to Sturm-Liouville problems to describe any other solution function.<br />
But “infinity” is a slippery concept, so we now are very careful about<br />
how to establish the mathematical basis. First we create a basic structure<br />
for space, then the properties <strong>of</strong> mappings between spaces, and lastly the<br />
representation <strong>of</strong> these mappings by simple matrices <strong>of</strong> coefficients.<br />
Ultimately the development <strong>of</strong> a common setting allows us to draw simple<br />
vector pictures even when discussing concepts in extremely complicated situations<br />
such as the space <strong>of</strong> all continuous functions.<br />
Sturm-Liouville theory introduced in §6.4 is very close to properties <strong>of</strong> eigenvalues<br />
and eigenvectors <strong>of</strong> matrices. In this module we bring both within<br />
a unified view using the abstract theory <strong>of</strong> inner product spaces. We then<br />
extend the combined view a little further.<br />
Module contents<br />
7.1 Inner product spaces . . . . . . . . . . . . . . . . . . . 255<br />
7.1.1 Vector spaces form the universe . . . . . . . . . . . . . . 255<br />
7.1.2 Inner products give distances and angles . . . . . . . . . 262<br />
7.1.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 266<br />
7.2 The nature <strong>of</strong> linear transformations . . . . . . . . . . 269<br />
7.2.1 The universe <strong>of</strong> linear transformations . . . . . . . . . . 269
Module 7. Linear transforms and their eigenvectors on inner product spaces254<br />
7.2.2 Adjoint operators . . . . . . . . . . . . . . . . . . . . . . 273<br />
7.2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 282<br />
7.3 Revision <strong>of</strong> eigenvalues and eigenvectors . . . . . . . 284<br />
7.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 287<br />
7.4 Diagonalisation transformation . . . . . . . . . . . . . 289<br />
7.4.1 Adjoint eigenvectors diagonalise operators . . . . . . . . 290<br />
7.4.2 Orthogonal eigenvectors <strong>of</strong> self-adjoint operators . . . . 300<br />
7.4.3 Expansions in orthogonal eigenfunctions . . . . . . . . . 303<br />
7.4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 308<br />
7.4.5 Answers to selected Exercises . . . . . . . . . . . . . . . 310<br />
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Module 7. Linear transforms and their eigenvectors on inner product spaces255<br />
7.1 Inner product spaces<br />
Here we establish the basic abstract structure <strong>of</strong> spaces in which the analysis<br />
takes place <strong>of</strong> linear algebra and differential and integral equations. The abstract<br />
concepts are supported by examples met in your earlier mathematics.<br />
The approach is to build up the structures and properties that are needed<br />
from an axiomatic base.<br />
Main aims:<br />
• to develop vector spaces and their properties from their basic axioms<br />
and to understand how functions and IR n are unified in this framework;<br />
• to see how the definition <strong>of</strong> inner products leads to a unified view <strong>of</strong><br />
the useful notions <strong>of</strong> length, angles and orthogonality;<br />
• show how familiar relations and inequalities generalise to many situations.<br />
7.1.1 Vector spaces form the universe<br />
The first step is to define the fundamental axioms <strong>of</strong> vector spaces. 1<br />
1 An entertaining and accurate introduction to vector spaces is available at<br />
http://ciips.ee.uwa.edu.au/~gregg/Linalg/node86.html
Module 7. Linear transforms and their eigenvectors on inner product spaces256<br />
Reading 7.A Study the first three pages <strong>of</strong> §6.8 in Kreyszig [K,pp358–60].<br />
Note the basic properties <strong>of</strong> vector addition and scalar multiplication on the<br />
vector space: closed, commutativity, associativity, distributivity, and the existence<br />
<strong>of</strong> the zero vector 0, the negative <strong>of</strong> a vector, and the multiplicative<br />
identity 1. As for ordinary vectors t<strong>here</strong> exist the concepts <strong>of</strong> linear combination,<br />
linear independence, dimensionality both finite and infinite, and a<br />
basis.<br />
In our study we will stay within the realm <strong>of</strong> real vector spaces.<br />
Example 7.1: quadratic polynomials Show the set V <strong>of</strong> all quadratic<br />
polynomials (including those with zero coefficients) form a vector space<br />
under the usual operations <strong>of</strong> addition and scalar multiplication, write<br />
down a basis for the vector space and deduce it is <strong>of</strong> dimension 3.<br />
Solution: Denote by, for example, a the quadratic polynomial a 0 +<br />
a 1 x + a 2 x 2 .<br />
• Then “vector” (polynomial) addition a + b = (a 0 + b 0 ) + (a 1 +<br />
b 1 )x + (a 2 + b 2 )x 2 clearly gives another quadratic polynomial in V<br />
and is thus closed under addition.
Module 7. Linear transforms and their eigenvectors on inner product spaces257<br />
• By definition and commutativity <strong>of</strong> ordinary addition:<br />
a + b = (a 0 + b 0 ) + (a 1 + b 1 )x + (a 2 + b 2 )x 2<br />
= (b 0 + a 0 ) + (b 1 + a 1 )x + (b 2 + a 2 )x 2<br />
= b + a .<br />
• Similarly for associativity:<br />
(u + v) + w<br />
= [(u 0 + v 0 ) + w 0 ] + [(u 1 + v 1 ) + w 1 ]x + [(u 2 + v 2 ) + w 2 ]x 2<br />
= [u 0 + (v 0 + w 0 )] + [u 1 + (v 1 + w 1 )]x + [u 2 + (v 2 + w 2 )]x 2<br />
= u + (v + w) .<br />
• Clearly the zero vector, 0, is the zero polynomial 0 + 0x + 0x 2 as<br />
a + 0 = (a 0 + 0) + (a 1 + 0)x + (a 2 + 0)x 2 = a.<br />
• Now scalar multiplication defined as ca = (ca 0 ) + (ca 1 )x + (ca 2 )x 2<br />
clearly gives another quadratic and so V is closed under scalar<br />
multiplication.<br />
• By definition and distributivity <strong>of</strong> ordinary multiplication:<br />
c(a + b) = c(a 0 + b 0 ) + c(a 1 + b 1 )x + c(a 2 + b 2 )x 2<br />
= (ca 0 + cb 0 ) + (ca 1 + cb 1 )x + (ca 2 + cb 2 )x 2<br />
= (ca 0 ) + (ca 1 )x + (ca 2 )x 2 + (cb 0 ) + (cb 1 )x + (cb 2 )x 2<br />
= (ca) + (cb) .
Module 7. Linear transforms and their eigenvectors on inner product spaces258<br />
• Similarly for<br />
(c + d)a<br />
= (c + d)a 0 + (c + d)a 1 x + (c + d)a 2 x 2<br />
= (ca 0 + da 0 ) + (ca 1 + da 1 )x + (ca 2 + da 2 )x 2<br />
= (ca 0 ) + (ca 1 )x + (ca 2 )x 2 + (da 0 ) + (da 1 )x + (da 2 )x 2<br />
= (ca) + (da) .<br />
• Again by definition and associativity <strong>of</strong> ordinary multiplication<br />
c(da) = c [ (da 0 ) + (da 1 )x + (da 2 )x 2]<br />
= c [ d(a 0 + a 1 x + a 2 x 2 ) ]<br />
= (cd)a .<br />
• Lastly, the number 1 clearly serves as the identity for scalar multiplication.<br />
Thus this system forms a vector space. A basis for the vector space<br />
could be simply the powers <strong>of</strong> x in {1, x, x 2 } which in fact we used to<br />
show the vector space properties. Note that 1, x and x 2 are linearly<br />
independent quadratics because one cannot find a linear combination<br />
<strong>of</strong> them that is the zero quadratic, that is, zero for all x. Another basis<br />
for the vector space could be the first three Legendre polynomials:<br />
P 0 (x) = 1, P 1 (x) = x and P 2 (x) = − 1 2 + 3 2 x2 . Since the number <strong>of</strong> basis<br />
vectors is necessarily three, then so is the dimensionality.
Module 7. Linear transforms and their eigenvectors on inner product spaces259<br />
Example 7.2: Show that sets with set union (∪) as the addition operator<br />
cannot form a vector space.<br />
Solution: Denote the “vectors”, namely the subsets <strong>of</strong> some universal<br />
set U, by capital letters, A and B.<br />
(a) Clearly A + B = A ∪ B is a set in U so “vector addition” is closed.<br />
(b) Also clearly, A+B = A∪B = B ∪A = B +A so “vector addition”<br />
satisfies commutativity.<br />
(c) Set union is associative so (A + B) + C = (A ∪ B) ∪ C = A ∪<br />
(B ∪ C) = A + (B + C) ensures the associativity <strong>of</strong> this “vector<br />
addition.”<br />
(d) Now A + 0 = A ∪ 0 = A can only hold for all sets A if the zero<br />
vector is 0 = ∅ the empty set.<br />
(e) But then t<strong>here</strong> is no negative for every set A as clearly t<strong>here</strong> is<br />
generally no set B (which we would like to denote by −A) such<br />
that A + B = A ∪ B = ∅.<br />
Because <strong>of</strong> the failure <strong>of</strong> this property, we cannot form a vector space.<br />
Definition 7.1 A square integrable function on the interval [a, b] is a function,<br />
say f(x), for which ∫ b<br />
a [f(x)]2 dx is finite valued. The set <strong>of</strong> all square<br />
integrable functions on [a, b] is denoted L 2 [a, b].
Module 7. Linear transforms and their eigenvectors on inner product spaces260<br />
Example 7.3: Argue that L 2 [a, b] is a vector space under the usual addition<br />
and scalar multiplication <strong>of</strong> functions.<br />
Solution: Denote the “vectors” by lower case letters such as f, g and<br />
h to denote the functions f(x), g(x) and h(x) respectively. Consider<br />
each property in turn.<br />
• Defining f + g to be the function with the value f(x) + g(x) for<br />
all x ∈ [a, b]. But is it necessarily in L 2 [a, b], namely square integrable?<br />
Note the following inequality, that for any numbers a and<br />
b: Remember this<br />
(a + b) 2 = 2a 2 + 2b 2 − (a − b) 2 ≤ 2a 2 + 2b 2 .<br />
Apply this pointwise to functions f and g:<br />
∫ b<br />
a<br />
(f + g) 2 dx ≤<br />
∫ b<br />
a<br />
∫ b<br />
∫ b<br />
2f 2 + 2g 2 dx = 2 f 2 dx + 2 g 2 dx ,<br />
a<br />
a<br />
and since the right-hand side is a finite upper bound for the nonnegative<br />
integral on the left, thus f + g must be in L 2 [a, b] and<br />
addition is closed.<br />
• Also commutativity, f + g = g + f, follows from pointwise commutativity,<br />
f(x) + g(x) = g(x) + f(x).<br />
• Similarly for associativity.<br />
• Clearly the “zero vector” is the zero function as f(x) + 0 = f(x)<br />
for all x<br />
inequality—it comes<br />
from the<br />
parallelogram<br />
equality.
Module 7. Linear transforms and their eigenvectors on inner product spaces261<br />
• The “negative” <strong>of</strong> f is simply its pointwise negative −f(x). −f<br />
is clearly in L 2 [a, b] if f is.<br />
• L 2 [a, b] is closed under scalar multiplication as ∫ b<br />
a [cf(x)]2 dx =<br />
c 2 ∫ b<br />
a f 2 dx which is finite for all finite c and square integrable f.<br />
• As above distributivity and associativity <strong>of</strong> scalar multiplication<br />
follows immediately from pointwise properties.<br />
• Lastly, the identity for scalar multiplication is the function that<br />
is 1 for all x ∈ [a, b] as 1.f(x) = f(x)<br />
Often we only want to consider subsets <strong>of</strong> a vector space. For example,<br />
when solving a differential equation with boundary conditions we only need<br />
to consider those “vectors” in the vector space <strong>of</strong> functions which satisfy the<br />
boundary conditions. The notion <strong>of</strong> a subspace <strong>of</strong> a vector space is very<br />
useful from time to time. The pro<strong>of</strong> <strong>of</strong> the following theorem follows directly<br />
from the properties <strong>of</strong> a vector space.<br />
Theorem 7.2 A subset U <strong>of</strong> a vector space V is a vector space itself if it is<br />
closed under vector addition and scalar multiplication. Such a subset is then<br />
called a vector subspace.<br />
Example 7.4: The set <strong>of</strong> vectors lying on any one line through the origin in<br />
the plane form a vector subspace. Clearly, given a fixed line U through
Module 7. Linear transforms and their eigenvectors on inner product spaces262<br />
the origin: any two vectors lying in the line U add to another vector<br />
in U; any scalar multiple <strong>of</strong> a vector in the line U is also a vector in U.<br />
Thus such a line U is closed, is a subset <strong>of</strong> the vector space <strong>of</strong> the plane<br />
and t<strong>here</strong>fore is a vector subspace.<br />
Activity 7.B Do Problems 1–12 in Problem Set 6.8 [K,p364], and 7.6–7.10<br />
from Exercises 7.1.3. Send in to the examiner for feedback at least Q1,<br />
7 and Ex. 7.6.<br />
7.1.2 Inner products give distances and angles<br />
One <strong>of</strong> our fundamental needs is the notion <strong>of</strong> distance and angles. For<br />
example, only then can we determine the errors in an approximation. A generalisation<br />
<strong>of</strong> the vector dot product to an inner product serves this purpose<br />
in any vector space.<br />
Reading 7.C Study the brief subsection on Inner Product Spaces in Kreyszig<br />
§6.8 [K,pp361–2].
Module 7. Linear transforms and their eigenvectors on inner product spaces263<br />
Note, Kreyszig uses round brackets (parentheses) to denote a general inner<br />
product, (u, v), w<strong>here</strong>as I prefer the angle brackets, 〈u, v〉, as it is less likely<br />
to be mistaken for a vector with two components, and will use it throughout<br />
this study guide. Inner products occur so extensively in mathematics that<br />
one <strong>of</strong>ten uses many different types <strong>of</strong> brackets for different inner products<br />
on different vector spaces.<br />
Definition 7.3 An inner product on a real vector space V is a real function<br />
〈u, v〉 for each u and v in V such that the following properties hold:<br />
1. linearity, 〈au + bv, w〉 = a 〈u, w〉 + b 〈v, w〉 for all real a and b, and<br />
all vectors u, v and w in V ;<br />
2. symmetry, 〈u, v〉 = 〈v, u〉 for all vectors u and v in V ;<br />
3. positivity, 〈v, v〉 ≥ 0 for all v with equality holding only if v = 0.<br />
A vector space with an inner product is called an inner product space.<br />
Example 7.5: For functions f and g in L 2 [a, b] determine whether 〈f, g〉 =<br />
∫ b<br />
a fg dx forms an inner product.
Module 7. Linear transforms and their eigenvectors on inner product spaces264<br />
Solution:<br />
Since fg = 1 4 [(f + g)2 − (f − g) 2 ] thus<br />
∫ b<br />
fg dx = 1 [ ∫ b<br />
∫ ]<br />
b<br />
(f + g) 2 dx − (f − g) 2 dx<br />
a 4 a<br />
a<br />
which always is a finite real number as f ± g are in L 2 [a, b].<br />
linearity for all c, d and functions f, g and h in L 2 [a, b]:<br />
〈cf + dg, h〉 =<br />
=<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
(cf + dg)h dx<br />
cfh dx +<br />
∫ b<br />
= c 〈f, h〉 + d 〈g, h〉 .<br />
a<br />
dgh dx<br />
symmetry Clearly 〈f, g〉 = ∫ b<br />
a fg dx = ∫ b<br />
a gf dx = 〈g, f〉<br />
positivity Also clearly 〈f, f〉 = ∫ b<br />
a f 2 dx ≥ 0 as the integrand f 2 ≥ 0.<br />
However, 〈f, f〉 can be 0 without f being precisely zero. For Infinite dimensional<br />
example, consider f(x) = 0 everyw<strong>here</strong> on [a, b] except for a finite function spaces are<br />
number <strong>of</strong> points at which it takes some non-zero value, then<br />
tricky.<br />
〈f, f〉 = ∫ b<br />
a f 2 dx = 0 but f is not zero. Strictly speaking this 〈, 〉<br />
is not an inner product on L 2 [a, b].<br />
However, we can patch the definitions. Refine the definition <strong>of</strong> square<br />
integrable functions so that a “vector” f in L 2 [a, b] is the set <strong>of</strong> all<br />
functions which are the same except at some number <strong>of</strong> isolated points.
Module 7. Linear transforms and their eigenvectors on inner product spaces265<br />
Then all necessary properties <strong>of</strong> an inner product space follow including<br />
that 〈f, f〉 = 0 only if f is the zero “vector” (the set <strong>of</strong> functions such<br />
that ∫ b<br />
a f 2 dx = 0.)<br />
With an inner product defined, the definition <strong>of</strong> distance between two vectors<br />
follows immediately.<br />
Definition 7.4 For vectors √ u and v in an inner product space, the length<br />
or norm <strong>of</strong> u is ‖u‖ = 〈u, u〉. (Thus the distance between u and v is<br />
‖u − v‖). A vector <strong>of</strong> norm 1 is called a unit vector.<br />
Note especially the consequent Schwarz inequality, also known as the Cauchy-<br />
Schwarz inequality, the triangle inequality, and the parallelogram equality.<br />
These relations are familiar in 2 and 3-dimensional geometry, and now we<br />
know they also hold even for very esoteric vector spaces. It means that<br />
schematic diagrams we draw on paper are still relevant to infinite dimensional<br />
inner product spaces.<br />
Inner products not only provide the notion <strong>of</strong> distance, they are also intimately<br />
tied up with the notion <strong>of</strong> angles and hence orthogonality. This<br />
underpins the orthogonality we discussed (§6.4) in the infinite number <strong>of</strong><br />
eigenfunctions <strong>of</strong> Sturm-Liouville problems.
Module 7. Linear transforms and their eigenvectors on inner product spaces266<br />
Definition 7.5 The angle θ between two vectors u and v in an inner product<br />
space is determined from<br />
( ) 〈u, v〉<br />
〈u, v〉 = ‖u‖.‖v‖. cos θ that is θ = arccos<br />
. (7.1)<br />
‖u‖.‖v‖<br />
Consequently, two vectors are orthogonal if their inner product 〈u, v〉 = 0 .<br />
Observe how the Cauchy-Schwarz inequality ensures that t<strong>here</strong> is always a<br />
well defined angle between any two non-zero vectors <strong>of</strong> an inner product<br />
space. This leads to being able to characterise vectors for which 〈u, v〉 =<br />
0 as being orthogonal, that is at right-angles, and leads to a generalised<br />
Pythagoras theorem.<br />
Activity 7.D Do problems from Exercises 7.11–7.13. Send in to the examiner<br />
for feedback at least Ex. 7.13.<br />
7.1.3 Exercises<br />
Ex. 7.6: Show that sets <strong>of</strong> objects with set intersection, ∩, as the addition<br />
operator cannot form a vector space.<br />
Ex. 7.7: Argue that the set <strong>of</strong> infinite sequences, denoted by IR ∞ it is composed<br />
<strong>of</strong> elements <strong>of</strong> the form a = (a 1 , a 2 , a 3 , . . .), forms a vector space.
Module 7. Linear transforms and their eigenvectors on inner product spaces267<br />
Ex. 7.8: Determine whether the following functions are in L 2 on the given<br />
interval:<br />
(a) sin x on [0, π];<br />
(b) cos x on (−∞, ∞);<br />
(c) e −x on [0, ∞);<br />
(d) x −1/4 on [0, 1];<br />
(e) 1/ √ x on [0, 4];<br />
(f) x −3/4 on [1, ∞).<br />
Ex. 7.9: Let L w 2 [a, b] denote the set <strong>of</strong> functions for which the weighted integral<br />
∫ b<br />
a w(x)[f(x)]2 dx is finite for some positive weight function w(x).<br />
Argue L w 2 [a, b] is a vector space under the usual addition and scalar<br />
multiplication <strong>of</strong> functions.<br />
Ex. 7.10: Argue that the space, C n [a, b], 2 <strong>of</strong> all functions with n continuous<br />
derivatives on [a, b] forms a vector space under addition and scalar<br />
multiplication <strong>of</strong> functions.<br />
Ex. 7.11: By considering ‖u+v‖ 2 and using the Cauchy-Schwarz inequality,<br />
prove the triangle inequality ‖u + v‖ ≤ ‖u‖ + ‖v‖ for all vectors in an<br />
inner product space.<br />
2 Often the space C 0 [a, b] <strong>of</strong> continuous functions on [a, b] is written as just C[a, b].
Module 7. Linear transforms and their eigenvectors on inner product spaces268<br />
Ex. 7.12: Argue that the subset U <strong>of</strong> IR ∞ for which ∑ ∞<br />
i=1 a 2 i is finite forms<br />
a subspace with an inner product 〈a, b〉 = ∑ ∞<br />
i=1 a i b i .<br />
Ex. 7.13: Let u = x and v = x 2 and the inner product 〈f, g〉 = ∫ 1<br />
0 fg dx .<br />
What are the norms <strong>of</strong> u and v? What is the angle between u and v?
Module 7. Linear transforms and their eigenvectors on inner product spaces269<br />
7.2 The nature <strong>of</strong> linear transformations<br />
We need to start considering functions defined on vector spaces. The simplest<br />
examples are functions <strong>of</strong> many variables. But we will have to move to<br />
dealing with functions <strong>of</strong> an infinite number <strong>of</strong> variables and even functions<br />
<strong>of</strong> functions! In fact you are already intimately familiar with the examples<br />
<strong>of</strong> differentiation and integration: sin x = cos x, d<br />
d<br />
= 2xe<br />
dx dx ex2 x2 and<br />
d<br />
(2√ x) = 1/ √ x so differentiation takes a function as an argument, such as<br />
dx<br />
sin x, and returns a function as a result, such as cos x. Here we investigate<br />
the simplest functions <strong>of</strong> a vector space, the linear transformations. They<br />
are the “straight lines” <strong>of</strong> vector spaces that in later units will form a basis<br />
for understanding quite general transformations.<br />
Main aims:<br />
• show that familiar operations on functions are examples <strong>of</strong> linear transformations;<br />
• the adjoint is the general analogue <strong>of</strong> the transpose.<br />
7.2.1 The universe <strong>of</strong> linear transformations<br />
Reading 7.E Study the last part, Linear Transformations, <strong>of</strong> Kreyszig §6.8<br />
[K,pp362–4].
Module 7. Linear transforms and their eigenvectors on inner product spaces270<br />
Definition 7.6 If F : V → W is a function from the vector space V into<br />
the vector space W , then F is called a linear transformation if<br />
1. F (u + v) = F (u) + F (v) for all vectors u and v in V , and<br />
2. F (cu) = cF (u) for all vectors u in V and scalars c.<br />
A linear transform is also called a linear operator.<br />
Example 7.14: Show that the differential operator L = d2<br />
+x d<br />
dx 2<br />
transformation from C 2 [a, b] into C 0 [a, b].<br />
dx<br />
is a linear<br />
Solution: Since L involves at most the second derivative, the range<br />
and domain are clearly appropriate.<br />
(a) Observe<br />
L(f + g) = d2<br />
d<br />
(f + g) + x (f + g)<br />
dx2 dx<br />
= d2 f<br />
dx + d2 g<br />
2 dx + x df<br />
2 dx + x dg<br />
dx<br />
= d2 f<br />
dx + x df<br />
2 dx + d2 g<br />
dx + x dg<br />
2 dx<br />
= Lf + Lg ,
Module 7. Linear transforms and their eigenvectors on inner product spaces271<br />
(b) and<br />
L(cf) = d2<br />
dx 2 (cf) + x d<br />
dx (cf)<br />
= c d2 f df<br />
+ cx<br />
dx2 dx<br />
= cLf ,<br />
which are the requisite properties for any a and b.<br />
Example 7.15: Argue that the integral L(f) = ∫ b<br />
a K(x, y)f(y) dy is a linear<br />
operator from L 2 [a, b] into itself, that is from and to square integrable<br />
functions, provided that K is bounded, |K(x, y)| ≤ k, for x and y in<br />
the interval [a, b].<br />
For example, if a = 0, b = 1 and K(x, y) = x − y (the bound is k = 1)<br />
then L(x 2 ) = 1x − 1 and L(sin πx) = (2x − 1)/π.<br />
3 4<br />
Solution: As with many infinite dimensional vector space problems<br />
the overwhelming difficult lies in confirming the range <strong>of</strong> the function L.<br />
Thus we first dispense with the straightforward part <strong>of</strong> showing that it<br />
is linear:
Module 7. Linear transforms and their eigenvectors on inner product spaces272<br />
(a)<br />
(b)<br />
L(f + g) =<br />
=<br />
=<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
K(x, y)(f(y) + g(y))dy<br />
K(x, y)f(y) + K(x, y)g(y) dy<br />
K(x, y)f(y) dy +<br />
= L(f) + L(g) ;<br />
L(cf) =<br />
= c<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
= cL(f) .<br />
∫ b<br />
a<br />
K(x, y)cf(y) dy<br />
K(x, y)f(y) dy<br />
K(x, y)g(y) dy<br />
For f ∈ L 2 [a, b] we know ∫ b<br />
a f 2 (x) dx is finite. We now need to prove<br />
that g(x) = ∫ b<br />
a K(x, y)f(y) dy is also in L 2[a, b]. To help we define the<br />
inner product for any f and g, 〈f, g〉 = ∫ b<br />
a fg dx and use the Cauchy-<br />
Schwarz inequality that 〈f, g〉 2 ≤ ‖f‖ 2 ‖g‖ 2 . Consider<br />
∫ b<br />
a<br />
g 2 dx =<br />
=<br />
∫ b<br />
a<br />
∫ b ∫ b<br />
a<br />
L(f)L(f) dx<br />
a<br />
∫ b<br />
K(x, y)f(y)dy K(x, z)f(z)dz dx<br />
a
Module 7. Linear transforms and their eigenvectors on inner product spaces273<br />
=<br />
≤<br />
as any variable may be used for y in L(f)<br />
∫ b ∫ b ∫ b<br />
a a a<br />
∫ b ∫ b ∫ b<br />
a<br />
a<br />
K(x, y)K(x, z)dx f(y)f(z) dy dz<br />
|K(x, y)|.|K(x, z)|dx |f(y)|.|f(z)| dy dz<br />
a<br />
} {{ }<br />
≤(b−a)k 2<br />
∫ b<br />
rearranging<br />
∫ b<br />
≤ k 2 (b − a) |f(y)|dy |f(z)|dz<br />
a<br />
a<br />
= k 2 (b − a) 〈1, |f|〉 2 by definition <strong>of</strong> inner product<br />
≤ k 2 (b − a)‖f‖ 2 ‖1‖ 2<br />
∫ b<br />
= k 2 (b − a) 2 f 2 (x) dx<br />
a<br />
by Cauchy-Schwarz<br />
by definition <strong>of</strong> norms.<br />
This bound on the integral is finite and thus g = L(f) is necessarily<br />
square integrable, that is in L 2 [a, b].<br />
Activity 7.F Do Exercise 7.23 <strong>here</strong>in.<br />
7.2.2 Adjoint operators<br />
Recall that the transpose <strong>of</strong> a matrix <strong>of</strong>ten crops up in solving matrix problems.<br />
For example, the least-squares solution <strong>of</strong> an overdetermined system
Module 7. Linear transforms and their eigenvectors on inner product spaces274<br />
Ax = b is found by solving A T Ax = A T b. Also, the eigenvalues <strong>of</strong> a symmetric<br />
matrix, one for which A T = A, are always real. For general linear<br />
transforms we define the equivalent notion <strong>of</strong> an adjoint operator.<br />
Definition 7.7 The adjoint <strong>of</strong> a linear operator L mapping a subspace V<br />
into a subspace U <strong>of</strong> an inner product space W is the operator L † such that<br />
〈u, Lv〉 = 〈 L † u, v 〉 for all vectors u ∈ U and v ∈ V .<br />
If L † = L and U = V then L is called self-adjoint.<br />
Example 7.16: A † = A T . The adjoint <strong>of</strong> a matrix is its transpose using the<br />
usual inner product 〈u, v〉 = u T v. For all u and v, consider:<br />
〈u, Av〉 = u T Av by inner product definition<br />
= (A T u) T v by transpose properties<br />
= 〈 A T u, v 〉 by inner product definition,<br />
and hence the adjoint is A T . Clearly a symmetric matrix is self-adjoint.<br />
Theorem 7.8 Some straightforward properties <strong>of</strong> the adjoint follow.<br />
any linear operators L and M:<br />
For
Module 7. Linear transforms and their eigenvectors on inner product spaces275<br />
1. (L † ) † = L;<br />
2. (L + M) † = L † + M † ;<br />
3. (LM) † = M † L † ;<br />
4. the adjoint depends upon the inner product, if the inner product is<br />
changed then so does the adjoint.<br />
The pro<strong>of</strong>s <strong>of</strong> these properties are left as Exercise 7.26.<br />
Example 7.17: The shear transformation <strong>of</strong> the plane in the horizontal x-<br />
direction with parameter k is T (x, y) = (x + ky, y). This has matrix<br />
A =<br />
[<br />
1 k<br />
0 1<br />
]<br />
so that<br />
[<br />
x<br />
T (x, y) = A<br />
y<br />
]<br />
.<br />
Thus from Example 7.16 its adjoint must have matrix<br />
A T =<br />
[<br />
1 0<br />
k 1<br />
]<br />
so that<br />
T † (x, y) = A T [<br />
x<br />
y<br />
]<br />
=<br />
[<br />
x<br />
kx + y<br />
]<br />
.<br />
Thus T † is the shear transformation in the vertical y-direction with<br />
parameter k.<br />
But what is T † if we used a weighted inner product? Say the inner<br />
product on the plane was defined as 〈(u, v), (x, y)〉 = 2xu + yv so that
Module 7. Linear transforms and their eigenvectors on inner product spaces276<br />
we weight the horizontal direction more than the vertical direction.<br />
Then for all u, v, x and y<br />
〈(u, v), T (x, y)〉 = 〈(u, v), (x + ky, y)〉<br />
= 2u(x + ky) + vy<br />
= 2ux + (2ku + v)y<br />
= 〈(u, 2ku + v), (x, y)〉<br />
and so T † (x, y) = (x, 2kx + y) which is again a shear in the vertical but<br />
now with parameter 2k. The adjoint depends upon the choice <strong>of</strong> inner<br />
product.<br />
Example 7.18: Find the adjoint <strong>of</strong> the linear operator<br />
Lf =<br />
∫ b<br />
a<br />
K(x, y)f(y)dy .<br />
Solution:<br />
Using the inner product 〈f, g〉 = ∫ b<br />
a f(x)g(x) dx we have<br />
〈f, Lg〉 =<br />
=<br />
∫ b<br />
a<br />
∫ b ∫ b<br />
a<br />
∫ b<br />
f(x)<br />
a<br />
a<br />
K(x, y)g(y) dy dx<br />
f(x)K(x, y) dx g(y) dy<br />
by definitions<br />
swap order <strong>of</strong> integration
Module 7. Linear transforms and their eigenvectors on inner product spaces277<br />
=<br />
=<br />
∫ b ∫ b<br />
a a<br />
〈 ∫ b<br />
a<br />
K(y, x)f(y) dy g(x) dx<br />
〉<br />
K(y, x)f(y) dy , g<br />
and so the adjoint L † f = ∫ b<br />
a<br />
,<br />
swapping roles <strong>of</strong> x and y<br />
K(y, x)f(y) dy—which is not the same This is analogous to<br />
matrix transpose.<br />
as L because the arguments <strong>of</strong> K are interchanged, unless K(y, x) =<br />
K(x, y) for all x and y in which case K is called symmetric and then<br />
the operator L is self-adjoint.<br />
Example 7.19: the adjoint <strong>of</strong> d/dx is almost −d/dx! Consider this differentiation<br />
operator over C 1 [a, b] with the usual inner product 〈f, g〉 =<br />
∫ b<br />
a f(x)g(x) dx. Now<br />
〈<br />
f, dg 〉 ∫ b<br />
= f dg<br />
dx<br />
a dx dx<br />
∫ b<br />
= [fg] b a − df<br />
a dx g dx by integration by parts<br />
〈<br />
= f(b)g(b) − f(a)g(a) + − df 〉<br />
dx , g .<br />
The inner product appearing <strong>here</strong> indeed suggests that d †<br />
dx = −<br />
d<br />
but dx<br />
the exact identity required by the definition <strong>of</strong> adjoint actually does<br />
not hold unless we also have f(b)g(b) − f(a)g(a) = 0.<br />
This is a usual<br />
difficulty in function
Module 7. Linear transforms and their eigenvectors on inner product spaces278<br />
• One way to ensure this is to restrict the set <strong>of</strong> functions that the<br />
adjoint is defined to the subspace U <strong>of</strong> C 1 [a, b] <strong>of</strong> functions that are<br />
zero at a and b; hence f(a) = f(b) = 0, then f(b)g(b)−f(a)g(a) =<br />
0 and hence d †<br />
dx = −<br />
d<br />
on U.<br />
dx<br />
• Another way, more aesthetically pleasing, is to restrict d to the<br />
dx<br />
subspace V <strong>of</strong> functions zero at a and restrict d †<br />
dx to the subspace<br />
U (redefined) <strong>of</strong> functions zero at b (or vice-versa); then f(b)g(b)−<br />
f(a)g(a) = 0 as g(a) = 0 and f(b) = 0 and hence d †<br />
dx = −<br />
d<br />
dx<br />
The previous example begins to show that initial and/or boundary conditions<br />
are an integral part <strong>of</strong> operators and their adjoints.<br />
Example 7.20: L = d2 is self-adjoint on the subspace V <strong>of</strong> C<br />
dx 2<br />
2 [a, b] <strong>of</strong> functions<br />
that are zero at x = a and b using the usual inner product.<br />
Solution: <strong>Just</strong> consider<br />
〈f, Lg〉 =<br />
∫ b<br />
a<br />
fg ′′ dx<br />
∫ b<br />
= [fg ′ ] b a − f ′ g ′ dx<br />
a<br />
∫ b<br />
= [fg ′ − f ′ g] b a + f ′′ g dx<br />
using dashes for derivatives<br />
a<br />
integrating by parts once<br />
integrating by parts again
Module 7. Linear transforms and their eigenvectors on inner product spaces279<br />
= [fg ′ − f ′ g] b a<br />
+ 〈Lf, g〉<br />
= [fg ′ ] b a<br />
+ 〈Lf, g〉 as g(a) = g(b) = 0 for g ∈ V<br />
= 〈Lf, g〉<br />
for f ∈ V as then f(a) = f(b) = 0. Thus L † = L on V and is<br />
self-adjoint.<br />
Example 7.21: Find the adjoint <strong>of</strong> L = d2<br />
+ x d on the subspace V =<br />
dx 2 dx<br />
{g ∈ C 2 [a, b] | g ′ (a) = g(b) = 0} using the usual inner product.<br />
Solution:<br />
Consider<br />
〈f, Lg〉 =<br />
=<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
f(g ′′ + xg ′ ) dx<br />
fg ′′ + xfg ′ dx<br />
∫ b<br />
= [fg ′ + xfg] b a − f ′ g ′ + (xf) ′ g dx<br />
= f(b)g ′ (b) − af(a)g(a) −<br />
as g ′ (a) = g(b) = 0<br />
a<br />
∫ b<br />
a<br />
integrating by parts<br />
f ′ g ′ + (xf ′ + f)g dx<br />
∫ b<br />
= f(b)g ′ (b) − af(a)g(a) − [f ′ g] b a + (f ′′ − xf ′ − f)g dx<br />
a
Module 7. Linear transforms and their eigenvectors on inner product spaces280<br />
integrating f ′ g ′ by parts<br />
= f(b)g ′ (b) − af(a)g(a) + f ′ (a)g(a) + 〈f ′′ − xf ′ − f, g〉<br />
since g(b) = 0.<br />
Thus L † =<br />
d2 − x d − 1 provided we restrict it to the subspace U<br />
dx 2 dx<br />
<strong>of</strong> functions f such that f(b)g ′ (b) + [−af(a) + f ′ (a)]g(a) = 0. Now<br />
we cannot control g ′ (b) nor g(a) as these may vary over any values in<br />
V . Thus we require that f(b) = 0 and f ′ (a) = af(a)—the adjoint is<br />
L † = d2 − x d − 1 over the subspace<br />
dx 2 dx<br />
U = {f ∈ C 2 [a, b] | f ′ (a) − af(a) = f(b) = 0} .<br />
Example 7.22: Sturm-Liouville Show that Sturm-Liouville operators are<br />
self-adjoint in the usual inner product on suitable subsets <strong>of</strong> C 2 [a, b].<br />
Solution:<br />
As seen in §6.4 the Sturm-Liouville equation is<br />
Lg = [r(x)g ′ ] ′ + [q(x) + λp(x)]g = 0 ,<br />
for some functions p, q and r. Let’s consider the subspace <strong>of</strong> functions<br />
satisfying the quite general boundary conditions kg(a) + lg ′ (a) = 0 and
Module 7. Linear transforms and their eigenvectors on inner product spaces281<br />
mg(b) + ng ′ (b) = 0. Then<br />
〈f, Lg〉 =<br />
=<br />
∫ b<br />
a<br />
∫ b<br />
a<br />
f{(rg ′ ) ′ + (q + λp)g} dx<br />
f(rg ′ ) ′ + (q + λp)fg dx<br />
∫ b<br />
= [rfg ′ − rf ′ g] b a + (f ′ r) ′ g + (q + λp)fg dx<br />
a<br />
after integrating f(rg ′ ) ′ by parts twice<br />
= [r(fg ′ − f ′ g)] b a<br />
+ 〈Lf, g〉 .<br />
Thus for L to be self-adjoint we either need r(x) to be zero at the endpoints<br />
or, the case we consider <strong>here</strong>, fg ′ − f ′ g = 0 at the two ends. If<br />
l ≠ 0, then, w<strong>here</strong> all functions are evaluated at x = a, g ′ = −kg/l and<br />
hence<br />
fg ′ − f ′ g = −kfg/l − f ′ g = −(kf + lf ′ )g/l = 0<br />
if and only if kf + lf ′ = 0 since g could have any value. Similarly<br />
for the other end point x = b. If it happens that l = 0 then even<br />
easier arguments apply. We have shown that the functions f for the<br />
adjoint satisfy the same boundary conditions as for L itself. Thus<br />
Sturm-Liouville operators are self-adjoint.<br />
Observe in these examples how a differential operator together with suitable<br />
boundary conditions are very naturally complemented by the adjoint and
Module 7. Linear transforms and their eigenvectors on inner product spaces282<br />
its boundary conditions. If the operator and boundary conditions are used<br />
to form a well-posed differential equation, then so can the adjoint and its<br />
boundary conditions form a well-posed differential equation. Soon we will<br />
see that solutions to an ode are usefully related to those <strong>of</strong> its adjoint. The<br />
relationship is particularly useful when the operator is self-adjoint.<br />
7.2.3 Exercises<br />
Ex. 7.23: Argue that the set V <strong>of</strong> linear transforms from a vector space U<br />
to a vector space V , L : U → V , is itself a vector space under operator<br />
addition, that L + M is the transformation that applied to any vector<br />
u ∈ U is L(u) + M(u), and the operation <strong>of</strong> scalar multiplication, that<br />
cL is the transformation that applied to any vector is c.L(u). (For<br />
example, all the transformations <strong>of</strong> the plane, possibly represented as<br />
all 2 × 2 matrices, itself forms a vector space.)<br />
Ex. 7.24: Show that the adjoint <strong>of</strong> a matrix A under the weighted inner<br />
product 〈u, v〉 = u T Bv, for some suitable weight matrix B, is A † =<br />
(BAB −1 ) T .<br />
Ex. 7.25: Describe the adjoint <strong>of</strong> the transformation <strong>of</strong> the plane, T , which<br />
is rotation by an angle θ.<br />
Ex. 7.26: Use the definition <strong>of</strong> the adjoint to prove the first three properties<br />
listed in Theorem 7.8.
Module 7. Linear transforms and their eigenvectors on inner product spaces283<br />
Ex. 7.27: Find the adjoint <strong>of</strong> Lf = ∫ b<br />
a K(x, y)f(y)dy under the weighted<br />
inner product 〈f, g〉 = ∫ b<br />
a w(x)f(x)g(x) dx.<br />
Ex. 7.28: Find the adjoints <strong>of</strong> the following differential operators L and the<br />
subspaces they operate on:<br />
(a) Lf = df such that 2f(0) + 5f(1) = 0;<br />
dx<br />
(b) Lf = f ′′ + 3f ′ + 4f such that f(0) = 0 and f ′ (1) = 0;<br />
(c) Lf = f ′′′ + f ′ such that f(0) = 0, f ′ (0) = 2f ′′ (1), f ′ (1) = 3f(1).<br />
Use the inner product 〈f, g〉 = ∫ 1<br />
0 fg dx.<br />
Ex. 7.29: Find the adjoints <strong>of</strong> the following differential operators L and the<br />
subspaces they operate on:<br />
(a) Lf = df such that f(0) = 3f(1);<br />
dx<br />
(b) Lf = f ′′ + 2f ′ + f such that f(0) = f(1) = 0;<br />
(c) Lf = f ′′′ such that f(0) = 0, f ′ (0) = 2f ′′ (1), f(1) = f ′ (1).<br />
Use the inner product 〈f, g〉 = ∫ 1<br />
0 fg dx.
Module 7. Linear transforms and their eigenvectors on inner product spaces284<br />
7.3 Revision <strong>of</strong> eigenvalues and eigenvectors<br />
Reading 7.G Start by recalling familiar properties <strong>of</strong> eigenvalues and eigenvectors<br />
<strong>of</strong> matrices by revising the material in Kreyszig §7.1–2 [K,pp371–<br />
81]<br />
The critical facets:<br />
• a matrix A has a non-zero eigenvector v and eigenvalue λ if Av = λv,<br />
that is if the action <strong>of</strong> A is simply to stretch or compress v (possibly<br />
reversing direction if λ < 0);<br />
• eigenvalues are the solution <strong>of</strong> the characteristic equation det(λI−A) =<br />
0; the spectrum <strong>of</strong> A is the set <strong>of</strong> eigenvalues <strong>of</strong> A;<br />
• for any given eigenvector v, any scalar multiple is also an eigenvector,<br />
is a basis for a subspace and thus we seek linearly independent<br />
eigenvectors to avoid duplicity;<br />
• counted according to their multiplicity, t<strong>here</strong> are precisely n eigenvalues<br />
(possibly complex) <strong>of</strong> an n × n matrix;<br />
• if the n eigenvalues <strong>of</strong> an n × n matrix are distinct, then t<strong>here</strong> are<br />
precisely n linearly independent eigenvectors; however, if one or more<br />
eigenvalues are repeated, then the matrix may have less than n linearly<br />
independent eigenvectors;
Module 7. Linear transforms and their eigenvectors on inner product spaces285<br />
• in Matlab,<br />
– poly(a) returns the coefficients <strong>of</strong> the characteristic polynomial<br />
det(λI − A),<br />
– eig(a) returns a vector <strong>of</strong> eigenvalues,<br />
– w<strong>here</strong>as [p,d]=eig(a) returns a diagonal matrix D <strong>of</strong> eigenvalues<br />
and P whose columns are eigenvectors so that A = P DP −1 .<br />
Activity 7.H Ensure you can do the problems in Kreyszig Problem Sets 7.1<br />
and 7.2 [K,pp375–6 & pp379–81], and Exercises 7.31 and 7.32. Send in<br />
to the examiner for feedback at least Ex. 7.31(a) & 7.32(a).<br />
Many <strong>of</strong> the above properties hold on function spaces. However, the determinant<br />
is not defined so other methods have to be used to find the eigenvalues<br />
and eigenvectors.<br />
Example 7.30: Find the eigenvalues and eigenfunctions corresponding to<br />
non-zero eigenvalues <strong>of</strong> the linear transformation ∫ 1<br />
0 (x + y)f(y) dy over<br />
the space <strong>of</strong> continuously differentiable functions.<br />
Solution:<br />
We seek non-trivial solutions to<br />
∫ 1<br />
(x + y)f(y) dy = λf(x) .<br />
0
Module 7. Linear transforms and their eigenvectors on inner product spaces286<br />
• Expanding the integral on the left-hand side, observe<br />
x<br />
∫ 1<br />
0<br />
f(y) dy +<br />
∫ 1<br />
0<br />
yf(y) dy = λf(x) .<br />
• As the left-hand side is a linear function <strong>of</strong> x, then so must the<br />
right-hand side and hence so is f(x) (unless the left-hand side<br />
is zero which then implies λ = 0). Try f(x) = Ax + B in the<br />
equation to deduce<br />
( ) ( A A<br />
x<br />
2 + B +<br />
3 + B )<br />
= λAx + λB .<br />
2<br />
• This has to hold for all x and so the coefficients on the two sides<br />
A<br />
must be equal: + B = λA and A + B = λB, which in matrix<br />
2 3 2<br />
form is the matrix eigenproblem<br />
[ ] [ ] [ ]<br />
1/2 1 A A<br />
= λ .<br />
1/3 1/2 B B<br />
• The characteristic equation for this 2×2 matrix is (λ− 1 2 )2 − 1 3 = 0<br />
with solution the non-zero eigenvalues λ = 1 2 ± 1 √<br />
3<br />
.<br />
• The corresponding eigenvectors <strong>of</strong> the 2 × 2 problem are clearly<br />
proportional to (± √ 3, 1) which in terms <strong>of</strong> the functions in the<br />
function space are simply those proportional to the eigenfunctions<br />
f = ± √ 3x + 1.
Module 7. Linear transforms and their eigenvectors on inner product spaces287<br />
7.3.1 Exercises<br />
Ex. 7.31: Each <strong>of</strong> the four pictures plotted below show the effect on vectors<br />
in the plane <strong>of</strong> a different transformation <strong>of</strong> the plane obtained by<br />
multiplying by different 2×2 matrices. In each picture, t<strong>here</strong> are seven<br />
different coloured dashed vectors terminated by open circles, call them<br />
u i . In each picture the vectors resulting from multiplying by a matrix A<br />
are also plotted, say v i = Au i , and drawn as solid lines terminated by<br />
“*”. Using that the action <strong>of</strong> a matrix is to just stretch its eigenvectors<br />
by a factor λ (and reverse direction if λ < 0), draw on each picture your<br />
best estimate <strong>of</strong> the two directions corresponding to the two different<br />
families <strong>of</strong> eigenvectors <strong>of</strong> a 2 × 2 transformation. Label them with a<br />
rough estimate <strong>of</strong> the corresponding eigenvalue.
Module 7. Linear transforms and their eigenvectors on inner product spaces288<br />
(a)<br />
(b)<br />
1.5<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-1 0 1 2<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
-1 0 1 2<br />
(c)<br />
(d)<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
1<br />
0.5<br />
0<br />
-0.5<br />
-1<br />
-1 -0.5 0 0.5 1 1.5<br />
-1 0 1 2<br />
Ex. 7.32: Find the only non-zero eigenvalues and the corresponding eigenfunctions<br />
<strong>of</strong> the following linear transformations:<br />
(a) Lf = ∫ 1<br />
0 2(x − y)f(y) dy;<br />
(b) Lf = ∫ b<br />
a exp(x − y)f(y) dy;<br />
(c) Lf = ∫ π<br />
0 cos(x + y)f(y) dy.
Module 7. Linear transforms and their eigenvectors on inner product spaces289<br />
7.4 Diagonalisation transformation<br />
The orthogonal solutions <strong>of</strong> a Sturm-Liouville problem can be used to simply<br />
solve inhomogeneous differential equations <strong>of</strong> the form [r(x)y ′ ] ′ + [q(x) +<br />
µp(x)]y = f(x). The trick is to express the right-hand side f(x) as a sum<br />
over the eigenfunctions <strong>of</strong> the differential operator. We now explore how<br />
this is analogous to the diagonalisation <strong>of</strong> matrices and proceed to develop<br />
further properties.<br />
Main aims:<br />
• show that eigenvectors (eigenfunctions) <strong>of</strong> the adjoint operator are used<br />
to find expansions in the eigenvectors (eigenfunctions) <strong>of</strong> an operator;<br />
• because the adjoint eigenvectors are orthogonal to the eigenvectors, see<br />
that eigenvectors <strong>of</strong> a self adjoint operator are orthogonal.<br />
• an eigenfunction expansion solution to the Sturm-Liouville type problem<br />
[r(x)y ′ ] ′ + [q(x) + µp(x)]y = f(x) is the linear combination <strong>of</strong><br />
eigenfunctions ∑ j y j (x) 〈f/p, y j 〉 /(µ − λ j ) for any specific value <strong>of</strong> the<br />
parameter µ.
Module 7. Linear transforms and their eigenvectors on inner product spaces290<br />
7.4.1 Adjoint eigenvectors diagonalise operators<br />
Example 7.33: Consider solving the simple linear equation<br />
[ ] [ ] [ ] [ ]<br />
3 1 3 1 1<br />
Ax = x = = + 2 .<br />
1 3 −1 1 −1<br />
The rearrangement on the very right <strong>of</strong> this equation is motivated because<br />
I happen to know that (1, ±1) are eigenvectors <strong>of</strong> the matrix<br />
A; they correspond to eigenvalues 4 and 2. Knowing this we solve<br />
the linear equation using the “method <strong>of</strong> undetermined coefficients”<br />
by guessing a solution <strong>of</strong> the form <strong>of</strong> a linear combination <strong>of</strong> the two<br />
eigenvectors:<br />
[ ] [ ]<br />
1 1<br />
x = a + b .<br />
1 −1<br />
Substituting this into the linear equation and noting that A(1, 1) is<br />
4(1, 1) and A(1, −1) is 2(1, −1)<br />
[ ]<br />
[ ] [ ] [ ] [ ]<br />
3 1 1 1 1<br />
Ax = becomes 4a + 2b = + 2 .<br />
−1<br />
1 −1 1 −1<br />
Equating coefficients on both sides shows 4a = 1 and 2b = 2, that is<br />
a = 1/4 and b = 1, hence the solution is<br />
x = 1 [ ] [ ] [ ]<br />
1 1 5/4<br />
+ = .<br />
4 1 −1 −3/4
Module 7. Linear transforms and their eigenvectors on inner product spaces291<br />
The process just given <strong>here</strong> is the same as that we use in the eigenfunction<br />
expansion <strong>of</strong> solutions <strong>of</strong> ode’s in §7.4.3. Look at and compare with Example<br />
7.37. This process is intimately tied up with the diagonalisation <strong>of</strong><br />
matrices and linear operators because the solution <strong>of</strong> linear equations with a<br />
diagonal operator is near trivial as shown above.<br />
Reading 7.I Study Kreyszig §7.5 [K,pp392–6] (overlook Theorem 4 and Example<br />
3).<br />
Using diagonalisation, A = P DP −1 , the solution <strong>of</strong> the system Ax = b is<br />
written as x = P D −1 P −1 b which is easy to do as D −1 is simply the diagonal<br />
matrix <strong>of</strong> reciprocals <strong>of</strong> the eigenvalues. More explicitly we might write this as<br />
x = P a = ∑ j v j a j w<strong>here</strong> the “amplitudes” in the solution <strong>of</strong> the eigenvectors<br />
v j are given by a j = (w j · b)/λ j w<strong>here</strong> w j comes from the jth row <strong>of</strong> P −1 .<br />
Implicitly, this general result is used in the introductory example.<br />
Activity 7.J Do problems 1–6 and 10–22 from Problem Set 7.5 [K,pp397–8]<br />
Theorem 7.11 on deriving the eigenfunction expansion <strong>of</strong> solutions to inhomogeneous<br />
Sturm-Liouville problems, seen in action in Example 7.37, is<br />
equivalent to using a diagonalisation.<br />
We proceed to determine how to “digonalise operators”, not just matrices.<br />
The general setting is <strong>of</strong> linear transforms A acting on some vector space<br />
with an inner product 〈, 〉. The definition <strong>of</strong> eigenvalues and eigenvectors<br />
proceeds as before. However, t<strong>here</strong> is one new twist that is useful.
Module 7. Linear transforms and their eigenvectors on inner product spaces292<br />
Definition 7.9 For a linear transformation A the eigenvectors <strong>of</strong> A † are<br />
called the left-eigenvectors or adjoint eigenvectors <strong>of</strong> A.<br />
They are called left-eigenvectors because in the case <strong>of</strong> a matrix A, the defining<br />
equation A T w = λw is, upon transposing, equivalent to w T A = λw T<br />
in which w T appears to the left <strong>of</strong> A. Three important properties are the<br />
following.<br />
identical spectrum The spectrum <strong>of</strong> the adjoint A † , the set <strong>of</strong> eigenvalues,<br />
is the same as that <strong>of</strong> A. This is easily seen in finite dimensions because<br />
the characteristic polynomial <strong>of</strong> a matrix and its transpose are the<br />
same.<br />
orthogonal eigenvectors Any left-eigenvector and ordinary eigenvector corresponding<br />
to distinct eigenvalues are orthogonal.<br />
To see this, suppose v i is an eigenvector corresponding to eigenvalue<br />
λ i and w j is an eigenvector corresponding to eigenvalue λ j ≠ λ i . Then<br />
consider<br />
λ i 〈w j , v i 〉 = 〈w j , λ i v i 〉<br />
= 〈w j , Av i 〉 by definition <strong>of</strong> eigenvector<br />
= 〈 A † w j , v i<br />
〉<br />
by definition <strong>of</strong> adjoint<br />
= 〈λ j w j , v i 〉 by definition <strong>of</strong> left-eigenvector<br />
= λ j 〈w j , v i 〉 .
Module 7. Linear transforms and their eigenvectors on inner product spaces293<br />
Since λ i ≠ λ j the only way the extreme sides <strong>of</strong> this equation can be<br />
equal is if the common inner product factor is zero. Thus w j and v i<br />
are orthogonal.<br />
eigen-expansion Thus, if we have a complete set <strong>of</strong> eigenvectors and when<br />
we normalise the left-eigenvectors so that the inner product with its<br />
partner eigenvector 〈w j , v j 〉 = 1, then any vector may be decomposed<br />
as the linear combination <strong>of</strong> the eigenvectors u = ∑ i 〈w i , u〉 v i .<br />
Since the eigenvectors are complete, t<strong>here</strong> exists some linear combination<br />
u = ∑ i a i v i . Take the inner product <strong>of</strong> this to determine<br />
〈w j , u〉 =<br />
〈<br />
w j , ∑ i<br />
a i v i<br />
〉<br />
A common<br />
convention is to use<br />
δ ij to be 1 if i = j<br />
and 0 otherwise.<br />
= ∑ i<br />
a i 〈w j , v i 〉<br />
by linearity <strong>of</strong> inner product<br />
= ∑ i<br />
a i δ ij<br />
by orthonormalisation <strong>of</strong> w j<br />
= a j as all other terms are 0.<br />
Hence the “amplitude” <strong>of</strong> v i in the linear combination is a i = 〈w i , u〉<br />
as claimed.<br />
For a matrix A one may form the matrices <strong>of</strong> eigenvectors,<br />
P = [v 1 | v 2 | . . . | v n ] and Q = [w 1 | w 2 | . . . | w n ] ,
Module 7. Linear transforms and their eigenvectors on inner product spaces294<br />
then, using the usual dot product for as the inner product, observe that Q T P<br />
is the matrix <strong>of</strong> inner products 〈w i , v j 〉 which is zero everyw<strong>here</strong> except the<br />
diagonal w<strong>here</strong> w i has been normalised so the diagonal is 1. Thus Q T = P −1 ,<br />
that is, the rows <strong>of</strong> P −1 are the normalised left-eigenvectors <strong>of</strong> A.<br />
Example 7.34: Compute the eigenvalues, eigenvectors and left-eigenvectors<br />
<strong>of</strong> the matrix appearing in the linear equation<br />
[<br />
1 −1<br />
−4 1<br />
and hence solve the linear equation.<br />
] [ ]<br />
2<br />
u = ,<br />
2<br />
Solution:<br />
Call the matrix A, then the characteristic equation is<br />
det(A − λI) = (λ − 1) 2 − 4 = 0 ,<br />
which has solution the eigenvalues λ 1 = −1 and λ 2 = 3.<br />
• Any eigenvector corresponding to λ 1 = −1 solves<br />
(A − λI)v 1 =<br />
[<br />
2 −1<br />
−4 2<br />
with all solutions proportional to v 1 = (1, 2).<br />
]<br />
v 1 = 0 ,
Module 7. Linear transforms and their eigenvectors on inner product spaces295<br />
• Any eigenvector corresponding to λ 2 = 3 solves<br />
[ ]<br />
−2 −1<br />
(A − λI)v 2 =<br />
v<br />
−4 −2 2 = 0 ,<br />
with all solutions proportional to v 2 = (1, −2).<br />
• The left-eigenvectors satisfy the transposed equations, so for λ 1 =<br />
−1:<br />
[ ]<br />
2 −4<br />
(A T − λI)w 1 =<br />
w<br />
−1 2 1 = 0 ,<br />
with all solutions proportional to w 1 = (2, 1). Observe that w 1 ·<br />
v 2 = 0 as assured by theory, and that w 1 · v 1 = 1 upon scaling w 1<br />
to w 1 = (1/2, 1/4).<br />
• Similarly the left-eigenvector corresponding to λ 2 = 3 solves<br />
[ ]<br />
−2 −4<br />
(A T − λI)w 2 =<br />
w<br />
−1 −2 2 = 0 ,<br />
with all solutions proportional to w 2 = (−2, 1). Observe that<br />
w 2 ·v 1 = 0 as assured by theory, and that w 2 ·v 2 = 1 upon scaling<br />
w 2 to w 2 = (1/2, −1/4).<br />
Thus the inner products <strong>of</strong> the left-eigenvectors with the given righthand<br />
side are w 1 · (2, 2) = 3/2 and w 2 · (2, 2) = 1/2 so we know it as<br />
this linear combination <strong>of</strong> the eigenvectors:<br />
[ ]<br />
2<br />
= 3 [ ]<br />
1<br />
+ 1 [ ]<br />
1<br />
.<br />
2 2 2 2 −2
Module 7. Linear transforms and their eigenvectors on inner product spaces296<br />
Divide each term in the linear combination by the corresponding eigenvalue<br />
to obtain the solution<br />
u = − 3 [ ]<br />
1<br />
+ 1 [ ] [ ]<br />
1 −4/3<br />
=<br />
.<br />
2 2 6 −2 −10/3<br />
Example 7.35: Find the eigenvalues, eigenvectors (eigenfunctions) and<br />
normalised adjoint eigenvectors for the linear operator Ly = − d2 y<br />
with<br />
dx 2<br />
boundary conditions y(0) = 0 and y ′ (π) = 1 2 y′ (0).<br />
Solution: Use the usual inner product on the domain <strong>of</strong> the differential<br />
equation, namely 〈f, g〉 = ∫ π<br />
0 f(x)g(x) dx.<br />
• First, solve the eigenproblem −y ′′ = λy such that y(0) = 0 and<br />
y ′ (π) = 1 2 y′ (0). The ode has constant coefficients so we expect<br />
exponential or trigonometric solutions. Exponential solutions cannot<br />
occur because if they did they would have to be <strong>of</strong> the form<br />
y(x) = sinh( √ −λx) to satisfy y(0) = 0, but the derivative <strong>of</strong> this<br />
y ′ (x) ∝ cosh( √ −λx) is monotonic increasing with positive x and<br />
so y ′ (π) cannot be 1 2 y′ (0). Similarly λ = 0 cannot give rise to an<br />
eigenfunction. In the last case, to satisfy y(0) = 0 trigonometric
Module 7. Linear transforms and their eigenvectors on inner product spaces297<br />
solutions must be <strong>of</strong> the form y(x) = sin( √ λx). Then the other<br />
boundary condition<br />
y ′ (π) = 1 2 y′ (0) ⇔ √ λ cos( √ λπ) = 1 2√<br />
λ<br />
⇔ cos( √ λπ) = 1 2<br />
⇔ √ λ = 1 3 , 5 3 , 7 3 , 11 3 , . . .<br />
√<br />
⇔ λ j = j − 1 + 2 (−1)j /6 j = 1, 2, 3, . . .<br />
⇔ λ j = (j − 1 2 + (−1)j /6) 2 j = 1, 2, 3, . . .<br />
The corresponding eigenfunctions are v j (x) = sin[(j− 1 2 +(−1)j /6)x]<br />
plotted below.
Module 7. Linear transforms and their eigenvectors on inner product spaces298<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
-0.2<br />
v 1<br />
-0.4<br />
-0.6<br />
-0.8<br />
v 2<br />
v 3<br />
v 4<br />
v 5<br />
-1<br />
0 0.5 1 1.5 2 2.5 3 3.5<br />
x<br />
• Second, derive and solve the adjoint. Consider<br />
〈w, Lv〉 =<br />
∫ π<br />
0<br />
−wv ′′ dx<br />
= [−wv ′ + w ′ v] π 0 +<br />
∫ π<br />
0<br />
−w ′′ v dx<br />
x=linspace(0,pi);<br />
[j,x]=meshgrid(1:5,x);<br />
k=(j-.5+(-1).^j/6);<br />
v=sin(k.*x);<br />
plot(x,v)<br />
= w ′ (π)v(π) + [w(0) − 1 2 w(π)]v′ (0) + 〈−w ′′ , v〉<br />
and t<strong>here</strong>fore the adjoint is L † w = − d2 w<br />
with boundary conditions<br />
w ′ (π) = 0 and w(0) = 1 w(π). Observe that although the<br />
dx 2<br />
2<br />
differential part <strong>of</strong> the adjoint is the same, the operator L is not<br />
self-adjoint because the boundary conditions for the adjoint are<br />
different to that for L.
Module 7. Linear transforms and their eigenvectors on inner product spaces299<br />
By a similar argument to the above the solutions to the adjoint<br />
eigenproblem L † w = λw must be trigonometric. To satisfy w ′ (π) =<br />
0 the solutions must be <strong>of</strong> the form w = cos[ √ λ(π − x)]. Then<br />
the other boundary condition<br />
w(0) = 1 2 w(π) ⇔ cos[√ λπ] = 1 2<br />
⇔ √ λ = 1 3 , 5 3 , 7 3 , 11<br />
3 , . . .<br />
as for L. The spectrum for L and its adjoint must be the same.<br />
The left-eigenfunctions are then found to be w j (x) ∝ cos[(j − 1 2 +<br />
(−1) j /6)(π − x)]. To normalise, observe<br />
Thus choose<br />
〈<br />
cos[(j −<br />
1<br />
2 + (−1)j /6)(π − x)], v j (x) 〉<br />
= π 2 sin[(j − 1 2 + (−1)j /6)π] = π√ 3<br />
4 (−1)j−1 .<br />
w j (x) = 4(−1)j−1<br />
π √ 3<br />
cos[(j − 1 2 + (−1)j /6)(π − x)] ,<br />
plotted below. A little algebra also confirms that w j (x) are orthogonal<br />
to the eigenfunctions v i (x).
Module 7. Linear transforms and their eigenvectors on inner product spaces300<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
-0.2<br />
-0.4<br />
-0.6<br />
w 1<br />
w 2<br />
w 3<br />
w 4<br />
w 5 w=cos(k.*(pi-x)) ...<br />
.*(-1).^(j-1)*4/(pi*sqrt(3));<br />
plot(x,w)<br />
-0.8<br />
0 0.5 1 1.5 2 2.5 3 3.5<br />
x<br />
Activity 7.K Do Exercises 7.39–7.40 in §7.4.4. Send in to the examiner for<br />
feedback at least Ex. 7.39(a) & 7.40.<br />
7.4.2 Orthogonal eigenvectors <strong>of</strong> self-adjoint operators<br />
One immediate consequence <strong>of</strong> the work in the previous subsection concerns<br />
a self-adjoint transform. Clearly if a transformation is self-adjoint then the
Module 7. Linear transforms and their eigenvectors on inner product spaces301<br />
left-eigenfunctions, or left-eigenvectors <strong>of</strong> a symmetric matrix, are identical<br />
to the ordinary eigenvectors. This is because they satisfy precisely the<br />
same equations. This and other rather special properties hold for self-adjoint<br />
transformations (symmetric matrices).<br />
Reading 7.L Study the properties <strong>of</strong> orthogonal and symmetric matrices<br />
in Kreyszig §7.3 [K,pp381–4].<br />
The theorems apply not just to symmetric matrices but also to self-adjoint<br />
operators. As an example consider the pro<strong>of</strong> <strong>of</strong> the reality <strong>of</strong> the eigenvalues.<br />
Theorem 7.10 The eigenvalues <strong>of</strong> a self-adjoint (real) linear transformation<br />
are all real.<br />
Pro<strong>of</strong>: Let L be a self-adjoint linear transformation on some inner product This pro<strong>of</strong> is better<br />
space with eigenvalue λ and corresponding eigenvector v: thus Lv = λv. when set in a<br />
complex vector<br />
space, but <strong>here</strong> we<br />
• Take the complex conjugate <strong>of</strong> this equation to deduce Lv = λv w<strong>here</strong> compromise by<br />
the over bar denotes complex conjugation. Since L is real, L = L. allowing complex<br />
eigenvalues and<br />
Thus λ and v must be an eigenvalue and eigenvector <strong>of</strong> L also.<br />
eigenvectors<br />
without actually<br />
giving a proper<br />
setting for them.
Module 7. Linear transforms and their eigenvectors on inner product spaces302<br />
♠<br />
• Now consider 〈v, Lv〉. On the one hand<br />
On the other hand,<br />
〈v, Lv〉 = 〈v, λv〉 as v is an eigenvector <strong>of</strong> L<br />
= λ 〈v, v〉 .<br />
〈v, Lv〉 = 〈Lv, v〉 as L is self-adjoint<br />
= 〈 λv, v 〉 as v is an eigenvector <strong>of</strong> L<br />
= λ 〈v, v〉 .<br />
• Hence, λ 〈v, v〉 = λ 〈v, v〉, equivalently<br />
(λ − λ) 〈v, v〉 = 0 .<br />
• Under all useful definitions <strong>of</strong> an inner product 〈v, v〉 ≠ 0, and indeed<br />
is real and positive. Thus the only way the previous equation can be<br />
satisfied is if λ = λ. That is, the eigenvalue must be real.<br />
This pro<strong>of</strong> directly echoes that for the reality <strong>of</strong> the eigenvalues <strong>of</strong> the Sturm-<br />
Liouville problem. The only difference is the generalisation in the Sturm-<br />
Liouville problem to differential equations <strong>of</strong> the form Ly = λp(x)y for some<br />
“weight function” p(x). The general theory can <strong>of</strong> course be extended to<br />
cover such more general cases, but the details are more involved so <strong>here</strong> we<br />
do not do so.
Module 7. Linear transforms and their eigenvectors on inner product spaces303<br />
Example 7.36: Observe that the linear transformation in Exercise 7.32(c)<br />
is self-adjoint, because K(x, y) = K(y, x), and you will have found<br />
the eigenvalues are real and the particular eigenvectors you found were<br />
orthogonal.<br />
However, the eigenvalues <strong>of</strong> the linear transformation in Exercise 7.32(a)<br />
are complex valued and this is allowed because it is not self-adjoint.<br />
That an n × n symmetric matrix is (orthogonally) diagonalisable because<br />
it always has n (orthogonal) eigenvectors, is mirrored by the claim <strong>of</strong> completeness<br />
made for eigenfunctions <strong>of</strong> the Sturm-Liouville problem: that a<br />
matrix has n eigenvectors means we can write any vector in IR n in terms <strong>of</strong><br />
the eigenvectors; that the eigenfunctions are complete means that we can<br />
represent any function on the domain as a linear combination <strong>of</strong> the eigenfunctions.<br />
Activity 7.M Do problems 1–6 from Problem Set 7.3 in Kreyszig [K,p384],<br />
and Exercises 7.42–7.44 <strong>here</strong>in.<br />
7.4.3 Expansions in orthogonal eigenfunctions<br />
Having seen that we can obtain sets <strong>of</strong> eigenfunctions as solutions <strong>of</strong> differential<br />
equations we now show that these can be used to produce a new
Module 7. Linear transforms and their eigenvectors on inner product spaces304<br />
representation <strong>of</strong> almost arbitrarily complicated functions. This is advantageous<br />
in many circumstances.<br />
For an introductory example, use the Legendre polynomials to solve the ode<br />
(1 − x 2 )y ′′ − 2xy ′ + y = 1 + x + x 2 ,<br />
such that y(x) is well-behaved at x = ±1. First, rewrite the right-hand side<br />
in terms <strong>of</strong> Legendre polynomials [K,p208]<br />
4<br />
3 P 0(x) + P 1 (x) + 2 3 P 2(x) = 4 3 + x + 2 1<br />
3 2 (3x2 − 1) = 1 + x + x 2 .<br />
Second, try a solution in the form y = aP 0 (x) + bP 1 (x) + cP 2 (x) for some<br />
constants a, b and c to be determined. Because Legendre polynomials satisfy<br />
(1 − x 2 )P n ′′ − 2xP n ′ = −n(n + 1)P n , the left-hand side <strong>of</strong> the ode simplifies<br />
immensely to just aP 0 −bP 1 −5cP 2 . Lastly equate coefficients <strong>of</strong> the Legendre<br />
polynomials on the two sides <strong>of</strong> the equation to deduce a = 4 , b = −1 and<br />
3<br />
c = − 2 . Hence the solution is<br />
15<br />
y = 4 3 P 0(x) − P 1 (x) − 2 15 P 2(x) = 21<br />
15 − x − 1 5 x2 .<br />
Of course <strong>here</strong> this could have been obtained more straightforwardly by simply<br />
guessing this polynomial form. But the approach introduced <strong>here</strong> is much<br />
more general as seen below.<br />
Reading 7.N Study Kreyszig §4.8 [K,pp240–6].
Module 7. Linear transforms and their eigenvectors on inner product spaces305<br />
Activity 7.O Do problems from Problem Set 4.8 [K,pp246–7]. Send in to<br />
the examiner for feedback at least Q3 & 7.<br />
Now generalise the earlier example.<br />
Theorem 7.11 The solution to the ode [r(x)y ′ ] ′ + [q(x) + µp(x)]y = f(x)<br />
subject to some homogeneous boundary conditions for some constant µ may<br />
be written<br />
y = ∑ 〈f/p, y m 〉<br />
y m (x) ,<br />
m µ − λ m<br />
provided µ ≠ λ m , w<strong>here</strong> λ m are eigenvalues and y m (x) are the orthonormal<br />
eigenfunctions <strong>of</strong> the associated Sturm-Liouville problem.<br />
Pro<strong>of</strong>: Try a solution in the form y(x) = ∑ m a m y m (x). Because [ry m ′ ]′ +<br />
qy m = −λ m py m the ode becomes<br />
∑<br />
a m (µ − λ m )p(x)y m (x) = f(x) .<br />
m<br />
Multiply by y n (x) for any n and integrate over the domain, say [a, b], to<br />
deduce<br />
∑<br />
∫ b<br />
∫ b<br />
a m (µ − λ m ) p(x)y m (x)y n (x) dx = f(x)y n (x) dx .<br />
a<br />
a<br />
m<br />
The right-hand side is identical to the inner product, 〈f/p, y n 〉, <strong>of</strong> f/p and y n .<br />
Because the eigenfunctions are orthonormal with weight p(x), the integral on
Module 7. Linear transforms and their eigenvectors on inner product spaces306<br />
the left-hand side is 1 if m = n and 0 otherwise. Thus the equation simplifies<br />
to a n (µ − λ n ) = 〈f/p, y n 〉 from which we deduce a n = 〈f/p, y n 〉 /(µ − λ n ) for<br />
all n provided µ ≠ λ n . ♠<br />
Example 7.37: Use eigenfunction expansion to solve the ode y ′′ + 2y = 1<br />
such that y(0) = y(π) = 0.<br />
Solution: First find the eigenfunctions <strong>of</strong> the associated problem: y ′′ +<br />
λy = 0 such that y(0) = y(π) = 0. Fortunately this is well known to<br />
us: the eigenvalues are √λ n = n 2 and the complete set <strong>of</strong> orthonormal<br />
eigenfunctions are y n = 2/π sin nx.<br />
Second, write the right-hand side, <strong>here</strong> f(x) = 1 for 0 < x < π, in<br />
terms <strong>of</strong> the eigenfunctions. From Example 1 [K,pp241–2] we know<br />
that we can do this as<br />
f(x) = 1 = 4 (sin x + 1 π 3 sin 3x + 1 sin 5x + · · ·)<br />
for 0 < x < π .<br />
5<br />
Lastly, substitution shows that a solution expressed as a sum <strong>of</strong> the<br />
eigenfunctions just involves dividing each term appearing above by the<br />
corresponding 2 − λ n :<br />
y(x) = 4 π<br />
(<br />
sin x − 1 1<br />
sin 3x − sin 5x − · · ·)<br />
.<br />
6 35
Module 7. Linear transforms and their eigenvectors on inner product spaces307<br />
Example 7.38: In example 7.35 we computed eigenvalues, eigenvectors (eigenfunctions)<br />
and normalised adjoint eigenvectors for the linear operator<br />
Ly = − d2 y<br />
with boundary conditions y(0) = 0 and y ′ (π) = 1 dx 2 2 y′ (0).<br />
Write down the formal eigenfunction expansion <strong>of</strong> the solution to the<br />
problem −y ′′ = h(x) with the same boundary conditions.<br />
Solution: We may expand any function in terms <strong>of</strong> the eigenfunctions<br />
as h(x) = ∑ ∞<br />
j=1 〈w j , h〉 v j (x). Then the formal solution to −y ′′ = h(x)<br />
is<br />
∞∑ 〈w j , h〉<br />
y = v j (x) .<br />
λ j<br />
j=1<br />
You should wonder what occurs if the possibility <strong>of</strong> dividing by a zero µ−λ m<br />
eventuates in the formal solution <strong>of</strong> Theorem 7.11. <strong>Just</strong> as for the solution<br />
<strong>of</strong> linear algebraic equations, such division by zero indicates that the Sturm-<br />
Liouville differential operator is “singular”. Thus if for any m it happens<br />
that µ − λ m = 0, then either the ode is inconsistent, indicated by the inner<br />
product 〈f/p, y n 〉 ≠ 0, or the ode is consistent, when 〈f/p, y n 〉 = 0, and the<br />
solution can include an arbitrary multiple <strong>of</strong> the corresponding eigenfunction,<br />
that is Ay m (x) could be added for any constant A.
Module 7. Linear transforms and their eigenvectors on inner product spaces308<br />
7.4.4 Exercises<br />
Ex. 7.39: Find the eigenvalues, eigenvectors and left-eigenvectors <strong>of</strong> the following<br />
matrices, and then verify the orthogonality between eigenvectors<br />
and left-eigenvectors:<br />
[ ]<br />
0 1<br />
(a)<br />
−6 5<br />
[ ]<br />
11 −6<br />
(b)<br />
18 −10<br />
(c)<br />
⎡<br />
⎢<br />
⎣<br />
0 1 3<br />
1 6 9<br />
−1 −5 −8<br />
⎤<br />
⎥<br />
⎦<br />
Ex. 7.40: Deduce the adjoint eigenfunctions that correspond to the nonzero<br />
eigenvalues <strong>of</strong> the linear operators in Exercise 7.32 (b) and (c).<br />
W<strong>here</strong> appropriate, verify the orthogonality among the eigenfunctions<br />
and these adjoint eigenfunctions.<br />
Ex. 7.41: Prove that if L is a self-adjoint linear transformation in some inner<br />
product, then eigenvectors from different eigenspaces are orthogonal.<br />
Ex. 7.42: Show that the differential operator Ly = d2<br />
dx 2 [<br />
r(x)<br />
d 2 y<br />
dx 2 ]<br />
such that<br />
y(0) = y ′ (0) = y(L) = y ′ (L) = 0 is self-adjoint and hence deduce it has<br />
real eigenvalues and a complete set <strong>of</strong> eigenfunctions. For your interest
Module 7. Linear transforms and their eigenvectors on inner product spaces309<br />
note that the ode Ly = h(x) with these boundary conditions describes<br />
the deflection under a distributed load h(x) <strong>of</strong> a beam <strong>of</strong> varying shape,<br />
encoded by r(x), with clamped ends.<br />
Ex. 7.43: Show that the eigenfunctions <strong>of</strong> the Sturm-Liouville system<br />
−y ′′ = λy , y(0) = 0 , y(1) − 2y ′ (1) = 0 ,<br />
are sin( √ λt) w<strong>here</strong> the eigenvalues are the positive solutions to tan √ λ =<br />
2 √ λ. By sketching a graph show that λ j ≈ (2j − 1) 2 π 2 /4 for large j.<br />
Why do we not need to worry about the possibility <strong>of</strong> complex eigenvalues?<br />
Ex. 7.44: Similarly find the eigenfunctions <strong>of</strong> the Sturm-Liouville system<br />
−y ′′ = λy , y(0) = y ′ (0) , y(π) = 0 ,<br />
and approximate values for the eigenvalues.<br />
Ex. 7.45: A linear operator L is defined by<br />
Lf = f ′′ + 4f ′ + 3f , w<strong>here</strong> f(0) = f ′ (1) = 0 ,<br />
on a vector space with inner product 〈f, g〉 = ∫ 1<br />
0<br />
(a) Find the adjoint <strong>of</strong> L.<br />
fg dx .<br />
(b) Show that f n (x) = e −2x sin(ω n x) are eigenfunctions <strong>of</strong> L provided<br />
ω n = 2 tan(ω n ). (For example, ω 1 = 4.2748, ω 2 = 7.5965, etc.)<br />
What are the corresponding eigenvalues?
Module 7. Linear transforms and their eigenvectors on inner product spaces310<br />
7.4.5 Answers to selected Exercises<br />
7.8 (a) yes; (b) no; (c) yes; (d) yes; (e) no; (f) yes.<br />
7.25 The adjoint is rotation by −θ.<br />
7.27 L † f = ∫ b<br />
a w(y)w(x)−1 K(y, x)f(y) dy<br />
7.28 (a) L † f = − df such that f(0) + 2f(1); (b) dx L† f = f ′′ − 3f ′ + 4f such<br />
that f ′ (1) = 0 and f(0) = 0; (c) L † f = −f ′′′ − f ′ such that f(0) = 0,<br />
f(1) + 2f ′ (0) = 0 and 3f ′′ (1) − f ′ (1) + 3f(1) = 0.<br />
7.29 (a) L † f = − df<br />
dx such that f(1) = 3f(0); (b) L† f = f ′′ − 2f ′ + f<br />
such that f(−1) = f(1) = 0; (c) L † f = −f ′′′ such that f(0) =<br />
f(1) + f ′ (0) = 0 and f ′′ (1) = f ′ (1).<br />
7.31 (a) λ = 1, λ = 2; (b) λ = 1, λ = 3; (c) λ = −1.5, λ = 1; (d)<br />
λ = −0.8, λ = 1.8.<br />
7.32 (a) λ = ±i/ √ 3 with eigenfunctions f(x) = 2x + (−1 ± i/ √ 3) ; (b)<br />
λ = b − a and the eigenfunctions are f(x) = e x ; (c) λ = ±π/2<br />
corresponding to eigenfunctions cos x and sin x.<br />
7.39 (a) λ = 2 and 3, v 1 = (1, 2), v 2 = (1, 3), w 1 = (3, −1) and w 2 =<br />
(−2, 1); (b) λ = 2 and −1, v 1 = (2, 3), v 2 = (1, 2), w 1 = (2, −1)<br />
and w 2 = (−3, 2); (c) λ = 1, −1 and −2, v 1 = (−1, 2, −1), v 2 =<br />
(2, 1, −1), v 3 = (−1, −1, 1), w 1 = (0, 1, 1), w 2 = (1, 2, 3) and w 3 =<br />
(1, 3, 5);
Module 7. Linear transforms and their eigenvectors on inner product spaces311<br />
7.40 (b) w(x) = e −x ; (c) w 1 (x) = 2 cos x and w π 2(x) = 2 sin x.<br />
π
Module 7. Linear transforms and their eigenvectors on inner product spaces312<br />
7.5 Summary<br />
• A vector space with the operations <strong>of</strong> vector addition and scalar multiplication<br />
is the foundation for the study <strong>of</strong> transformations in both<br />
finite and infinite dimensions (§7.1.1). The ten axioms <strong>of</strong> a vector<br />
space are: closure and associativity <strong>of</strong> vector addition and scalar multiplication,<br />
commutativity <strong>of</strong> vector addition, distributivity <strong>of</strong> scalar<br />
multiplication over vector addition and <strong>of</strong> scalar multiplication over<br />
scalar addition; the existence <strong>of</strong> a zero vector and a negative, and the<br />
identity <strong>of</strong> scalar multiplication by 1.<br />
• A subset <strong>of</strong> a vector space is a subspace if it is closed under vector<br />
addition and scalar multiplication.<br />
• The dimension <strong>of</strong> a vector space is the maximum number <strong>of</strong> linearly<br />
independent basis vectors.<br />
• An inner product imbues a vector space with distances, lengths and<br />
angles (§7.1.2). The three axioms <strong>of</strong> an inner product (Defn. 7.3) are<br />
linearity, symmetry and positivity. Inequalities, familiar from plane<br />
geometry, follow in general.<br />
√<br />
• Distances and lengths are given by the norm ‖u‖ = 〈u, u〉 (Defn. 7.4).<br />
The angle θ between two vectors is given by cos θ = 〈u, v〉 /(‖u‖.‖v‖);<br />
they are orthogonal if the inner product is zero (§7.1.2).
Module 7. Linear transforms and their eigenvectors on inner product spaces313<br />
• Matrix multiplication, differential and integral operators are examples<br />
<strong>of</strong> linear transformations (§7.2.1).<br />
• A linear transform (operator) is neatly complemented by its adjoint<br />
(§7.2.2) defined by 〈u, Lv〉 = 〈 L † u, v 〉 (Defn. 7.7). A self-adjoint<br />
transformation, such as the Sturm-Liouville operator, generalises the<br />
concept <strong>of</strong> a symmetric matrix.<br />
• The spectrum <strong>of</strong> a linear transformation and its adjoint are the same,<br />
and the adjoint or left-eigenvectors are orthogonal to the ordinary<br />
eigenvectors (§7.4.1). This allows them to be used to extract the component<br />
<strong>of</strong> the eigenvectors or eigenfunction in any given vector or function.<br />
• Thus the eigenvectors <strong>of</strong> a self-adjoint linear transformation (symmetric<br />
matrix) are necessarily orthogonal and complete (§7.4.2). Solutions <strong>of</strong><br />
Sturm-Liouville systems provide an important example <strong>of</strong> this property.<br />
• The solution to the ode [r(x)y ′ ] ′ + [q(x) + µp(x)]y = f(x) subject to<br />
some homogeneous boundary conditions for some constant µ may be<br />
written<br />
y = ∑ 〈f/p, y m 〉<br />
y m (x) ,<br />
m µ − λ m<br />
provided µ ≠ λ m , w<strong>here</strong> λ m are eigenvalues and y m (x) are the orthonormal<br />
eigenfunctions <strong>of</strong> the associated Sturm-Liouville problem.
Index<br />
:=, 242<br />
array, 66, 69<br />
article, 53<br />
author, 54<br />
begin-end, 243<br />
bye, 242<br />
caption, 77<br />
df, 221, 243<br />
displaymath, 63<br />
documentclass, 53<br />
document, 53<br />
end, 243<br />
eqnarray, 68, 69<br />
equation, 63<br />
factorial, 243<br />
factor, 215<br />
for, 213, 243<br />
graphicx, 76<br />
includegraphics, 77<br />
int, 213, 243<br />
in, 244<br />
left, 64<br />
let, 243<br />
maketitle, 54<br />
mbox, 66<br />
nonumber, 69<br />
paragraph, 58<br />
quad, 66<br />
quit, 242<br />
repeat-until, 226, 243<br />
right, 64<br />
section, 56<br />
title, 54<br />
until, 243<br />
usepackage, 76<br />
write, 215, 243
Index 315<br />
absolute convergence, 143<br />
accents, 73<br />
Achilles, 133<br />
adjoint, 274, 274–283, 292, 298,<br />
299, 313<br />
adjoint eigenfunction, 308<br />
adjoint eigenvectors, 292, 296, 307<br />
age structured populations, 108<br />
Airy’s equation, 239<br />
alternating harmonic series, 143<br />
ampersand, 59<br />
analytic, 191, 193<br />
analytic function, 151, 167<br />
angle, 262, 265, 266, 266, 282<br />
array, 67<br />
associativity, 256–261, 312<br />
autonomous, 35<br />
Babbage, 206<br />
basis, 256, 258<br />
Bessel function, 202, 203, 234, 250<br />
Bessel functions <strong>of</strong> the first kind,<br />
201<br />
Bessel’s equation, 201, 202, 237,<br />
250<br />
brace, 59, 60, 64<br />
bracket, 64<br />
bulleted list, 61<br />
car traffic, 93, 94, 100, 105<br />
caret, 59, 60<br />
case statement, 67<br />
Cauchy’s convergence, 142<br />
Cauchy-Schwarz inequality, 265,<br />
266, 267, 272<br />
characteristic diagram, 102, 120<br />
characteristic equation, 284<br />
circular geometry, 195<br />
closed, 256, 257, 259–262, 312<br />
coefficients, 148, 191, 195, 197,<br />
198, 220, 304<br />
command definition, 74, 75<br />
commutativity, 256, 257, 259, 260,<br />
312<br />
comparison test, 145<br />
complete, 293, 303, 306, 308<br />
complex conjugate, 301<br />
computer algebra, 206, 207, 211,<br />
214, 231, 233, 234,<br />
240–242<br />
conditional convergence, 143<br />
conservation <strong>of</strong> mass, 96
Index 316<br />
conservation <strong>of</strong> momentum, 125<br />
continuity equation, 96–98, 108,<br />
109, 117, 125<br />
converges, 134<br />
critical point, 17<br />
critical points, 23<br />
definition, 46<br />
definition, command, 74<br />
degenerate case, 18, 19<br />
delimiter, 64<br />
diagonalisation, 291<br />
dimension, 256, 312<br />
dimensionality, 256, 258<br />
directional derivatives, 172<br />
displayed equations, 47<br />
displayed mathematics, 63<br />
distance, 262, 265, 265<br />
distributivity, 256, 257, 261, 312<br />
dollar, 59<br />
dot product, 262<br />
dots, 68<br />
eigenfunction, 246, 251, 285,<br />
287–289, 296, 297, 299,<br />
303, 305–309, 313<br />
eigenfunction expansion, 291, 306,<br />
307<br />
eigenspace, 308<br />
eigenvalue, 37, 246, 251, 284–288,<br />
290–292, 294, 296,<br />
301–303, 305–309, 313<br />
eigenvector, 37, 246, 284, 285,<br />
287, 290–296, 301, 303,<br />
307, 308<br />
elastic artery, 124<br />
elementary functions, 71<br />
ellipsis, 48<br />
empty set, 259<br />
equation <strong>of</strong> state, 119<br />
equilibrium, 106<br />
Euler, 86<br />
Eulerian description, 86<br />
even powers, 149, 150<br />
example, 46<br />
exponential, 31, 296<br />
extrema, 168, 170, 171, 173<br />
figure, 75, 76<br />
fixed point, 17–19, 23–26, 28, 31,<br />
35–37, 106<br />
float, 76
Index 317<br />
font, 53, 60<br />
fraction, 61<br />
Frobenius method, 196, 198, 201,<br />
250<br />
geometric series, 145<br />
global maximum, 170<br />
global minimum, 170<br />
harmonic series, 139, 140, 143, 144<br />
hash, 59<br />
Hessian, 174<br />
Hessian matrix, 174, 175, 179, 180<br />
higher dimensions, 18, 19<br />
html, 50<br />
ideal gas, 119<br />
identity, 256, 258, 261, 312<br />
in-line mathematics, 59, 61, 73<br />
indicial equation, 197, 199, 234,<br />
235, 237, 250<br />
infinite series, 134<br />
infinite sum, 134<br />
initial conditions, 212, 215,<br />
223–225, 229, 231<br />
inner product, 262, 263, 263<br />
Inner product, 265<br />
inner product, 268, 272, 274–280,<br />
282, 283, 305, 312<br />
inner product space, 263,<br />
265–267, 274<br />
install LaTeX, 51<br />
isoclines, 24, 24, 31<br />
iteration, 211, 212, 217, 220–223,<br />
225, 230, 235, 236, 239<br />
iterative construction, 251<br />
Jacobi, 37<br />
Jacobian, 35, 37, 37, 38<br />
Lagrange’s remainder, 159, 162<br />
Lagrangian particle paths, 87<br />
LaTeX, 49<br />
LaTeX, document, 53<br />
LaTeX, install, 51<br />
LaTeX, web access, 52<br />
least-squares solution, 273<br />
left-eigenvectors, 292, 293–295,<br />
301, 308<br />
Legendre polynomials, 194, 249,<br />
258, 304
Index 318<br />
Legendre’s equation, 193, 223,<br />
224, 240, 249<br />
length, 265<br />
line breaks, 56<br />
linear combination, 256, 258, 289,<br />
293, 295<br />
linear independence, 256<br />
linear operator, 270, 271, 274,<br />
276, 291, 296, 307, 308<br />
linear transform, 270, 282, 285,<br />
288, 291, 301, 303, 308<br />
linear transformation, 269, 270,<br />
313<br />
linearisation, 24, 24, 28, 100, 112,<br />
119, 125, 233<br />
linearity, 263, 312<br />
linearly independent, 258<br />
list environment, 61<br />
local maximum, 170, 172, 173,<br />
175, 177, 180<br />
local minimum, 172, 173, 177, 182<br />
logarithm, 234<br />
Maclaurin series, 153, 211, 212,<br />
214, 216, 217, 221–223,<br />
229, 232, 239–241<br />
mass on a spring, 7<br />
material derivative, 86, 87, 91, 92<br />
mathematical functions, 48, 71<br />
mathematics environment, 60, 66,<br />
68<br />
method <strong>of</strong> characteristics, 100,<br />
101, 119<br />
method <strong>of</strong> undetermined<br />
coefficients, 290<br />
momentum equation, 115, 117,<br />
125<br />
multiplicity, 284<br />
negative, 256, 259, 261, 312<br />
negative definite, 174, 177<br />
negative space, 65, 66<br />
nonlinear differential equations,<br />
23, 207<br />
nonlinear ode, 217, 217, 221, 229,<br />
233, 240, 241<br />
norm, 265, 265, 312<br />
normal space, 65<br />
notation, 46<br />
numbered list, 62<br />
odd powers, 149
Index 319<br />
orbit, 11<br />
order <strong>of</strong>, 221, 230, 233<br />
orthogonal, 245, 246, 249, 251,<br />
265, 266, 266, 289, 292,<br />
293, 299, 303, 308, 312<br />
paragraphs, 56<br />
parallelogram equality, 265<br />
parentheses, 64<br />
partial derivative command, 74<br />
partial sums, 134, 134, 135, 137,<br />
138, 156<br />
percent, 59<br />
phase plane, 11, 11, 14, 24, 217<br />
phase portrait, 11, 23<br />
population model, 108<br />
positive definite, 174, 174, 177<br />
positivity, 263, 312<br />
postscript, 76<br />
power series, 147, 148, 155, 190,<br />
191, 193, 196–198, 201,<br />
211–213, 217, 218,<br />
220–222, 230, 233, 234,<br />
237, 241, 242, 249, 251<br />
power series method, 189, 195, 216<br />
preamble, 76<br />
pro<strong>of</strong>, 46<br />
pro<strong>of</strong> by contradiction, 139<br />
punctuate, 47, 64<br />
Pythagoras theorem, 266<br />
quad space, 65<br />
quadratic form, 174, 174<br />
quadratic polynomials, 256<br />
radius <strong>of</strong> convergence, 148, 151,<br />
153, 166<br />
range, 270, 271<br />
ratio test, 145, 145, 149, 150<br />
redefine, 74<br />
reduce, 207, 242<br />
regular point, 200, 201, 249<br />
relations, 63<br />
residual, 221, 224, 224, 226, 229,<br />
230, 233, 235, 239, 251<br />
Rolle’s theorem, 161<br />
root test, 145<br />
rubber band, 87<br />
saddle point, 27, 28, 31, 172, 172,<br />
173, 175, 177, 179, 181
Index 320<br />
scalar multiplication, 256,<br />
256–258, 260, 261, 267,<br />
282, 312<br />
Schwarz inequality, 265<br />
sections, 56<br />
self-adjoint, 274, 274, 277–282,<br />
298, 300, 301, 303, 308,<br />
313<br />
sequence, 130, 132, 134, 135, 138,<br />
142<br />
set union, 259<br />
shear transformation, 275<br />
shift summation indices, 191<br />
singular, 249, 307<br />
singular point, 200, 234, 249<br />
slosh, 59<br />
sonic boom, 119<br />
sound, 119<br />
space, 65<br />
special functions, 249, 250<br />
spectrum, 284, 292, 299<br />
square integrable, 259, 259–261,<br />
264, 271, 273<br />
stability, 17, 17, 19, 37<br />
stable, 17, 19, 21<br />
stationary points, 171, 172, 178,<br />
179, 181<br />
structure, 195, 245<br />
Sturm-Liouville, 302, 303<br />
Sturm-Liouville equation, 246,<br />
251, 280<br />
Sturm-Liouville operators, 280,<br />
281<br />
Sturm-Liouville problem, 265,<br />
289, 291, 302, 305, 313<br />
Sturm-Liouville theory, 245<br />
subscript, 59, 60, 72<br />
subspace, 261, 268, 274, 278–280,<br />
283, 284, 312<br />
superscript, 59, 60, 72<br />
symbols, 47, 60, 79<br />
symmetric matrix, 274, 301, 303<br />
symmetry, 263, 312<br />
tabular format, 66<br />
tangent plane, 172<br />
Taylor polynomial, 156, 157–159,<br />
161, 162<br />
Taylor series, 152, 153, 153, 156,<br />
162, 169, 180, 191, 200,<br />
201, 221
Index 321<br />
Taylor’s series, multivariable, 166<br />
Taylor’s theorem, 166, 173<br />
telescopic sum, 135<br />
telnet, 208<br />
testing, 234<br />
thin space, 65, 66, 71<br />
tilde, 59<br />
trajectory, 11, 11, 15<br />
triangle inequality, 265, 267<br />
truncation error, 156, 156<br />
underscore, 59, 60<br />
uniqueness, 154, 191, 249<br />
unit vector, 265<br />
unstable, 17, 19–21, 28, 42<br />
vector addition, 256, 259, 261, 312<br />
vector space, 255, 256, 258–263,<br />
265–267, 269, 271, 282,<br />
291, 312<br />
vector subspace, 261, 261, 262<br />
wave equation, 119, 124<br />
wave speed, 101<br />
Zeno <strong>of</strong> Elea, 130<br />
Zeno’s Second Paradox, 133<br />
zero vector, 256, 257, 259, 260,<br />
312