Chapter 3 (cont'd): Newton-Raphson, Secant, Fixed-Point Iteration
Chapter 3 (cont'd): Newton-Raphson, Secant, Fixed-Point Iteration
Chapter 3 (cont'd): Newton-Raphson, Secant, Fixed-Point Iteration
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Chapter</strong> 3 (cont’d): <strong>Newton</strong>-<strong>Raphson</strong>, <strong>Secant</strong>,<br />
<strong>Fixed</strong>-<strong>Point</strong> <strong>Iteration</strong><br />
<strong>Newton</strong>-<strong>Raphson</strong> Method<br />
It is important to remember that for <strong>Newton</strong>-<strong>Raphson</strong> it is necessary to have a good initial guess, otherwise<br />
the method may not converge.<br />
Basic idea: Guess x 1 . Draw the tangent to f(x) at x 1 and use the intersection with the x-axis at x 2 as the<br />
second guess. Repeat until convergence.<br />
The gradient of the tangent line is simply f ′ (x n ) and so it has equation<br />
To find x n+1 consider this at y = 0, i.e.,<br />
giving<br />
y − f(x n ) = f ′ (x n )(x − x n )<br />
−f(x n ) = f ′ (x n )(x n+1 − x n )<br />
x n+1 = x n − f(x n)<br />
f ′ (x n )<br />
Example: f(x) = xe x − 1 in the interval [0,1]<br />
try first guess x 0 = 1,<br />
x n+1 = x n − x ne xn − 1<br />
(1 + x n )e xn<br />
n x n f(x n ) f ′ (x n ) x n+1 e n+1 = x n+1 − x n<br />
0 1.0 1.71828 5.43656 0.68394 -0.31606<br />
1 0.68394 0.35534 3.33701 0.57745 -0.10649<br />
2 0.57745 0.028721 2.81021 0.56723 -0.01021<br />
Caution: If the initial guess is not accurate enough the algorithm may not converge to a root. This is<br />
especially true when f ′ (x) becomes small somewhere (so called ‘shallow gradient’ functions).<br />
General iterate:<br />
Convergence of the <strong>Newton</strong>-<strong>Raphson</strong> method:<br />
Define e n = x n − r.<br />
x n+1 = x n − f(x n)<br />
f ′ (x n )<br />
e n+1 = e n − f(e n + r)<br />
f ′ (e n + r) ≈ e n − f(r) + e nf ′ (r) + e2 n<br />
2<br />
f ′′ (r)<br />
f ′ (r) + e n f ′′ (r) + e2 n<br />
2<br />
f ′′′ (r)<br />
(Taylor)<br />
We know that f(r) = 0 as r is a root.<br />
Next use the expansion<br />
1<br />
1 + x = 1 − x + x2 + ..., |x| < 1<br />
1
f(x)<br />
r<br />
x 2 x 1<br />
x<br />
Figure: Illustrating the <strong>Newton</strong>-<strong>Raphson</strong> algorithm.<br />
f(x)<br />
r<br />
x 2 x 1<br />
x<br />
Figure: Illustrating the <strong>Newton</strong>-<strong>Raphson</strong> algorithm failing due to a point where f ′ (x) ≈ 0.<br />
2
f(x)<br />
x<br />
r x 3<br />
x 2 x 1<br />
Figure: Illustrating the secant algorithm.<br />
to give<br />
To leading order then<br />
( )(<br />
e n+1 = e n − e n 1 + enf ′′ (r) f<br />
2f ′ (r)<br />
1 + e ′′ (r)<br />
n<br />
(<br />
= e n − e n 1 + enf ′′ (r)<br />
2f ′ (r)<br />
= e2 n<br />
2<br />
f ′′ (r)<br />
f ′ (r) + O(e3 n)<br />
e n+1 = e2 n<br />
2<br />
f ′′ (r)<br />
f ′ (r)<br />
)<br />
f ′ (r) + e2 f ′′′ −1<br />
(r)<br />
n f ′ (r)<br />
f ′′′ (r)<br />
f ′ (r) + e2 n<br />
)(<br />
1 − e n<br />
f ′′ (r)<br />
f ′ (r) − e2 n<br />
as e n → ∞ provided that f ′ (r) ≠ 0.<br />
)<br />
f ′′ (r) 2<br />
f ′ (r)<br />
+ O(e 3 n) 2<br />
since e n+1 /e 2 n is O(1) the <strong>Newton</strong>-<strong>Raphson</strong> method is said to converge QUADRATICALLY (p = 2).<br />
Advantage: Quadratic convergence<br />
Disadvantages: convergence not assured, can be computationally inefficient since the derivative f ′ (x) has<br />
to be calculated at each time step.<br />
The NR method is often used together with one of the bracketing methods discussed above. By using the<br />
bisection method (for example) we can get a good initial guess and then obtain fast convergence with NR.<br />
In order to avoid derivatives but still have a fast rate of converge (compared with linear) a commonly used<br />
alternative is...<br />
The <strong>Secant</strong> Method<br />
(As seen in the shooting algorithm)<br />
In the secant method the new iterate x n+1 is found as the intersection of the x-axis and the straight line<br />
through f(x n ) and f(x n−1 ). The equation of the line through f(x n ) and f(x n−1 ) is given by<br />
Imposing y = 0 at x = x n+1 we obtain<br />
y − f(x n )<br />
f(x n+1 ) − f(x n ) =<br />
x − x n<br />
x n−1 − x n<br />
x n+1 = x n − (x n − x n−1 ) f(x n )<br />
f(x n ) − f(x n−1 )<br />
3<br />
(∗)
• Notice (*) is equivalent to <strong>Newton</strong>-<strong>Raphson</strong> if we approximate<br />
• It is better to implement (*) than<br />
f ′ (x n ) ≈ f(x n) − f(x n−1 )<br />
x n − x n−1<br />
x n+1 = x n−1f(x n ) − x n f(x n−1 )<br />
f(x n ) − f(x n−1 )<br />
in order to reduce loss of significance errors. • A downside of the method, compared to <strong>Newton</strong> <strong>Raphson</strong> is<br />
that we need TWO initial guesses now. BUT it is no longer necessary to compute derivatives.<br />
How about convergence?<br />
e n+1 + r = (en−1+r)f(en+r)−(en+r)f(en−1+r)<br />
f(e n+r)−f(e n−1+r)<br />
= r + en−1f(en+r)−enf(en−1+r)<br />
f(e n+r)−f(e n−1+r)<br />
Taylor expanding<br />
e n+1 =<br />
Now f(r) = 0 since r is a root, so<br />
) (<br />
)<br />
e n−1<br />
(f(r) + e n f ′ (r) + e2 n<br />
2<br />
f ′′ (r) − e n f(r) + e n−1 f ′ (r) + e2 n−1<br />
2<br />
f ′′ (r)<br />
f(r) + e n f ′ (r) + e2 n<br />
2<br />
f ′′ (r) − f(r) − e n−1 f ′ (r) − e2 n−1<br />
2<br />
f ′′ (r)<br />
e n+1 =<br />
f<br />
e ′′ (r)<br />
n−1e n 2 (e n−e n−1)<br />
(e n−e n−1)f ′ (r)+ f′′ (r)<br />
2 (e 2 n −e2 n−1)<br />
= en−1en<br />
2<br />
≈<br />
en−1en<br />
2<br />
f ′′ (r)<br />
f ′ (r)<br />
f ′′ (r)<br />
f ′ (r)<br />
(<br />
1 + 1 2 (e n + e n−1 ) f ′′ (r)<br />
(<br />
f ′ (r)<br />
)<br />
1 − 1 2 (e n + e n−1 ) f ′′ (r)<br />
f ′ (r)<br />
) −1<br />
So to leading order<br />
e n+1 ≈ 1 f ′′ (r)<br />
2 f ′ (r) e n−1e n = Γe n−1 e n<br />
as n → ∞<br />
To determine the rate of convergence we need to find p such that<br />
knowing that<br />
e n+1<br />
e n e n−1<br />
∼ e n+1<br />
(e n ) p as n → ∞<br />
e n+1<br />
e n e n−1<br />
= Γ n<br />
with Γ n → Γ as n → ∞<br />
We look for a solution of the form<br />
e n+1 = Ae p n, so that e n = Ae p n−1<br />
Rearranging gives<br />
( en<br />
) 1/p<br />
e n−1 =<br />
A<br />
which allows us to solve for A and p using<br />
e n+1 = Γe n e n−1 =<br />
Γ<br />
e(1+1/p)<br />
A1/p n = Ae p n<br />
4
Equating the last terms gives<br />
p 2 − p − 1 = 0, and Γ = A 1+1/p = A p<br />
The rate of convergence p is therefore given by the positive root<br />
and we have<br />
p = 1 + √ 5<br />
2<br />
≈ 1.618....<br />
e n+1<br />
e p n<br />
( 1<br />
→<br />
2<br />
f ′′ ) 1/p<br />
(r)<br />
f ′ (r)<br />
So the secant method converges almost quadratically, with p ≈ 1.618... and is therefore considerably faster<br />
than the linear interval refinement methods examined earlier. Although slower to converge than <strong>Newton</strong>-<br />
<strong>Raphson</strong>, it has the advantage of being much easier to implement.<br />
<strong>Fixed</strong> <strong>Point</strong> <strong>Iteration</strong><br />
The final part of this section considers fixed point iteration.<br />
Consider <strong>Newton</strong>-<strong>Raphson</strong>. This can be written as<br />
x n+1 = G(x n ), where G(x n ) = x n − f(x n)<br />
f ′ (x n ) .<br />
We can think of this as a quite general algorithm, called fixed point iteration. It can be applied to the<br />
problem f(x) = 0 any time you can write f(x) as f(x) = x − G(x), so that<br />
f(x) = 0 is equivalent to<br />
x = G(x).<br />
Example: f(x) = x 2 − x − 2<br />
(1) G(x) = x 2 − 2<br />
(2) G(x) = √ x + 2<br />
(3) G(x) = 1 + 2/x<br />
(4) G(x) = x − (x 2 − x − 2)/m<br />
G(x) is called the ITERATION FUNCTION.<br />
The root can be looked for iteratively as<br />
x n+1 = G(x n )<br />
giving<br />
(1) x n+1 = x 2 n − 2<br />
(2) x n+1 = √ x n + 2<br />
(3) x n+1 = 1 + 2/x n<br />
(4) x n+1 = x n − (x 2 n − x n − 2)/m<br />
NOTE: not all of these iterations will converge to the root! For the fixed point iteration to be useful we<br />
have to check:<br />
1. Given a starting point x 0 , it is possible to calculate iteratively x 1 ,x 2 ,... (consider e.g. G(x) = − √ x).<br />
2. The sequence x 1 ,x 2 ,.... converges to some point r.<br />
3. r is a FIXED POINT of G(x), i.e. G(r) = r.<br />
5
f(x)<br />
r<br />
x<br />
G(x)<br />
r<br />
x<br />
Figure: Illustrating the fixed point iteration.<br />
The point 1 is satisfied if G(x) : I → I maps the interval [a,b] onto itself. Then if x 0 ∈ I, implies that<br />
and therefore that all subsequent x n+1 = G(x n ) ∈ I.<br />
x 1 = G(x 0 ) ∈ I<br />
Considering now point 3.<br />
It is easy to show that if G(x) ∈ [a,b] then ∃ a fixed point such that G(r) = r.<br />
In fact if G(a) = a or G(b) = b then a or b are fixed points.<br />
If G(a) ≠ a or G(b) ≠ b then G(a) > a and G(b) < b, since G maps I → I.<br />
So h(x) = G(x) − x is ∈ [a,b] and h(x) satisfies h(a) < 0 and h(b) > 0. By the intermediate value theorem,<br />
therefore, ∃ a point r such that<br />
h(r) = 0.<br />
Therefore r is also a fixed point of G(x): that is G(r) = r.<br />
<strong>Point</strong> 2 deals with the issue of convergence to a root. Consider this graphically<br />
Cases (a) and (b) converge.<br />
Cases (c) and (d) diverge.<br />
The convergence of the iteration seems to depend on the slope of g(x) near to r, that is if |g ′ (x)| ≫ 1 we<br />
have divergence.<br />
In fact, examining the error,<br />
e n+1 + r = G(e n + r)<br />
≈ G(r) + e n G ′ (r)<br />
e n+1 ≈ e n G ′ (r)<br />
If |G ′ (r)| < 1 then the fixed point iteration CONVERGES LINEARLY.<br />
If |G ′ (r)| > 0 (and |G ′ (r)| < 1) convergence is monotonic, otherwise oscillatory.<br />
6
G(x)<br />
(a)<br />
G(x)<br />
(b)<br />
x<br />
1<br />
x 2<br />
x 1<br />
x 2<br />
G(x)<br />
x<br />
r x0<br />
x 0<br />
(c)<br />
G(x)<br />
r<br />
(d)<br />
x<br />
x<br />
x 0 r r x0<br />
x<br />
Figure: Illustrating four examples of fixed point iteration. (a) and (b) converge, whereas (c) and (d) diverge.<br />
7
Stopping Criteria<br />
When you implement one of these methods remember to put a stopping criterion to end the iteration,<br />
especially when the method is not guaranteed to converge. You can choose among<br />
1. A bound in absolute value of the error<br />
2. A bound in the relative value of the error<br />
|x n+1 − x n | < δ (e.g. δ < 0.5 × 10 −6 for 6 dec. pl.)<br />
|x n+1 − x n |<br />
|x n+1 |<br />
< δ<br />
this is equivalent to a percentage error, but be careful with x n ∼ 0.<br />
3. A check whether f(x n ) ∼ 0, i.e. ensure that<br />
|f(x n )| < δ.<br />
4. A limit on the number of steps in the iteration. (Always use this together with another condition<br />
when there is a chance the method will not converge, to avoid an infinite loop.)<br />
8