Numerical Methods Contents - SAM
Numerical Methods Contents - SAM
Numerical Methods Contents - SAM
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
B e min−1<br />
0<br />
spacing B e min−m<br />
spacing B e min−m+1<br />
Gap partly filled with non-normalized numbers<br />
Example 2.4.2 (IEEE standard 754 for machine numbers).<br />
spacing B e min−m+2<br />
→ [30], → link<br />
1 >> format long ;<br />
Code 2.4.6: input errors and roundoff errors<br />
2 >> a = 4 / 3 ; b = a−1; c = 3∗b ; e = 1−c<br />
3 e = 2.220446049250313e−16<br />
4 >> a = 1012/113; b = a−9; c = 113∗b ; e = 5+c<br />
5 e = 6.750155989720952e−14<br />
6 >> a = 83810206/6789; b = a−12345; c = 6789∗b ; e = c−1<br />
7 e = −1.607986632734537e−09<br />
No surprise:<br />
for modern computers B = 2 (binary system)<br />
single precision : m = 24 ∗ ,E ∈ {−125,...,128}<br />
➣ 4 bytes<br />
double precision : m = 53 ∗ ,E ∈ {−1021,...,1024} ➣ 8 bytes<br />
∗: including bit indicating sign<br />
Remark 2.4.3 (Special cases in IEEE standard).<br />
1 >> x = exp(1000) , y = 3 / x , z = x∗sin ( pi ) , w = x∗log ( 1 )<br />
2 x = I n f<br />
3 y = 0<br />
4 z = I n f<br />
w = NaN<br />
E = e max , M ≠ 0 ˆ= NaN = Not a number → exception<br />
E = e max , M = 0 ˆ= Inf = Infinity → overflow<br />
E = 0 ˆ= Non-normalized numbers → underflow<br />
E = 0, M = 0 ˆ= number 0<br />
✸<br />
Ôº½¼ ¾º<br />
!<br />
△<br />
Notation: floating point realization of ⋆ ∈ {+, −, ·, /}: ˜⋆<br />
correct rounding:<br />
For any reasonable M:<br />
Realization of ˜+, ˜−,˜·,˜/:<br />
rd(x) = arg min |x − ˜x|<br />
˜x∈M<br />
(if non-unique, round to larger (in modulus) ˜x ∈ M: “rounding up”)<br />
small relative rounding error<br />
∃eps ≪ 1:<br />
| rd(x) − x|<br />
|x|<br />
Ôº½½½ ¾º<br />
≤ eps ∀x ∈ R . (2.4.1)<br />
⋆ ∈ {+, −, ·,/}: x˜⋆ y := rd(x ⋆ y) (2.4.2)<br />
✸<br />
Example 2.4.4 (Characteristic parameters of IEEE floating point numbers (double precision)).<br />
☞<br />
MATLAB always uses double precision<br />
1 >> format hex ; realmin , format long ; realmin<br />
2 ans = 0010000000000000<br />
3 ans = 2.225073858507201e−308<br />
4 >> format hex ; realmax , format long ; realmax<br />
5 ans = 7 f e f f f f f f f f f f f f f<br />
6 ans = 1.797693134862316e+308<br />
✬<br />
Assumption 2.4.2 (“Axiom” of roundoff analysis).<br />
There is a small positive number eps, the machine precision, such that for the elementary<br />
arithmetic operations ⋆ ∈ {+, −, ·, /} and “hard-wired” functions ∗ f ∈ {exp, sin, cos, log, ...}<br />
holds<br />
with |δ| < eps.<br />
✫<br />
x˜⋆ y = (x ⋆ y)(1 + δ) , ˜f(x) = f(x)(1 + δ) ∀x,y ∈ M ,<br />
∗: this is an ideal, which may not be accomplished even by modern CPUs.<br />
✩<br />
✪<br />
Example 2.4.5 (Input errors and roundoff errors).<br />
✸<br />
Ôº½½¼ ¾º<br />
relative roundoff errors of elementary steps in a program bounded by machine precision !<br />
Ôº½½¾ ¾º<br />
Example 2.4.7 (Machine precision for MATLAB). (CPU Intel Pentium)