21.06.2014 Views

Numerical Methods Contents - SAM

Numerical Methods Contents - SAM

Numerical Methods Contents - SAM

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

B e min−1<br />

0<br />

spacing B e min−m<br />

spacing B e min−m+1<br />

Gap partly filled with non-normalized numbers<br />

Example 2.4.2 (IEEE standard 754 for machine numbers).<br />

spacing B e min−m+2<br />

→ [30], → link<br />

1 >> format long ;<br />

Code 2.4.6: input errors and roundoff errors<br />

2 >> a = 4 / 3 ; b = a−1; c = 3∗b ; e = 1−c<br />

3 e = 2.220446049250313e−16<br />

4 >> a = 1012/113; b = a−9; c = 113∗b ; e = 5+c<br />

5 e = 6.750155989720952e−14<br />

6 >> a = 83810206/6789; b = a−12345; c = 6789∗b ; e = c−1<br />

7 e = −1.607986632734537e−09<br />

No surprise:<br />

for modern computers B = 2 (binary system)<br />

single precision : m = 24 ∗ ,E ∈ {−125,...,128}<br />

➣ 4 bytes<br />

double precision : m = 53 ∗ ,E ∈ {−1021,...,1024} ➣ 8 bytes<br />

∗: including bit indicating sign<br />

Remark 2.4.3 (Special cases in IEEE standard).<br />

1 >> x = exp(1000) , y = 3 / x , z = x∗sin ( pi ) , w = x∗log ( 1 )<br />

2 x = I n f<br />

3 y = 0<br />

4 z = I n f<br />

w = NaN<br />

E = e max , M ≠ 0 ˆ= NaN = Not a number → exception<br />

E = e max , M = 0 ˆ= Inf = Infinity → overflow<br />

E = 0 ˆ= Non-normalized numbers → underflow<br />

E = 0, M = 0 ˆ= number 0<br />

✸<br />

Ôº½¼ ¾º<br />

!<br />

△<br />

Notation: floating point realization of ⋆ ∈ {+, −, ·, /}: ˜⋆<br />

correct rounding:<br />

For any reasonable M:<br />

Realization of ˜+, ˜−,˜·,˜/:<br />

rd(x) = arg min |x − ˜x|<br />

˜x∈M<br />

(if non-unique, round to larger (in modulus) ˜x ∈ M: “rounding up”)<br />

small relative rounding error<br />

∃eps ≪ 1:<br />

| rd(x) − x|<br />

|x|<br />

Ôº½½½ ¾º<br />

≤ eps ∀x ∈ R . (2.4.1)<br />

⋆ ∈ {+, −, ·,/}: x˜⋆ y := rd(x ⋆ y) (2.4.2)<br />

✸<br />

Example 2.4.4 (Characteristic parameters of IEEE floating point numbers (double precision)).<br />

☞<br />

MATLAB always uses double precision<br />

1 >> format hex ; realmin , format long ; realmin<br />

2 ans = 0010000000000000<br />

3 ans = 2.225073858507201e−308<br />

4 >> format hex ; realmax , format long ; realmax<br />

5 ans = 7 f e f f f f f f f f f f f f f<br />

6 ans = 1.797693134862316e+308<br />

✬<br />

Assumption 2.4.2 (“Axiom” of roundoff analysis).<br />

There is a small positive number eps, the machine precision, such that for the elementary<br />

arithmetic operations ⋆ ∈ {+, −, ·, /} and “hard-wired” functions ∗ f ∈ {exp, sin, cos, log, ...}<br />

holds<br />

with |δ| < eps.<br />

✫<br />

x˜⋆ y = (x ⋆ y)(1 + δ) , ˜f(x) = f(x)(1 + δ) ∀x,y ∈ M ,<br />

∗: this is an ideal, which may not be accomplished even by modern CPUs.<br />

✩<br />

✪<br />

Example 2.4.5 (Input errors and roundoff errors).<br />

✸<br />

Ôº½½¼ ¾º<br />

relative roundoff errors of elementary steps in a program bounded by machine precision !<br />

Ôº½½¾ ¾º<br />

Example 2.4.7 (Machine precision for MATLAB). (CPU Intel Pentium)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!