12.07.2015 Views

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

32 chapter 2purposes, let us consider how the computer may store the floating-point numbera = 11223344556677889900 = 1.12233445566778899 × 10 19 . (2.4)Because the exponent is stored separately and is a small number, we can assumethat it will be stored in full precision. In contrast, some of the digits of the mantissamay be truncated. In double precision the mantissa of a will be stored in two words,the most significant part representing the decimal 1.12233, and the least significantpart 44556677. The digits beyond 7 are lost. As we see below, when we performcalculations with words of fixed length, it is inevitable that errors will be introduced(at least) into the least significant parts of the words.2.1.1 Model for Disaster: Subtractive CancellationA calculation employing numbers that are stored only approximately on the computercan be expected to yield only an approximate answer. To demonstrate theeffect of this type of uncertainty, we model the computer representation x c of theexact number x asx c ≃ x(1 + ɛ x ). (2.5)Here ɛ x is the relative error in x c , which we expect to be of a similar magnitudeto the machine precision ɛ m . If we apply this notation to the simple subtractiona = b − c, we obtaina = b − c ⇒ a c ≃ b c − c c ≃ b(1 + ɛ b ) − c(1 + ɛ c )⇒a ≃ 1+ɛ bba − c a ɛ c. (2.6)a cWe see from (2.6) that the resulting error in a is essentially a weighted average ofthe errors in b and c, with no assurance that the last two terms will cancel. Of specialimportance here is to observe that the error in the answer a c increases when wesubtract two nearly equal numbers (b ≃ c) because then we are subtracting off themost significant parts of both numbers and leaving the error-prone least-significantparts:a cadef= 1+ɛ a ≃ 1+ b a (ɛ b − ɛ c ) ≃ 1+ b a max(|ɛ b|, |ɛ c |). (2.7)This shows that even if the relative errors in b and c may cancel somewhat, theyare multiplied by the large number b/a, which can significantly magnify the error.Because we cannot assume any sign for the errors, we must assume the worst [the“max” in (2.7)].−101<strong>COPYRIGHT</strong> <strong>2008</strong>, PRINCET O N UNIVE R S I T Y P R E S SEVALUATION COPY ONLY. NOT FOR USE IN COURSES.ALLpup_06.04 — <strong>2008</strong>/2/15 — Page 32

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!