16.07.2014 Views

DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net

DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net

DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>DRAFT</strong> <strong>IEEE</strong> <strong>Standard</strong> <strong>for</strong> <strong>Floating</strong>-<strong>Point</strong> <strong>Arithmetic</strong> – 2003 August 12 10:20<br />

3.0. Background and Terminology<br />

<strong>Floating</strong>-point arithmetic is a systematic approximation of real arithmetic.<br />

<strong>Floating</strong>-point arithmetic can only represent a finite subset of the infinite number of<br />

real numbers. Additionally, many of the axioms of real arithmetic, such as<br />

associatively of addition, do not hold <strong>for</strong> floating-point arithmetic. The<br />

mathematical structure unpinning the arithmetic in this standard is the extended<br />

reals, that is, the set of real numbers together with positive and negative infinity.<br />

For a given <strong>for</strong>mat, the process of rounding (section 4) maps an element of the<br />

extended reals to a representable numerical value included in that <strong>for</strong>mat. A<br />

representable numerical value can be mapped to one or more floating-point<br />

values of a <strong>for</strong>mat. The set of floating-point values a numerical value maps to is<br />

called the numerical value’s cohort. The elements of a cohort are distinct<br />

representations of the same numerical value. For example, in a binary floatingpoint<br />

<strong>for</strong>mat, the numerical value zero has the cohort {-0, +0}. The floating-point<br />

values of a <strong>for</strong>mat consist of:<br />

• tuples (s, e, m); the numerical value of a tuple is (–1) s b e b 1–p m<br />

• +infinity, -infinity<br />

• NaN<br />

For nonzero values, binary <strong>for</strong>mats have constraints on the relation between e and<br />

m which cause each numerical value representable in that <strong>for</strong>mat to map to a<br />

unique floating-point value in that <strong>for</strong>mat; in other words, nonzero numerical values<br />

have a unique representation in a binary <strong>for</strong>mat. Decimal <strong>for</strong>mats do not have the<br />

same constraints; a nonzero numerical value’s cohort can have multiple elements.<br />

For example, if m is a multiple of 10 and e is not emax, (s, e, m) and (s, e + 1, m<br />

/ 10) are two representations <strong>for</strong> the same numerical value.<br />

With one exception, the numerical value of the result of a floating-point arithmetic<br />

operation is only a function of the numerical values of the operands (see section<br />

5). In other words, the representation of the operands may only influence the<br />

representation of the result; the result has the same cohort indepenent of the<br />

operands’ representations. The exception to this this rule is division by zero, in<br />

which case the sign of the zero influences which infinity is returned (see section<br />

7.2); positive and negative infinity are not in the same cohort. Which<br />

representation is used <strong>for</strong> a result provides some in<strong>for</strong>mation about the history of<br />

the computation; the decimal specific operations (section 5.11) can be used to<br />

distinguish among the different representations.<br />

Copyright © 2003 by the Institute of Electrical and Electronics Engineers, Inc. This document is an unapproved<br />

draft of a proposed <strong>IEEE</strong>-SA <strong>Standard</strong> - USE AT YOUR OWN RISK. See statement on page 1.<br />

Page 16

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!