DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net
DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net
DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>DRAFT</strong> <strong>IEEE</strong> <strong>Standard</strong> <strong>for</strong> <strong>Floating</strong>-<strong>Point</strong> <strong>Arithmetic</strong> – 2003 August 12 10:20<br />
3. (Previous draft) Formats<br />
This standard defines four floating-point <strong>for</strong>mats in two groups, basic and<br />
extended, each having two widths, single and double. The standard levels of<br />
implementation are distinguished by the combinations of <strong>for</strong>mats supported.<br />
3.1. Sets of Values<br />
This section concerns only the numerical values representable within a <strong>for</strong>mat, not<br />
the encodings. The only values representable in a chosen <strong>for</strong>mat are those<br />
specified by way of the following three integer parameters:<br />
• p = the number of significant bits (precision)<br />
• E max = the maximum exponent<br />
• E min = the minimum exponent .<br />
Each <strong>for</strong>mat's parameters are given in Table 1. Within each <strong>for</strong>mat only the<br />
following entities shall be provided:<br />
• Numbers of the <strong>for</strong>m (–1) s 2 E ( b 0 . b 1 b 2 ... b p–1 ), where<br />
• s = 0 or 1<br />
• E = any integer between E min and E max , inclusive<br />
• b i = 0 or 1<br />
• Two infinities, +¥ and –¥<br />
• At least one signaling NaN<br />
• At least one quiet NaN<br />
The <strong>for</strong>egoing description enumerates some values redundantly, <strong>for</strong> example, 2 0<br />
(1.0) = 2 1 (0.1) = 2 2 (0.0 1) = ... . However, the encodings of such nonzero<br />
values may be redundant only in extended <strong>for</strong>mats (3.3). The nonzero values of<br />
the <strong>for</strong>m ± 2 E min (0 . b 1 b 2 ... b p–1 ) are called denormalized. Reserved exponents<br />
may be used to encode NaNs, ±¥ , ±0, and denormalized numbers. For any<br />
variable that has the value zero, the sign bit s provides an extra bit of in<strong>for</strong>mation.<br />
Copyright © 2003 by the Institute of Electrical and Electronics Engineers, Inc. This document is an unapproved<br />
draft of a proposed <strong>IEEE</strong>-SA <strong>Standard</strong> - USE AT YOUR OWN RISK. See statement on page 1.<br />
Page 32