16.07.2014 Views

DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net

DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net

DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>DRAFT</strong> <strong>IEEE</strong> <strong>Standard</strong> <strong>for</strong> <strong>Floating</strong>-<strong>Point</strong> <strong>Arithmetic</strong> – 2003 August 12 10:20<br />

Rationale: I am not sure if I have ever used case (class(..., when I was concerned about per<strong>for</strong>mance; in<br />

that case I typically use a sequence like isnormal – finite – iszero; isinf, issignaling, in order to get to the<br />

common case first.<br />

Simple one-operand logical predicates have a higher chance of hardware implementation, programmer<br />

usage in per<strong>for</strong>mance-oriented code, and correct compiler optimization.<br />

Classification predicates (Section 5.9) and functions (Appendix A) are much more efficient if hardware<br />

implementations of floating-point registers classify their contents as they are loaded or computed, and<br />

record the classification into extra tag bits <strong>for</strong> each register.<br />

§BI <strong>Standard</strong>ize quiet functions.<br />

Rationale: these functions are usually implemented without exceptions.<br />

§FMA Fused multiply-add.<br />

Rationale: <strong>Standard</strong>ize a best practice <strong>for</strong> fused multiply-add operations, which are now available in<br />

several instruction sets, often implemented slightly differently. Note that 0 × ∞ + qnan does not signal<br />

invalid, because a NaN is an operand, but 0 × ∞ + ∞ does signal invalid, and generates qnan rather<br />

than ∞, because no operand is NaN, and even if all three operands have positive sign bits, the product<br />

is still undefined. In general, an operation with a floating-point destination can generate an invalid<br />

exception if no operand is a quiet NaN, and generates no invalid exception if one or more operands is a<br />

quiet NaN and none is a signaling NaN.<br />

12 November 2001<br />

§Q Quad: Add 128-bit quadruple precision with 112 fraction bits and implicit integer bit. Rationale:<br />

match existing hardware and software implementations, and discourage undesirable alternatives such as<br />

double-double.<br />

§1 New running footer <strong>for</strong> <strong>IEEE</strong> drafts substituted <strong>for</strong> published 754 copyright notice.<br />

Rationale: comply with <strong>IEEE</strong> rules.<br />

11 April 2001<br />

§4 SCOPE and PURPOSE defined.<br />

Rationale: previous intent was not universally understood.<br />

OPEN ISSUE: too narrow a purpose? Is uniqueness specified and achievable?<br />

OPEN ISSUE: merge the new PURPOSE: with the Foreword; merge the new and existing SCOPE: to<br />

remove redundancy.<br />

Copyright © 2003 by the Institute of Electrical and Electronics Engineers, Inc. This document is an unapproved<br />

draft of a proposed <strong>IEEE</strong>-SA <strong>Standard</strong> - USE AT YOUR OWN RISK. See statement on page 1.<br />

Page 5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!