DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net
DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net
DRAFT IEEE Standard for Binary Floating-Point Arithmetic - Sonic.net
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>DRAFT</strong> <strong>IEEE</strong> <strong>Standard</strong> <strong>for</strong> <strong>Floating</strong>-<strong>Point</strong> <strong>Arithmetic</strong> – 2003 August 12 10:20<br />
returned.<br />
[This is analagous to the rescale operation in Java’s<br />
BigDecimal. The name <strong>for</strong> this functionality may be<br />
changed/improved. MFC suggests “quantize” since the<br />
value may not have been explicitly quantized be<strong>for</strong>e. There<br />
was discussion of having requantize’s second argument be a<br />
decimal number instead of an integer; in that case, requantize<br />
would try to return a value numberically equal to x but having<br />
the second argument’s exponent.]<br />
5.12 Decimal Exponent Calculation<br />
As discussed in section 3.3, decimal arithmetic involves not<br />
only computing the proper numerical result but also selecting<br />
the proper member of that value’s cohort. If the result is a<br />
finite value, the required element of the cohort is selected based<br />
on the preferred exponent <strong>for</strong> a result of that operation. The<br />
preferred exponent of an operation is a function of the<br />
exponents of the inputs, as listed in table 5.<br />
Operation with decimal result<br />
addition<br />
subtraction<br />
multiplication<br />
division<br />
fused multiply add, (a× b)+c<br />
Preferred Exponent of Result<br />
min(getexponent(addend), getexponent(augend))<br />
min(getexponent(minuend), getexponent(subtrahend))<br />
getexponent(multiplier) + getexponent(multiplicand)<br />
getexponent(dividend) – getexponent(divisor)<br />
min(getexponent(a) + getexponent(b), getexponent(c))<br />
square root floor(getexponent(radicand) / 2)<br />
conversion between decimal <strong>for</strong>mats getexponent(operand) (?)<br />
conversion from a binary <strong>for</strong>mat to a<br />
decimal <strong>for</strong>mat<br />
round to integer value in same <strong>for</strong>mat max(getexponent(operand), 0)<br />
copy, negate, abs<br />
copysign(x, y)<br />
(?)<br />
getexponent(operand)<br />
getexponent (x)<br />
Copyright © 2003 by the Institute of Electrical and Electronics Engineers, Inc. This document is an unapproved<br />
draft of a proposed <strong>IEEE</strong>-SA <strong>Standard</strong> - USE AT YOUR OWN RISK. See statement on page 1.<br />
Page 53