02.10.2019 Views

UploadFile_6417

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

266 Chapter 6 IMPLEMENTATION OF DISCRETE-TIME FILTERS<br />

±0, depending on the sign bit (called the soft zero). Thus 0 has two<br />

representations.<br />

• If E = 255 and M ̸= 0,then the representation is interpreted as a<br />

not-a-number (abbreviated as NaN). MATLAB assigns a variable NaN<br />

when this happens—e.g., 0/0.<br />

• If E = 255 and M =0,then the representation is interpreted as ±∞.<br />

MATLAB assigns a variable inf when this happens—e.g., 1/0.<br />

□ EXAMPLE 6.19 Consider the bit pattern given in Example 6.17. Assuming IEEE-754 format,<br />

determine its decimal equivalent.<br />

Solution<br />

The sign bit is 0 and the exponent code is 131, which means that the exponent<br />

is 131 − 127 = 4. The significand is 1 + 2 −1 +2 −2 =1.75. Hence the bit pattern<br />

represents<br />

ˆx = +(1 + 2 −1 +2 −2 )(2 4 )=2 4 +2 3 +2 2 =28<br />

which is different from the number in Example 6.17.<br />

□<br />

MATLAB employs the 64-bit double-precision IEEE-754 format for<br />

all its number representations and the 80-bit temporary format for its internal<br />

computations. Hence all calculations that we perform in MATLAB<br />

are in fact floating-point computations. Simulating a different floatingpoint<br />

format in MATLAB would be much more complicated and would<br />

not add any more insight to our understanding than the native format.<br />

Hence we will not consider a MATLAB simulation of floating-point arithmetic<br />

as we did for fixed-point.<br />

6.7 THE PROCESS OF QUANTIZATION AND ERROR<br />

CHARACTERIZATIONS<br />

From the discussion of number representations in the previous section, it<br />

should be clear that a general infinite-precision real number must be assigned<br />

to one of the finite representable number, given a specific structure<br />

for the finite-length register (that is, the arithmetic as well as the format).<br />

Usually in practice, there are two different operations by which this assignment<br />

is made to the nearest number or level: the truncation operation<br />

and the rounding operation. These operations affect the accuracy as well<br />

as general characteristics of digital filters and DSP operations.<br />

We assume, without loss of generality, that there are B +1 bits in<br />

the fixed-point (fractional) arithmetic or in the mantissa of floating-point<br />

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).<br />

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!