02.10.2019 Views

UploadFile_6417

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Representation of Numbers 251<br />

new filter Ĥ(z) and its output ŷ(n) are as close as possible to the original<br />

filter H(z) and the original output y(n).<br />

Since the quantization operation is a nonlinear operation, the overall<br />

analysis that takes into account all three effects described above is very<br />

difficult and tedious. Therefore, we will study each of these effects separately<br />

as though it were the only one acting at the time. This makes the<br />

analysis easier and the results more interpretable.<br />

We begin by discussing the number representation in a computer—<br />

more accurately, a central processing unit (CPU). This leads to the process<br />

of number quantization and the resulting error characterization. We<br />

then analyze the effects of filter coefficient quantization on digital filter<br />

frequency responses. The effects of multiplication and addition quantization<br />

(collectively known as arithmetic round-off errors) on filter output<br />

are discussed in Chapter 10.<br />

6.6 REPRESENTATION OF NUMBERS<br />

In computers, numbers (real-valued or complex-valued, integers or fractions)<br />

are represented using binary digits (bits), which take the value of<br />

either a 0 or a 1. The finite word-length arithmetic needed for processing<br />

these numbers is implemented using two different approaches, depending<br />

on the ease of implementation and the accuracy as well as dynamic range<br />

needed in processing. The fixed-point arithmetic is easy to implement but<br />

has only a fixed dynamic range and accuracy (i.e., very large numbers or<br />

very small numbers). The floating-point arithmetic, on the other hand, has<br />

a wide dynamic range and a variable accuracy (relative to the magnitude<br />

of a number) but is more complicated to implement and analyze.<br />

Since a computer can operate only on a binary variable (e.g., a 1 or<br />

a 0), positive numbers can straightforwardly be represented using binary<br />

numbers. The problem arises as to how to represent the negative numbers.<br />

There are three different formats used in each of these arithmetics:<br />

sign-magnitude format, one’s-complement format, and two’s-complement<br />

format. In discussing and analyzing these representations, we will mostly<br />

consider a binary number system containing bits. However, this discussion<br />

and analysis is also valid for any radix numbering system—for example,<br />

the hexadecimal, octal, or decimal system.<br />

In the following discussion, we will first begin with fixed-point signed<br />

integer arithmetic. A B-bit binary representation of an integer x is given<br />

by 1<br />

x ≡ b B−1 b B−2 ... b 0 = b B−1 × 2 B−1 + b B−2 × 2 B−2 + ···+ b 0 × 2 0 (6.28)<br />

1 Here the letter b is used to represent a binary bit. It is also used for filter coefficients<br />

{b k }. Its use in the text should be clear from the context.<br />

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).<br />

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!