15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

FIGURE 42.54 von Neumann architecture and Harvard/modified Harvard architecture.<br />

FIGURE 42.55 Finite impulse response filter.<br />

Expansion of this equation results in the following pseudo code statements:<br />

y(0) = c(0)x(0) + c(1)x(-1) + c(2)x(-2) + … + c(N - 1)x(1 - N);<br />

y(1) = c(0)x(1) + c(1)x(0) + c(2)x(-1) + … + c(N - 1)x(2 - N);<br />

y(2) = c(0)x(2) + c(1)x(1) + c(2)x(0) + … + c(N - 1)x(3 - N);<br />

…<br />

y(n) = c(0)x(n) + c(1)x(n - 1) + c(2)x(n - 2) + …<br />

+ c(N - 1)x(n - (N - 1));<br />

When this equation is executed in software or assembly code, output samples y(n) are computed in<br />

sequence. To implement this on a von Neumann architecture, the following operations are needed.<br />

Assume that the von Neumann has a multiply and accumulate instruction (not necessarily the case).<br />

Assume also that pipelining allows to execute the multiply and accumulate in parallel with the read or<br />

write operations. Then one tap needs four cycles:<br />

1. Read multiply-accumulate instruction.<br />

2. Read data value from memory.<br />

3. Read coefficient from memory.<br />

4. Write data value to the next location in the delay line (because to start the computation of the<br />

next output sample, all values are shifted by one location).<br />

Thus even if the von Neumann architecture includes a single cycle multiply-accumulate unit, it will<br />

take four cycles to compute one tap.<br />

Implementing the same FIR filter on a Harvard architecture will reduce the number of cycles to three<br />

because it allows the fetch of the instruction in parallel with the fetch of one of the data items. This was<br />

© 2002 by CRC Press LLC<br />

x(n)<br />

Instruction<br />

Processing<br />

Unit<br />

Address Bus<br />

Data Bus<br />

16 x 16 mpy Instruction<br />

Processing<br />

Unit<br />

16 x 16 mpy<br />

Memory<br />

ALU<br />

x(n-1)<br />

Address Bus Address Bus 2<br />

Data Bus Data Bus 2<br />

Program<br />

Memory<br />

Z -1 Z -1 Z -1<br />

(50 TAPS)<br />

c(0) X X X c(N-1) X<br />

+ + +<br />

ALU<br />

Data<br />

Memory<br />

x(n-(N-1))<br />

y(n)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!