26.12.2012 Views

ZTE Communications

ZTE Communications

ZTE Communications

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Slice Registers<br />

No. of Resources<br />

600<br />

500<br />

400<br />

300<br />

200<br />

100<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

0<br />

: LPA Block RAM<br />

: LPA Dist. RAM<br />

: RA<br />

: RA (DSP48E-Systolic)<br />

Case 1 Case 2 Case 3 Case 4 Case 5<br />

(a)<br />

: LPA Block RAM<br />

: LPA Dist. RAM<br />

: RA<br />

: RA (DSP48E-Systolic)<br />

The MAC architecture for LPA and RA has DSP48E<br />

slice-based multipliers and CLB-based adder trees. All the<br />

cases based on this MAC architecture uses 12 DSP48E<br />

slices. However, the RA with DSP48E systolic-array-based<br />

MAC uses two extra DSP48E slices for the final accumulation<br />

process. Similarly, the filter coefficient banks (which are<br />

based on block RAMs) in both LPA distributed RAM and RAs<br />

uses three dedicated RAMB18x2s [11] resources. However,<br />

the LPA block RAM uses three extra resources of RAMB18x2s<br />

for its block RAM based register bank. Furthermore, the LPAs<br />

use one RAM18x2SDPs [11] for the input FIFO (which is zero<br />

for the RAs because they do not use input FIFO).<br />

Fig. 12 shows the number of LUTs versus (operating clock)<br />

frequency for the four design solutions for the five embedded<br />

resampling cases for the polyphase filter. The RAs for case 1,<br />

case 3, and case 5 require less area and lower processing<br />

clocks than their LPA counterparts. Similarly, the RAs for case<br />

2 and case 4 require more area but use lower processing<br />

clocks than their LPA counterparts. This larger area is due to<br />

the pre-stored indices for addressing data register and filter<br />

coefficient banks, and also due to the use of multiplexers.<br />

Thus, RA is the preferred choice for reduced operating clock<br />

rates, and also for reduced area resources with the exception<br />

of cases 2 and 4.<br />

5 Power Analysis<br />

Here we analyze power consumption, focusing on dynamic<br />

power (CMOS technology is assumed) for the LPA and RA of<br />

Hardware Architecture of Polyphase Filter Banks Performing Embedded Resampling for Software-Defined Radio Front-Ends<br />

Mehmood Awan, Yannick Le Moullec, Peter Koch, and Fred Harris<br />

DSP48Es RAMB18×2s RAMB18×2SDPs<br />

0 30 60 90 120 150<br />

Frequency (MHz)<br />

(c) (d)<br />

DSP48E: dedicated computational resource on Virtex-5 FPGA<br />

LPA: load-process architecture<br />

RA: runtime architecture<br />

▲Figure 11. Resource usage for the five embedded resampling cases under LPA and RA: (a)<br />

slice registers, (b) slice LUTs, (c) dedicated resources, and (d) required operating clock rates.<br />

Slice LUTS<br />

600<br />

500<br />

400<br />

300<br />

200<br />

100<br />

0<br />

Case 5<br />

Case 4<br />

Case 3<br />

Case 2<br />

Case 1<br />

: LPA Block RAM<br />

: LPA Dist. RAM<br />

: RA<br />

: RA (DSP48E-Systolic)<br />

Case 1 Case 2 Case 3<br />

(b)<br />

Case 4 Case 5<br />

: LPA Block RAM<br />

: LPA Dist. RAM<br />

: RA<br />

: RA (DSP48E-Systolic)<br />

180<br />

210<br />

the polyphase filter with five different<br />

resampling factors. The demand for high<br />

clock rates in the LPA is often difficult to<br />

satisfy, and the architectures are power<br />

hungry because the dynamic power is<br />

proportional to the toggle frequency [12]:<br />

where n is the number of nodes being<br />

toggled, C is the load capacitance per node,<br />

V is the supply voltage, and f is the toggle<br />

frequency. C and V are device parameters,<br />

whereas n and f are design parameters. By<br />

keeping almost the same n and lowering f,<br />

power consumption decreases. As reduced<br />

clock operation has the same time constraint<br />

as high clock operation, energy consumption<br />

is reduced as well.<br />

Fig. 13 shows dynamic power<br />

consumption for LPA and RA in the five<br />

embedded resampling cases. Xilinx XPA [13]<br />

tool is used. The input vector is the same for<br />

all the cases and designs, and the activity<br />

rates are calculated using ModelSim [14]<br />

post-route simulations with a run length of<br />

500 μs. The thermal settings for the power<br />

simulations are 25 degrees Celsius—an<br />

ambient temperature, and zero airflow.<br />

Fig. 13 shows that the LPA with block RAM-based register<br />

bank consumes more dynamic power than LPA with<br />

distributed-RAM based register bank and also more dynamic<br />

power than RAs. The maximally decimated system (case 1)<br />

based on RAs consumes 64% less dynamic power than its<br />

LPA counterpart when block RAM-based register banks are<br />

used, and it consumes 49% less dynamic power than its LPA<br />

counterpart when distributed RAM-based register banks are<br />

used. The under-decimated system (case 2) based on RAs<br />

consumes 48% less dynamic power than its LPA counterpart<br />

when block RAM-based register banks are used, and it<br />

Area (LUTs)<br />

700<br />

600<br />

500<br />

400<br />

300<br />

200<br />

100<br />

0<br />

0<br />

▲❋<br />

❋<br />

◆▲<br />

Runtime<br />

Architectures<br />

30<br />

■<br />

■<br />

60<br />

90 120<br />

Frequency (MHz)<br />

LUT: lookup table<br />

Load-Process<br />

Architectures<br />

▲Figure 12. Area versus (operating clock) frequency for the four<br />

design solutions for the five embedded resampling cases.<br />

■<br />

◆<br />

150<br />

◆<br />

■<br />

▲<br />

×<br />

❋<br />

×<br />

◆<br />

×<br />

❋▲<br />

❋<br />

▲<br />

R esearch Papers<br />

P dynamic = nCV 2 f, (1)<br />

■<br />

180<br />

: Case 1<br />

: Case 2<br />

: Case 3<br />

: Case 4<br />

: Case 5<br />

×<br />

210<br />

March 2012 Vol.10 No.1 <strong>ZTE</strong> COMMUNICATIONS 61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!