15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FPGAs by providing on-chip memory for multiple array configurations. The on-chip cache exploits high,<br />

local on-chip bandwidth to perform quick reconfiguration.<br />

In addition, several systems with coarse-grain granularity exist, such as RaPiD [9], Raw [10], and Pleiades<br />

[11]. RaPiD is a configurable architecture that allows the user to construct custom application-specific<br />

architectures in a run-time configurable way. The system is a linear array of uncommitted functional<br />

units, which contain datapath registers, three ALUs, an integer multiplier, and some local memory. The<br />

RaPiD architecture targets applications that can be mapped to deep pipelines formed from the repeated<br />

functional units.<br />

The Reconfigurable Architecture Workstation (Raw) is a set of replicated tiles, where each tile contains<br />

a simple RISC-like processor, small amount of bit-level configurable logic, and some memory for instructions<br />

and data.<br />

The CS2112 reconfigurable communications processor (RCP) from Chameleon Systems, Inc. contains<br />

reconfigurable fabric organized in slices, each of which can be independently reconfigured. The<br />

CS2112 includes four slices consisting of three tiles. Each tile comprises seven 32-bit datapath units, two<br />

16 × 24-bit single-cycle multipliers, four local store memories, and a control logic unit. The RCP uses a<br />

background configuration plane to perform quick reconfiguration. This reconfigurable fabric is combined<br />

with a 32-bit embedded processor subsystem.<br />

The Pleiades architecture is a processor approach that combines an on-chip microprocessor with an<br />

array of heterogeneous programmable computational units of different granularities, connected by a<br />

reconfigurable interconnect network. The programmable units are MACs, ALUs, and an embedded FPGA.<br />

@<br />

Xilinx has recently introduced the Virtex-II devices from the new Xilinx Platform FPGAs. The<br />

Virtex-II architecture includes new features such as up to 192 dedicated high-speed multipliers. Designers<br />

@<br />

can use Virtex-II devices to implement critical DSP elements of emerging broadband systems. This is<br />

somewhat a similar effort in the same direction that we are heading. The Virtex-II device is providing<br />

the dedicated high-performance multipliers for DSP applications like the VPBs on the SPS, which are<br />

intended to improve performance for a set of applications. SPS differs from a Virtex-II device in the<br />

following:<br />

• The architecture can contain blocks of various complexities. Depending on the requirements of<br />

the applications fixed blocks can be as complex as an FFT block or as simple as an adder or multiplier.<br />

Examples of VPBs will be provided in the next section.<br />

• The generation of an SPS instance is automated. Given a set of target applications an architecture<br />

generation tool determines the number and types of VPBs to be placed on the chip. Although the<br />

Virtex-II device is still general purpose, an instance of an SPS will be more context-defined<br />

according to a given set of applications.<br />

36.3 Strategically Programmable System<br />

Recently, reconfigurable fabric was integrated into SoCs forming hybrid (re)configurable systems. Hybrid<br />

(re)configurable systems contain some kind of computational unit, e.g., ALUs, intellectual property units<br />

(IPs) or even traditional general-purpose processors, embedded into a reconfigurable fabric (see Fig. 36.2).<br />

One type of hybrid reconfigurable architecture embeds reconfigurable cores as a coprocessor to a<br />

general-purpose microprocessor, e.g., Garp [6] and Chimaera [7]. Another direction of new architectures<br />

considers integration of highly optimized hard cores and hardwired blocks with reconfigurable fabric.<br />

The main goal here is to utilize the optimized blocks to improve the system performance. Such programmable<br />

devices are targeted for a specific context—a<br />

class of similar applications, such as DSP, data<br />

communications (Xilinx Platform Series), or networking (Lucent’s ORCA®). The embedded fixed blocks<br />

are tailored for the critical operations common to the application class. In essence, the programmable<br />

logic is supported with the high-density high-performance cores. The cores can be applied at various<br />

levels, such as the functional block level, e.g., fast Fourier transform (FFT) units, or at the level of basic<br />

arithmetic operations (multipliers).<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!