High Performance Reconfigurable Computing


High Performance Reconfigurable Computing

Anthony Agresta & Jeremy Coon

Intro: High Performance

Reconfigurable Computing

Reconfigurable Computing incorporates

the use of a high-speed reprogrammable

“fabric,” which is (re)programmed as

needed in order to solve tasks.

Takes advantage of data-level parallelism

in order to boost performance

Reprogramming a single FPGA as needed

can be cheaper than having several ASICs

on a board

Computing System Element Choices

Programmability /












Network Processors

Graphics Processors




Hardware customization/reconfigurablity, how?

Change both functionality of hardware cells (elements)

and their spatial connectivity to match requirements of

computation/application on the fly (at runtime).

Reconfigurable Computing

Also known as Custom Computing Machines (CCMs)

Utilize hardware devices customized to match computation

Using: FPGAs (Fine grain) or

Micro-coded arrays of simple processors (coarse grain)



Specialization , Development cost/time

Performance/Chip Area/Watt

(Computational Efficiency)

+ Shorter Useful Life cycle

#3 lec # 9

Fall 2010


What is Reconfigurable Computing (RC)?

• Utilize reconfigurable hardware devices: (spatially-programmed connections of

hardware processing elements) tailored to application:

• Customizing hardware to match computations needed/present in a particular

application by changing hardware functionality on the fly (at runtime).

Reconfigurable Computing Goal: Using reconfigurable hardware devices to build

systems with advantages over conventional computing solutions in terms of:

- Flexibility - Performance - Power - Time-to-market - Life cycle cost

(vs. ASICS) Computational Efficiency (vs. ASICS) (vs. ASICS)

(vs. processors)

“Hardware” customized to

specifics of problem.

Direct map of problem specific

dataflow, control.

Circuits “adapted” as problem

requirements change.

Still spatial computing but both

functionality and connectivity of

hardware elements are not fixed

#4 lec # 9

Fall 2010


Intro: von Neumann Architecture

Single in-order execution

Implementations have faster clock

speeds than reconfigurable computers

Much less parallelism due to the

limitations of the architecture

Intro: High Performance

Reconfigurable Computing

Processor is “rewired” as needed in order to

perform a task in massive parallel

Slower clock speed than GPPs

Much more gets done per cycle due to


Spatial vs. Temporal Computing


(using hardware)


Space vs. Time Trade-off

(using software/program)



Defined by fixed functionality

and connectivity of hardware elements

Processor running programs written using

a pre-defined fixed set of instructions (ISA)

#7 lec # 9

Fall 2010


Approaches for HPRC

• Pure FPGA approach:

○ An entire system is built around an FPGA,

which is programmed as needed to solve a


Approaches for HPRC

• “Hybrid-core” approach

○ An FPGA is used alongside a general

purpose processor, often in the form of an

FPGA expansion board or coprocessor

installed into a normal computer.

○ The GPP reprograms the FPGA to do

massively parallel work best suited to it.


The FPGA, or Field Programmable Gate

Array, lies at the heart of most

reconfigurable computing designs

An FPGA’s function is determined long

after it is manufactured

Programmed using a variant of C, or an

HDL (often VHDL or Verilog)

Fine-grain Reconfigurable Hardware Devices: FPGAs

Conventional FPGA Tile

K-LUT (typical k=4)

w/ optional

output Flip-Flop

~ 75% of FPGA area


~ 25% of FPGA area

Or configurable Logic Block (CLB)

#12 lec # 9

Fall 2010


FPGA: Pros and Cons


• FPGAs offer the reconfigurable hardware

needed for HPRC

• Entire FPGA can be used each cycle

• Power Consumption

• Reduced time to market / startup cost

compared to ASIC


• Clock speed

• Harder to program than a GPP

Applications of FPGAs and HPRC

Programmable firmware for consumer


Multi-band phones

Upgradeable firmware for consumer

electronics (game consoles, etc.)

Applications of FPGAs and HPRC

Embedded systems, “systems-on-achip”

Hardware cryptography


Image processing

Sample Configurable Computing Application:

Prototype Video Communications System

Uses a single FPGA to perform four functions that typically require separate chips.

A memory chip stores the four circuit configurations and loads them sequentially into the

FPGA as needed.

Initially, the FPGA's circuits are configured to acquire digitized video data.

The chip is then rapidly reconfigured to transform the video information into a

compressed form and reconfigured again to prepare it for transmission.

Finally, the FPGA circuits are reconfigured to modulate and transmit the video


At the receiver, the four configurations are applied in reverse order to demodulate the

data, uncompress the image and then send it to a digital-to-analog converter so it can be

displayed on a television screen.

#16 lec # 9

Fall 2010



In the future, mainstream computers

might contain a programmable FPGA


This could allow applications to “re-wire”

a part of your computer in order to

perform required number crunching

faster than with a traditional CPU


Modern processors have hit a physical

clock-speed barrier

Best way to continue performance gains

as dictated by Moore’s Law is increased


FPGAs and RC offer a good method to

take advantage of data-level parallelism

and increased transistor count.

More magazines by this user
Similar magazines