Coarse Grain Reconfigurable Architectures - Hartenstein

hartenstein.de

Coarse Grain Reconfigurable Architectures - Hartenstein

Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

1

reiner@hartenstein.de

8 May 2009

RAW 2009

The 16th Reconfigurable

Architectures Workshop

May 25-26, 2009, Rome, Italy

Reiner

Hartenstein

IEEE International

Parallel & Distributed

Processing Symposium

Rethinking

Moore’s Law

Preface

Arthur Schopenhauer: "Approximately every 30 years,

we declare the scientific, literary and artistic spirit of

the age bankrupt. In time, the accumulation of errors

collapses under the absurdity of its own weight."

http://hartenstein.de

© 2009,

reiner@hartenstein.de

1

2

Outline (1)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• Conclusions

Power Consumption of Computers

Controverse estimations: really an important issue

Energy consumption of all computers world-wide, visible

and embedded: will it be affordable in the future

What is the reason of their high power consumption

Methods to slash the electricity bill of computing

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

3

4

Computers everywhere

... Ecosystem: just one example

6

http://hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

5

5

© 2009,

reiner@hartenstein.de

6

6

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


75 MW

Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

2

reiner@hartenstein.de

8 May 2009

... Supercomputers ...

more ...

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

7

7

8

Sun Microsystems’ MD S20

16x24

=384

truck

„Google causes 2% of

world-wide electricity

consumption“ [TIME online; Google denies]

**) [Randy Katz:

IEEE Spectrum,

Febr. 2009]

( > 1 mio servers)

187.5 kW

Server Farms

banks of Columbia river**

each 6500 m2

Quincy

48 MW*

10 American

football fields

*) power for 40 000

homes (8 x Quincy)

Power Consumption of Computers

Controverse estimations

Mills, Fettweis,

Times online, LBNL

Power consumption has become

an industry-wide issue for computing

[Horst Simon et al., LBNL, Berkeley]

Driven by economic factors industry will

switch to more energy-efficient solutions in

computing. Incremental improvements are on

track, but „we may ultimately need

revolutionary new solutions“ [Horst Simon, LBNL, Berkeley]

http://hartenstein.de

© 2009,

water-

8 racks reiner@hartenstein.de

cooled

9

25% of Amsterdam‘s electricity: server farms 9

Dallas

Boardman

Fettweis: x30 til 2020

http://hartenstein.de

© 2009,

reiner@hartenstein.de

10

Incremental Improvements

550 Watt

Conferences on Low Power Design

www.islped.org www.patmos-conf.org and on EDA:

www.dac.com www.date-conference.com www.aspdac.com

Source: MediaMarkt advertisement, 2009

Outline (1 b)

• Power Consumption of Computers

• around Moore’s Law

Reasons

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• Conclusions

http://hartenstein.de

© 2009,

reiner@hartenstein.de

11

http://hartenstein.de

© 2009,

reiner@hartenstein.de

12

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

3

reiner@hartenstein.de

8 May 2009

The term „Software“ ...

... stands for extremely

memory-cycle-hungry

instruction streams —

coming with

multiple levels of overhead:

the von Neumann Syndrome

The von Neumann

Syndrome

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

[2006 coined by

C.V. “RAM”

Ramamoorthy]

13

14

14

von Neumann

CPU single core

overhead

instruction fetch

state address computation

data address computation

data meet PU + other overh.

i / o to / from off-chip RAM

http://hartenstein.de

© 2009,

reiner@hartenstein.de

Massive Overhead Phenomena

von Neumann machine

instruction stream

instruction stream

instruction stream

instruction stream

instruction stream

Early critics:

Dijkstra 1968: The Goto considered harmful

Koch et al. 1975: The universal Bus considered harmful

Backus, 1978: Can programming be liberated from the von Neumann style

Arvind et al., 1983: A critique of Multiprocessing the von Neumann Style

speed-up factor of 20

just by reconfigrable

data address generator

i. e. by partial software to

configware migration

1986, E.I.S. Projekt: 94%

time for address computation

total speed-up:

x 15000

PISA DRC accelerator [ICCAD 1984]

Nathan Myhrvold's Law

(also attributed to Bill Gates)**

**) referred to as featuritis, bloat, etc.

http://hartenstein.de

© 2009,

reiner@hartenstein.de

Nathan’s Law

1. Software is a gas. It expands to fill the container it is in.

2. Software grows until it becomes limited by Moore’s [& Kryder‟s]* Law.

3. Makes Moore’s [& Kryder‟s]* Law possible through the demand it creates.

harddisc capacity [MB]

10 6

10 5

10 4

10 3

10 2

10 1

10 0

Jan

1980

Jan

1985

Jan

1990

Jan

1995

Jan

2000

Jan

2005

Jan

2010

*) Kryder‘s Law based on:

Albert Fert / Peter Grünberg

2007 nobel award:

giant magnetoresistance

effect (spintronics)

15

15

16

The Memory Wall

Massive Energy Consumption

Performance

1000

100

10

CPU

Patterson’s Law:

Processor-Memory

Performance Gap:

(grows 50% / year)

>1000

The von Neumann Syndrome

is the reason of the

Software Crisis, and the massive

energy consumption of computers

DRAM

1

1980 1990 2000

2008

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

17

18

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


tall thin man

coherence

Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

4

reiner@hartenstein.de

8 May 2009

Outline (2)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• Conclusions

10 6

10 5

10 4

technology:

„we master

the design

with the

left hand“

Gordon Moore’s Law

10 2

10 1

60 62 64 66 68

10 3

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

19

20

20

application level

submit

RT level

submit

reject

Gate level

submit

reject

reject

Switching level

submit

reject

Circuit level

submit

reject

Layout level

Technology

on site

Breadth of specialization

Conventional Design Flow

design activity

application level

machine level

RT level

Gate level

switching level

circuit

component level

C = COTS

phase #

1 2 3 4 5 6 7

H SE

H

H

H

H

H

C

C

SE = Softw. Engrg.

C

C

C

C

C

H = Hardw. Designer (using COTS)

application level

submit

RT level

submit

reject

Gate level

submit

reject

reject

Switching level

submit

reject

Circuit level

submit

reject

Layout level

Technology

on site

Breadth of specialization

Conventional Design Flow

design activity

application level

machine level

RT level

Gate level

switching level

circuit

component level

C = COTS

phase #

1 2 3 4 5 6 7

H SE

H

H

H

H

H

C

C

SE = Softw. Engrg.

C

C

C

C

C

H = Hardw. Designer (using COTS)

http://hartenstein.de

© 2009,

reiner@hartenstein.de

21

http://hartenstein.de

© 2009,

reiner@hartenstein.de

22

10 6

10 5

10 4

technology:

„we master

the design

with the

left hand“

Design Sciences

missing

designer

population

Mead/Conway

VLSI design

revolution

70 72 74 76 78 80 82 84 86 88 90

Carver Mead:

10 2

„design should

be a separate

discipline !“

10 1

60 62 64 66 68

10 3

application

submit

RT level

submit

Gate level

submit

Switching level

submit

Circuit level

submit

reject

reject

reject

reject

reject

Layout level

Technology

on site

Breadth of specialization

Avoiding Specialization Overload

Clean-up &

intuitive models

Fixing the

Education

Dilemma

The new M-&-C

organization:

application

Breadth of specialization

The text book [1980] ( Bestseller ! )

Carver Mead Lynn Conway

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

23

23

24

24

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

5

reiner@hartenstein.de

8 May 2009

VLSI Design Education Spreading Rapidly

1980 - 1983

world-wide

incubator of

workstation and

EDA industry etc.

The most effective

http://hartenstein.de

project

©

in the

2009,

history of modern

reiner@hartenstein.de

computer science

Carver

25 Mead

25

Lynn

Conway

25

Outline (3)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• A Highly Promising Scenario

• Conclusions

http://hartenstein.de

© 2009,

reiner@hartenstein.de

26

10 9

10 8

10 7

10 6

10 5

10 4

year

10

70 3 72 74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08

http://hartenstein.de

© 2009,

reiner@hartenstein.de

The Single Core Approach

Free ride on

Moore‘s Law

27

27

couldn„t compete with the

free ride on Moore„s Law

•ACRI

•Alliant

•American Supercomputer

•Ametek

•Applied Dynamics

•Astronautics

•BBN

•CDC

•Convex

•Cray Computer

•Cray Research

•Culler-Harris

•Culler Scientific

•Cydrome

http://hartenstein.de

© 2009,

reiner@hartenstein.de

•Dana/Ardent/ Stellar/Stardent

Dead Supercomputer Society

•DAPP

•Denelcor

•Elexsi

•ETA Systems

•Evans and Sutherland

Computer

•Floating Point Systems

•Galaxy YH-1

•Goodyear Aerospace MPP

•Gould NPL

•Guiltech

•ICL

•Intel Scientific Computers

•International Parallel

Machines

•Kendall Square Research

•Key Computer Laboratories

28

[Gordon Bell, keynote at ISCA 2000]

•MasPar

•Meiko

•Multiflow

•Myrias

•Numerix

•Prisma

•Tera

•Thinking Machines

•Saxpy

•Scientific Computer

•Systems (SCS)

•Soviet Supercomputers

•Supertek

•Supercomputer Systems

•Suprenum

•Vitesse Electronics

The single core sequential mind set is the winner

Machine

model

Machine Model of the Mainframe Era

resources

programming

property source

property

CPU hardwired - programmable

sequencer

programming

source state register

Software

(instruction

streams)

Bill Gates at a summit meeting of US state governors ....

........... „... cannot hire such people“

program

counter

CPU

CPU-centric

flat world

(Aristotelian model)

typical

programmer

qualification:

sequential-only

mind set –

von Neumannonly

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

29

30

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

6

reiner@hartenstein.de

8 May 2009

Computer Machine Model of the PC Era

The End of Moore‘s Law

stop in

2005

Machine

resources

sequencer

model

programming

property source

property

programming

source state register

ASIC

accelerator hardwired - hardwired -

CPU hardwired - programmable

Software

(instruction

streams)

program

counter

the tail is

wagging

the dog

Application-Specific Integrated Circuit &

other accelerators: e.g. display processor

10 9

10 8

10 7

10 6

10 5

10 4

the 20

nm wall

The end of

Moore‘s Law

from growth industry to

replacement business

year

10

70 3 72 74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08

http://hartenstein.de

© 2009,

reiner@hartenstein.de

31

http://hartenstein.de

© 2009,

reiner@hartenstein.de

32

32

From Single-core to Multicore Crisis

intel‟s vision:

MultiCore

pre-announced: 16, 32, 80, ..

http://hartenstein.de

© 2009,

reiner@hartenstein.de

Industry is facing a

disruptive turning point

Growing # of cores: The

multicore programming crisis

„I would be panicked

if I were in industry―

forcing a historic transition to a parallel

programming model yet to be invented [David Callahan]

33

[John Hennessy]

33

CPU

CPU

CPU

CPU

overhead

instruction fetch

Multicore von Neumann: arrays

CPU

CPU

CPU

of massive overhead phenomena

CPU

CPU

CPU

von CPU Neumann

CPU

CPU

many-

CPU

single CPU

CPU core

„a terrifyning umber of

processes running in parallel,

create sequential-processing

bottlenecks and losses in

von Neumann machine

data locality“

2008: David Callahan

state address computation

data address computation

data meet PU + other overh.

i / o to / from off-chip RAM

Inter PU communication

message passing overhead

http://hartenstein.de

transactional memory overh.

© 2009,

reiner@hartenstein.de

multithreading overhead etc.

instruction stream

instruction stream

instruction stream

instruction stream

instruction stream

instruction stream

instruction stream

instruction stream

instruction stream

34

proportionate

to the number

of processors

disproportionate

to the number

of processors

34

Massive Energy Consumption

The von Neumann Syndrome

is the reason of the

Software Crisis , and the massive

energy consumption of computers

Performance

1000

100

10

by Reconfigurable Computing

Tear down this wall !

CPU

Patterson’s Law:

Processor-Memory

Performance Gap:

(grows 50% / year)

DRAM

1

1980 1990 2000

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

35

36

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Speedup-Factor

Speedup-Factor

Reconfigurable

Computing

Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

7

reiner@hartenstein.de

8 May 2009

relative performance

10 13

10 12

10 11

10 10

10 9

10 8

10 7

10 6

10 5

10 4

2 more decades after

the end of Moore‘s Law

Still a Growth Industry

10 3 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 14 16 18 20 22 24 26 28 30

year

Outline (4)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU fromn the Center

• Programming beyond Software

• A Highly Promising Scenario

• Conclusions

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

37

37

38

10 6

10 6

Speed-up

factors

obtained

by Software

to Configware

migration

10 3

Image processing,

Pattern matching,

Multimedia DSP and

real-time

face detection wireless

6000

pattern

recognition

video-rate

stereo vision

52

40

730

BLAST

Reed-Solomon

Decoding 2400

900

1000

400

288

SPIHT wavelet-based

image compression 457

FFT

88

protein

identification

MAC

crypto

1000

Viterbi Decoding

Smith-Waterman

pattern matching

100

CT imaging

3000

molecular

dynamics

simulation

28500

DES breaking

Bioinformatics

Energy

saving

factors:

~10% of

speedup

10 3

Image processing,

Pattern matching,

Multimedia DSP and

real-time

face detection wireless

6000

pattern

recognition

video-rate

stereo vision

52

40

730

BLAST

Reed-Solomon

Decoding 2400

900

1000

400

288

SPIHT wavelet-based

image compression 457

FFT

88

protein

identification

MAC

crypto

1000

Viterbi Decoding

Smith-Waterman

pattern matching

100

molecular

dynamics

simulation

28500

DES breaking

CT imaging

3000

Bioinformatics

http://hartenstein.de

© 2009,

reiner@hartenstein.de

10 0

20 Astrophysics

GRAPE

39

39

http://hartenstein.de

© 2009,

reiner@hartenstein.de

10 0

20 Astrophysics

GRAPE

40

40

Demonstrating the intensive Impact

[T. Elghazawi et al.: IEEE COMPUTER, Febr. 2008]

SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster

Application

Speed-up

factor

Savings

Power Cost Size

DNA and Protein

sequencing 8723 779 22 253

DES breaking 28514 3439 96 1116

when Hackers use FPGAs …

encryption on von Neumann unaffordable

The RC Paradox

By orders of magnitude more performance

with worse technology:

The Reconfigurable Computing Paradox —

caused by the von Neumann syndrome

Much less equipment needed – i. e. instead

of a hangar full of racks: e. g. just one or

half a rack without air conditioning

Much less memory needed: i.d. mostly fits

in RAM on board of the processor chip:

i. e. orders of magnitude more bandwidth

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

41

41

42

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

8

reiner@hartenstein.de

8 May 2009

Outline (5)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• Conclusions

Our Contemporary Computer Machine Model

Machine

resources

sequencer

model

programming

property source

property

programming

source state register

ASIC

accelerator hardwired - hardwired -

CPU hardwired - programmable

Software

(instruction

streams)

program

counter

RCU

accelerator

programmable

Now accelerators

are programmable

Configware

(configuration

code)

programmable

Flowware

(data

streams)

RCU needs 2 program sources

data

counters

http://hartenstein.de

© 2009,

reiner@hartenstein.de

43

http://hartenstein.de

© 2009,

reiner@hartenstein.de

44

Now accelerators

are programmable:

programmable ....

the hardware/software chasm

should be turned into

configware/ software interfacing

CPU

The

CPU-centric

flat world

Aristotelian -

we need a

new model

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

45

46

A Heliocentric CS Model

Configware

Engineering

CE

Program Engineering (PE)

SE

Software

Engineering

PE

Program

Engineering

— the Generalization of

Software Engineering:

a Dual Dichotomy Approach.

Dual rail education Dichotomy to overcome the software/

configware chasm & the software/hardware chasm

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

47

48

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


David Parnas

Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

9

reiner@hartenstein.de

8 May 2009

Outline (6)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• Conclusions

Software vs Flowware and Configware

Programming source for instruction-stream-based

computing (von Neumann etc.):

The programming source for data-stream-based computing

operations (datastream machine paradigm):

Programming sources for Reconfigurable Computing

(morphware):

Software

Flowware

Flowware and Configware

Sources for Embedded Systems:

Flowware, Configware and Software

http://hartenstein.de

© 2009,

reiner@hartenstein.de

49

http://hartenstein.de

© 2009,

reiner@hartenstein.de

50

Compilation: Software vs. Configware

Software

Engineering

source program

software

compiler

software code

placement

& routing

configware code

Configware

Engineering

source „program“

mapper

configware

compiler

data scheduler

flowware code

Our Contemporary Computer Machine Model

Machine

resources

sequencer

model

programming

property source

property

programming

source state register

ASIC

accelerator hardwired - hardwired -

CPU hardwired - programmable

Software program

(instruction counter

streams) in CPU

RCU

accelerator

programmable

Configware

Flowware data

(configuration programmable (data counters

code)

streams) in RAM

data counters of reconfigurable

address generators (GAG) in

data memory blocks (asM)

twin Paradigm Dichotomy

asM: auto-sequencing memory

(avoids address computation overhead)

http://hartenstein.de

© 2009,

reiner@hartenstein.de

51

http://hartenstein.de

© 2009,

reiner@hartenstein.de

52

Dichotomy of Procedural Language Paradigms

language category Von Neumann Languages Anti Flowware Machine Languages

both deterministic procedural sequencing: traceable, checkpointable

read next instruction, read next data item,

goto (instr. addr.),

goto (data addr.),

operation

jump (to instr. addr.), jump (to data addr.),

sequence

instr. loop, loop nesting data loop, loop nesting,

driven by:

no parallel loops, escapes, parallel loops, escapes,

instruction stream branching data stream branching

state register program counter data counter(s)

address

computation

massive memory

cycle overhead

overhead avoided

Instruction fetch memory cycle overhead overhead avoided

parallel memory

bank access interleaving none only no restrictions

language features control flow +

data manipulation

data streams only

(no data manipulation)

why twin

dichotomy

Time to Space Mapping

Machine

resources

sequencer

model

programming

property source

property

programming

source state register

ASIC

accelerator hardwired - hardwired -

CPU hardwired - programmable

Software program

(instruction counter

streams)

RCU

accelerator

programmable

loop turns

2 pipeline

Configware

(configuration

code)

programmable

Relativity Dichotomy

Flowware

(data

streams)

data

counters

„The biggest payoff will come from Putting Old ideas into

Practice and teaching people how to apply them properly.“

http://hartenstein.de

© 2009,

reiner@hartenstein.de

53

http://hartenstein.de

© 2009,

reiner@hartenstein.de

54

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


ASM

ASM

ASM

ASM

ASM

ASM

Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

10

reiner@hartenstein.de

8 May 2009

The Complete Contemporary Computer System Model

Machine

resources

sequencer

model

programming

property source

property

programming

source state register

ASIC

accelerator hardwired - hardwired -

CPU hardwired - programmable

Software

(instruction

streams)

program

counter

RCU

accelerator

FpCU

accelerator

programmable

Configware

(configuration

code)

programmable

hardwired - programmable

Hardwired Datastream Machine

Flowware

(data

streams)

Flowware

(data

streams)

data

counters

data

counters

Hardwired Datastream Machine Model

Machine

model

FpCU

accelerator

resources

programming

property source

hardwired

Flowware-programmable

Computing Unit

Examples:

BEE project (UC Berkeley)

traditional systolic array

hardwired super systolic array

property

programmable

6 pipelines

Flowware

flowware

compiler

flowware code

sequencer

programming

source

Flowware

(data streams)

example:

systolic array

ASM

ASM

ASM

state register

12data

counters

ASM

ASM

ASM

http://hartenstein.de

© 2009,

reiner@hartenstein.de

55

http://hartenstein.de

© 2009,

reiner@hartenstein.de

56

relative performance

10 13

10 12

10 11

10 10

10 9

10 8

10 7

10 6

10 5

10 4

Further Growth of Dead Computer Society

10 3 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 14 16 18 20 22 24 26 28 30

http://hartenstein.de

© 2009,

reiner@hartenstein.de

wasting

energy

programmer

productivity

problems

it‘s not the

silver bullet !

year

relative performance

10 13

10 12

10 11

10 10

10 9

10 8

10 7

10 6

10 5

10 4

10 3 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 14 16 18 20 22 24 26 28 30

http://hartenstein.de

© 2009,

reiner@hartenstein.de

Two more decades of Growth

creates jobs

year

57

57

58

58

Outline (7)

• Power Consumption of Computers

• around Moore’s Law

• A CPU-centric Flat World

Reconfigurable Computing: the Silver Bullet

• Displacing the CPU from the Center

• Programming beyond Software

• Conclusions

Arthur Schopenhauer: "Approximately every 30 years,

we declare the scientific, literary and artistic spirit of the

age bankrupt. In time, the accumulation of errors

collapses under the absurdity of its own weight."

Arthur Schopenhauer

RH: "Mesmerized by the Gordon Moore Curve,

we in CS slowed down our learning curve.

Finally, after 60 years, we are witnessing

the spirit from the Mainframe Age

collapsing under the von Neumann Syndrome.―

instead of contributing to the absurdity,

awareness is needed, and,

a change of direction in our basic mind set,

from Aristotelian to Copernican model

http://hartenstein.de

© 2009,

reiner@hartenstein.de

59

http://hartenstein.de

© 2009,

reiner@hartenstein.de

60

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

11

reiner@hartenstein.de

8 May 2009

Conclusions 1(2)

A von-Neumann-only strategy can never be the solution

We need a massive Software to Configware Migration

Established technologies are available and

we can still use standard software and their tools

Configware skills and basic hardware knowledge

are essential qualifications for programmers.

We urgently need a fundamental CS Education

and Research Revolution for dual-rail-thinking

Conclusions 2(2)

Currently the scenario is comparable

to the VLSI design crisis around 1980.

Missing reply to the End of Moore‘s Law

The best time to begin is right now

The likeliness of success stems from the

urgent need to cope with a massive threat

We need „une' Levée en Masses“

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

61

61

62

62

We need „une' Levée en Masses―

We need „une'

Levée en Masses“

thank you for your patience

http://hartenstein.de

© 2009,

reiner@hartenstein.de

63

63

63

http://hartenstein.de

© 2009,

reiner@hartenstein.de

64

64

backup for discussion:

END

http://hartenstein.de

© 2009,

reiner@hartenstein.de

http://hartenstein.de

© 2009,

reiner@hartenstein.de

65

66

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy


Reiner Hartenstein, TU Kaiserslautern, Germany

http://hartenstein.de/RH-bio.pdf

12

reiner@hartenstein.de

8 May 2009

We need to POIIP for:

Software to Hardware Migration

and

Software to Configware Migration

2 simple key rules of thumb:

a) loop turns into pipeline

b) decision box turns

into demultiplexer

Put Old ideas into Practice

loop

body

loop

body

rDPU

rDPU

rDPU

rDPU

rDPU

Astronomic dimensions

The predominance of tremendously inefficient

softwaredriven von-Neumann-type (vN) computers,

employing a fetch-decode-execute cycle to run programs,

is the cause of this massive waist of energy.

Under its top-heavy demand, software packages often reach

astronomic dimensions (up to 200 million lines of code)

and running such packages requires large powerhungry

slow extra memory microchips (―The Memory Wall‖)

http://hartenstein.de

© 2009,

reiner@hartenstein.de

67

67

http://hartenstein.de

© 2009,

reiner@hartenstein.de

68

keynote, RAW 2009, The 16th Reconfigurable Architectures Workshop, in conjunction with IPDPS,

International Conference on Parallel and Distributed Processing, May 25-29, 2009, Rome, Italy

More magazines by this user
Similar magazines