29.01.2015 Views

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

314 Chapter 23<br />

gies <strong>for</strong> the typical internal junction temperatures in a chip [2]. We assumed<br />

a 20 msec resynchronization latency (a conservative estimate) to bring a turned<br />

off processor (and its caches) into the full operational mo<strong>de</strong>. We also assumed<br />

a simple synchronization mechanism where each processor updates a bit in a<br />

globally shared and protected location. When a processor updates its bit‚ it<br />

waits <strong>for</strong> the other processors to update their bits. When all processor update<br />

their bits‚ all processors continue with their parallel execution. Our energy<br />

calculation also inclu<strong>de</strong>s the energy consumed by the processors during<br />

synchronization.<br />

For each processor‚ both data and instruction caches are 8 KB‚ 2-way<br />

set-associative with 32 byte blocks and an access latency of 1 cycle. The<br />

on-chip shared memory is assumed to be 1 MB with a 12 cycle access latency.<br />

All our energy values are obtained using the 0.1-micron process technology.<br />

The cache hit and miss statistics are collected at runtime using the per<strong>for</strong>mance<br />

counters in a Sun Spare machine‚ where all our simulations have been<br />

per<strong>for</strong>med.<br />

We used a set of eight benchmarks to quantify the benefits due to our<br />

runtime parallelization strategy‚ Img is an image convolution application.<br />

Cholesky is a Cholesky <strong>de</strong>composition program. Atr is a network address<br />

translation application. SP computes the all-no<strong>de</strong>s shortest paths on a given<br />

graph. Encr has two modules. The first module generates a cryptographicallysecure<br />

digital signature <strong>for</strong> each outgoing packet in a network architecture.<br />

The second one checks the authenticy of a digital signature attached to an<br />

incoming message. Hyper simulates the communication activity in a distributed-memory<br />

parallel architecture. wood implements a color-based visual<br />

surface inspection method. Finally‚ Usonic is a feature-based object estimation<br />

algorithm. The important characteristics of these benchmarks are given<br />

in Table 23-2. The last two columns give the execution cycles and the energy<br />

consumptions (in the processor core‚ caches‚ and the shared memory) when<br />

our applications are executed using a single processor.<br />

Table 23-2. Benchmark co<strong>de</strong>s used in the experiments and their important characteristics.<br />

Benchmark<br />

Input<br />

Cycles<br />

Energy<br />

Img<br />

Cholegky<br />

Atr<br />

SP<br />

Encr<br />

Hyper<br />

Wood<br />

Usonic<br />

263.1 KB<br />

526.0 KB<br />

382.4 KB<br />

688.4 KB<br />

443.6 KB<br />

451.1 KB<br />

727.6 KB<br />

292.9 KB<br />

16526121<br />

42233738<br />

19689554<br />

268932537<br />

28334189<br />

25754072<br />

362069719<br />

9517606<br />

18.51 mJ<br />

53.34 mJ<br />

24. 88 mJ<br />

91.11 mJ<br />

72.55 mJ<br />

66.80 mJ<br />

90.04 mJ<br />

22.93 mJ

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!