01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

206 R. Plyask<strong>in</strong> and A. Herkersdorf<br />

4.1 Accuracy <strong>of</strong> Generated Traces<br />

In order to evaluate accuracy <strong>of</strong> the proposed trace generation method and to<br />

estimate a possible ga<strong>in</strong> <strong>in</strong> simulation performance compared to the cycle accurate<br />

simulator, we took a set <strong>of</strong> five benchmarks. Each benchmark represented<br />

a particular application doma<strong>in</strong>. In jpeg benchmark, the application performed<br />

encod<strong>in</strong>g <strong>of</strong> a bitmap image <strong>in</strong>to the JPEG format [3]. Additionally, we selected<br />

bitcount, FFT, str<strong>in</strong>gsearch, andsha benchmarks from MiBench suite [5] that<br />

represent correspond<strong>in</strong>gly Automotive, Telecommunications, Office, and Security<br />

application categories. The experiments were conducted on a PC with a<br />

2 GHz Intel Core 2 Duo processor and 1 GB <strong>of</strong> RAM. Dur<strong>in</strong>g the tests we used<br />

the follow<strong>in</strong>g <strong>in</strong>put data for the applications:<br />

– Jpeg: a test bitmap image <strong>of</strong> size 104×72 pixels;<br />

– FFT : polynomial function with one random s<strong>in</strong>usoid and 512 samples;<br />

– Bitcount: 50000 iterations;<br />

– Sha: the small ASCII text file provided with the benchmark;<br />

– Str<strong>in</strong>gsearch: the large str<strong>in</strong>g provided with the benchmark.<br />

The cross-compiled C-code <strong>of</strong> each application was executed on a CoMET virtual<br />

platform. The SoC architecture modeled <strong>in</strong> the tool consisted <strong>of</strong> a PowerPC<br />

e200z6 CPU with a 32 kB write-back cache, a generic memory model with an<br />

access latency <strong>of</strong> 1 cycle, and a generic model <strong>of</strong> a bus with a request latency<br />

<strong>of</strong> 1 cycle. The resultant log was further processed by the trace generator which<br />

produced a trace file for each benchmark code.<br />

In the next step, we executed the generated traces <strong>in</strong> the trace simulator<br />

(TS) conta<strong>in</strong><strong>in</strong>g the same SoC components as <strong>in</strong> the CoMET virtual platform.<br />

Parameters <strong>of</strong> the abstracted modules, e.g. the memory and bus latencies as well<br />

as the cache size, were adjusted to the parameters <strong>of</strong> the CoMET modules. Clock<br />

frequencies <strong>of</strong> the components both <strong>in</strong> CoMET and TS were set to 100 MHz.<br />

Results <strong>of</strong> the estimated execution time for each benchmark are given <strong>in</strong> Table 1.<br />

Although the trace simulator does not have a notion <strong>of</strong> <strong>in</strong>structions, the simulation<br />

performance <strong>in</strong> MIPS was calculated as the number <strong>of</strong> <strong>in</strong>structions <strong>of</strong><br />

the correspond<strong>in</strong>g application divided by the time <strong>of</strong> the trace simulation. We<br />

refer to the simulation time as a real time elapsed from the start to the end <strong>of</strong><br />

Table 1. Comparison <strong>of</strong> VaST CoMET and trace-based simulations<br />

Benchmark<br />

Number <strong>of</strong><br />

<strong>in</strong>structions<br />

Simulation<br />

Estimated cycles<br />

performance, MIPS<br />

VaST TS Error,% VaST TS Speedup<br />

VaST perf.<br />

with trace<br />

gen., MIPS<br />

str<strong>in</strong>gsearch 1.68M 2.86M 2.93M 2.18 7.01 11.97 1.71 0.94<br />

jpeg 3.01M 5.18M 5.16M -0.37 6.52 25.07 3.84 0.98<br />

sha 10.81M 12.8M 12.45M -2.77 54.60 72.07 1.32 1.18<br />

FFT 21.68M 27.23M 25.96M -4.67 28.64 60.23 2.10 1.00<br />

bitcount 34.18M 43.16M 42.71M -1.04 90.17 126.57 1.40 1.40

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!