29.01.2015 Views

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Efficient Power/Per<strong>for</strong>mance Analysis 293<br />

of instructions almost always results in the same scheduled trace of instructions<br />

insi<strong>de</strong> of the given hotspot.<br />

Hotspots create regular patterns of execution profiles during the course of<br />

execution of a benchmark. Due to these execution patterns‚ it is not essential<br />

to do a cycle accurate <strong>de</strong>tailed simulation <strong>for</strong> the entire duration of each of<br />

the hotspots. Thus‚ one can use the IPC and EPC values of a sampling window<br />

to predict the future behavior of the program. We exploit this feature of the<br />

program execution behavior in or<strong>de</strong>r to accelerate micro-architectural<br />

simulation. In Section 3‚ we <strong>de</strong>scribe how metrics of interest can be obtained<br />

via sampling insi<strong>de</strong> <strong>de</strong>tected hotspots.<br />

2.2. Hotspot <strong>de</strong>tection<br />

Hotspots are mainly characterized by the behavior of the branches insi<strong>de</strong> the<br />

hotspot. A hotspot can be <strong>de</strong>tected by keeping track of the branches being<br />

encountered during the execution of the program. To keep track of branches‚<br />

we use a cache-like data structure called the Branch Behavior Buffer (BBB)<br />

[13]. Each branch has an entry in the BBB‚ consisting of an Execution Counter<br />

and a one-bit Candidate Flag (CF). The execution counter is incremented each<br />

time the branch is taken‚ and once the counter exceeds a certain threshold<br />

(512‚ in our case)‚ the branch in question is marked as a candidate branch by<br />

setting the CF bit <strong>for</strong> that branch. The simulator also maintains a saturating<br />

counter called the hotspot <strong>de</strong>tection counter (HDC)‚ which keeps track of<br />

candidate branches. Initially‚ the counter is set to a maximum value (64 K in<br />

our case); each time a candidate branch is taken‚ the counter is <strong>de</strong>cremented<br />

by a value D‚ and each time a non-candidate branch is taken‚ it is incremented<br />

by a value I. When the HDC <strong>de</strong>crements down to zero‚ we are in a hotspot.<br />

For our implementation we chose D as 2 and I as 1‚ such that exiting the<br />

hotspot is twice as slow as entering it (this is done to prevent false exits from<br />

a hotspot). The <strong>de</strong>tails of the hotspot <strong>de</strong>tection scheme can be found in [13]‚<br />

[16].<br />

3. HYBRID SIMULATION<br />

As <strong>de</strong>scribed previously‚ common programs exhibit high temporal locality in<br />

various sections of their co<strong>de</strong> and behave in a predictable manner insi<strong>de</strong><br />

hotspots. Our strategy is to employ a two-level hybrid simulation paradigm‚<br />

which can per<strong>for</strong>m architectural simulation at two different levels of abstraction<br />

and with different speed/accuracy characteristics.<br />

A low-level simulation environment‚ which can per<strong>for</strong>m cycle accurate<br />

simulation and provi<strong>de</strong> accurate metrics associated with the program<br />

execution.<br />

A high level simulation environment‚ which can per<strong>for</strong>m correct functional<br />

simulation without providing cycle accurate metrics.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!