21.01.2013 Views

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

326 F. Vandeputte and L. Eeckhout<br />

3.5 Phase Complexity Surfaces<br />

Hav<strong>in</strong>g discussed both the phase count surface as well as the phase predictability surface,<br />

we can now comb<strong>in</strong>e both surfaces to form a so called phase complexity surface.<br />

A phase complexity surface shows phase count versus phase predictability across different<br />

time scales. A phase complexity surface is easily derived from the phase count<br />

and predictability surfaces by factor<strong>in</strong>g out the θ threshold. In other words, each po<strong>in</strong>t<br />

on the phase complexity surface corresponds to a particular θ threshold which determ<strong>in</strong>es<br />

phase count and predictability at a given time scale. The motivation for the phase<br />

complexity surface is to represent an easy-to-grasp <strong>in</strong>tuitive view on a program’s phase<br />

behavior through a s<strong>in</strong>gle graph.<br />

3.6 Discussion<br />

Time complexity. The time complexity for comput<strong>in</strong>g phase complexity surfaces is<br />

l<strong>in</strong>ear as all of the four steps have a l<strong>in</strong>ear-time complexity. The first step computes<br />

the BBVs at the smallest <strong>in</strong>terval granularity of <strong>in</strong>terest. This requires a functional simulation<br />

or <strong>in</strong>strumentation run of the complete benchmark execution; the overhead is<br />

limited though. The second step computes BBVs at larger <strong>in</strong>terval granularities by<br />

aggregat<strong>in</strong>g the BBVs from the previous step. This step is l<strong>in</strong>ear <strong>in</strong> the number of<br />

smallest-granularity <strong>in</strong>tervals. The third step applies threshold cluster<strong>in</strong>g at all <strong>in</strong>terval<br />

granularities. As mentioned <strong>in</strong> the paper, the basic approach to threshold cluster<strong>in</strong>g<br />

is an iterative process, our approach though makes a l<strong>in</strong>ear scan over the BBVs. Once<br />

the phase IDs are determ<strong>in</strong>ed through the cluster<strong>in</strong>g step, the fourth step then determ<strong>in</strong>es<br />

the phase predictability by predict<strong>in</strong>g next phase IDs — aga<strong>in</strong>, this is l<strong>in</strong>ear-time<br />

complexity.<br />

Applications. The phase complexity surfaces provide a number of potential applications.<br />

One is to select representative benchmarks for performance analysis based on<br />

their <strong>in</strong>herent program phase behavior. A set of benchmarks that represent diverse phase<br />

behaviors can capture a representative picture of the benchmark suite’s phase behavior;<br />

this will be illustrated further <strong>in</strong> section 6. Second, phase complexity surfaces are also<br />

useful <strong>in</strong> determ<strong>in</strong><strong>in</strong>g an appropriate <strong>in</strong>terval size for optimization. For example, reduc<strong>in</strong>g<br />

energy consumption can be done by downscal<strong>in</strong>g hardware resources on a per-phase<br />

basis [1,6,7,27]. An important criterion for good energy sav<strong>in</strong>g and limited performance<br />

penalty, is to limit the number of phases (<strong>in</strong> order to limit the tra<strong>in</strong><strong>in</strong>g time at run time<br />

of f<strong>in</strong>d<strong>in</strong>g a good per-phase hardware sett<strong>in</strong>g) and to achieve good phase predictability<br />

(<strong>in</strong> order to limit the number of phase mispredictions which may be costly <strong>in</strong> terms of<br />

missed energy sav<strong>in</strong>g opportunities and/or performance penalty).<br />

4 Experimental Setup<br />

We use all the SPEC CPU2000 <strong>in</strong>teger and float<strong>in</strong>g-po<strong>in</strong>t benchmarks, use reference <strong>in</strong>puts<br />

for all benchmarks and run all benchmarks to completion. We use the SimpleScalar<br />

Tool Set [2] for collect<strong>in</strong>g BBVs on an <strong>in</strong>terval basis.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!