01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Compiler-Directed Performance Model<br />

Construction for Parallel Programs<br />

Mart<strong>in</strong> Sch<strong>in</strong>dewolf 1 ,DavidKramer 1 , and Marcelo C<strong>in</strong>tra 2<br />

1 Institute <strong>of</strong> <strong>Computer</strong> Science & Eng<strong>in</strong>eer<strong>in</strong>g<br />

Karlsruhe Institute <strong>of</strong> Technology (KIT)<br />

Haid-und-Neu-Straße 7<br />

76131 Karlsruhe, Germany<br />

{sch<strong>in</strong>dewolf,kramer}@kit.edu<br />

2 School <strong>of</strong> Informatics<br />

University <strong>of</strong> Ed<strong>in</strong>burgh<br />

10 Crichton Street<br />

Ed<strong>in</strong>burgh EH8 9AB, United K<strong>in</strong>gdom<br />

mc@<strong>in</strong>f.ed.ac.uk<br />

Abstract. Dur<strong>in</strong>g the last decade, performance prediction for <strong>in</strong>dustrial<br />

and scientific workloads on massively parallel high-performance comput<strong>in</strong>g<br />

systems has been and still is an active research area. Due to the<br />

complexity <strong>of</strong> applications, the approach to deriv<strong>in</strong>g an analytical performance<br />

model from current workloads becomes <strong>in</strong>creas<strong>in</strong>gly challeng<strong>in</strong>g:<br />

automatically generated models <strong>of</strong>ten suffer from <strong>in</strong>accurate performance<br />

prediction; manually constructed analytical models show better prediction,<br />

but are very labor-<strong>in</strong>tensive. Our approach aims at clos<strong>in</strong>g the gap<br />

between compiler-supported automatic model construction and the manual<br />

analytical model<strong>in</strong>g <strong>of</strong> workloads. Commonly, performance-counter<br />

values are used to validate the model, so that prediction errors can be<br />

determ<strong>in</strong>ed and quantified. Instead <strong>of</strong> manually <strong>in</strong>strument<strong>in</strong>g the executable<br />

for access<strong>in</strong>g performance counters, we modified the GCC compiler<br />

to <strong>in</strong>sert calls to run-time system functions. Added compiler options<br />

enable the user to control the <strong>in</strong>strumentation process. Subsequently, the<br />

<strong>in</strong>strumentation focuses on frequently executed code parts. Similar to established<br />

frameworks, a run-time system is used to track the application<br />

behavior: traces are generated at run-time, enabl<strong>in</strong>g the construction<br />

<strong>of</strong> architecture <strong>in</strong>dependent models (us<strong>in</strong>g quadratic programm<strong>in</strong>g) and,<br />

thus, the prediction <strong>of</strong> larger workloads. In this paper, we <strong>in</strong>troduce our<br />

framework and demonstrate its applicability to benchmarks as well as<br />

real world numerical workloads. The experiments reveal an average error<br />

rate <strong>of</strong> 9% for the prediction <strong>of</strong> larger workloads.<br />

1 Introduction<br />

For many years, research <strong>in</strong> computer system architecture has aimed at model<strong>in</strong>g<br />

the performance <strong>of</strong> computer architectures and workloads. Performance<br />

model<strong>in</strong>g is widely applicable dur<strong>in</strong>g a comput<strong>in</strong>g life-cycle: design, <strong>in</strong>tegration,<br />

C. Müller-Schloer, W. Karl, and S. Yehia (Eds.): ARCS 2010, LNCS 5974, pp. 187–198, 2010.<br />

c○ Spr<strong>in</strong>ger-Verlag Berl<strong>in</strong> Heidelberg 2010

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!