01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Compiler-Directed Performance Model Construction 189<br />

The approach is based on the manual <strong>in</strong>strumentation <strong>of</strong> the application with<br />

MA API calls. The models for float<strong>in</strong>g po<strong>in</strong>t and load/store operations are created<br />

at run-time <strong>of</strong> the application. The used MA library starts and stops the<br />

performance counters (us<strong>in</strong>g PAPI), which are used to validate the model. The<br />

MPI communication patterns are pr<strong>of</strong>iled. Further, an <strong>in</strong>cremental ref<strong>in</strong>ement <strong>of</strong><br />

the model is supported through add<strong>in</strong>g more variables to the <strong>in</strong>put parameters.<br />

The MA library functions generate trace files. These traces are used to validate<br />

the model. The CG and SP benchmarks <strong>of</strong> the NPB-suite were evaluated and<br />

reveal a maximum error rate less than 30% (typical error rate less than 10%) for<br />

float<strong>in</strong>g po<strong>in</strong>t operations.<br />

Ipek et. al. propose a neural network based approach towards performance<br />

prediction <strong>of</strong> parallel applications [6]. Through tra<strong>in</strong><strong>in</strong>g neural networks on performance<br />

data, this approach benefits from automated model construction, model<strong>in</strong>g<br />

full system complexity without the need to add architectural details to the<br />

model to get precise predictions.<br />

Mar<strong>in</strong> describes an architecture <strong>in</strong>dependent model construction process [7].<br />

The object code <strong>of</strong> an application is analyzed <strong>in</strong> two ways. First, a static analysis<br />

is performed, identify<strong>in</strong>g the loop nests, <strong>in</strong>struction mixes <strong>in</strong> basic blocks<br />

and deliver<strong>in</strong>g a control flow graph. Second, the b<strong>in</strong>ary is <strong>in</strong>strumented to measure<br />

the basic block counts, the communication volume and frequency and the<br />

memory utilization at runtime. The post-process<strong>in</strong>g tool set generates an architecture<br />

neutral model, which serves as <strong>in</strong>put for the scheduler and is merged<br />

with an architecture description lead<strong>in</strong>g to an overall performance model. The<br />

scheduler maps the specific <strong>in</strong>structions to generic classes, assur<strong>in</strong>g the architecture<br />

<strong>in</strong>dependence. The result<strong>in</strong>g models are capable <strong>of</strong> predict<strong>in</strong>g float<strong>in</strong>g<br />

po<strong>in</strong>t, load and store operations. This approach is extended for cross-platform<br />

prediction [8]. The effects <strong>of</strong> restrict<strong>in</strong>g the execution time on cross-platform<br />

performance prediction are studied by Yang et. al [9].<br />

The convolution method was developed at the Performance Model<strong>in</strong>g and<br />

Characterization Lab at the San Diego Supercomputer Center [10]. The performance<br />

prediction <strong>of</strong> a parallel application is obta<strong>in</strong>ed from s<strong>in</strong>gle processor<br />

performance and network utilization.<br />

Lee et al. propose to apply statistical methods to comb<strong>in</strong>e s<strong>in</strong>gle processor<br />

and contention models to predict multiprocessor behavior [11,12].<br />

The Performance Oriented End-to-end Model<strong>in</strong>g System (POEMS) is an environment<br />

for end-to-end performance model<strong>in</strong>g <strong>of</strong> parallel systems [13]. System<br />

components are modeled on different layers <strong>of</strong> abstraction: applications, runtime,<br />

operat<strong>in</strong>g system, and hardware. Different model<strong>in</strong>g paradigms, such as simulation,<br />

analysis, and direct measurement, are employed to model the components.<br />

Sweep3D is used to evaluate and predict parallel architectures.<br />

TAU is a set <strong>of</strong> tools address<strong>in</strong>g <strong>in</strong>strumentation, measurement and analysis<br />

<strong>of</strong> parallel applications [14]. However, their approach does not <strong>in</strong>volve a compiler<br />

but <strong>in</strong>stead relies on the comb<strong>in</strong>ation <strong>of</strong> a preprocessor and b<strong>in</strong>ary <strong>in</strong>strumentation<br />

or dynamic <strong>in</strong>strumentation to realize an <strong>in</strong>strumented b<strong>in</strong>ary. B<strong>in</strong>ary <strong>in</strong>strumentation<br />

has the advantage to work even <strong>in</strong> the absence <strong>of</strong> the source code.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!