21.01.2013 Views

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Aggressive Function Inl<strong>in</strong><strong>in</strong>g: Prevent<strong>in</strong>g Loop Block<strong>in</strong>gs 395<br />

1.00<br />

0.90<br />

0.80<br />

0.70<br />

0.60<br />

0.50<br />

0.40<br />

Fraction of L1 I-cache fetches reduced on CINT2000<br />

(lower is better)<br />

Code Reorder<strong>in</strong>g Aggressive Inl<strong>in</strong><strong>in</strong>g<br />

gzip gcc crafty eon vortex twolf<br />

vpr mcf parser gap bzip2 Avg.<br />

Fig. 8. Amount of L1 Icache fetches reduced on CINT2000 (lower is better)<br />

2.00<br />

1.50<br />

1.00<br />

0.50<br />

0.00<br />

Fraction of branch target mispredictions reduced on CINT2000<br />

(lower is better)<br />

Code Reorder<strong>in</strong>g Aggressive Inl<strong>in</strong><strong>in</strong>g<br />

gzip gcc crafty eon vortex twolf<br />

vpr mcf parser gap bzip2 Avg.<br />

Fig. 9. Amount of branch target mispredictions reduced on CINT2000 (lower is better)<br />

of the IBM PCIX Cryptographic Coprocessor. We tested the csulcca (Common<br />

Support Utility L<strong>in</strong>ux Common Cryptographic Architecture) application and<br />

obta<strong>in</strong>ed 2% improvements over code reorder<strong>in</strong>g due to the use of <strong>in</strong>l<strong>in</strong><strong>in</strong>g.<br />

5 Related Work<br />

Methods for selective <strong>in</strong>l<strong>in</strong><strong>in</strong>g have been studied and implemented <strong>in</strong> the last<br />

10 years. These works should be separated from the works <strong>in</strong> code reorder<strong>in</strong>g<br />

where<strong>in</strong> <strong>in</strong>l<strong>in</strong><strong>in</strong>g is only a pre-stage to code reorder<strong>in</strong>g. The goal of code reorder<strong>in</strong>g<br />

is to rearrange code segments to m<strong>in</strong>imize Icache misses. As expla<strong>in</strong>ed <strong>in</strong><br />

Section 1 this differs from the the goal of select<strong>in</strong>g “safe” yet aggressive <strong>in</strong>l<strong>in</strong><strong>in</strong>gs.<br />

Scheifler [12] proposed to <strong>in</strong>l<strong>in</strong>e functions based on: a) function size, b) number<br />

of calls versus function size and c) dom<strong>in</strong>ant calls. Scheifler showed that<br />

comput<strong>in</strong>g optimal <strong>in</strong>l<strong>in</strong><strong>in</strong>g is at least NP hard. Ball [4] proposed to <strong>in</strong>l<strong>in</strong>e functions<br />

based on their utility for constant propagation and other optimizations.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!