21.01.2013 Views

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

140 M. Cornero et al.<br />

x86 SH-4<br />

Fig. 4. Impact on performance of the CLI based compilation (config a vs. b)<br />

<strong>in</strong>troduction of SH-4 nops to ensure proper alignment of basic blocks and functions,<br />

required to achieve high performance of this architecture.<br />

The worst case comes from the benchmark video, whereCLI is roughly 74%<br />

larger than x86 or SH-4: two thirds of the CLI code is made of <strong>in</strong>itializers of<br />

arrays of bitfields, for which we emit very naive code, one bitfield at a time. A<br />

smarter code emission (which we have planned, but not yet implemented) will<br />

comb<strong>in</strong>e bitfields to generate the values <strong>in</strong>-place, gett<strong>in</strong>g rid of most <strong>in</strong>itializers.<br />

Exclud<strong>in</strong>g the pathological cases, video for both architectures and mpeg2enc<br />

for SH-4, theSH-4 (resp. x86) is 19% (resp. 2%) larger than CLI .<br />

There are other opportunities for improvements: <strong>in</strong> some cases, we have to<br />

generate data segments for both little-endian and big-endian architectures. It is<br />

likely that, at deployment-time, the endianness is known 2 .Inthiscase,theuseless<br />

data def<strong>in</strong>ition could be dropped. Another reduction can come from the fact<br />

that CLI reta<strong>in</strong>s all the source code function and type names <strong>in</strong> the metadata.<br />

In the absence of reflection, which is true for the C language, those names can<br />

be changed to much shorter ones. Us<strong>in</strong>g only lower case and upper case letters,<br />

digits and underscore, one can encode (2 × 26 + 10 + 1) 2 = 5329 names on two<br />

characters, drastically reduc<strong>in</strong>g the size of the str<strong>in</strong>g pool.<br />

Our experiments confirm a previous result [6] that CLI is quite compact, similar<br />

to x86 and roughly 20% smaller (tak<strong>in</strong>g <strong>in</strong>to account the preced<strong>in</strong>g remarks)<br />

than SH-4, both notoriously known for hav<strong>in</strong>g dense <strong>in</strong>struction sets.<br />

On the performance side, consider the Figure 4 which represents the performance<br />

of the b<strong>in</strong>aries generated by the configuration b (through CLI ,at-O2)<br />

with respect to a (classical flow also at -O2). It measures the impact on performance<br />

of us<strong>in</strong>g the <strong>in</strong>termediate representation. The code generated through<br />

CLI <strong>in</strong> configuration b is, on average, barely slower than a. Inotherwords,us<strong>in</strong>g<br />

-O2 optimization level <strong>in</strong> all cases causes a 1.5% performance degradation<br />

on x86 and 0.6% on SH-4. The worst degradation is also conta<strong>in</strong>ed, with -18%<br />

for crypto on x86 and -17% for ks on SH-4.<br />

2 Some platforms are made of processors of both endiannesses. It could be advantageous<br />

to migrate the code from one to another and thus to keep both def<strong>in</strong>itions.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!