13.10.2014 Views

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

167<br />

7.6.2 Overall Performance Across All Benchmarks<br />

The previous section considered each of the benchmarks individually. The results<br />

presented in this section consider the development of a single set of multicodes that<br />

offer a performance benefit for all of the benchmarks under consideration.<br />

Each of the benchmarks considered in this study executes a different number of<br />

bytecodes. While 201 compress executes a total of more than 950 million bytecodes,<br />

213 javac executes less than 75 million, a difference of approximately 875 million<br />

bytecodes. Consequently, it was necessary to normalize the number of bytecodes<br />

executed by each benchmark in order to develop a single set of multicodes for all of the<br />

applications. If this process was not performed, the multicodes selected would have<br />

been heavily skewed in favour of the 201 compress benchmark because it executes<br />

far more bytecodes than any of the other benchmarks under consideration.<br />

In order to accomplish this goal, the compressed multicode block files for each<br />

benchmark were processed and merged into a single file. As each compressed multicode<br />

block file was read, the total number of multicode blocks executed by the<br />

benchmark was recorded. The count for each multicode block was expressed as a percentage<br />

of the total number of blocks executed by the benchmark. Then the results<br />

for the six benchmarks were merged together. By performing this normalization process,<br />

each benchmark contributed an equal number of bytecodes to the newly created<br />

compressed multicode block file representing the total set of blocks executed by all of<br />

the benchmarks.<br />

Once the new file was created, it was used to determine the 50 best multicodes.<br />

Examining the performance results presented in the previous section revealed that<br />

on average, using Transfer Reduction Scoring and a maximum multicode length of 25<br />

bytecodes offered the best performance. Consequently, this combination of scoring<br />

technique and maximum multicode length was employed when determining the set of<br />

multicodes to test across all of the benchmarks. The specific multicodes utilized are<br />

listed in Table A.7, located in Appendix 11.<br />

Figure 7.22 shows the performance results achieved when a single set of multicodes<br />

was identified for all six benchmarks. The performance achieved for each benchmark<br />

individually is shown using a broken line while the average performance across all<br />

benchmarks is shown using a solid line. Interestingly, this graph reveals that the<br />

best performance is achieved when 30 multicode substitutions were performed. This<br />

result indicates that the cost associated with introducing additional multicodes has<br />

begun to outweigh the benefit at this point. This behaviour is not observed when<br />

multicodes are determined for a single benchmark because each multicode identified

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!