13.10.2014 Views

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

181<br />

7.9 Conclusion<br />

This chapter has introduced an optimization strategy that has come to be known as<br />

multicode substitution. It employs profiling in order to identify sequences of bytecodes<br />

that are executed with great frequency. Once those sequences are identified,<br />

a new multicode is introduced for each sequence. This multicode provides identical<br />

functionality to the bytecode sequence it replaces but makes use of only a single<br />

opcode.<br />

The process of fetching and decoding bytecodes is a large source of overhead<br />

within a Java interpreter. A variety of bytecode dispatch techniques can be employed<br />

including a simple loop and switch statement or more advanced strategies such as<br />

threading. Performing the multicode substitution reduces the fetch and decode operations<br />

performed regardless of the bytecode dispatch strategy employed, resulting in<br />

improved application performance.<br />

Performance results were generated for six of the benchmarks in the SPEC JVM98<br />

benchmark suite. The amount of performance benefit achieved varied. In the worst<br />

case, only minor gains in performance were achieved with application run time being<br />

reduced to approximately 96 percent of the original run time. However, other<br />

benchmarks showed much better performance. Using only the number of transfers<br />

removed in order to identify bytecode sequences for multicode replacement resulted in<br />

application run times that were 76 percent of the original run time for 201 compress<br />

and 73 percent of the original run time for 222 mpegaudio.<br />

When multicode substitution is performed, new optimization opportunities become<br />

available that could not be exploited previously due to the presence of the<br />

intervening transfers of control. Some of these optimizations are performed automatically<br />

while others require knowledge about the behaviour of the virtual machine<br />

which is not known by the compiler. Using a timing based multicode selection strategy<br />

that included the impact of these optimizations in the multicode selection process<br />

resulted in application run times that were less than 70 percent of the original run<br />

time for the 222 mpegaudio benchmark.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!