13.10.2014 Views

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

147<br />

number of bytecodes that will be impacted by performing the substitution. However,<br />

this algorithm can use any scoring system that assigns a non-negative integer number<br />

to each candidate sequence, with better sequences receiving a higher number than<br />

poorer sequences. The overall effectiveness of the multicode identification algorithm<br />

is based on the accuracy of the scoring function. Section 7.3.5 describes one of many<br />

possible alternative scoring functions that could be employed.<br />

As was observed in the previous section, once a multicode has been identified the<br />

set of candidate sequences must be recomputed before the next multicode can be identified.<br />

However, this recomputation can be done incrementally and quite efficiently.<br />

In particular, a set of update operations are performed that remove occurrences of<br />

target candidate sequences from the candidate sequence data-structure. In addition,<br />

all occurrences of sequences that either partially or completely overlap with this sequence<br />

must also be removed from future consideration. Most importantly, the counts<br />

of remaining candidate sequences must be adjusted to reflect the removal of the target<br />

candidate sequence. The algorithm used to accomplish these tasks is described in<br />

Figure 7.5.<br />

The algorithm identifies all of the sequences which contain at least one occurrence<br />

of candidate sequence BestSeq, presumably identified previously using the algorithm<br />

presented in Figure 7.4. Each sequence containing BestSeq is processed, starting<br />

with the longest sequence and proceeding to shorter and shorter sequences. When<br />

a sequence contains BestSeq, the count for that sequence, e, and the counts for all<br />

subsequences of e, denoted by e ′ , are reduced by the count for e. However, while this<br />

successfully removes all of the bytecodes used to represent occurrences of BestSeq,<br />

it also reduces the counts for some subsequences of e that are not impacted by the<br />

selection of BestSeq.<br />

In order to correct the counts for those sequences which should not have changed,<br />

prefix and suffix sequences are determined which represent those bytecodes that occur<br />

before and after the first occurrence of BestSeq in e respectively. The count that was<br />

associated with e is added back to each of the subsequences of the prefix and suffix,<br />

resulting in a net change of zero in their counts. As a result, this algorithm successfully<br />

updates the data structures generate during the determination of BestSeq,<br />

removing all occurrences of that sequence and its subsequences without impacting the<br />

counts associated with any other bytecode sequences. When e contains two or more<br />

occurrences of BestSeq, the additional occurrences will reside in the suffix sequence.<br />

The count for this sequence will be increment just like any other suffix sequence.<br />

This is not a problem because the occurrence of BestSeq contained within the suffix

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!