13.10.2014 Views

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

OPTIMIZING THE JAVA VIRTUAL MACHINE INSTRUCTION SET BY ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

200<br />

9.2 Related String Problems<br />

The most contributory substring problem is related to two other well known string<br />

problems: the Greatest (or Longest) Common Substring Problem and the All Maximal<br />

Repeats Problem (AMRP). However it is distinct from each of these.<br />

While the Greatest Common Substring Problem identifies the longest substring<br />

common across all strings in a set, the Most Contributory Substring Problem identifies<br />

the string that contributes the greatest number of characters to the set of strings. One<br />

important distinction between these two problems is that there is no guarantee that<br />

the most contributory substring will occur in every string in the set while the greatest<br />

common substring problem guarantees that the substring identified is a substring<br />

of every string in the set. The greatest common substring problem also ignores<br />

multiple occurrences of the same substring within each string in the set while the<br />

Most Contributory Substring Problem counts all occurrences of the substring.<br />

A variation on the Greatest Common Substring Problem has also been presented<br />

that identifies the longest common substring to k strings in a set [56]. This problem<br />

is also distinct from the Most Contributory Substring Problem because it fails to<br />

consider the value of multiple occurrences of a substring within one string in the<br />

input set.<br />

The All Maximal Repeats Problem is concerned with finding a substring that<br />

occurs twice within a single larger string such that the characters to the immediate<br />

left and right of the first occurrence of the string are distinct from the character to<br />

the immediate left and right of the second occurrence. This is distinct from the Most<br />

Contributory Substring Problem which is intended to be applied to a set of strings.<br />

The AMRP is also concerned only with finding a substring that occurs twice while<br />

the most contributory substring may occur many times.<br />

9.3 Suffix Trees<br />

A substring w of a string s is defined to be a sequence of characters w 0 ...w n such that<br />

s i = w 0 , s i+1 = w 1 , ...s i+n = w n for some integer i. Equivalently, a substring w of a<br />

string s can be defined as a string for which there exist strings (possibly empty) p<br />

and q such that s = pwq. A suffix, w of a string s is defined to be a non empty string<br />

meeting the constraint s = pw for some (possibly empty) string p.<br />

Let L represent the set of m strings under consideration. A specific string within<br />

the set will be denoted as L i such that 0 ≤ i < m. Let Σ 0 ...Σ m−1 represent the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!