30.08.2014 Views

url - Universität zu Lübeck

url - Universität zu Lübeck

url - Universität zu Lübeck

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.3. INDEX SELECTION PROBLEM APPLIED TO KEYX INDEXES 109<br />

Definition 22 (Index Candidates)<br />

The index candidates are defined as a function ican : P → P(D) returning a set<br />

containing all possible index declarations for a given path expression p. The following<br />

definition combines and permutes the key nodes:<br />

ican(p) = {([k 1 , k 2 , . . . , k m ], value(p)) | k j ∈ key(p) ∧<br />

In total we have to consider ∑ m<br />

n=0<br />

1 ≤ j ≤ m ∧ 1 ≤ m ≤ |key(p)|} <br />

m!<br />

(m−n)!<br />

− 1 different possible indexes. As most of<br />

these indexes are dropped during ISP calculation, we call them index candidates<br />

of the path expression p. Index candidates are virtual and not materialized in the<br />

database.<br />

Example 16 A query o with the path expression<br />

p 5 = /dblp/article[author = ”X” and title = ”Y ”]<br />

has the following key and value nodes:<br />

key(p 5 ) = {/dlbp/article/author, /dlpb/article/title}<br />

value(p 5 ) = /dlbp/article<br />

All index candidates are listed below. As the order of key nodes matters, the first<br />

two index candidates of ican(p 5 ) are not equivalent.<br />

ican(p 5 ) = {i 1 p 5<br />

, i 2 p 5<br />

, i 3 p 5<br />

, i 4 p 5<br />

} with<br />

i 1 p 5<br />

= ([/dblp/article/author, /dblp/article/title], /dblp/article)<br />

i 2 p 5<br />

= ([/dblp/article/title, /dblp/article/author], /dblp/article)<br />

i 3 p 5<br />

= ([/dblp/article/author], /dblp/article)<br />

i 4 p 5<br />

= ([/dblp/article/title], /dblp/article) <br />

The two multi-key indexes i 1 p 5<br />

and i 2 p 5<br />

constitute the best suitable indexes for p 5<br />

as they reflect both key nodes. In contrast the two indexes i 3 p 5<br />

and i 4 p 5<br />

require<br />

additional processing of the referenced nodes, but are still more efficient than an<br />

evaluation of the plain path expression without an index. Please notice that the<br />

number of index candidates grows exponentially with the number of key nodes<br />

of a path expression and increases the costs of solving the ISP dramatically. A<br />

path expression with 4 key nodes will lead to 64 index candidates! Heuristics to<br />

decrease the computational expense have to start at this point by reducing the<br />

number of index candidates.<br />

To consider the whole workload we need to regard the index candidates of all<br />

database operations o ∈ W . This is done by unifying the index candidates of all<br />

operations to the total index candidate set.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!