url - Universität zu Lübeck
url - Universität zu Lübeck
url - Universität zu Lübeck
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 5<br />
The Key-Oriented XML Index<br />
KeyX<br />
In this chapter we introduce a new approach for indexing XML data formally and<br />
by examples. Our approach - called KeyX - is motivated by the selective index<br />
structures used within the relational world. Relational indexes are defined upon<br />
a specific table and one (or multiple) columns. Only queries that operate on these<br />
columns can be accelerated with this index. Therefore, a relational index is selective<br />
to specific queries.<br />
Like relational indexes, KeyX is based on keys - the values of elements and attributes<br />
which are accessed by a specific path expression. The path expression<br />
can be part of an XQuery or XUpdate operation.<br />
For a set of frequent queries 1 the relevant keys are extracted from the original<br />
XML data and stored in a search structure optimized for efficient key retrieval.<br />
Those search structures include hashtables, tries, binary search trees, B + Trees<br />
for disk resident indexes, or any other data structure that is capable of storing<br />
and retrieving keys.<br />
An index is defined by the ’shape’ of the path expression to be optimized. After<br />
materializing the index, further queries with a matching shape are processed<br />
by the index - with logarithmic instead of linear complexity. For real databases<br />
with a size of several megabytes a set of suitable indexes implies an acceleration<br />
factor of many magnitudes.<br />
KeyX can also be used to accelerate specific navigational queries. In contrast to<br />
structural summaries like Strong DataGuides and APEX our indexing approach<br />
is defined for a set of frequent navigational queries 1 . A selective structure index<br />
consumes less space and can be tuned for update issues.<br />
In the following we introduce KeyX formally and by examples. We prove the quality<br />
of this approach by performance measurements.<br />
1 Frequent queries can be defined by a database administrator or by tools that analyze the workload<br />
of the database.