url - Universität zu Lübeck
url - Universität zu Lübeck
url - Universität zu Lübeck
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 4<br />
Introduction to Recent<br />
Approaches in XML Indexing<br />
In this section we classify and describe recent approaches indexing XML and<br />
semistructured data. Some approaches were published before XML gained the<br />
current importance and generally operate on semistructured data. We transferred<br />
these approaches to XML.<br />
The basic idea of an index for semistructured data and XML is to accelerate the<br />
execution of path expressions, for instance XPath. The more complex XQuery<br />
expressions benefit from an index, too, because XQuery relies on the execution of<br />
XPath expressions for addressing the nodes of the sequences.<br />
All indexing approaches have in common that they try to avoid the linear inspection<br />
of XML nodes when performing node tests or checking predicates. For<br />
instance, when evaluating the XPath expression //item[/name=’MP3 Player’]<br />
every element is treated as if it has the label item or not. Second, for each item<br />
element all children are checked whether they have the label name. Third, for all<br />
name elements the corresponding text value is compared with the given string. For<br />
larger databases this evaluation method leads to unacceptable processing times.<br />
Although all indexing approaches have the same goal, their methodology, the<br />
internal data structures, and the query processing vary significantly. For this<br />
reason we establish some criteria in order to classify and compare the related<br />
work on XML indexing.<br />
Some index approaches index the structure of the XML data without regarding the<br />
value of elements or attributes. These approaches are called structural indexes or<br />
pure-path indexes. On the other hand some indexes cover only the value of elements<br />
and attributes without reflecting the leading path to these values; these<br />
approaches are called value indexes. Advanced approaches cover both structure<br />
and values leading to an acceleration of more general and realistic path expres-