url - Universität zu Lübeck

More documents

Recommendations

Info

5.4. QUERY PROCESSING 93 particular there are only 41 different values for the year element of the books. The selectivity of 1.0 for the /dblp/article/year path expression indicates that all values are the same. The selectivity for the other elements that have values in this DBLP fragment can be seen in the figure. The values r and sl are used when estimating the number of elements that correspond to a path expression to be evaluated. This number is the result of a multiplication of all r values on the path to the selected element. If the path expression contains a key comparison the value for its selectivity is multiplied additionally. Example 14 The path expression /dblp/article/author will lead to 1 · 535 · 1.69 = 904 selected elements. The same path expressions with a value comparison (/dblp/article/author[. = ′ x ′ ]) leads to 904·0.621 = 561 selected elements. This number is relatively high because the 535 articles in the selected DBLP fragment are written by only 342 different authors. The similar path expression /dblp/article/title[. = ′ x ′ ] leads to only 1 · 535 · 1 · 0.002 = 1.07 hits. Therefore, querying an article by its title is more than 300 times faster than querying it by the authors. If the query contains a wildcard (*) or the descendant axis (//) multiple extents must be regarded. The values for r can be summarized in order to get the final result. With a key in the path expression the r values of the different affected extents have to be weighted by the selectivity sl before summarizing them. Example 15 The path expression /dblp/ ∗ /<strong>url</strong> affects three <strong>url</strong> extents of books, articles and inproceedings. We calculate their numbers independently and summarize them afterwards. Therefore, the result of this path expression has the expected cardinality 1 · 540 · 0.18 + 1 · 535 · 0.04 + 1 · 26565 · 1.0 = 26684. The statistic DataGuide is a relatively simple but in most cases sufficient and efficient approach to estimate the cardinality of selected elements of a path expression. The particular value can be used in cost models for indexes and conventional XPath evaluation. The approach assumes statistical independence between elements in the XML data. If we have elements that are statistically dependent, for instance they are mutually exclusive the statistic DataGuide will lead to reduced precision: For instance, an element X has two children a and b. Half of the X elements have exactly one a child and the other half has exactly one b child. Therefore, no X element has both an a and b child. The statistic DataGuide would assign r = 0.5 for the a and b extent. The path expression //X[a and b] would lead to an estimated cardinality of |X| · 0.5 · 0.5 = |x| 4 indicating that a quarter of the X elements have both an a and b value.
94 CHAPTER 5. THE KEY-ORIENTED XML INDEX KEYX For this reason the statistic DataGuide is an early approach that needs further refinement. An important question is the updatability of this approach when the underlying XML data is modified. Anyhow, the problem of rating different indexes with additional statistical information occurs only if we have multiple indexes that are able to execute the query. Therefore, a huge amount of database application will probably perform well even without any statistical ranking of indexes. 5.4.5 Algorithm for the Query Execution The process of the query execution is organized in three phases: the selection of indexes, the key retrieval and the optional postprocessing. Figure 5.8 illustrates these phases. In the following we describe the phases with pseudo code. Figure 5.8: The three phases of the KeyX query execution 1. Phase 1: Index Selection The first phase is responsible for determining an index j that matches with the query q. First, the algorithm extracts the path expressions for the keys, qualifiers and return values of the index declaration and the query. If all path expressions are equal an optimal index is found and returned. If no such index exists the algorithm tries to find indexes that can be used due to the containment relationship. If only one subset index is found it is returned instantly. If multiple indexes are found the algorithm requests an advise from the statistic DataGuide.
Page 1 and 2:
Aus dem Institut für Informationss
Page 3 and 4:
Acknowledgments I would like to tha
Page 5 and 6:
2 CONTENTS 4 Introduction to Recent
Page 7 and 8:
4 CONTENTS 10.3 XML Schema . . . .
Page 9 and 10:
6 CHAPTER 1. INTRODUCTION Due to th
Page 11 and 12:
8 CHAPTER 2. FUNDAMENTALS In contra
Page 13 and 14:
10 CHAPTER 2. FUNDAMENTALS data is
Page 15 and 16:
12 CHAPTER 2. FUNDAMENTALS XML supp
Page 17 and 18:
14 CHAPTER 2. FUNDAMENTALS 2.2 Docu
Page 19 and 20:
16 CHAPTER 2. FUNDAMENTALS be const
Page 21 and 22:
18 CHAPTER 2. FUNDAMENTALS plies th
Page 23 and 24:
20 CHAPTER 2. FUNDAMENTALS the stru
Page 25 and 26:
22 CHAPTER 2. FUNDAMENTALS 2.3 XML
Page 27 and 28:
24 CHAPTER 2. FUNDAMENTALS Axes for
Page 29 and 30:
26 CHAPTER 2. FUNDAMENTALS Node Tes
Page 31 and 32:
28 CHAPTER 2. FUNDAMENTALS //item[c
Page 33 and 34:
30 CHAPTER 2. FUNDAMENTALS 21 i f (
Page 35 and 36:
32 CHAPTER 2. FUNDAMENTALS FLWOR-Ex
Page 37 and 38:
34 CHAPTER 2. FUNDAMENTALS 21 22 {
Page 39 and 40:
36 CHAPTER 2. FUNDAMENTALS 1 2 3
Page 41 and 42:
38 CHAPTER 2. FUNDAMENTALS 2.5 XML
Page 43 and 44:
40 CHAPTER 2. FUNDAMENTALS the valu
Page 45 and 46: 42 CHAPTER 2. FUNDAMENTALS tables a
Page 47 and 48: 44 CHAPTER 2. FUNDAMENTALS signific
Page 49 and 50: 46 CHAPTER 3. FORMAL MODELS FOR XML
Page 59 and 60: 56 CHAPTER 4. INTRODUCTION TO RECEN
Page 81 and 82: 78 CHAPTER 5. THE KEY-ORIENTED XML
Page 95: 92 CHAPTER 5. THE KEY-ORIENTED XML
Page 107 and 108: 104 CHAPTER 6. THE INDEX SELECTION
Page 127 and 128: 124 CHAPTER 7. THE XML INDEX UPDATE
Page 147 and 148:
144 CHAPTER 7. THE XML INDEX UPDATE
Page 149 and 150:
146 CHAPTER 7. THE XML INDEX UPDATE
Page 151 and 152:
148 CHAPTER 8. KEYX IMPLEMENTATION
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
158 CHAPTER 9. CONCLUSION AND FUTUR
Page 163 and 164:
160 CHAPTER 9. CONCLUSION AND FUTUR
Page 165 and 166:
162 CHAPTER 10. APPENDIX 23 relKeyP
Page 167 and 168:
164 CHAPTER 10. APPENDIX Title: On
Page 169 and 170:
166 CHAPTER 10. APPENDIX Title: The
Page 171 and 172:
168 CHAPTER 10. APPENDIX
Page 173 and 174:
170 BIBLIOGRAPHY [12] Alberto Capra
Page 175 and 176:
172 BIBLIOGRAPHY [37] Roy Goldman a
Page 177 and 178:
174 BIBLIOGRAPHY [59] Raghav Kaushi
Page 179 and 180:
176 BIBLIOGRAPHY [84] David G. Mitc
Page 181 and 182:
178 BIBLIOGRAPHY [113] W3 Schools.
Page 183 and 184:
180 Index D, 78, 108, 110, 111, 144
Page 185 and 186:
182 INDEX Oracle XML DB, 41 Parent,
show all

url - Universität zu Lübeck

Create successful ePaper yourself

Delete template?

Save as template?