url - Universität zu Lübeck

More documents

Recommendations

Info

7.2. INTERSECTION OF TWO PATH EXPRESSIONS 129 7.2.2 Automaton for Mod(p) The general idea of the intersection algorithm is to build two finite automata A and A ′ with A accepting Mod(p) and A ′ accepting Mod(p ′ ) with p, p ′ ∈ P labs two absolute linear path expressions. Having A and A ′ we build the product automaton B . The emptiness of the intersection of p and p’ is a property of B. Unfortunately, finite automata are defined on the basis of a finite alphabet. In contrast, path expressions operate on XML data with an infinite alphabet because the node labels are not limited. In section 3.2.1 we showed that a linear path expression p ∈ P labs can be transformed into a regular expression r ∈ REG Σ,α with Σ = Σ(p) and α ∉ Σ an arbitrary new symbol. Lemma 3 For any regular expression r ∈ REG Σ,α , respectively its language L r , one can construct a finite automaton A that decides whether an input string is a word of L r or not. The proof of the lemma is omitted here because it is basic knowledge in theory of finite automata. The proof can be found, for instance, in [52]. When reading an XML data as input for a finite automaton we have the following problem: The automaton expects a string of several symbols in a defined order. In tree-like XML data a node may have several children so that the next symbol (element label) is not defined unambigiously. Therefore, we define a function path leaf : T → P(string) that extracts all paths (sequences of nodes) from the root node to each leaf element node in the XML data. Text nodes are ignored as they are not affected by linear path expressions. The paths are returned as strings built from the labels of the contained nodes. The function is defined as follows: Definition 28 (Function path leaf ) path leaf (t) = path(t.root) { path(n) = n.label : n.children = ∅ n.label + ”; ” + {path(c)|c ∈ n.children} : otherwise with t ∈ T and n ∈ N; + denotes the concatenation of strings with a + ”; ”{b, c, d} = {a; b, a; c, a; d}. The semicolon is a delimiter used to distinguish different element label (e.g. a; b ≠ ab).
130 CHAPTER 7. THE XML INDEX UPDATE PROBLEM Example 22 The XML data t 1 with t 1 = 1 2 3 text 4 5 6 7 8 has the following paths to leaf element nodes: path leaf (t 1 ) = {a; b, a; c; d}. Lemma 4 n.label with n ∈ t.nodes appears at least once in a path ∈ path leaf (). PROOF (by contradiction) We assume that n is a node of t not appearing in any path in t. Because path leaf (t) contains the paths to leaf nodes, n cannot be a leaf node. Therefore, n must have at least one child node c. c or one descendant of c is a leaf node because t is finite and a tree without circles. This leaf is called l. Because each node in t has exactly one parent, n is in the path from l to t.root and reverse from t.root to l. Now we are able to extract all paths from the root node to leaf nodes as strings of symbols ∈ Σ t = {n.label|n = t.nodes}. But because Σ t may contain symbols that are not in Σ(p) the automaton that is defined by the regular expression REG Σ,α has no transitions for symbols s ∈ Σ t \Σ(p). With a further function rename : P(string) → P(string) we change all symbols s ∈ Σ t \Σ(p) of the strings in path leaf to α. Definition 29 (Renaming function) rename(S, Σ, α) = {rename(s)|s ∈ S} rename(s) = s (s 1 ) + rename s (s 2 ) + ... + rename s (s n ) rename { rename s (s) = s : s ∈ Σ α : otherwise with S a set of strings, s one string consisting of the sequence of n symbols s 1 ,s 2 ...s n . Technically, the function rename is a homomorphism that substitutes a particular string for each symbol. The resulting strings have an alphabet restricted to Σ(p)∪{α} and can be processed by a finite automaton. Next, we show that an XML data t is in Mod(p) if and only if at least one path of t is a word of L r . Formally this means:
Page 1 and 2:
Aus dem Institut für Informationss
Page 3 and 4:
Acknowledgments I would like to tha
Page 5 and 6:
2 CONTENTS 4 Introduction to Recent
Page 7 and 8:
4 CONTENTS 10.3 XML Schema . . . .
Page 9 and 10:
6 CHAPTER 1. INTRODUCTION Due to th
Page 11 and 12:
8 CHAPTER 2. FUNDAMENTALS In contra
Page 13 and 14:
10 CHAPTER 2. FUNDAMENTALS data is
Page 15 and 16:
12 CHAPTER 2. FUNDAMENTALS XML supp
Page 17 and 18:
14 CHAPTER 2. FUNDAMENTALS 2.2 Docu
Page 19 and 20:
16 CHAPTER 2. FUNDAMENTALS be const
Page 21 and 22:
18 CHAPTER 2. FUNDAMENTALS plies th
Page 23 and 24:
20 CHAPTER 2. FUNDAMENTALS the stru
Page 25 and 26:
22 CHAPTER 2. FUNDAMENTALS 2.3 XML
Page 27 and 28:
24 CHAPTER 2. FUNDAMENTALS Axes for
Page 29 and 30:
26 CHAPTER 2. FUNDAMENTALS Node Tes
Page 31 and 32:
28 CHAPTER 2. FUNDAMENTALS //item[c
Page 33 and 34:
30 CHAPTER 2. FUNDAMENTALS 21 i f (
Page 35 and 36:
32 CHAPTER 2. FUNDAMENTALS FLWOR-Ex
Page 37 and 38:
34 CHAPTER 2. FUNDAMENTALS 21 22 {
Page 39 and 40:
36 CHAPTER 2. FUNDAMENTALS 1 2 3
Page 41 and 42:
38 CHAPTER 2. FUNDAMENTALS 2.5 XML
Page 43 and 44:
40 CHAPTER 2. FUNDAMENTALS the valu
Page 45 and 46:
42 CHAPTER 2. FUNDAMENTALS tables a
Page 47 and 48:
44 CHAPTER 2. FUNDAMENTALS signific
Page 49 and 50:
46 CHAPTER 3. FORMAL MODELS FOR XML
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
56 CHAPTER 4. INTRODUCTION TO RECEN
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82: 78 CHAPTER 5. THE KEY-ORIENTED XML
Page 107 and 108: 104 CHAPTER 6. THE INDEX SELECTION
Page 127 and 128: 124 CHAPTER 7. THE XML INDEX UPDATE
Page 131: 128 CHAPTER 7. THE XML INDEX UPDATE
Page 151 and 152: 148 CHAPTER 8. KEYX IMPLEMENTATION
Page 161 and 162: 158 CHAPTER 9. CONCLUSION AND FUTUR
Page 163 and 164: 160 CHAPTER 9. CONCLUSION AND FUTUR
Page 165 and 166: 162 CHAPTER 10. APPENDIX 23 relKeyP
Page 167 and 168: 164 CHAPTER 10. APPENDIX Title: On
Page 169 and 170: 166 CHAPTER 10. APPENDIX Title: The
Page 171 and 172: 168 CHAPTER 10. APPENDIX
Page 173 and 174: 170 BIBLIOGRAPHY [12] Alberto Capra
Page 175 and 176: 172 BIBLIOGRAPHY [37] Roy Goldman a
Page 177 and 178: 174 BIBLIOGRAPHY [59] Raghav Kaushi
Page 179 and 180: 176 BIBLIOGRAPHY [84] David G. Mitc
Page 181 and 182: 178 BIBLIOGRAPHY [113] W3 Schools.
Page 183 and 184:
180 Index D, 78, 108, 110, 111, 144
Page 185 and 186:
182 INDEX Oracle XML DB, 41 Parent,
show all

url - Universität zu Lübeck

Create successful ePaper yourself

Delete template?

Save as template?