Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

More documents

Recommendations

Info

84 5. Pre-Aggregation Support Beyond Basic Aggregate Operations approach, but it is considered negligible since approximations are good enough for many applications. In our approach, when two or more pre-aggregates qualify for computing a given scaling operation, we pick the pre-aggregate with the closest scale vector value to the one defined in the scaling operation. Example 5.1 – Assume the queries listed in Table 5.1 have been pre-aggregated, and suppose we want to compute the following query: q = scale(ras01, (4.0, 4.0, ⃗ 4.0), bi). From the list of available pre-aggregates, the query can be answered either by using p2 or p3. From these two pre-aggregates, p3 has the closest scale vector to q. Thus, q ′ = scale(p3, (0.87, 0.87, ⃗ 0.87), bi). Note that q ′ represents a rewritten scaling operation in terms of the pre-aggregate. ✷ Table 5.1. Sample Pre-Aggregates. Raster Object ID Raster Name Scale Vector Resampling Method p1 ras01 (2.0, 2.0, ⃗ 2.0) nn p2 ras01 (3.0, 3.0, ⃗ 3.0) bi p3 ras01 (3.5, 3.5, ⃗ 3.5) bi p4 ras01 (6.0, 6.0, ⃗ 6.0) bi The REWRITEOPERATION procedure returns for query q a query q ′ that has been rewritten in terms of a pre-aggregate identified with p id . The input of the algorithm is the scaling operation q and a set of pre-aggregates P . The algorithm looks for a PERFECT-MATCH between q and one of the elements in P . To this end, the algorithm verifies that the matching conditions listed in Section 5.2.2 are all satisfied. If a perfect match is found, it returns the identifier of the matched pre-aggregate. Otherwise, the algorithm verifies PARTIAL-MATCH conditions for all pre-aggregates in P . All qualified pre-aggregates are added to set S. In case of a partial matching, the algorithm finds the pre-aggregate with the scale vector closest to the one defined in Q. REWRITEQUERY rewrites the original query as a function of the selected preaggregate, and adjusts the values of the scale vector to perform the complementary scaling operation. The algorithm makes use of the following auxiliary functions. • FULLMATCH(q, P ). Verifies that all full-match conditions are satisfied. If no matching is found, it returns 0, else it returns the id of the matching preaggregate. • PARTIALMATCH(q, P ). Verifies that all partial-match conditions are satisfied. Each qualified pre-aggregate of P is added to set S. • CLOSESTSCALEVECTOR(q, S). Compares the scale vectors between q and the elements of S, and returns the identifier (p id ) of the pre-aggregate whose scale vector is the closest to that defined for q. • REWRITEQUERY(Q, p id ). Rewrites query q in terms of the selected pre-aggregate and adjusts the scale vector values accordingly.
5.5 Experimental Results 85 Algorithm 4 REWRITEOPERATION Require: A query q, and a set of pre-aggregates P 1: initialize S = {} , p id = 0 2: p id = fullMatch(q, P ) 3: if (p id == 0) then 4: S = partialMatch(q, P ) 5: p id = closestScaleV ector(q, S) 6: end if 7: q ′ = rewriteQuery(q, p id ) 8: return q ′ 5.5 Experimental Results Experiments were conducted to evaluate the effectiveness of the pre-aggregation selection and rewriting algorithms in supporting scaling operations. They were run on a machine with a 3.00 GHz Intel Pentium 4 processor, running SuSe Linux 9.1. The workstation had a total physical memory of 512 MB. The query workload consisted of scaling operations with different scaling vectors. Different data distributions of the query workload were also considered. Despite the growing popularity of Web mapping services for GIS raster information processing, very few studies have been undertaken that report on user behaviors while using those services. One of the primary reasons for lack of research in this area may be the limited availability of the datasets outside of specialized research groups. Moreover, while query patterns related to scaling operations on 2D datasets are difficult to find, no empirical workload distributions were found for datasets of higher dimensionalities. We therefore resorted to using a set of artificial distributions that cover many practical situations in GIS and remote-sensing imaging. Most pre-aggregation algorithms in OLAP and image pyramids assume a uniform distribution of the values given for the scale vector in the query workload, so we considered the same type of distribution for our experiments. Furthermore, we also considered a Poisson distribution of the scale vector values. The rationale is that such a distribution covers situations where the dataset is scaled down by factors that typically fall within a narrow range of scale vectors. For example, very large objects may need to be scaled down by large scale vectors so they can be efficiently transferred back and forth via Web services [77]. We also considered applications where the dataset is scaled down by the same scale vector, we refer to such access patter as a peak distribution. Finally, we investigated a step distribution that satisfies cases where scaling operations can be grouped within specific ranges of scale vectors. Our experiments were performed on datasets generated from three real-life rasterobjects: • Dataset R1. Consists of a 2D raster object with spatial domain [0 : 15359, 0 : 10239]. The dataset contains 600 tiles, each with a spatial domain of [0 : 512, 0 : 512]. The total number of cells composing the raster object is 157 millions.
Page 1:
Applying OLAP Pre-Aggregation Techn
Page 5 and 6:
Acknowledgments I would like to exp
Page 7 and 8:
Abstract Large multidimensional arr
Page 9 and 10:
Contents 1 Introduction and Problem
Page 11 and 12:
List of Figures 2.1 3D Array . . .
Page 13 and 14:
List of Tables 3.1 UNO and FAO Suit
Page 15 and 16:
Chapter 1 Introduction and Problem
Page 17 and 18:
Relevant and complementary question
Page 19 and 20:
1.2 Publications Related to this Th
Page 21 and 22:
Chapter 2 Background and Related Wo
Page 23 and 24:
2.1 Array Databases 17 Figure 2.2 s
Page 25 and 26:
2.1 Array Databases 19 toward the s
Page 27 and 28:
2.1 Array Databases 21 • Bilinear
Page 29 and 30:
2.1 Array Databases 23 given image
Page 31 and 32:
2.2 On-Line Analytical Processing (
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40: 2.3 Discussion 33 spatial-vector da
Page 41 and 42: 2.3 Discussion 35 • Both applicat
Page 43 and 44: Chapter 3 Fundamental Geo-Raster Op
Page 45 and 46: 3.2 Geo-Raster Operations 39 3.1.2
Page 47 and 48: 3.2 Geo-Raster Operations 41 multip
Page 49 and 50: 3.2 Geo-Raster Operations 43 Table
Page 51 and 52: 3.2 Geo-Raster Operations 45 turn i
Page 53 and 54: 3.2 Geo-Raster Operations 47 (a) Or
Page 55 and 56: 3.2 Geo-Raster Operations 49 Query
Page 57 and 58: 3.2 Geo-Raster Operations 51 contai
Page 59 and 60: 3.2 Geo-Raster Operations 53 is the
Page 61 and 62: 3.2 Geo-Raster Operations 55 3.2.4
Page 63 and 64: 3.2 Geo-Raster Operations 57 As in
Page 65 and 66: 3.2 Geo-Raster Operations 59 Local
Page 67 and 68: 3.3 Summary 61 Slicing The slicing
Page 69 and 70: Chapter 4 Answering Basic Aggregate
Page 71 and 72: 4.1 Framework 65 pre-aggregated res
Page 73 and 74: 4.2 Cost Model 67 By partitioning t
Page 75 and 76: 4.2 Cost Model 69 Cost of independe
Page 77 and 78: 4.3 Implementation 71 Algorithm 1 Q
Page 79 and 80: 4.4 Experimental Results 73 Query E
Page 81 and 82: 4.5 Summary 75 pre-aggregates: inde
Page 83 and 84: Chapter 5 Pre-Aggregation Support B
Page 85 and 86: 5.2 Conceptual Framework 79 Figure
Page 87 and 88: 5.2 Conceptual Framework 81 Benefit
Page 89: 5.4 Answering Scaling Operations Us
Page 93 and 94: 5.5 Experimental Results 87 (a) Que
Page 95 and 96: 5.5 Experimental Results 89 (a) Sel
Page 97 and 98: 5.5 Experimental Results 91 vectors
Page 99 and 100: 5.5 Experimental Results 93 root no
Page 101 and 102: 5.5 Experimental Results 95 Figure
Page 107 and 108: 5.6 Summary 101 we considered non-u
Page 109 and 110: Chapter 6 Conclusion One of the big
Page 111 and 112: 6.1 Future Work 105 more non-spatio
Page 113 and 114: Bibliography [1] Blakeley J. A., La
Page 115 and 116: BIBLIOGRAPHY 109 [22] Moon B., Vega
Page 117 and 118: BIBLIOGRAPHY 111 [47] ESRI Inc. Arc
Page 119 and 120: BIBLIOGRAPHY 113 [73] Stefanovic N.
Page 121: BIBLIOGRAPHY 115 [97] Kotidis Y. an
show all

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?