Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

More documents

Recommendations

Info

30 2. Background and Related Work HOLAP The intermediate architecture type, HOLAP, mixes the advantages offered by ROLAP and MOLAP. It takes advantage of the standardization level and the ability to manage large amounts of data from ROLAP implementations, and the query speed typical of MOLAP systems. For summary type information, HOLAP leverages cube technology and for drilling down into details it uses the ROLAP model. In HOLAP architecture, the largest amount of data should be stored in an RDBMS to avoid the problems caused by sparsity, and a multidimensional system should store only the information users most frequently need to access [68]. If that information is not enough to solve queries, then the system accesses the data managed by the relational system in a more transparent manner. 2.2.4 OLAP Pre-Aggregation OLAP systems require fast interactive multidimensional data analysis of aggregates. To fulfill this requirement, database systems frequently pre-compute aggregate views on some subset of dimensions and their corresponding hierarchies. Virtually all OLAP products resort to some degree of pre-computation of these aggregates, a process known as pre-aggregation. OLAP pre-aggregation techniques have proved to speed up aggregate queries by several orders of magnitude in business applications [31, 41]. A full pre-aggregation of all possible combinations of aggregate queries, however, is not considered feasible because it often exceeds the available storage limit and incurs a high maintenance cost. Therefore, modern OLAP systems adopt a partial pre-aggregation approach where only a set of aggregates are materialized so it can be re-used for efficiently computing other aggregates. Pre-aggregation techniques consist of three inter-related processes: view selection, query rewriting, and view maintenance. A view is a derived relation defined in terms of base relations. Views can be materialized by storing the tuples of a view in a database, as was first investigated in the 1980s [36]. Like a cache, a materialized view provides fast access to its data. However, a cache may get dirty whenever its underlying base relations are updated. The process of updating a materialized view in response to changes to its base data is called view maintenance [12]. View Selection Gupta et al. [13] proposed a framework that shows how to use materialized views to help answer aggregate queries. The framework provides a set of query rewriting rules to determine what materialized aggregate views can be employed to answer aggregate queries. An algorithm uses these rules to transform a query tree into an equivalent tree with some or all base relations replaced by materialized views. Thus, a query optimizer can choose the most efficient tree and provide the best query response time. Harinarayan et al. [92] investigated the issue of how to select views for materialization under storage space constraints so the average query cost is minimal. To meet changing user needs several dynamic pre-aggregation approaches have
2.2 On-Line Analytical Processing (OLAP) 31 been proposed. In principle, views may be either selected on demand or pre-selected using some prediction strategy. For applications where storage space is a constraint, replacement algorithms identify those views that can be replaced with new selections [60]. Kotidis et al. [97] introduced a dynamic view selection approach called Multidimensional Range Queries (MRQ), known as slice queries in OLAP, which use an on-demand fetching strategy. Within this approach, the level of detail or granularity is a compromise between the materialization of many small, highly specific queries, and the materialization of a few large queries followed by answering incoming queries at each stage, using the materialized queries. This approach, however, does not take into account user access patterns before making selections. The first work to consider user access information to evaluate potential queries to be materialized is presented in [26], where the author introduced PROMISE, an approach that predicts the structure and value of the next query based on the current query. Yao et al. [99] proposed a different approach for the materialization of dynamic views. A set of batch queries were rewritten using certain canonical queries so the total cost of execution could be reduced using the intermediate results for answering queries appearing later in the batch. This approach requires all queries to be precisely known before hand, and though the approach might work well in a particular database scenario, it might not be useful in dynamic OLAP, where it is extremely difficult to accurately predict the exact nature of future queries. View Maintenance In most cases it is wasteful to maintain a view by recomputing it from scratch. Materialized views are therefore maintained using an incremental approach [11]. Only the changes to be propagated to the materialized view are computed using the changes of the source relations [1, 33, 89]. At present, view maintenance has been investigated from these four dimensions [11]: • Information Dimension: Focuses on accessing the information required for view maintenance, such as base relations and the materialized view. • Modification Dimension: Focuses on the kinds of modifications e.g., insertions and deletions, that a view maintenance algorithm can handle. • Language Dimension: Addresses the problems related to the language of the views supported by the view maintenance algorithm. That is, what is the language of the views that can be maintained by the view maintenance algorithm? How are views expressed? Does the algorithm allow duplicates? • Instance Dimension: Considers the applicability of the algorithm to all or a specific set of instances of the database. View maintenance cost is the sum of the cost of propagating each base relation change to the affected materialized views. The sum can be weighted, where each weight indicates the frequency of propagations of the changes of the associated source
Page 1: Applying OLAP Pre-Aggregation Techn
Page 5 and 6: Acknowledgments I would like to exp
Page 7 and 8: Abstract Large multidimensional arr
Page 9 and 10: Contents 1 Introduction and Problem
Page 11 and 12: List of Figures 2.1 3D Array . . .
Page 13 and 14: List of Tables 3.1 UNO and FAO Suit
Page 15 and 16: Chapter 1 Introduction and Problem
Page 17 and 18: Relevant and complementary question
Page 19 and 20: 1.2 Publications Related to this Th
Page 21 and 22: Chapter 2 Background and Related Wo
Page 23 and 24: 2.1 Array Databases 17 Figure 2.2 s
Page 25 and 26: 2.1 Array Databases 19 toward the s
Page 27 and 28: 2.1 Array Databases 21 • Bilinear
Page 29 and 30: 2.1 Array Databases 23 given image
Page 31 and 32: 2.2 On-Line Analytical Processing (
Page 33 and 34: 2.2 On-Line Analytical Processing (
Page 35: 2.2 On-Line Analytical Processing (
Page 39 and 40: 2.3 Discussion 33 spatial-vector da
Page 41 and 42: 2.3 Discussion 35 • Both applicat
Page 43 and 44: Chapter 3 Fundamental Geo-Raster Op
Page 45 and 46: 3.2 Geo-Raster Operations 39 3.1.2
Page 47 and 48: 3.2 Geo-Raster Operations 41 multip
Page 49 and 50: 3.2 Geo-Raster Operations 43 Table
Page 51 and 52: 3.2 Geo-Raster Operations 45 turn i
Page 53 and 54: 3.2 Geo-Raster Operations 47 (a) Or
Page 55 and 56: 3.2 Geo-Raster Operations 49 Query
Page 57 and 58: 3.2 Geo-Raster Operations 51 contai
Page 59 and 60: 3.2 Geo-Raster Operations 53 is the
Page 61 and 62: 3.2 Geo-Raster Operations 55 3.2.4
Page 63 and 64: 3.2 Geo-Raster Operations 57 As in
Page 65 and 66: 3.2 Geo-Raster Operations 59 Local
Page 67 and 68: 3.3 Summary 61 Slicing The slicing
Page 69 and 70: Chapter 4 Answering Basic Aggregate
Page 71 and 72: 4.1 Framework 65 pre-aggregated res
Page 73 and 74: 4.2 Cost Model 67 By partitioning t
Page 75 and 76: 4.2 Cost Model 69 Cost of independe
Page 77 and 78: 4.3 Implementation 71 Algorithm 1 Q
Page 79 and 80: 4.4 Experimental Results 73 Query E
Page 81 and 82: 4.5 Summary 75 pre-aggregates: inde
Page 83 and 84: Chapter 5 Pre-Aggregation Support B
Page 85 and 86: 5.2 Conceptual Framework 79 Figure
Page 87 and 88:
5.2 Conceptual Framework 81 Benefit
Page 89 and 90:
5.4 Answering Scaling Operations Us
Page 91 and 92:
5.5 Experimental Results 85 Algorit
Page 93 and 94:
5.5 Experimental Results 87 (a) Que
Page 95 and 96:
5.5 Experimental Results 89 (a) Sel
Page 97 and 98:
5.5 Experimental Results 91 vectors
Page 99 and 100:
5.5 Experimental Results 93 root no
Page 101 and 102:
5.5 Experimental Results 95 Figure
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
5.6 Summary 101 we considered non-u
Page 109 and 110:
Chapter 6 Conclusion One of the big
Page 111 and 112:
6.1 Future Work 105 more non-spatio
Page 113 and 114:
Bibliography [1] Blakeley J. A., La
Page 115 and 116:
BIBLIOGRAPHY 109 [22] Moon B., Vega
Page 117 and 118:
BIBLIOGRAPHY 111 [47] ESRI Inc. Arc
Page 119 and 120:
BIBLIOGRAPHY 113 [73] Stefanovic N.
Page 121:
BIBLIOGRAPHY 115 [97] Kotidis Y. an
show all

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?