Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University
Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University
Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 4<br />
Answering Basic Aggregate Queries<br />
Using <strong>Pre</strong>-Aggregated Data<br />
As discussed in previous chapters, aggregation is an important mechanism that allows<br />
users <strong>to</strong> extract general characterizations from very large reposi<strong>to</strong>ries of data. In this<br />
chapter, we study the effect of selecting a set of aggregate queries, compute their<br />
results and use them for subsequent query requests. In particular, we study the effect<br />
of pre-aggregation in computing aggregate queries in the field of GIS and remotesensing<br />
imaging applications.<br />
We introduce a pre-aggregation framework that distinguishes among different types<br />
of pre-aggregates for computing a query. We show that in most cases, several preaggregates<br />
may qualify for answering an aggregate query and address the problem of<br />
selecting the best pre-aggregate in terms of execution time. To this end, we introduce<br />
a model that measures the cost of using qualified pre-aggregates for the computation<br />
of a query. We then present an algorithm that selects the best pre-aggregate for computing<br />
a query. We measure the performance of our algorithms in an array database<br />
management system (RasDaMan), and show that our algorithms give much better performance<br />
over straightforward methods.<br />
4.1 Framework<br />
Most major database management systems allow the user <strong>to</strong> s<strong>to</strong>re query results<br />
through a process known as view materialization. The query optimizer may then au<strong>to</strong>matically<br />
use the materialized data <strong>to</strong> speed up the evaluation of a new query. Queries<br />
that benefit from using materialized data are those that involve the summarization of<br />
large amounts of data. They are known as aggregate queries because their query statements<br />
include one or more aggregate functions. The ANSI SQL:2008 standard defines<br />
a wide variety of aggregate functions including: COUNT, SUM, AVG, MAX, MIN,<br />
EVERY, ANY, SOME, VAR POP, VAR SAMP, STDDEV POP, STDDEV SAMP, AR-<br />
RAY AGG, REGR COUNT, COVAR POP, COVAR SAMP, CORR, REGR R2, REGR SLOPE,<br />
and REGR INTER-CEPT [20].<br />
63