11.03.2014 Views

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 4<br />

Answering Basic Aggregate Queries<br />

Using <strong>Pre</strong>-Aggregated Data<br />

As discussed in previous chapters, aggregation is an important mechanism that allows<br />

users <strong>to</strong> extract general characterizations from very large reposi<strong>to</strong>ries of data. In this<br />

chapter, we study the effect of selecting a set of aggregate queries, compute their<br />

results and use them for subsequent query requests. In particular, we study the effect<br />

of pre-aggregation in computing aggregate queries in the field of GIS and remotesensing<br />

imaging applications.<br />

We introduce a pre-aggregation framework that distinguishes among different types<br />

of pre-aggregates for computing a query. We show that in most cases, several preaggregates<br />

may qualify for answering an aggregate query and address the problem of<br />

selecting the best pre-aggregate in terms of execution time. To this end, we introduce<br />

a model that measures the cost of using qualified pre-aggregates for the computation<br />

of a query. We then present an algorithm that selects the best pre-aggregate for computing<br />

a query. We measure the performance of our algorithms in an array database<br />

management system (RasDaMan), and show that our algorithms give much better performance<br />

over straightforward methods.<br />

4.1 Framework<br />

Most major database management systems allow the user <strong>to</strong> s<strong>to</strong>re query results<br />

through a process known as view materialization. The query optimizer may then au<strong>to</strong>matically<br />

use the materialized data <strong>to</strong> speed up the evaluation of a new query. Queries<br />

that benefit from using materialized data are those that involve the summarization of<br />

large amounts of data. They are known as aggregate queries because their query statements<br />

include one or more aggregate functions. The ANSI SQL:2008 standard defines<br />

a wide variety of aggregate functions including: COUNT, SUM, AVG, MAX, MIN,<br />

EVERY, ANY, SOME, VAR POP, VAR SAMP, STDDEV POP, STDDEV SAMP, AR-<br />

RAY AGG, REGR COUNT, COVAR POP, COVAR SAMP, CORR, REGR R2, REGR SLOPE,<br />

and REGR INTER-CEPT [20].<br />

63

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!