11.03.2014 Views

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

30 2. Background and Related Work<br />

H<strong>OLAP</strong><br />

The intermediate architecture type, H<strong>OLAP</strong>, mixes the advantages offered by R<strong>OLAP</strong><br />

and M<strong>OLAP</strong>. It takes advantage of the standardization level and the ability <strong>to</strong> manage<br />

large amounts of data from R<strong>OLAP</strong> implementations, and the query speed typical of<br />

M<strong>OLAP</strong> systems. For summary type information, H<strong>OLAP</strong> leverages cube technology<br />

and for drilling down in<strong>to</strong> details it uses the R<strong>OLAP</strong> model. In H<strong>OLAP</strong> architecture,<br />

the largest amount of data should be s<strong>to</strong>red in an RDBMS <strong>to</strong> avoid the problems<br />

caused by sparsity, and a multidimensional system should s<strong>to</strong>re only the information<br />

users most frequently need <strong>to</strong> access [68]. If that information is not enough <strong>to</strong> solve<br />

queries, then the system accesses the data managed by the relational system in a more<br />

transparent manner.<br />

2.2.4 <strong>OLAP</strong> <strong>Pre</strong>-<strong>Aggregation</strong><br />

<strong>OLAP</strong> systems require fast interactive multidimensional data analysis of aggregates.<br />

To fulfill this requirement, database systems frequently pre-compute aggregate<br />

views on some subset of dimensions and their corresponding hierarchies. Virtually<br />

all <strong>OLAP</strong> products resort <strong>to</strong> some degree of pre-computation of these aggregates,<br />

a process known as pre-aggregation. <strong>OLAP</strong> pre-aggregation techniques have<br />

proved <strong>to</strong> speed up aggregate queries by several orders of magnitude in business applications<br />

[31, 41]. A full pre-aggregation of all possible combinations of aggregate<br />

queries, however, is not considered feasible because it often exceeds the available s<strong>to</strong>rage<br />

limit and incurs a high maintenance cost. Therefore, modern <strong>OLAP</strong> systems adopt<br />

a partial pre-aggregation approach where only a set of aggregates are materialized so<br />

it can be re-used for efficiently computing other aggregates.<br />

<strong>Pre</strong>-aggregation techniques consist of three inter-related processes: view selection,<br />

query rewriting, and view maintenance. A view is a derived relation defined in terms<br />

of base relations. Views can be materialized by s<strong>to</strong>ring the tuples of a view in a<br />

database, as was first investigated in the 1980s [36]. Like a cache, a materialized<br />

view provides fast access <strong>to</strong> its data. However, a cache may get dirty whenever its<br />

underlying base relations are updated. The process of updating a materialized view in<br />

response <strong>to</strong> changes <strong>to</strong> its base data is called view maintenance [12].<br />

View Selection<br />

Gupta et al. [13] proposed a framework that shows how <strong>to</strong> use materialized views <strong>to</strong><br />

help answer aggregate queries. The framework provides a set of query rewriting rules<br />

<strong>to</strong> determine what materialized aggregate views can be employed <strong>to</strong> answer aggregate<br />

queries. An algorithm uses these rules <strong>to</strong> transform a query tree in<strong>to</strong> an equivalent<br />

tree with some or all base relations replaced by materialized views. Thus, a query<br />

optimizer can choose the most efficient tree and provide the best query response time.<br />

Harinarayan et al. [92] investigated the issue of how <strong>to</strong> select views for materialization<br />

under s<strong>to</strong>rage space constraints so the average query cost is minimal.<br />

To meet changing user needs several dynamic pre-aggregation approaches have

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!