Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University
Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University
Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2.2 On-Line Analytical Processing (<strong>OLAP</strong>) 25<br />
2.1.7 Summary<br />
Array database theory is gradually entering its consolidation phase. The notion<br />
of arrays as functions mapping points of some hypercube-shaped domain <strong>to</strong> values<br />
of some range set is commonly accepted. Two main modeling paradigms are used:<br />
calculus and algebra. Multidimensional data models embed arrays in<strong>to</strong> the relational<br />
world, either by providing conceptual stubs like Array Algebra, or by adding relational<br />
capabilities explicitly such as AQL and RAM. Notably, aggregate query processing<br />
plays a critical role given the large volumes of the arrays. Our study shows<br />
that pre-aggregation techniques focus only on 2D datasets, and that support is limited<br />
<strong>to</strong> one particular operation: scaling. We distinguish the pyramid approach as the<br />
most popular method for speeding up scaling operations on 2D datasets; despite its<br />
known limitations such as hard-wired interpolation and lack of support for datasets of<br />
higher dimensions. Advances on hardware graphics are enabling quicker and more<br />
accurate visualization and navigation capabilities for raster imagery. However, little<br />
work has been reported on how array database technology is progressively exploiting<br />
these hardware advances. A critical gap with respect <strong>to</strong> pre-aggregation is the lack of<br />
support for aggregate operations other than 2D scaling.<br />
2.2 On-Line Analytical Processing (<strong>OLAP</strong>)<br />
Data warehousing/<strong>OLAP</strong> is an application domain where complex multidimensional<br />
aggregates on large databases have been studied intensively. Typically, a data<br />
warehouse collects business data from one or multiple sources so that the desired financial,<br />
marketing, and business analyses can be performed. These kinds of analyses<br />
can detect trends and anomalies, make projections, and make business decisions<br />
[41]. When such analysis predominantly involves aggregate queries, it is called<br />
on-line analytical processing, or <strong>OLAP</strong> [38, 39]. To understand the mechanism of<br />
pre-computation, the following subsections review different approaches <strong>to</strong> structuring<br />
multidimensional data, s<strong>to</strong>rage mechanisms and operations in <strong>OLAP</strong>.<br />
2.2.1 <strong>OLAP</strong> Data model<br />
The multidimensional <strong>OLAP</strong> model begins with the observation that the fac<strong>to</strong>rs<br />
that influence decision-making processes are related <strong>to</strong> enterprise-specific facts, such<br />
as sales, shipments, hospital admissions, surgeries, and so on. [68]. Instances of a<br />
fact subsequently correspond <strong>to</strong> events that occur. For example, every sale or shipment<br />
carried out is an event. Each fact is described by the values of a set of relevant<br />
measures providing quantitative descriptions of events, e.g., sales receipts, amounts<br />
shipped, hospital admission costs, and surgery times are all measures.<br />
In <strong>OLAP</strong>, information is viewed conceptually as cubes that consist of descriptive<br />
categories (dimensions) and quantitative values (measures) [26, 81, 69, 83]. In the scientific<br />
literature, measures are at times called variables, metrics, properties, attributes,<br />
or indica<strong>to</strong>rs. Figure 2.7 illustrates a 3D <strong>OLAP</strong> data cube where business events