11.03.2014 Views

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

22 2. Background and Related Work<br />

on the output raster. Other resampling methods such as bilinear and cubic interpolation<br />

consider a subset of cells <strong>to</strong> calculate each of the cell values in the output rasters.<br />

Fig. 2.5 shows three common options for interpolating output cell values. Note that<br />

the bold outline (center image) indicates the current target cell for which a value is<br />

being interpolated.<br />

(a) Portion of<br />

original raster<br />

(b) Portion of<br />

output raster<br />

(c) Input cells used by common<br />

resampling methods<br />

Figure 2.5. Nearest Neighbor, Bilinear and Cubic Interpolation Methods<br />

A characteristic of the pyramid approach is that it increases the size of a raster<br />

dataset by approximately 33 percent. This is because the additionally reduced resolution<br />

representations are s<strong>to</strong>red in the system <strong>to</strong>gether with the original dataset. This is<br />

offset, however, by the increasing response time obtained in return. The choice of resampling<br />

method for constructing the pyramid is influenced by the data characteristics<br />

and type of analysis performed on the data. For example, visual appearance of remote<br />

sensing imagery is best using nearest-neighbor resampling, whereas scientific interpretation<br />

may require cubic interpolation. Rasters representing categorical data e.g.,<br />

land use data, do not allow interpolation since it is important that original data values<br />

remain unchanged; hence only nearest-neighbor resampling can be applied [64].<br />

The reason why categorical data should not be interpolated is because intermediate<br />

terms cannot be derived with meaningful results. For example, soil type data cannot<br />

be interpolated since a soil type 14 and a soil type 15 cannot sensibly be averaged<br />

<strong>to</strong> derive a soil type 14.5. Creating pyramids for different resampling methods is not<br />

efficient due <strong>to</strong> the additional resources required for s<strong>to</strong>rage and maintenance. Thus,<br />

the hard-wired resampling approach possess significant flexibility limitations <strong>to</strong> users<br />

when analytic objectives diverge.<br />

Fast retrieval of raster image datasets has also been investigated in distributed<br />

database systems. Kitamo<strong>to</strong> [14] proposed a caching mechanism that allows twodimensional<br />

satellite imagery <strong>to</strong> be cached with minimum resolution <strong>to</strong> provide a<br />

coarse view of the images in distributed satellite image databases. The cache management<br />

problem is treated as the knapsack problem [14], where the relevance and size<br />

of the data is considered <strong>to</strong> determine if the data will be cached or not. Additionally,<br />

access patterns influence the relevance of the data. The frequency of requests for a

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!