Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

More documents

Recommendations

Info

22 2. Background and Related Work on the output raster. Other resampling methods such as bilinear and cubic interpolation consider a subset of cells to calculate each of the cell values in the output rasters. Fig. 2.5 shows three common options for interpolating output cell values. Note that the bold outline (center image) indicates the current target cell for which a value is being interpolated. (a) Portion of original raster (b) Portion of output raster (c) Input cells used by common resampling methods Figure 2.5. Nearest Neighbor, Bilinear and Cubic Interpolation Methods A characteristic of the pyramid approach is that it increases the size of a raster dataset by approximately 33 percent. This is because the additionally reduced resolution representations are stored in the system together with the original dataset. This is offset, however, by the increasing response time obtained in return. The choice of resampling method for constructing the pyramid is influenced by the data characteristics and type of analysis performed on the data. For example, visual appearance of remote sensing imagery is best using nearest-neighbor resampling, whereas scientific interpretation may require cubic interpolation. Rasters representing categorical data e.g., land use data, do not allow interpolation since it is important that original data values remain unchanged; hence only nearest-neighbor resampling can be applied [64]. The reason why categorical data should not be interpolated is because intermediate terms cannot be derived with meaningful results. For example, soil type data cannot be interpolated since a soil type 14 and a soil type 15 cannot sensibly be averaged to derive a soil type 14.5. Creating pyramids for different resampling methods is not efficient due to the additional resources required for storage and maintenance. Thus, the hard-wired resampling approach possess significant flexibility limitations to users when analytic objectives diverge. Fast retrieval of raster image datasets has also been investigated in distributed database systems. Kitamoto [14] proposed a caching mechanism that allows twodimensional satellite imagery to be cached with minimum resolution to provide a coarse view of the images in distributed satellite image databases. The cache management problem is treated as the knapsack problem [14], where the relevance and size of the data is considered to determine if the data will be cached or not. Additionally, access patterns influence the relevance of the data. The frequency of requests for a
2.1 Array Databases 23 given image and its resulting popularity rank are included in the strategy for caching selection. Prediction of user access patterns is not considered, however. More recently, methods exploiting the capabilities of modern graphics hardware have been applied to the organization and processing of large amounts of satellite imagery. For example, Boettger et al. presented a method based on the concepts of perspective and complex logarithm [90] for visualization and navigation of satellite and aerial imagery [50]. Datasets are decomposed into tiles of different sizes and levels of resolution according to a pre-defined area of interest. The tiles closer to the center of interest have higher resolution, whereas low-resolution tiles are created for parts further away. The resulting tiles are indexed and cached into the memory of the graphics hardware, enabling quick access to the area of interest with the best available resolution. When the center of interest is changed, tiles not yet available in graphics memory are loaded. Based on the assumption that the graphics memory offers more space than needed, the cache contains not only the tiles that conform to the area of interest, but those that presumably will be needed in the future. 2.1.6 Pre-Aggregation Beyond 2D Geographic phenomena can be examined at different granularities. This includes different spatial perspectives and temporal views. Earth remote sensing imagery can be treated as time-series data to study/track changes over time. For example, a user looking at changes in vegetation patterns over a certain region during the past 10 years can see their effect on the regional maps over that time period. Fig. 2.6 shows various instances of scaling operations on 3D image time-series. Figure 2.6(a) shows the original dataset, which consists of two spatial dimensions (dim 1, dim 2), and one temporal dimension (dim 3). Figure 2.6(b) shows the original dataset scaled down along the two spatial dimensions. Figure 2.6(c) shows a scaling operation along the time dimension of the original dataset. Figure 2.6(d) shows the original dataset scaled down in the spatial and temporal dimensions. Shifts in temporal detail have been studied in various application domains [18, 22, 43]. At the time of this writing, there is little support for zooming with respect to time in GIS technology: the focus has been set on studying such alterations with respect to the geometric (vector) properties of objects [54, 58, 59]. Datasets in environmental observation and climate modeling are often defined over 4-D spatio-temporal space of the form (x,y,z,t), possibly extended with topology relationships. Scaling operations are also critical for these kinds of applications due to the size and dimensionality of the data. Extremely large volumes of data are generated during climate simulations. While only one part might be needed for a specific data analysis, huge data volumes are moved. This is particularly true for time-series data analysis. At the time of this writing, however, 4D scaling operations are not supported for GIS and remote-sensing imaging applications.
Page 1: Applying OLAP Pre-Aggregation Techn
Page 5 and 6: Acknowledgments I would like to exp
Page 7 and 8: Abstract Large multidimensional arr
Page 9 and 10: Contents 1 Introduction and Problem
Page 11 and 12: List of Figures 2.1 3D Array . . .
Page 13 and 14: List of Tables 3.1 UNO and FAO Suit
Page 15 and 16: Chapter 1 Introduction and Problem
Page 17 and 18: Relevant and complementary question
Page 19 and 20: 1.2 Publications Related to this Th
Page 21 and 22: Chapter 2 Background and Related Wo
Page 23 and 24: 2.1 Array Databases 17 Figure 2.2 s
Page 25 and 26: 2.1 Array Databases 19 toward the s
Page 27: 2.1 Array Databases 21 • Bilinear
Page 31 and 32: 2.2 On-Line Analytical Processing (
Page 39 and 40: 2.3 Discussion 33 spatial-vector da
Page 41 and 42: 2.3 Discussion 35 • Both applicat
Page 43 and 44: Chapter 3 Fundamental Geo-Raster Op
Page 45 and 46: 3.2 Geo-Raster Operations 39 3.1.2
Page 47 and 48: 3.2 Geo-Raster Operations 41 multip
Page 49 and 50: 3.2 Geo-Raster Operations 43 Table
Page 51 and 52: 3.2 Geo-Raster Operations 45 turn i
Page 53 and 54: 3.2 Geo-Raster Operations 47 (a) Or
Page 55 and 56: 3.2 Geo-Raster Operations 49 Query
Page 57 and 58: 3.2 Geo-Raster Operations 51 contai
Page 59 and 60: 3.2 Geo-Raster Operations 53 is the
Page 61 and 62: 3.2 Geo-Raster Operations 55 3.2.4
Page 63 and 64: 3.2 Geo-Raster Operations 57 As in
Page 65 and 66: 3.2 Geo-Raster Operations 59 Local
Page 67 and 68: 3.3 Summary 61 Slicing The slicing
Page 69 and 70: Chapter 4 Answering Basic Aggregate
Page 71 and 72: 4.1 Framework 65 pre-aggregated res
Page 73 and 74: 4.2 Cost Model 67 By partitioning t
Page 75 and 76: 4.2 Cost Model 69 Cost of independe
Page 77 and 78: 4.3 Implementation 71 Algorithm 1 Q
Page 79 and 80:
4.4 Experimental Results 73 Query E
Page 81 and 82:
4.5 Summary 75 pre-aggregates: inde
Page 83 and 84:
Chapter 5 Pre-Aggregation Support B
Page 85 and 86:
5.2 Conceptual Framework 79 Figure
Page 87 and 88:
5.2 Conceptual Framework 81 Benefit
Page 89 and 90:
5.4 Answering Scaling Operations Us
Page 91 and 92:
5.5 Experimental Results 85 Algorit
Page 93 and 94:
5.5 Experimental Results 87 (a) Que
Page 95 and 96:
5.5 Experimental Results 89 (a) Sel
Page 97 and 98:
5.5 Experimental Results 91 vectors
Page 99 and 100:
5.5 Experimental Results 93 root no
Page 101 and 102:
5.5 Experimental Results 95 Figure
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
5.6 Summary 101 we considered non-u
Page 109 and 110:
Chapter 6 Conclusion One of the big
Page 111 and 112:
6.1 Future Work 105 more non-spatio
Page 113 and 114:
Bibliography [1] Blakeley J. A., La
Page 115 and 116:
BIBLIOGRAPHY 109 [22] Moon B., Vega
Page 117 and 118:
BIBLIOGRAPHY 111 [47] ESRI Inc. Arc
Page 119 and 120:
BIBLIOGRAPHY 113 [73] Stefanovic N.
Page 121:
BIBLIOGRAPHY 115 [97] Kotidis Y. an
show all

Applying OLAP Pre-Aggregation Techniques to ... - Jacobs University

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?