11.07.2015 Views

Research Article A Hybrid Spatio-Temporal Data Model and Structure

Research Article A Hybrid Spatio-Temporal Data Model and Structure

Research Article A Hybrid Spatio-Temporal Data Model and Structure

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

352 R Sengupta <strong>and</strong> C YanFurther, the notion of time, <strong>and</strong> consequently temporal change, plays an important rolein spatial decision-making (Armstrong 1988). The need to incorporate time in GIS is evenmore apparent when considering the integration of geographical <strong>and</strong> environmentalmodels with GIS, a key research area in GIScience (Goodchild et al. 1992, Nyerges 1993).The goal of ongoing GIScience research relating to spatio-temporal data models <strong>and</strong> structures,therefore, has been to develop methods to effectively store, retrieve, manipulate,analyze <strong>and</strong> display spatio-temporal data (Peuquet <strong>and</strong> Duan 1995). Such a GIS hasbeen refereed to alternately as <strong>Temporal</strong> or <strong>Spatio</strong>temporal GIS (TGIS) (Renolen 2000).For the purpose of recording spatio-temporal data, a distinction has been made betweenspatial objects that change their location (Frihida et al. 2002) as opposed to spatiallocations that change their values over time (Marceau et al. 2001). This is analogous tothe “fields” versus “objects” debate of the 1980s (Couclelis 1992). For example, thechange in location of a police patrol car over time can be considered to be an objectview of the world, while the observation of changes at fixed, regularly defined coordinatelocations as they transition from one l<strong>and</strong> use to another (e.g. agriculture to a newhousing subdivision) can be considered to be a field-based view. Further, the changeitself can be recorded either as a continuous timeline, at discrete time intervals wherethe occurrence of a transition between states is marked by “events”, or as a combinationof the two (Hornsby <strong>and</strong> Egenhofer 2000, Renolen 2000, Wang <strong>and</strong> Cheng 2001).Because of these differences in the way spatio-temporal data is represented, a differentdata structure or model is often necessary to optimally store <strong>and</strong> retrieve the differenttypes of information. For example, recording the location of a spatial object that ismoving continuously is better represented using an object oriented data model, wherethe object can encapsulate information about its timeline within itself (Raper <strong>and</strong>Livingstone 1995, Wachowicz 1999, Frihida et al. 2002). For a temporal GIS with a“field-view” of the real world, where the information collected pertains to changes invalue for a location over time, several data models <strong>and</strong> data structures have been proposedin the literature (Peuquet <strong>and</strong> Duan 1995, Theodoridis et al. 1996, Tzourmanis et al. 2000).Further, Peuquet (1994) proposed the combination of the object-based <strong>and</strong> locationbasedapproaches with a time-based representation to develop a “Triad framework”.This framework allows users to query spatio-temporal data using object-based, fieldbased,<strong>and</strong> time-based representations.This research operationalizes Peuquet <strong>and</strong> Duan’s (1995) Event-based <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Model</strong> (ESTDM) using some of the elements of Overlapping R-trees(Guttman 1984, Tzourmanis et al. 2000) to create a <strong>Hybrid</strong> <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong><strong>Model</strong> <strong>and</strong> <strong>Structure</strong> (HST-DMS). HST-DMS, therefore, utilizes an event-oriented, fieldbasedview, <strong>and</strong> is designed to reduce retrieval time <strong>and</strong> storage space requirements whenrecording changes in values for specific coordinate locations over time.2 Conceptual Organization <strong>and</strong> Implementation Details of <strong>Spatio</strong>-temporal<strong>Data</strong> <strong>Model</strong>sThe “snapshot” data model discussed by Armstrong (1988) <strong>and</strong> the “space-timecomposite” model of Langran <strong>and</strong> Chrisman (1988) were the initial attempts to conceptuallyorganize spatio-temporal data. In the snapshot model, temporal informationwas simply encoded with each spatial layer to identify it as a static view of the worldat any given time. The space-time composite improved upon the snapshot model by© Blackwell Publishing Ltd. 2004


356 R Sengupta <strong>and</strong> C YanFigure 3<strong>Data</strong> <strong>Model</strong> used in HST-DMSassociated with each time as a list of “events”, where each element of the list refers tospatial changes in relation to a previous state (Figure 1). Figure 3 shows the structureof the new data model with the help of a sample 16-cell raster dataset designed torepresent traditional l<strong>and</strong> use change. As shown in this figure, parts of the datasetchange with time, while some parts never change throughout the entire time periodrepresented by t 0 to t 3 . Further, only a small number of cells actually change betweeneach time step. So, it is obviously unnecessary to repeatedly store the unchanged part ateach time step.In ESTDM, this fact is well recognized, <strong>and</strong> the element associated with time t i onlystores a record of the cells that change between t i−1 to t i . Therefore, it is very space© Blackwell Publishing Ltd. 2004


A <strong>Hybrid</strong> <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Model</strong> <strong>and</strong> <strong>Structure</strong> 357efficient. However, ESTDM is only efficient for specific types of temporal queries. Inorder to retrieve a snapshot view of the spatial pattern at any given time t i , the searchmust begin with the base map at time t 0 <strong>and</strong> proceed by traversing the event list <strong>and</strong>retrieving every single change associated with each element, until the element representingtime t i is traversed. Therefore, the search can become very time-consuming.HST-DMS attempts to circumvent this problem by storing the change between timesteps as two separate groups: one records the elements that have changed since theprevious time step (called “change map at t i ”), <strong>and</strong> the other records elements that havechanged sometime in a past time step (called “complement map at t i ”). Further, the“base map” only stores those elements that never change through the entire time periodrepresented by the event list. Conversely, a “starting complement map” records thoseelements that will change sometime during the time period represented by the event list.Note that the base map <strong>and</strong> starting complement map are associated with the startingtime t 0 . A change map at t i <strong>and</strong> complement map at t i are associated with the time nodet i (where i > 0).Obviously, the new data model requires significant processing when a new time stepis added to existing information. In addition to creating a new change map <strong>and</strong> complementmap for the added time step, a new base map <strong>and</strong> starting complement map willalso have to be created. Further, storing a complement map for each time step addsadditional storage requirements that were not required in ESTDM. However, HST-DMS has some advantages over ESTDM during querying that offsets the additionalprocessing time imposed by the creation of complement maps. For example, in order tocreate a snapshot of time t i , the required data can be assembled by retrieving the basemap, the change map at t i , <strong>and</strong> the complement map at t i (where i > 0). The additionalspace requirements imposed by the creation of complement maps are offset to someextent by the use of overlapping data structures discussed in the following section.3.2 Implementation using Overlapping R-TreesThe three basic commonly used data structures for spatial data are arrays, linked-lists<strong>and</strong> trees, all of which have different advantages <strong>and</strong> disadvantages with respect tostorage requirements <strong>and</strong> search efficiency. Two-dimensional arrays consisting of rows<strong>and</strong> columns of data elements are structurally similar to a raster data model <strong>and</strong> aretherefore the means for implementing such a model. Linked lists are more complicatedstructures, where individual data elements are connected to each other through the useof “pointers”. Retrieval of information from linked lists requires the traversal of the listfrom a starting to an ending element by following the pointers. As the list grows,however, the linear traversal of elements can take significant time. Therefore, tree structureshave been created in order to organize information in “leaf nodes” located underdifferent “branch nodes” <strong>and</strong> connected via pointers. Trees have a hierarchical structure,with the main tree composed of several sub-trees.To test the feasibility of HST-DMS in storing spatio-temporal information, a prototypewas developed <strong>and</strong> implemented using the C++ programming language. Becauseof their individual characteristics <strong>and</strong> attendant advantages <strong>and</strong> disadvantages, the differentdata structures described above were used to implement different components ofthe HST-DMS data model (Figure 4). A doubly linked list was used to store the eventlistthat keeps track of the all spatial changes between two temporal states (Figure 4).The complement maps were stored using an “overlapping R-tree” (Guttman 1984,© Blackwell Publishing Ltd. 2004


A <strong>Hybrid</strong> <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Model</strong> <strong>and</strong> <strong>Structure</strong> 359Table 1 Contents of a record for the dynamic array representing change map at time t 1PositionPointersRow Column Attribute Time Link1(L1) Link2 (L2)0 2 2 1 (0, 2) in t 0 (0, 2) in t 20 3 2 1 (0, 3) in t 0 (0, 3) in t 24 Results: Storage <strong>and</strong> Search Efficiency4.1 Testing the ability of HST-DMS to efficiently store real-world dataTo test the ability of the prototype to efficiently store real world data, snapshots ofseveral raster layers marking the expansion of the areal extent of the city of Carbondale,Illinois, extracted from a sequence of aerial photographs taken at decadal intervals fromthe 1940s to 1990s (Figure 5), were entered into HST-DMS. The experimental datasetwas derived from six Digital Orthophoto Quarter Quadrangle (DOQQ) images ofCarbondale for 1938, 1952, 1959, 1970, 1980 <strong>and</strong> 1993, rectified to be in the samecoordinate system using historical l<strong>and</strong>marks. Note from the images that the growth ofthe urban areas between time periods is not systematic or regular, <strong>and</strong> varies betweendifferent time intervals.Several transformations were performed using the ArcView GIS software beforethe data could be stored in HST-DMS. First, the urban extents were visually identifiedfrom each image <strong>and</strong> digitized as vector polygons using on-screen digitizing techniques.Next, the vector themes were converted to grids (consisting of 30 m by 30 m cells) <strong>and</strong>the grid cells assigned attributes for two classes: 1 representing urban areas, <strong>and</strong> 0 representingnon-urban area. Finally, the grid was exported to the ASCII text file formatwhere the first six lines record the description for the image, <strong>and</strong> from the seventh lineonwards, the lines represented the attribute value of individual cells. In total, the datafor each year had 149 rows <strong>and</strong> 200 columns (29,800 cells).The ASCII files were then read by a program to populate HST-DMS. First, thevalues of the cells for all the images were stored in a simple three-dimensional array(time, row, column), from which the data was transferred into HST-DMS. All thespatio-temporal search operations were performed utilizing the new structure.Within HST-DMS, it was found that 20,287 of the 29,800 cells (or 68% of thetotal data) never change their values throughout the recorded time period (i.e. 1938–93), <strong>and</strong> can therefore be stored in the base map only once. The remainder of the 9,513cells are stored in the change map <strong>and</strong> complement map for each time period. Interestingly,even though 32% of the total cells change at some point during the periodspanning 1938 to 1993, less than 15% of the total cells changed between any two givensnapshots. Because of this, the use of an overlapping R-tree to store the complementmap resulted in even greater savings in storage space than initially expected. A total of16,255 data nodes <strong>and</strong> 18,529 branch nodes were required in the overlapping R-tree,whereas a similar non-overlapping structure developed for comparison purposesresulted in 38,842 data nodes <strong>and</strong> 42,105 branch nodes. An analysis of the R-treestructure indicated that the size of each data node was 25 bytes <strong>and</strong> the size of each© Blackwell Publishing Ltd. 2004


360 R Sengupta <strong>and</strong> C YanFigure 5Expansion of the city boundaries of Carbondale, Illinois, USA© Blackwell Publishing Ltd. 2004


A <strong>Hybrid</strong> <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Model</strong> <strong>and</strong> <strong>Structure</strong> 363Table 3Algorithm used to retrieve data in response to a spatial key queryOperationStepsif Basemap[i][j ]! = −1 1Retrieve the attributeelse {Access the time node 5If (i, j ) is in Changed Map of t iRetrieve the attributeElseRetrieve the attribute from Complement Map of t i 1}For a spatial key query, the maximum number of accesses required by HST-DMSto respond to the query is 7. The rationale for this time complexity results from thealgorithm used to extract data from HST-DMS, as shown in Table 3. A doubling of thetime periods will simply double the number of temporal pointers <strong>and</strong> increase numberof accesses to 13 (1 + 11 + 1), giving a time complexity of O(N). On the other h<strong>and</strong>,the 3D array requires access to the snapshot representing the time period t i from whichthe value of a cell can be extracted (i.e. worst case complexity of 1). Doubling theintervening time periods does not change the number of accesses, giving a time complexityof O(1). ESTDM requires that the base map, <strong>and</strong> all subsequent cell value modificationsuntil the desired time period t i must be accessed in order to reconstruct the current valueof the location. If all the cells in the base map change their value at every transition,then “n” number of accesses is required for each time period, giving a worst-case complexityof 6n for the six intervening time periods. A doubling of the time periods canincrease the worst case scenario to 12n, giving a time complexity of O(N).The time complexity for the time-range query for both HST-DMS <strong>and</strong> 3D array are2n. The reason is the creation of two snapshots for time nodes t i <strong>and</strong> t j respectively, fromwhich the cell values that change between two time periods can be identified. A doublingof the intervening time period does not change access requirements, giving a time complexityof O(1) for both structures. For ESTDM, the results are similar to performing asnapshot query because a static representation of the l<strong>and</strong>scape at time t i <strong>and</strong> t j isrequired. However, because both the snapshots are generated using a single traversal ofthe time nodes starting from the base map, <strong>and</strong> each cell can change its value at everytime step, the worst case time complexity is 6n. A doubling of the time steps leads to adoubling of the number of accesses to 12n <strong>and</strong> a time complexity of O(N). Because akey time-range query can be performed upon the results obtained from the time-rangequery, the time complexity for these two queries for the three data structures is identical.A spatial time-range query is useful to retrieve information about changes to certainlocations over a period of time (e.g. as a parcel of l<strong>and</strong> transitions from forest to pastureto urban subdivision). For the worst case, the number of access required is 6n in HST-DMS. First, an initial snapshot must be generated (i.e. n), followed by the creation of asnapshot for every time period. The doubling of the time periods leads to a doubling ofthe number of required accesses as well (i.e. 12n), leading to time complexity of O(N).© Blackwell Publishing Ltd. 2004


364 R Sengupta <strong>and</strong> C YanA similar number of accesses <strong>and</strong> time complexity is required by both 3D array <strong>and</strong>ESTDM (i.e. a spatial time-range query with ESTDM requires starting from a base mapto build up snapshots for all intervening time periods).Therefore, the above results indicate that in most cases, HST-DMS has significantlybetter worst-case time complexities than ESTDM. This can be attributed to the fact thatHST-DMS can recreate a snapshot for a given time period t i without having to traverseall the time nodes starting from the base map. This can be very useful in a worst-casescenario, where every cell has a value which may be different from its value for apreceding time period. On the other h<strong>and</strong>, a 3D array is as efficient as the HST-DMSfor all queries, <strong>and</strong> has better performance for a spatial key query.However, if average-case scenarios are considered, HST-DMS may have a bettertime complexity than 3D arrays for some types of spatio-temporal range queries. Forexample, if only half the cells changed their value between two adjacent time periods,then this information would be stored in a change map that is effectively n/2 in size. Acomplete search of this map would require only n/2 accesses, while retrieval of similarinformation from a 3D array would require 2n accesses, a four-fold increase.5 Discussion <strong>and</strong> ConclusionsHST-DMS has two advantages: implicit spatial <strong>and</strong> explicit temporal topology inheritedfrom ESTDM, <strong>and</strong> an efficient spatio-temporal storage <strong>and</strong> query mechanism inheritedfrom the use of base, change <strong>and</strong> complement maps to implement the data model.Additional temporal topology (as shown in Table 1) can be used to track the temporalhistory of individual cells, <strong>and</strong> to answer complicated spatio-temporal queries relatedto the temporal history of these cells (e.g. when did a cell, or a group of cells, convertfrom agriculture to urban use). Further, HST-DMS is raster-based, making it suitablefor integration with a range of environmental modeling software such as SLEUTH,SPARKS, AGNPS <strong>and</strong> MODFLOW, as well as for storing l<strong>and</strong> use information whichis often extracted from tessellated satellite imagery. In comparing the three raster-basedspatio-temporal data structures, a 3D array is as (or more) efficient as HST-DMS forworst case data retrieval scenarios (Table 2), but HST-DMS has better average case timecomplexities for specific cases, <strong>and</strong> is a definite improvement over ESTDM when consideringworst case time complexities.One of the disadvantages of the proposed data model, however, is that the databaseestablished by the new data structure is static. Unlike ESTDM, addition or deletion ofinformation requires that the constituent components (i.e. base map, change map, complementmap) be reconstructed to accommodate spatio-temporal change. The goal ofthis research, however, was to create a data structure that allowed efficient searching ofvery large databases while maintaining implicit temporal topology – a goal that HST-DMS is capable of meeting. Whether a dynamic database representing HST-DMS ispracticable is under consideration, <strong>and</strong> a subject for future research. Another issue notcovered by the creation of this data model is “transaction time”, the time when the datais entered into the database. The data model is limited to “valid time”, or the time whenthe event actually occurred in the real-world. Development of a bi-temporal HST-DMS,consisting of both valid time <strong>and</strong> transaction time, is also a possible future development.In conclusion, HST-DMS combines two elements of research (i.e. Peuquet <strong>and</strong> Duan’s(1995) Event-based <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Model</strong> (ESTDM) <strong>and</strong> the Overlapping R-tree© Blackwell Publishing Ltd. 2004


A <strong>Hybrid</strong> <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Model</strong> <strong>and</strong> <strong>Structure</strong> 365(Guttman 1984, Tzourmanis et al. 2000) respectively) with a focus on improved datastorage <strong>and</strong> search efficiency for very large spatio-temporal databases. As the availabilityof spatio-temporal information grows exponentially, the management of such databasesis likely to form a core requirement for a number of environmental modeling applications.Finally, the utility of HST-DMS in managing spatio-temporal data is demonstrated withthe help of a prototype used to store data about the expansion of the city boundaries ofCarbondale, Illinois, <strong>and</strong> by providing some theoretical estimates of the search efficiencyof the proposed model/structure.ReferencesArmstrong M P 1988 <strong>Temporal</strong>ity in spatial databases. In Proceedings of GIS/LIS ’88 (Volume2). Bethesda, MD, American Congress on Surveying <strong>and</strong> Mapping: 880–9Couclelis H 1992 People manipulate objects (but cultivate fields): Beyond the raster-vector debatein GIS. In Frank A U, Campari I, <strong>and</strong> Formentini U (eds) Theories <strong>and</strong> Methods of Spatial-<strong>Temporal</strong> Reasoning in Geographic Space. Berlin, Springer-Verlag Lecture Notes in ComputerScience No. 639: 65–7Finkel R A <strong>and</strong> Bentley J L 1974 Quadtrees: A data structure for retrieval on composite keys. ActaInformatica 4: 1–9Frihida A, Marceau D J, <strong>and</strong> Theriault M 2002 <strong>Spatio</strong>-temporal object-oriented data model fordisaggregate travel behavior. Transactions in GIS 6: 277–94Goodchild M F, Haining R, <strong>and</strong> Wise S 1992 Integrating GIS <strong>and</strong> spatial data analysis: Problems<strong>and</strong> possibilities. International Journal of Geographical Information Systems 6: 407–23Guttman A 1984 R-tree: A dynamic index structure for spatial searching. In Proceedings of theACM SIGMOD Conference. New York, Association for Computing Machinery: 47–57Hornsby K <strong>and</strong> Egenhofer M J 2000 Identity-based change: A foundation for spatio-temporalknowledge representation. International Journal of Geographical Information Science 14:207–24Langran G <strong>and</strong> Chrisman N R 1988 A framework for temporal geographic information. Cartographica25: 1–13Marceau D J, Guindon L, Bruel M, <strong>and</strong> Marios C 2001 Building temporal topology in a GISdatabase to study the l<strong>and</strong>-use changes in a rural-urban environment. Professional Geographer53: 546–58Mason D C, O’Conaill M A, <strong>and</strong> Bell S B M 1994 H<strong>and</strong>ling four-dimensional geo-referenceddata in environmental GIS. International Journal of Geographical Information Systems 8:191–215Morton G M 1966 A Computer Oriented Geodetic <strong>Data</strong> Base <strong>and</strong> a New Technique in FileSequencing. Ottawa, IBMNascimento M A <strong>and</strong> Silva J R 1998 Towards Historical R-trees. In Proceedings of the ACMSymposium on Applied Computing (ACM-SAC’98), Atlanta, Georgia. New York, Associationfor Computing Machinery: 235–40Nyerges T L 1993 Underst<strong>and</strong>ing the scope of GIS: Its relationship to environmental modeling. InGoodchild M F, Parks B O, <strong>and</strong> Steyaert L T (eds) Environmental <strong>Model</strong>ing with GIS. NewYork, Oxford University Press: 75–84Peuquet D J 1994 It’s about time: A conceptual framework for the representation of temporaldynamics in Geographic Information Systems. Annals of the Association of American Geographers84: 441–61Peuquet D J <strong>and</strong> Duan N 1995 An event-based spatiotemporal data model (ESTDM) for temporalanalysis of geographical data. International Journal of Geographic Information Systems 9: 7–23Raper J <strong>and</strong> Livingstone D 1995 Development of a geomorphological spatial model using objectorienteddesign. International Journal of Geographical Information Systems 9: 359–83Renolen A 2000 <strong>Model</strong>ling the real world: Conceptual modelling in spatiotemporal informationsystem design. Transactions in GIS 4: 23–42© Blackwell Publishing Ltd. 2004


366 R Sengupta <strong>and</strong> C YanSamet H 1989 The Design <strong>and</strong> Analysis of Spatial <strong>Data</strong> <strong>Structure</strong>. Reading, MA, Addison-WesleyTheodoridis T, Vazirgiannis M, <strong>and</strong> Sellis T 1996 <strong>Spatio</strong>-<strong>Temporal</strong> Indexing for Large MultimediaApplications. In Proceedings of the Third IEEE Conference on Multimedia Computing <strong>and</strong>Systems (ICMCS ’96). New York, Institute of Electrical <strong>and</strong> Electronics Engineers: 441–8Tsotras V J, Jensen C S, <strong>and</strong> Snodgrass R T 1997 A Notation for <strong>Spatio</strong>temporal Queries. WWWdocument, http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-10.pdfTzouramanis T, Vassilakopoulos M, <strong>and</strong> Manolopoulos Y 2000 Overlapping linear quadtrees <strong>and</strong>spatio-temporal query processing. The Computer Journal 43: 325–43Yuan M 1999 Use of a three-domain representation to enhance GIS support for complex spatiotemporalqueries. Transactions in GIS 3: 137–59Wachowicz M 1999 Objected-oriented Design for <strong>Temporal</strong> GIS. London, Taylor <strong>and</strong> FrancisWang D <strong>and</strong> Cheng T 2001 A spatio-temporal data model for activity-based transport dem<strong>and</strong>modelling. International Journal of Geographical Information Science 15: 561–85Worboys M F 1992 A model for spatio-temporal information. In Bresnahan P, Corwin E, <strong>and</strong>Cowen D (eds) Proceeding of the Fifth International Symposium on Spatial <strong>Data</strong> H<strong>and</strong>ling(Volume 2). San Jose, CA, American Congress on Surveying <strong>and</strong> Mapping: 602–11Worboys M F 1994 A unified model for spatial <strong>and</strong> temporal information. The Computer Journal37: 26–34Xu X, Han J, <strong>and</strong> Lu W 1990 RT-tree: An Improved R-tree Index <strong>Structure</strong> for <strong>Spatio</strong>temporal<strong>Data</strong>bases. In Proceedings of the Fourth International Symposium on Spatial <strong>Data</strong> H<strong>and</strong>ling,Zurich, Switzerl<strong>and</strong>: 1040–9© Blackwell Publishing Ltd. 2004

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!