11.07.2015 Views

Tutorial on Spatial and Spatio-Temporal Data Mining Part II ... - UFSC

Tutorial on Spatial and Spatio-Temporal Data Mining Part II ... - UFSC

Tutorial on Spatial and Spatio-Temporal Data Mining Part II ... - UFSC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<str<strong>on</strong>g>Tutorial</str<strong>on</strong>g> <strong>on</strong> <strong>Spatial</strong> <strong>and</strong> <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Mining</strong><strong>Part</strong> <strong>II</strong> – Trajectory Knowledge DiscoveryVania BogornyUniversidade Federal de Santa Catarinawww.inf.ufsc.br/~vaniavania@inf.ufsc.brOutlineThe wireless explosi<strong>on</strong>Moving Object <strong>Data</strong> <strong>and</strong> Mobility <strong>Data</strong> AnalysisTrajectory PatternsGeometric Trajectory Pattern <strong>Mining</strong> Methods:TT 32T 1T 4Semantic Trajectory Pattern <strong>Mining</strong> Methods:RCRCSCHT T 32 HT 4T 1Trajectory <strong>Data</strong> mining ToolsHHotelRRestaurant CCinema


The Wireless Explosi<strong>on</strong> (Fosca Giannotti 2007 – www.geopkdd.eu)Have you ever feel to be tracked?The Wireless Explosi<strong>on</strong>The world becomes more <strong>and</strong> more mobile with the easyaccess to smart ph<strong>on</strong>es, GPS, etcSattelite services, sensors <strong>and</strong> wireless technologies arerapidly improvinglots of spatio-temporal data is being generated


The Wireless A Explosão Explosi<strong>on</strong> da Rede Sem Fio(Fosca Giannotti 2007 – www.geopkdd.eu)Mobile devices leave behind digital traces that are collected astrajectories, describing the movement of its usersMobile devices generate a new type of data, called “ Trajectories ofMoving Objects”5Mobility <strong>Data</strong> Analysis


Mobility <strong>Data</strong> AnalysisSeveral analysis may be d<strong>on</strong>e over trajectories:How people move around the townDuring the day, during the week, etc.Are there typical movement behaviours? In a certain area at a certaintime?How are people movement habits changing in this area in last decadeyear-m<strong>on</strong>th-day?Are there relati<strong>on</strong>s between movements of two areas?Are there periodic movements?Mobility <strong>Data</strong> Analysis: Applicati<strong>on</strong>sTrajectory data analysis may be useful inseveral applicati<strong>on</strong> domainsVeicule M<strong>on</strong>itoringTransportati<strong>on</strong> Companies m<strong>on</strong>itor their trucksInsurance companies use GPS devices to m<strong>on</strong>itorinsured vehicles to reduce insurance priceTraffic AnalysisTo alert people about traffic jams,accidents, etc...Identify/predict low traffic regi<strong>on</strong>sin a city


Mobility <strong>Data</strong> AnalysisAnimal Migrati<strong>on</strong> / Behaviour AnalysisWhich are the trajectories ofa given migrati<strong>on</strong> bird?Where do birds stop? For how l<strong>on</strong>g?Which is the migrati<strong>on</strong> pattern of certain species?Fishing Analysis <strong>and</strong> C<strong>on</strong>trolAre boats really fishing inallowed areas?Can we classify vessel trajectories?Mobility <strong>Data</strong> AnalysisWeather predicti<strong>on</strong> <strong>and</strong> movementanalysisHurricane tracking


Trajectory <strong>Data</strong>Trajectory <strong>Data</strong> (Giannotti 2007 – www.geopkdd.eu)<strong>Spatio</strong>-temporal <strong>Data</strong>Represented as a set of points, located in space <strong>and</strong> timeT=(x 1 ,y 1 , t 1 ), …, (x n , y n , t n ) => positi<strong>on</strong> in space at time t i was(x i ,y i )Tid positi<strong>on</strong> (x,y) time (t)1 48.890018 2.246100 08:251 48.890018 2.246100 08:26... ... ...1 48.890020 2.246102 08:401 48.888880 2.248208 08:411 48.885732 2.255031 08:42... ... ...1 48.858434 2.336105 09:041 48.853611 2.349190 09:05... ... ...2 ... ...


Trajectories: Overall Characteristics(Adrienko 2008)1. Geometric shape2. Length (traveled distance)3. Durati<strong>on</strong> (in time)4. SpeedMean <strong>and</strong> maximal SpeedAccelerati<strong>on</strong>, decelerati<strong>on</strong>5. Directi<strong>on</strong>:Periods of straight, curvilinear, circular movementMore.....Relati<strong>on</strong>shipsMany types of relati<strong>on</strong>s may be of interest, depending <strong>on</strong> the problem:similarity or difference of the overall characteristics of the trajectoriese.g. shapes, travelled distances, durati<strong>on</strong>s, dynamics of speed <strong>and</strong> directi<strong>on</strong>s)spatial <strong>and</strong> temporal relati<strong>on</strong>s:co-locati<strong>on</strong> in space (i.e. the trajectories c<strong>on</strong>sist of the samepositi<strong>on</strong>s or have some positi<strong>on</strong>s in comm<strong>on</strong>):co-existence in time (i.e. the trajectories are collected during thesame time period or the periods overlap);co-incidence in space <strong>and</strong> time (i.e. same positi<strong>on</strong>s are attained atthe same time);distances in space <strong>and</strong> in time.


Trajectory Patterns<strong>Mining</strong> Trajectories: ClusteringFosca Giannotti 2007 – www.geopkdd.euGroup together similar trajectoriesFor each group produce a summary= cell


<strong>Mining</strong> Trajectories : Frequent patternsFosca Giannotti 2007 – www.geopkdd.euFrequent followed paths= cell<strong>Mining</strong> Trajectories: classificati<strong>on</strong> modelsFosca Giannotti 2007 – www.geopkdd.euExtract behaviour rules from historyUse them to predict behaviour of future users20%7%?5%60%8%= cell


Trajectory <strong>Data</strong> <strong>Mining</strong> Methods<strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Mining</strong> MethodsTwo approaches:Geometry-based spatio-temporal data mining:Density-based clustering methodsFocus <strong>on</strong> physical similarityC<strong>on</strong>sider <strong>on</strong>ly geometrical properties of trajectories (space <strong>and</strong>time)Semantic-based spatio-temporal data miningDeal with sparse data alsoPatterns are computed based <strong>on</strong> the semantics of the dataTrajectories are pre-processed to enrich the data


Geometry-based Trajectory <strong>Data</strong> <strong>Mining</strong>MethodsGeneral Geometric Trajectory Patterns


Relative Moti<strong>on</strong> Patterns (Laube 2004)Proposed 5 kinks of trajectory patterns based <strong>on</strong> movement,directi<strong>on</strong>, <strong>and</strong> locati<strong>on</strong>: c<strong>on</strong>vergence, encounter, flock,leadership, <strong>and</strong> recurrenceC<strong>on</strong>vergence: At least m entities pass through the samecircular regi<strong>on</strong> of radius r, not necessarily at the sametime (e.g. people moving to train stati<strong>on</strong>)T 1T 2T 3T 4T 5c<strong>on</strong>vergenceRelative Moti<strong>on</strong> Patterns (Laube 2004)Flock pattern: At least m entities are within a regi<strong>on</strong> of radius r <strong>and</strong> move inthe same directi<strong>on</strong> during a time interval >= s (e.g. traffic jam)Leadership: At least m entities are within a circular regi<strong>on</strong> of radius r, theymove in the same directi<strong>on</strong>, <strong>and</strong> at least <strong>on</strong>e of the entities is heading in thatdirecti<strong>on</strong> for at least t time steps. (e.g. bird migrati<strong>on</strong>, traffic accident)Encounter: At least m entities will be c<strong>on</strong>currently inside the same circularregi<strong>on</strong> of radius r, assuming they move with the same speed <strong>and</strong> directi<strong>on</strong>.(e.g. traffic jam at some moment if cars keep moving in the same directi<strong>on</strong>)T 2T 3LeadershipT 1EncounterFlock


Relative Moti<strong>on</strong> Patterns (Laube 2004)Recurrence: at least m entities visit acircular regi<strong>on</strong> at least k timesF1F 1RecurrenceF 1F1Extensi<strong>on</strong> of the work proposed by [Laube 2004, 2005]Gudmundss<strong>on</strong>(2006)Computes the l<strong>on</strong>gest durati<strong>on</strong> flock patternsThe l<strong>on</strong>gest pattern has the l<strong>on</strong>gest durati<strong>on</strong>And has at least a minimal number oftrajectoriesGudmundss<strong>on</strong> (2007)proposes approximate algorithms for computingthe patterns leadership, encounter,c<strong>on</strong>vergence, <strong>and</strong> flockFocus relies <strong>on</strong> performance issues


Frequent Trajectory PatternsFrequent Mobile Group Patterns (Hwang, 2005)A group pattern is a set of trajectories close to each other(with distance less than a given minDist) for a minimalamount of time (minTime)Directi<strong>on</strong> is not c<strong>on</strong>sideredFrequent groups are computed with the algorithm AprioriGroup pattern: time, distance, <strong>and</strong> minsup


Co-Locati<strong>on</strong> Patterns (Cao 2006)Co-locati<strong>on</strong> episoids in spatio-temporal dataTrajectories are spatially close in a time window <strong>and</strong> move togetherw2w1Traclus (Han, 2007)Clustering algorithm (TraClus-Trajectory Clustering)Group sub-trajectoriesDensity-based<strong>Part</strong>iti<strong>on</strong>-<strong>and</strong>-group method1) each trajectory is partiti<strong>on</strong>ed into a set of line segments (subtrajectories)with lenght L defined by the user2) similar segments (close segments) are groupedSimilarity is based <strong>on</strong> a distance functi<strong>on</strong>Interesting approach for trajectories of hurricanesMain drawback: Clustering is based <strong>on</strong> spatial distancetime is not c<strong>on</strong>siderd


Trajectory Sequential PatternsFrequent Sequential Patterns (Cao, 2005)Three main steps:1. Transforms each trajectory in a line with several segments A distance tolerance measure is defined (similar to buffer) All trajectory points inside this distanceare summarized in <strong>on</strong>e segment2. Similar segments are grouped Similarity is based <strong>on</strong> the angle <strong>and</strong> the spatial lenght of the segmentSegments with same angle <strong>and</strong> length havetheir distance checked based <strong>on</strong> a given distance d thresholdFrom the resultant groups, a medium segment is createdFrom this segment a regi<strong>on</strong> (buffer) is created3. Frequent sequences of regi<strong>on</strong>s are computedc<strong>on</strong>sidering a minSup threshold


T-Patterns (Giannotti, 2007)Sequential Trajectory Pattern <strong>Mining</strong>C<strong>on</strong>sider both space <strong>and</strong> timeObjective is to describe frequent movementC<strong>on</strong>sidering visited regi<strong>on</strong>s of interestDuring movements <strong>and</strong> the durati<strong>on</strong> of movementsSteps:1. Compute or find regi<strong>on</strong>s of interest, based <strong>on</strong> dense spatialregi<strong>on</strong>s (no time is c<strong>on</strong>sidered)2. Select trajectories that intersect two or more regi<strong>on</strong>s in asequence, annotating travel time from <strong>on</strong>e regi<strong>on</strong> to another3. Compute sequences of regi<strong>on</strong>s visited in same time intervalsT-Patterns (Giannotti, 2007)Fix a set of pre-defined regi<strong>on</strong>sBACMap each (x,y) of the trajectory to its regi<strong>on</strong>timeSample pattern:A⎯ 20min.⎯⎯→B


T-Patterns (Giannotti, 2007)Detect significant regi<strong>on</strong>s thru spatial clusteringaround(x 1 ,y 1 )around(x 1 ,y 1 )Map each (x,y) of the trajectory to its regi<strong>on</strong>timeSample pattern:20 min.around ( x1,y1)⎯⎯⎯→ around ( x2,y2)Trajectory Classificati<strong>on</strong>The idea is to classify types of trajectories


TraClass Algorithm (Lee 2008)Two main steps algorithm:First: regi<strong>on</strong> – based clustering:Sec<strong>on</strong>d: trajectory-clusteringMain problem: time is not c<strong>on</strong>sideredTraClass Algorithm (Lee 2008)Classify subtrajectories instead of whole trajectoriesExamples: Red trajectories move from Port A to C<strong>on</strong>tainer Port <strong>and</strong> then to Port B Blue trajectories move from Port A to Refinery <strong>and</strong> then to Port BClassifying whole trajectory would classify all trajectories as moving fromPort A to Port B38


TraClass Algorithm (Lee 2008)First: regi<strong>on</strong> – based clusteringTrajectories are cut into segments (fast change of directi<strong>on</strong>)Segments are then clustered by distance with DB-SCANOne representative trajectory is generated for the cluster <strong>and</strong> labeled witha classTR TRTR 4 53 (1)A set of trajectories(2) <strong>Part</strong>iti<strong>on</strong>TR 2TR 1A representative trajectoryA set of line segments(3) GroupA cluster39TraClass Algorithm (Lee 2008)First: Discover regi<strong>on</strong>s that have trajectories mostly of <strong>on</strong>e class regardless oftheir movement patterns


TraClass Algorithm (Lee 2008)Sec<strong>on</strong>d: trajectory – based clustering:Extracts clusters of comm<strong>on</strong> movementpatterns in n<strong>on</strong>-homogeneous areasGrouping is based <strong>on</strong> same class41Trajectory Outlier Detecti<strong>on</strong>


Trajectory Outlier Detecti<strong>on</strong>• The objective is to find trajectories that have different behaviorin relati<strong>on</strong> to other trajectories• For instance:– A fishing vessel that has a behaviour differentfrom other fishing vessels in the same area– A hurricane that may change behaviour in certainparts of its trajectory– Cars or pedestrians with suspishious behaviour43TraOD - Trajectory Outlier Detecti<strong>on</strong> (Lee 2008)• <strong>Part</strong>iti<strong>on</strong> trajectories into subtrajectories• Compare subtrajectories based <strong>on</strong>:– distance <strong>and</strong> length• If a subtrajectory is not close to othertrajectories for a minimal lenght– It is an outlier44


TraOD - Trajectory Outlier Detecti<strong>on</strong> (Lee 2008)• Example:– Looking to the whole trajectory, TR 3 is not detected asan outlier since its overall behavior is similar t<strong>on</strong>eigbouhr trajectories• Looking at the subtrajectories, T3 can be an outlierTR 5TR 1TR 4 TR3TR2An outlying sub-trajectory45TraOD - Trajectory Outlier Detecti<strong>on</strong> (Lee 2008)Two phases: partiti<strong>on</strong>ing <strong>and</strong> detecti<strong>on</strong>TR 5TR 4 TR3TRTR 21(1) <strong>Part</strong>iti<strong>on</strong>A set of trajectoriesA set of trajectory partiti<strong>on</strong>s(2) DetectTR 3An outlierOutlying trajectory partiti<strong>on</strong>s46


TraOD - Trajectory Outlier Detecti<strong>on</strong> (Lee 2008)• Once trajectories are partiti<strong>on</strong>ed, trajectory outliers aredetected based <strong>on</strong> both distance <strong>and</strong> density• A trajectory is an outlier if it c<strong>on</strong>tains a sufficient amount ofoutlying t-partiti<strong>on</strong>sNot closeClose ≤ 1‒p> 1‒pTRi LiTRi LiL i is an outlying t-partiti<strong>on</strong>L i is not an outlying t-partiti<strong>on</strong>47TraOD - Trajectory Outlier Detecti<strong>on</strong> (Lee 2008)13 Outliers from Hurricane <strong>Data</strong>48


SummaryThese data mining approaches deal with Trajectory SamplesTid geometry timest1 48.890018 2.246100 08:251 48.890018 2.246100 08:26... ... ...1 48.890020 2.246102 08:401 48.888880 2.248208 08:411 48.885732 2.255031 08:42... ... ...1 48.858434 2.336105 09:041 48.853611 2.349190 09:05... ... ...1 48.853610 2.349205 09:401 48.860515 2.349018 09:41... ... ...1 48.861112 2.334167 10:001 48.861531 2.336018 10:011 48.861530 2.336020 10:02... ... ...2 ... ...ReferencesLaube, P. <strong>and</strong> Imfeld, S. (2002). Analyzing relative moti<strong>on</strong> within groups of trackablemoving point objects. In Egenhofer, M. J. <strong>and</strong> Mark, D. M., editors, GIScience, volume2478 of Lecture Notes in Computer Science, pages 132–144. Springer.Laube, P., Imfeld, S., <strong>and</strong> Weibel, R. (2005a). Discovering relative moti<strong>on</strong> patterns ingroups of moving point objects. Internati<strong>on</strong>al Journal of Geographical Informati<strong>on</strong>Science, 19(6):639–668.Laube, P., van Kreveld, M., <strong>and</strong> Imfeld, S. (2005b). Finding REMO: Detecting RelativeMoti<strong>on</strong> Patterns in Geospatial Lifelines. Springer.Lee, J.-G., Han, J., <strong>and</strong> Whang, K.-Y. (2007). Trajectory clustering: a partiti<strong>on</strong>-<strong>and</strong>-groupframework. In Chan, C. Y., Ooi, B. C., <strong>and</strong> Zhou, A., editors, SIGMOD C<strong>on</strong>ference,pages 593–604. ACM.Li, Y., Han, J., <strong>and</strong> Yang, J. (2004). Clustering moving objects. In KDD ’04: Proceedings ofthe tenth ACM SIGKDD internati<strong>on</strong>al c<strong>on</strong>ference <strong>on</strong> Knowledge discovery <strong>and</strong> datamining, pages 617–622, New York, NY, USA. ACM Press.Nanni, M. <strong>and</strong> Pedreschi, D. (2006). Time-focused clustering of trajectories of movingobjects. Journal of Intelligent Informati<strong>on</strong> Systems, 27(3):267–289.


ReferencesVerhein, F. <strong>and</strong> Chawla, S. (2006). <strong>Mining</strong> spatio-temporal associati<strong>on</strong> rules, sources, sinks,stati<strong>on</strong>ary regi<strong>on</strong>s <strong>and</strong> thoroughfares in object mobility databases. In Lee, M.- L., Tan,K.-L., <strong>and</strong> Wuw<strong>on</strong>gse, V., editors, DASFAA, volume 3882 of Lecture Notes in ComputerScience, pages 187–201. Springer.Gudmundss<strong>on</strong>, J. <strong>and</strong> van Kreveld, M. J. (2006). Computing l<strong>on</strong>gest durati<strong>on</strong> flocks intrajectory data. In [de By <strong>and</strong> Nittel 2006], pages 35–42.Gudmundss<strong>on</strong>, J., van Kreveld, M. J., <strong>and</strong> Speckmann, B. (2007). Efficient detecti<strong>on</strong> ofpatterns in 2d trajectories of moving points. GeoInformatica, 11(2):195–215.Hwang, S.-Y., Liu, Y.-H., Chiu, J.-K., <strong>and</strong> Lim, E.-P. (2005). <strong>Mining</strong> mobile group patterns: Atrajectory-based approach. In Ho, T. B., Cheung, D. W.-L., <strong>and</strong> Liu, H., editors, PAKDD,volume 3518 of Lecture Notes in Computer Science, pages 713–718. Springer.Cao, H., Mamoulis, N., <strong>and</strong> Cheung, D. W. (2006). Discovery of collocati<strong>on</strong> episodes inspatiotemporal data. In ICDM, pages 823–827. IEEE Computer Society.Zhenhui Li, Jae-Gil Lee, Xiaolei Li, Jiawei Han: Incremental Clustering for Trajectories.DASFAA (2) 2010: 32-46More...Huiping Cao, Nikos Mamoulis, David W. Cheung: Discovery of Periodic Patterns in <strong>Spatio</strong>temporalSequences. IEEE Trans. Knowl. <strong>Data</strong> Eng. 19(4): 453-467 (2007)Panos Kalnis, Nikos Mamoulis, Spirid<strong>on</strong> Bakiras: On Discovering Moving Clusters in <strong>Spatio</strong>temporal<strong>Data</strong>. SSTD, 364-381 (2005)Florian Verhein, Sanjay Chawla: <strong>Mining</strong> spatio-temporal patterns in object mobility databases.<strong>Data</strong> Min. Knowl. Discov. 16(1): 5-38 (2008)Florian Verhein, Sanjay Chawla: <strong>Mining</strong> <strong>Spatio</strong>-temporal Associati<strong>on</strong> Rules, Sources, Sinks,Stati<strong>on</strong>ary Regi<strong>on</strong>s <strong>and</strong> Thoroughfares in Object Mobility <strong>Data</strong>bases. DASFAA, 187-201(2006)Cao, H., Mamoulis, N., <strong>and</strong> Cheung, D. W. (2005). <strong>Mining</strong> frequent spatio-temporal sequentialpatterns. In ICDM ’05: Proceedings of the Fifth IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> <strong>Data</strong> <strong>Mining</strong>,pages 82–89, Washingt<strong>on</strong>, DC, USA. IEEE Computer Society.Jae-Gil Lee, Jiawei Han, Xiaolei Li, <strong>and</strong> Hector G<strong>on</strong>zalez, “TraClass: Trajectory Classificati<strong>on</strong>Using Hierarchical Regi<strong>on</strong>-Based <strong>and</strong> Trajectory-Based Clustering”, Proc. 2008 Int. C<strong>on</strong>f.<strong>on</strong> Very Large <strong>Data</strong> Base (VLDB'08), Auckl<strong>and</strong>, New Zeal<strong>and</strong>, Aug. 2008.Jae-Gil Lee, Jiawei Han, <strong>and</strong> Xiaolei Li, "Trajectory Outlier Detecti<strong>on</strong>: A <strong>Part</strong>iti<strong>on</strong>-<strong>and</strong>-DetectFramework", Proc. 2008 Int. C<strong>on</strong>f. <strong>on</strong> <strong>Data</strong> Engineering (ICDE'08), Cancun, Mexico, April2008.


Semantic-based <strong>Spatio</strong>-temporal <strong>Data</strong> <strong>Mining</strong>MethodsSemantic Trajectory <strong>Data</strong> <strong>Mining</strong>The main idea is to enrich trajectories with domainsemantic informati<strong>on</strong> in preprocessing stepsThis task can be d<strong>on</strong>e using data miningApply data mining as a sec<strong>on</strong>d step<strong>Mining</strong> is <strong>on</strong> semantic rich trajectories


Geometric Patterns X Semantic Patterns (Bogorny 2008)Geometric PatternRTPRTPCCT 3T 2T 1HCCT 2T 3T 4HT 4T 1HHotelRRestaurantTP TouristicPlaceSemantic trajectory Pattern(a) Hotel to Restaurant, passing by CC(b) go to Cinema, passing by CCGeometric Patterns X Semantic Patterns (Bogorny 2008)There is very little or no semantics in most DM approaches fortrajectoriesC<strong>on</strong>sequence:• Patterns are purely geometrical• Difficult to interpret from the user’s point of view• Do not discover semantic patterns,which can be independent of spatial locati<strong>on</strong>


DJ-Cluster (Zhou 2007)DJ-Cluster is a variati<strong>on</strong> of DBSCANFocus relies <strong>on</strong> performance issuesObjective: find interesting places of individual trajectoriesClusters are computed from a SET of trajectories of the same objectTime is not c<strong>on</strong>sideredA C<strong>on</strong>ceptual View <strong>on</strong> Trajectories (Spaccapietra 2008)A trajectory is a spatio-temporal thing (an object) thathas generic featuresgeneric: applicati<strong>on</strong> independenthas semantic featuressemantic: applicati<strong>on</strong> dependentA trajectory is more than a moving object


Semantic Trajectories - Motivati<strong>on</strong>Trajectory Samples (x,y,t)Geographic <strong>Data</strong>Geographic <strong>Data</strong> +Trajectory <strong>Data</strong> =Semantic TrajectoriesThe Model of Stops <strong>and</strong> Moves (Spaccapietra 2008)STOPSImportant parts of trajectoriesWhere the moving object has stayed for aminimal amount of timeStops are applicati<strong>on</strong> dependentTourism applicati<strong>on</strong>– Hotels, touristic places, airport, Traffic Management Applicati<strong>on</strong>– Traffic lights, roundabouts, big eventsMOVESAre the parts that are not stops


Semantic TrajectoriesA semantic trajectory is a set of stops <strong>and</strong> movesStops have a place, a start time <strong>and</strong> an end timeMoves are characterized by two c<strong>on</strong>secutive stopsMethods for Adding Semantics to TrajectoriesPre-processing Single Trajectories


Methods to Compute Stops <strong>and</strong> Moves1) IB-SMoT (INTERSECTION-based)Interesting for applicati<strong>on</strong>s like tourism <strong>and</strong> urban planning2) CB-SMoT (SPEED-based clustering)Interesting for applicati<strong>on</strong>s where the speed is important,like traffic management3) DB-SMOT (DIRECTION-based clustering)Interesting in applicati<strong>on</strong> where the directi<strong>on</strong> variati<strong>on</strong> is importantlike fishing activitiesIB-SMoT (Alvares 2007a)A c<strong>and</strong>idate stop C is a tuple (R C , ∆ C ), whereR C is the geometry of the c<strong>and</strong>idate stop (spatial feature type)∆ C is the minimal time durati<strong>on</strong>E.g. [Hotel - 3 hours]An applicati<strong>on</strong> A is a finite setA = {C 1 = (R C1 , ∆ C1 ), …, C N = (R CN , ∆ CN )} of c<strong>and</strong>idatestops with n<strong>on</strong>-overlapping geometries R C1 , … ,R CNE.g. [Hotel - 3 hours, Museum – 1 hour]


IB-SMoT (Alvares 2007a)A stop of a trajectory T is a place that is important for theapplicati<strong>on</strong>A move of T with respect to an applicati<strong>on</strong> is: a maximal c<strong>on</strong>tiguous subtrajectory of T :between the starting point of T <strong>and</strong> the first stop of T; ORbetween two c<strong>on</strong>secutive stops of T; ORS1between the last stop of T <strong>and</strong> the ending point of T; or the trajectory T itself, if T has no stops.S2S3IB-SMoT(Alvares 2007ª)Input: c<strong>and</strong>idate stopstrajectoriesOutput: Semantic rich trajectories// Applicati<strong>on</strong>// trajectory samplesMethod:For each trajectoryCheck if it intersects a c<strong>and</strong>idat stop for a minimal amount of timeJurere09-12IbisH.13-14FloripaS16-17


CB-SMoT: Speed-based clustering (Palma 2008)• Clusters single trajectories based <strong>on</strong> the speed variati<strong>on</strong>:low speed important placeCB-SMoT: Speed-based clustering (Palma 2008)Jurere09-12Unknown stopInput: Trajectory samplesSpeed variati<strong>on</strong>minTimeOutput: stops <strong>and</strong> movesStep 1: find clustersStep 2: Add semantics to eachclusterIbisH.13-14FloripaS16-172.1: If intersects α during ∆tα stop α2.2: If no intersecti<strong>on</strong>during ∆t unknown stop


CB-SMoT: Speed-based clustering (Palma 2008)Unknown Stops (CB-SMOT)T 1same unknown stopT 2another unknown stopCB-SMoT: Speed-based clustering (Palma 2008)Can Find Clusters Inside Buildingsp 1p 6p 7p 11t6= 10:10AMt7= 10:32AM


DB-SMOT : Directi<strong>on</strong>-based Clustering (Manso 2010)Input: trajectories // trajectory samplesminDirVariati<strong>on</strong> // minimal directi<strong>on</strong> variati<strong>on</strong>minTime // minimum timemaxToleranceOutput: semantic rich trajectoriesMethod:For each trajectoryFind clusters with directi<strong>on</strong> variati<strong>on</strong>higher than minDirVariati<strong>on</strong>For a minimal amount of timeExamples of semantic trajectory patterns


Semantic Rich Trajectories (Transportati<strong>on</strong> Applicati<strong>on</strong>)IB-SMoTCB-SMoTSequential Patterns (Transportati<strong>on</strong> Applicati<strong>on</strong>)


Fishing DomainDB-SMoT MethodMultiple-granularity semantic trajectorypattern mining


STOPS at Multiple-Granularities (Bogorny 2009)Stop at Ibis Hotel from 6:04PM to 7:42PM, september 16, 2010spacetimeIbisHotel or Hotel or Accommodati<strong>on</strong>Afterno<strong>on</strong> or Thursday or 6:00PM – 8:00PM or RUSH-HOURITEMS - the building blocks for semantic pattern discoveryAn item is generated either from a stop or a moveAn item is a set of complex informati<strong>on</strong> (space +time), that can be defined in many formats/types<strong>and</strong> at different granularities


Building an ITEM for <strong>Data</strong> <strong>Mining</strong> (Bogorny 2009)Formats/types for an item:ameOnly: is the name of the stop/moveSTOPS: name of the spatial feature instance• IbisHotelMOVES: name of the two stops which define the move• SydneyAirport – IbisHotelameStart: is the name of the stop/move + start timeIbisHotel [morning]--stopLouvreMuseum [weekend]--stopIbisHotel-SydneyAirport [10:00AM-11:00AM] --move10/11/2010 GIScience 2010 – A c<strong>on</strong>ceptual data model for trajectory data miningVania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania79Building an ITEM for <strong>Data</strong> <strong>Mining</strong> (Bogorny 2009)ameEnd: name of a stop/move + end timeIbisHotel[morning]stopIbisHotel-SydneyAirport[10:00AM-11:00AM] moveameStartEnd: name of a stop/move + start time + end timeIbisHotel[08:00AM-11:00AM][1:00pm-6:00pm] stopLouvreMuseum[morning][afterno<strong>on</strong>] stopSydenyAirport– IbisHotel [10:00AM-11:00PM] [10:00AM-6:00PM]10/11/2010 GIScience 2010 – A c<strong>on</strong>ceptual data model for trajectory data miningVania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania80


Multiple-Granularity Semantic Trajectory DMQL (Bogorny 2009)ST-DMQL is an approach to semantically enrichtrajectories with domain informati<strong>on</strong>Autormatically tranforms these semantic informati<strong>on</strong> intodifferent space <strong>and</strong> time granularitiesExtracts frequent patterns, associati<strong>on</strong> rules <strong>and</strong>sequential patterns from semantic trajectoriesMultiple Level Semantic Sequential PatternsLarge Sequences of Length 2 (ITEM=SPACE+Start_Time)(41803_street_5, 41803_street_5) Support: 7(41803_street_4, 41803_street_4) Support: 9(41803_street_4, 66655_street_4) Support: 5(41803_street_2, 41803_street_2) Support: 6(41803_street_8, 41803_street_8) Support: 5(41803_street_3, 0_unknown_3) Support: 5gidtime unit = m<strong>on</strong>th<strong>Spatial</strong> feature type (stop name)


Multiple Level Semantic Sequential PatternsLarge Sequences of Length 2 (ITEM=SPACE+Start_Time)(41803_street_tuesday,41803_street_tuesday) Support: 9(41803_street_tuesday,66655_street_tuesday) Support: 5(41803_street_m<strong>on</strong>day,66655_street_m<strong>on</strong>day) Support: 5(41803_street_m<strong>on</strong>day,41803_street_m<strong>on</strong>day) Support: 11(41803_street_m<strong>on</strong>day,0_unknown_m<strong>on</strong>day) Support: 5(41803_street_thursday,41803_street_thursday) Support: 13(41803_street_thursday,0_unknown_thursday) Support: 6(41803_street_wednesday,41803_street_wednesday) Support: 7gidTime unit = Day of the week<strong>Spatial</strong> feature type (stop name)WEKA-STPMThe previous semantic pattern mining approaches areimplemented in WEKA


Weka-STDMWeka-STDM


Current Works <strong>on</strong> Weka-STPMTrajectory Visualizati<strong>on</strong>Trajectory CleaningNew methods for trajectory pre-processingTrajectory Behaviour PatternsRecent works have emerged <strong>on</strong> mining behaviourpatterns from trajectories


Athena (Bagli<strong>on</strong>i 2009)Semantic-rich movement analysisWhich are the homeworktrajectories? Andthe comm<strong>on</strong> behaviorsof them?To answer these questi<strong>on</strong>swe need to define what is ahome-work trajectory (orpattern)The c<strong>on</strong>cept of the homeworktrajectory can beencoded in a formalframework to automaticallyinfer which trajectories arehome-workAthena (Bagli<strong>on</strong>i 2009)Supports the post processing / deductive phase ofthe KDD processBased <strong>on</strong> <strong>on</strong>tologies to represent domain knowledge<strong>and</strong> to infer the semantic types of thepatterns/trajectories.Semantic classificati<strong>on</strong> of patters/trajectories indomain c<strong>on</strong>cepts based <strong>on</strong> the semanticcharacteristics


Athena (Bagli<strong>on</strong>i 2009)1. ExampleStopSemanticTrajectoryMoveCommuterTrajectory<strong>on</strong>tologyCommuter trajectory≡ a trajectory frequentlystarting outside the city, stopping inside the city for a l<strong>on</strong>gtime <strong>and</strong> going back outside the cityAthena (Bagli<strong>on</strong>i 2009)SELECT t.id, t.objectFROM Milano_trWHERE ‘Commuter’ inSEMANTIC(t.object)Given the trajectories, we query the system toidentify the <strong>on</strong>es whose type is commuter, i.e.satisfying the <strong>on</strong>tology definiti<strong>on</strong>


Pattern Interpretati<strong>on</strong> (Rebeca 2010)This work focuses <strong>on</strong> postprocessing,trying to interpret thepatternsC<strong>on</strong>sidering that the movementc<strong>on</strong>text is essential to correctlyinterpret <strong>and</strong> underst<strong>and</strong> thepatternsCONTEXT = geography + thematicattributesPattern Interpretati<strong>on</strong> (Rebeca 2010)1. <strong>Mining</strong> movement patterns (stops)2. Semantic enrichment: annotates patterns withinformati<strong>on</strong> obtained from trajectory types defined inONTOLOGIES (e.g. the trajectory of a commuter)3. Use the enriched representati<strong>on</strong> to automaticallyclassify patterns


Works Summarized in this <str<strong>on</strong>g>Tutorial</str<strong>on</strong>g>Geometric Pattern<strong>Mining</strong> Methods(mining is <strong>on</strong> samplepoints)Laube 2004, 2005Hwang 2005Gudmunds<strong>on</strong> 2006, 2007Giannotti 2007Lee 2007Cao 2006, 2007Lee 2007, 2008a, 2008bLi 2010Semantic Pattern <strong>Mining</strong>Methods (GenerateSemantic Trajectories usingDM - mining is <strong>on</strong> SemanticTrajectories)Alvares 2007Zhou 2007Palma 2008Bogorny 2009Bogorny 2010Manso 2010Alvares 2010Behaviour Pattern<strong>Mining</strong> <strong>and</strong>Interpretati<strong>on</strong> MethodsGiannotti 2009Bagli<strong>on</strong>i 2009Rebeca 2010ReferencesBogorny, V. ; Bart Kuijpers, Luis Otávio Alvares: ST-DMQL: A Semantic Trajectory<strong>Data</strong> <strong>Mining</strong> Query Language. Internati<strong>on</strong>al Journal of Geographical Informati<strong>on</strong>Science 23(10): 1245-1276 (2009)Palma, A. T; Bogorny, V.; Kuijpers, B.; Alvares, L.O. A Clustering-based Approachfor Discovering Interesting Places in Trajectories. In: 23rd Annual Symposium <strong>on</strong>Applied Computing, (ACM-SAC'08), Fortaleza, Ceara, 16-20 March (2008) Brazil.pp. 863-868.Spaccapietra, S., Parent, C., Damiani, M. L., de Macedo, J. A., Porto, F., <strong>and</strong>Vangenot, C. (2008). A c<strong>on</strong>ceptual view <strong>on</strong> trajectories. <strong>Data</strong> <strong>and</strong> KnowledgeEngineering, 65(1):126–146.Alvares, L. O., Bogorny, V., Kuijpers, B., de Macedo, J. A. F., Moelans, B., <strong>and</strong>Vaisman, A. (2007b). A model for enriching trajectories with semantic geographicalinformati<strong>on</strong>. In ACM-GIS, pages 162–169, New York, NY, USA. ACM Press.Rebecca Ong, M<strong>on</strong>ica Wachowicz, Mirco anni, Chiara Renso, From PatternDiscovery to Pattern Interpretati<strong>on</strong> in Movement <strong>Data</strong> - IEEE SADM 2010


ReferencesManso, J. A. ; TIMES, V. C. ; Oliveira, G. ; ALVARES, L. O. ; BOGORNY, V. . DB-SMoT: A Directi<strong>on</strong>-Based <strong>Spatio</strong>-<strong>Temporal</strong> Clustering Method. In: IEEEInternati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Intelligent Systems (IS), 2010, L<strong>on</strong>dres. Proceedings ofthe IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Intelligent Systems, 2010. p. 114-119. 3.Alvares, Luis O. ; PALMA, Andrey ; Oliveira, G. ; BOGORNY, V. . Weka-STPM:from trajectory samples to semantic trajectories. In: Workshop de Sofware Livre,2010, Porto Alegre. WSL, 2010.Zhou, C.; Nupur Bhatnagar, Shashi Shekhar, Loren G. Terveen: <strong>Mining</strong> Pers<strong>on</strong>allyImportant Places from GPS Tracks. ICDE Workshops 2007: 517-526Bogorny, V. ; Carlos Alberto Heuser, Luis Otávio Alvares: A C<strong>on</strong>ceptual <strong>Data</strong> Modelfor Trajectory <strong>Data</strong> <strong>Mining</strong>. GIScience 2010: 1-15Miriam Bagli<strong>on</strong>i, José Antônio Fern<strong>and</strong>es de Macêdo, Chiara Renso, RobertoTrasarti, M<strong>on</strong>ica Wachowicz: Towards Semantic Interpretati<strong>on</strong> of MovementBehavior. AGILE C<strong>on</strong>f. 2009: 271-288Fosca Giannotti, Mirco Nanni, Dino Pedreschi, Chiara Renso, RobertoTrasarti: <strong>Mining</strong> Mobility Behavior from Trajectory <strong>Data</strong>. CSE (4) 2009:948-951Summary, Challenges <strong>and</strong> Open Issues in <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Mining</strong>


Challenges <strong>and</strong> Open Issues in <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Mining</strong>Trajectory ClusteringMost works are density-based clustering methodsMost are adapted spatial or n<strong>on</strong>-spatial clustering algorithmsC<strong>on</strong>sider either time or space, <strong>on</strong>ly a few c<strong>on</strong>sider both dimensi<strong>on</strong>sChallenges <strong>and</strong> Open Issues in <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Mining</strong>Trajectory SimilarityFocus relies <strong>on</strong> objective similarity measuresShape, directi<strong>on</strong>, closenessNeeds: semantic similarityHigher abstracti<strong>on</strong> level similarityExample:– groups of trajectories going together for shopping– Groups of trajectories going together to the University two timesa week


Challenges <strong>and</strong> Open Issues in <strong>Spatio</strong>-<strong>Temporal</strong> <strong>Data</strong> <strong>Mining</strong>Need for data mining methods using:MetadataDomain knowledgeSemanticsOntologiesFor:Trajectory data pre-processingPattern pruningImprove the quality of the patternsPattern interpretati<strong>on</strong>More needsThere is a need for collaborati<strong>on</strong> between data miners <strong>and</strong>domain experts (envir<strong>on</strong>mental experts, transportati<strong>on</strong>managers, metheorologists, etc)to evaluate data mining methods <strong>and</strong> the discovered patternsPost-Processing: almost no spatial or spatio-temporal datamining methods evaluate the patterns <strong>and</strong> theirinterestingness


Thank You !vania@inf.ufsc.br

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!