Exploring the Spatial Distribution of Bird Habitat with ... - IEEE Xplore

3.2 Hierarchical Cluster and Spatial-Tree buildingbased on DBSCANAlbeit we have found suitable parameter of Eps and MinPts, this bringsabout other problems. If a user defines the neighborhood of a point byspecifying a particular radius and looks for core points that have apre-defined number of points within that radius, then either the tight clusteron the map will be picked up as one cluster, and the rest will be marked asnoise, or else every point will belong to a single cluster. We then canassume that the cluster area with small Eps would tend to be the childrennode for the bigger cluster area.Also, for MCP range area commonly includes areas never visited byanimal. In order to get the actual area used within bar-headed goose homeranges, core areas were calculated. The core area usually defined as areasreceiving concentrated by individual at each wetland. For example, thefostering place would be core region and the foraging area would be theout core region. To find the core area in the breeding MCP range area orother bigger ones, we have to run the cluster approach again on this areawith much smaller Eps. Meanwhile, there is hundreds of core areascattering on the map. It is necessary to find an efficient and easy way forfurther management of these results. Motivated by this requirement, wedevelop a Hierarchical DBSCAN cluster method , which can reflect therelationship between different clusters, where the algorithm can buildSpatial-Tree from top (large cluster) to bottom (several clusters), andcoding the index for every cluster as the Spatial-Tree growing.The problem we hope to solve out is how to cluster hierarchically,these methods can be either agglomerative or divisive. Agglomerativemethods (eg. AGNES) starts with as many clusters as variables and buildsa tree by successive join the two nearest clusters. On the other hand,divisive methods(eg. DIANA ) starts with one big cluster and successivesplits the largest cluster by the most dissimilar variable. AGNES andDIANA are fully described in chapters 5 and 6 of [12].Hierarchical cluster Approach:Different from the DIANA which select the cluster with largestdiameter and divided into two new clusters until all clusters contain onlyone variable. We utilize the breadth-first search (BFS), in graph theory,BFS is a graph search algorithm that begins at the root node and exploresall the neighboring nodes to build the Spatial-tree from top to bottom. Ourhierarchical cluster approach employs the similar strategy like the BFS. Atfirst the only one cluster which contains all datasets was selected andDBSCAN with the predefined parameter would split this cluster into subcluster. Those cluster results were pulled into one queue, and then the datasets from the queue was selected to run DBSCAN again. Our splittingphase will stop until the cluster level reach the defined level from top tobottom. When the cluster was spitted into smaller ones, that cluster wasindexed as well. Take the space into consideration we omit a detaileddescription of the hierarchy algorithm. But we ought to note the followingpoints:A. We need to initial two pointers: first and last one to identify thetree level in the QueueB. If the number of cluster result from his parents data node is lowerthan 2, we will not pull this cluster results into the queue.C. Members are in the same level of tree, they will share the samecluster parameter Eps and MinPts we predefined before.D. We use one garbage collector to save the cluster which will be thenoise member in the next tree level. Because the parameter willchange as the tree level change.E. As the Spatial-Tree growing, every cluster node will be encodedby their father cluster id joining with its own cluster number.Consider the Fig 3, the depth of Spatial-Tree is four and the clusterarea also was encoded. For example: the left most leave node in tree wasencoded with 0/0/1/0.thus, we can know its parent node is 0/0/1 and itsgrandfather is 0/0.Fig 3: Spatial-TreeThe hierarchy cluster procedure has deployed the mainframe todiscover the habitat, and then the next step is to define object similarityand cluster parameter for every tree level.Similarity based on the Geographical distanceAs previously stated, only a dissimilarity matrix is needed toperform a divisive cluster analysis, so we should concentrate ourefforts on this matrix efficiency. The location points in our paper aregeographic coordinates (latitude and longitude), geographicaldistance formula [7] was deployed to calculate the great-circledistance. In addition, bird migration route overlaps broadly with TheEast Asian Flyway of Anatidate and the East Asian-AustralasianFlyway of shorebirds[9]. We set the (sphere of radius) to 6378140meters in the distance formula 1:rd r . Where is thecentral angle, d is the geographical distance.Determining the Parameter of Hierarchy ClusterWe employed the following procedures which could allow us tocompute the approximately cluster parameters Eps and MinPts forhierarchy cluster method.1. In general, we only interest in location points that bird speciestend to stay longer rather than the location they fly over occasionally,thus we used theTMinpts T / Tsum ,which was associated with the time birdmoving in one certain scope of area to define this phenomena.Therefore, the number of MinPts could be computed by formula 2:sum p . T p is the time interval which location132

signal we received. In our paper,Tsumis set as two days, because astopover event as the use of a migration area for more than two daysafter departure from either the breeding area or the wintering area [9].And the average time T p for each bird from our data sets was twohours. Then the resulting MinPts can be 25.2. To get the parameter Eps, we develop a simple but effectiveheuristic to determine the parameters Eps. The heuristic is based onthe following observation. The wintering and breeding area usuallycover at least ten thousand square kilometers wide region. We canassume the Eps would be at most ten thousand meters. Otherwise, ifthe Eps was much higher, the cluster area would be over estimatedand would not promise us as hoped. In addition, we have to reducethe amount of Eps value with multiples of 10 from the 10,000 m,1000 m, 100 m to 10 m to get a smaller region of space.3. The stay duration of one wild bird in its own habitat isnaturally determined. Look at from another way, the temporaldistribution of bird location also can reflect the spatial distributionfrom our cluster approach. So we utilize the migration time recorded[9] and our observation to rectify the parameter Eps several times togain our ends.3.3 Minimum Convex Polygon Home RangeTo know the precise range of the habitat region, Kernel densityestimation has become prevalent in the home range analysis [3]. In our4 Discovery and analysis of cluster areaIn this section, we describe the results of discovering habitat on 2007survey and 2008 survey data sets using our hierarchy cluster approach.Because of space constraints, we mostly focus on the 2007 survey, andsome in 2008 survey also were presented. As one part of our discussions,we will analyze site-fidelity, implication for other research works as well.4.1 System framework of HAMapWe are developing HAMap (Habitat Area Map), a web GIS systembased on the Google map which greatly improves a user's Webvisualization experience. Fig 4 shows the system architecture of theCluster Visual Map. Here, Google Map is a popular web based GISdeveloped by Google. The main tasks of the HAMap are listed as follows:.Our web server will download the latest data from the mail box and thensave them to the data base automatically..When a web Brower input the parameters in our .jsp page (Java serve netpage). The host analyzes the input parameter and translates them to thecluster model.cluster model at first run the cluster algorithm on the datasets fromdatabase, and then respond the user with the document stream as the .xmltype, which are detailed by the different cluster result such as the locationinformation, the area of convex hull and so on..The xml stream will be sent to Google Server net and the Brower willdisplay the cluster information based on the Google Map.work, MCP (Minimum Convex Polygon Home Range) would be moreproper in our home range estimation [10]. We first go to compute thepolygon in this area which would be a convex . There are two algorithmsthat compute the convex hull of a set of n points. Graham’s scan runs inO(nlgn)time, and the Jarvis’s march runs in O(nh)times, where h is thenumber of vertices of the convex hull. Both Graham’s scan and Jarvis’smarch use rotational sweep, which processing vertices in the order of thepolar angles they form with a reference vertex. At first, Point with Maximor Minimum latitude was computed, because only they will be the pointon the convex. And then we utilize Graham-Scan to get the MCP. The runtime will be limited to O(n) time. A much more technical descriptioncould be seen in [11].Home range was computed based on spherical geometry from theconvex results. A closed geometric figure on the surface of a sphere isformed by the arcs of great circles. The spherical polygon is ageneralization of the spherical triangle [12]. Ifs [ ( n2) ]* Ris the sum of theradian angles of a spherical polygon on a sphere of radius R, then the areais:2Fig 4: The system architecture of Habitat Area Map4.2 Spatial-Tree for bar-head gooseThe different annual migration route would not overlap with eachother. This involves dividing those data sets into two parts: 2007 surveyand 2008 survey.2007 Spatial-Tree was built from top to bottom on the annualmigration data from 2007-03 to 2008-03. In the Fig 5, the left slide is thetree and the right is the spatial distribution associated with that node. Theconvex home range also was depicted with different color of polygon andthe description would be shown when the user clicks the marker withcertain index.When the Spatial-Tree depth is 3, there are 6 clusters in the level 1with Eps is 35000 meters and MinPts is 25, and the average range ofhabitats area is 29045.38 Km 2 . In the map, we can clearly find the133

eeding area Qinghai lake with index 3, post breeding area Zhalin-Elinglake with index 4. The maximum one is the wintering area in Tibet rivervalley in China with 9254 Km 2 , and it is interesting to find that onespecies(No. BH07_67693) moved to cluster with index 1 within Mongoliarather than stay in Qinghai Lake for breeding. The movement ofbar-headed goose depicted in this study conformed to the Central AsianFlyway [8]. There are eight species migrated to Tibet river valley, and staythere from 2007-10-21 to 2008-02-25, with a total of 127 days for overwinter rather than fly to north-eastern India and Bangladesh.232.9636 Km 2 , and it mainly covers the salt lake illustrating theimportance of salt lake. These salt lakes are integral to the hydrologydefining the landscape and vegetation characteristics of the region.In the third level of Spatial-Tree, 146 nodes are computed withthe Eps 300 meters and MinPts 25. We also focus the cluster result inQinghai Lake. The core area colored red which indexed as 2 in Fig 7precisely covers one famous small island named San Kuai Si withinthe Qinghai Lake, a famous breeding place. Seven bar-heed geesehave stayed there from 2007-04-10 to 2007-06-24 for 74 days. Thisillustrated that our cluster method could detect the core range areaprecisely.We also can discover the main movements of the fourteenbar-headed geese in 2007 in Fig.5. They depart the breeding place inQinghai Lake and then arrive at the post breeding area inZhalin-Eling lake area or Huangheyuan wetland and stay there abouttwo months before heading to south. Then they follow the cluster 2and cluster 5 which served as the stop over area, and finally arrivingat the winter area. Mean fall migration time was about 7 days.Fig 5: Overview of 2007 bar-headed goose Spatial TreeFig 8: Overview of 2008 bar-headed goose Spatial Tree2008 Spatial-Tree was built on the location data from 2008 surveyFig 6: Core area of Qinghai lake breeding areaduring the period from 2008-03-01 to 2008-11-01 in the Fig 8. The sizeand distribution of post breeding and breeding area approximatelyconformed to the cluster result from 2007 survey. This suggests thatbar-headed goose did not move randomly during the whole process ofmigration. At last, seven birds have arrived at the wintering place inTibet river valley. The fall migration distance is about 1600 km.4.3 Discussion and Implications for habitatconservation and avian influenzaThere are two migration seasons for bar-headed goose: springFig 7: San kuai Si island core area within Qinghai LakeThe Eps and MinPts for the second level of Spatial-Tree are 3000meters and 25. As a result, there are 34 cluster nodes in the second levelof Spatial-Tree. The average size is 282.96 Km 2 . Qinghai lake breedingarea is one of cluster area in the first level, and there are six sub clusterarea within its MCP area in the Fig 6. There are 14 birds staying at thisrange of area for 231 days. The average size of core home range ismigration and fall migration. Biological scientist always aspired to knowwhether those species would fly back their original habitat in the next year.This is the so called site fidelity. On other hand, the site fidelity willreflect the quality of our cluster area result more or less. Three birds in2007 survey still active even on Nov 1. We choose the species(NO.BH07_6795) from them and draw its location data from the2008-04-01 to 2008-11-1 on our 2007 discovered habitat in Fig 9. Those134

location points with green color cover most of our discovered MCP in2007. This not only proves this species would like to return their habitatnext year, but also shows the outcomes of our research to some degree.possible solution is to extract the possible candidate sequence based onsome famous sequence mining approach. Also, we have begun toinvestigate comparing the cluster area spatial distribution betweendifferent years, and hope to discover the habitat changing trend of birdspecies. Finally, we intend to extend our analyses to other species in theQinghai Lake to identify the cross habitat for different species.Fig 9: Site-Fidelity of Spring Migration in 2007Our cluster based approach for discovering the bar headed gooseapproximately depicts the geographical distribution of this wild water bird.Both of the cluster results in 2007 and in 2008 matched greatly, whichshowed us that some certain habitat such as the Qinghai Lake, DaLingLake and the Tibet river valley are of vital importance for some species tolive. The wide area of MCP proves that it is necessary to build one broadnetwork to cover the different core region area. The cluster resultdisplayed in the GIS paves the way for human beings to construct onesystematic nature reserve in future. In addition, scientists would do muchmore research works such as virus, plant and environment quality surveyto discover the way of highly pathogenic avian influenza disperse in thewild bird species' MCP.5 Conclusion and Future WorkSatellite tracking has been used successfully to determine themigration routes and stopover sites of a number of birds. Suchinformation allowed the development of a future plan for protecting thebreeding and stopover sites. However, the lack of an efficient dataanalyzing approach presents an obstacle to help researchers to do thiswork automatically and precisely. In this paper we suggest to exploit thefield of using the location data information as a supplement clusteringprocess which can provide an alternative approach for visual observationin counting the location plots for finding the wild bird species’ habitat.Meanwhile, in order to discover the core rang, a new cluster strategy wasproposed: A hierarchy cluster based on the BFS. Our initial analysis of theclustering results suggests that this clustering strategy can effectivelymanage the different cluster area and discover the spatial relationship ofcluster area. It does provide an effective assistance for biologicalresearchers in discovering new habitat.In the future, we plan to extend our current work to address severalunresolved issues. Specifically, we want to figure out if there is anyfrequent and interesting visiting sequence for different wild birds. OneAcknowledgementsThis work was supported in part by Chinese Academy of Sciences(No.INF105-SDB) and Special Project of Informatization of ChineseAcademy of Sciences in" the Eleventh Five-Year Plan",No.INFO-115-D02. The authors would like to thank the researchers inWestern Ecological Research Center, Patuxent Wildlife Research Centerand United States Geological Survey for sharing the data sets with us.Reference[1] Liu, J., Xiao, H., Lei, F., Zhu, Q., Qin, K., Zhang, X. W., Zhang, X. L., Zhao, D., Wang,G. & other authors (2005). Highly pathogenic H5N1 influenza virus infection inmigratory birds. Science 309, 1206.[2] Li, Z. W. D. and Mundkur, T. (2007) Numbers and distribution of waterbirds andwetlands in the Asia-Pacific region: results of the Asian Waterbird Census 2002–2004.Kuala Lumpur, Malaysia: Wetlands International.[3] Worton.B.J.. 1989 kernel methods for estimating the utilization distribution inhome-range studies.Ecology 70. 164-168[4] Ester, M., Kriegel, H., Sander, J. and Xu, X. A density-based algorithm fordiscovering clusters in large spatial databases with noise. Proc. 2nd Int. Conf. onKnowledge Discovery and Data Mining. Portland, OR. 1996, pp. 226-231.[5] Ng R.T., and Han J. 1994. Efficient and Effective Clustering Methods for SpatialData Mining, Proc. 20th Int. Conf. on Very Large Data Bases, 144-155. Santiago, Chile.[6] L. Ert¨oz, M. Steinbach, and V. Kumar. Finding topics in collections of documents:A shared nearest neighbor approach. In Proceedings of Text Mine’01, First SIAMInternational Conference on Data Mining, Chicago, IL,USA, 2001.[7] Great-circle distance http://mathworld.wolfram.com/GreatCircle.html[8] Miyabayashi, Y. and Mundkur, T. (1999) Atlas of key sites for Anatidae in the EastAsian Flyway. Tokyo: Japan, and Kuala Lumpur, Malaysia: WetlandsInternational—AsiaPacific.Available at http://www.jawgp.org/anet/aaa1999/aaaendx.htm.Accessed 11 March2008.[9] Muzaffar S.B, Johny.Takekawa,. Seasonal movements and migration of Pallas’sGulls Larus ichthyaetus from Qinghai Lake, China Forktail 24(2008):100-107[10] Hooge, P. N. And B. Eichenlaub. 2000. Animal movement extension to ArcViewver. 2.0 and later. USGS Alaska Biological Science Center, Anchorage,Alaska.[11] Guo Hao Shan, Li JunYu, Lin QiYing. Introduction to ACM internationalCollegiate Programming Contest Secnod Edition. 100-102 (in Chinese)[12] Weisstein, Eric W. "Spherical Polygon." From MathWorld--A Wolfram WebResource. http://mathworld.wolfram.com/SphericalPolygon.html[13] Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to ClusterAnalysis. John Wiley and Sons, Inc., New York (1990)135

Exploring the Spatial Distribution of Bird Habitat with ... - IEEE Xplore

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?