Fast Robust Large-scale Mapping from Video and Internet Photo ...

More documents

Recommendations

Info

Keywords: 1. Introduction The fully automatic modeling of large-scale environments has been a research goal in photogrammetry and computer vision since a long time. Detailed 3D models automatically acquired from the real world have many uses including civil and military planning, mapping, virtual tourism, games, and movies. In this paper we present a system approaching the fully automatic modeling of large-scale environments, either from video or from photo collections. Our system has been designed for efficiency and scalability. GPU implementations for SIFT feature matching, KLT feature tracking, gist feature extraction and multi-view stereo allow our system to rapidly process large amounts of video and photographs. For large photo collections, iconic image clustering allows the dataset to be broken into small closely related parts. The parts are then processed and merged, allowing tens of thousands of photos to be registered. Similarly, for videos, loop detection finds intersections in the camera path, which reduces drift for long sequences and allows multiple videos to be registered. Recently mapping systems like Microsoft Bing Maps and Google Earth have started to use 3D models of cities in their visualizations. Currently these systems still require a human in the loop for delivering models of reasonable quality. Nevertheless they already achieve impressive results, modeling large areas with regular updates. However, these models are very low complexity, and do not provide enough detail for ground-level viewing. Furthermore the 2
Figure 1: The left shows an overview model of Chapel Hill reconstructed from 1.3 million video frames on a single PC in 11.5 hrs. On the right a model of the statue of liberty is shown. The reconstruction algorithm registered 9025 cameras out of 47238 images downloaded from the Internet. availability is restricted to only a small number of cities across the globe. On the other hand, our system can automatically produce 3D models from ground-level images with ground-level detail. And the efficiency and scalability of our system presents the possibility to reconstruct 3D models for all the world’s cities quickly and cheaply. In this paper we will review the details of our system, and present 3D models of large areas reconstructed from thousands to millions of images. These models are produced with state-of-the-art computational efficiency, yet are competitive with the state of the art in quality. 2. Related Work In the last years there has been a considerable progress in the area of large scale reconstruction from video for urban environments and arial data [1, 2, 3] as well as there is significant progress on reconstruction from Internet 3
Page 1: Fast Robust Large-scale Mapping fro
Page 5 and 6: use statistical models to combine d
Page 7 and 8: a probabilistic way and the final s
Page 9 and 10: streams of multiple cameras mounted
Page 11 and 12: In the case of available GPS data o
Page 13 and 14: 4.1. Camera Pose from Video Our sys
Page 15 and 16: complexity. Our recently proposed A
Page 17 and 18: surements we must normalize all of
Page 19 and 20: VIP-features [68]. To avoid a compu
Page 21 and 22: an active research topic in the com
Page 23 and 24: achieve high computational performa
Page 25 and 26: for each image in the cluster to re
Page 27 and 28: can be added to the 3D model. This
Page 29 and 30: Figure 5: Left: 3D reconstruction o
Page 31 and 32: from 11 images (10 matching, 1 refe
Page 33 and 34: Original Mesh Simplified Mesh Textu
Page 35 and 36: 7. Conclusions In this paper we pre
Page 37 and 38: [12] T. Berg, D. Forsyth, Animals o
Page 39 and 40: [30] Y. Jing, S. Baluja, H. Rowley,
Page 41 and 42: objects from multiple range images,
Page 43 and 44: [63] M. Fischler, R. Bolles, Random
Page 45 and 46: Marquardt algorithm, Tech. Rep. 340

Fast Robust Large-scale Mapping from Video and Internet Photo ...

Create successful ePaper yourself

Delete template?

Save as template?