29.01.2014 Views

Fast Robust Large-scale Mapping from Video and Internet Photo ...

Fast Robust Large-scale Mapping from Video and Internet Photo ...

Fast Robust Large-scale Mapping from Video and Internet Photo ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

For every frame of video, a depthmap is generated using a real-time GPUbased<br />

multi-view planesweep stereo algorithm. The planesweep algorithm,<br />

comprised primarily of image warping operations, is highly efficient on the<br />

GPU [62]. Because it is multi-view, it is quite robust <strong>and</strong> can efficiently<br />

h<strong>and</strong>le occlusions, a major problem in stereo. The stereo depthmaps are<br />

then fused using a visibility-based depthmap fusion method, which also runs<br />

on the GPU. The fusion method removes outliers <strong>from</strong> the depthmaps, <strong>and</strong><br />

also combines depth estimates to enhance their accuracy. Because of the redundancy<br />

in video (each surface is imaged multiple times), fused depthmaps<br />

only need to be produced for a subset of the video frames. This reduces<br />

processing time as well as the 3D model size as discussed in Section 6.<br />

The planesweep stereo algorithm [36] computes a depthmap by testing<br />

a family of plane hypotheses, <strong>and</strong> for each depthmap pixel recording the<br />

distance to the plane with the best photoconsistency score. One view is<br />

designated as reference, <strong>and</strong> a depthmap is computed for that view. The<br />

remaining views are designated matching views. For each plane, all matching<br />

views are back-projected onto the plane, <strong>and</strong> then projected into the<br />

reference view. This mapping, known as a plane homography [84], can be<br />

performed very efficiently on the GPU. Once the matching views are warped,<br />

the photoconsistency score is computed. We use a sum of absolute differences<br />

(SAD) photoconsistency measure. To h<strong>and</strong>le occlusions, the matching views<br />

are divided into a left <strong>and</strong> right subset, <strong>and</strong> the best matching score of the two<br />

subsets is kept [85]. Before the difference is computed, the image intensities<br />

are multiplied by the relative exposure (gain) computed during KLT tracking<br />

(Section 4.1.1). The stereo algorithm can produce a 512 × 384 depthmap<br />

30

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!