18.07.2014 Views

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

7.1. Map building 115<br />

Sub-map component<br />

T W L<br />

K<br />

I<br />

L<br />

C<br />

Π<br />

D<br />

A<br />

P L<br />

Description<br />

rigid trans<strong>for</strong>mation into the global coordinate system<br />

camera calibration matrix<br />

plane index image<br />

l<strong>and</strong>mark image patches<br />

plane covariances<br />

3D planes<br />

l<strong>and</strong>mark SIFT descriptors<br />

l<strong>and</strong>mark normalization trans<strong>for</strong>mations<br />

camera matrix of the local coordinate system<br />

Table 7.1: Components of the piece-wise planar sub-map.<br />

A is a set of size n holding a trans<strong>for</strong>mation <strong>for</strong> each l<strong>and</strong>mark in L. An entry of A is an affine<br />

trans<strong>for</strong>mation matrix of size 3 × 3. D is a set of feature vectors providing a description <strong>for</strong><br />

each l<strong>and</strong>mark of size n. Each entry of D is a SIFT feature vector of length 128 providing the<br />

description <strong>for</strong> the corresponding l<strong>and</strong>mark. The SIFT feature vector is computed from the<br />

normalized image patches in L. I <strong>and</strong> Π are used to represent the 3D coordinate of a l<strong>and</strong>mark.<br />

Π is a set of 3D plane descriptions of the sub-map of size p, where p is the number of planes<br />

detected in the sub-map. Each plane is described by a 6-vector (parameterized with normal<br />

vector <strong>and</strong> one 3D point) representing the 3D parameters within the local coordinate system.<br />

Each l<strong>and</strong>mark is located on these planes in 3D space. The corresponding mapping is stored in<br />

I which is an index image holding the in<strong>for</strong>mation which pixel in the image space corresponds<br />

to which plane in 3D. The map also contains uncertainties <strong>for</strong> the 3D planes. The set C of size<br />

p contains covariance matrices <strong>for</strong> the different 3D planes. K <strong>and</strong> P L define the local coordinate<br />

system. P L is the camera matrix which connects the 3D planes to the image coordinates. K is<br />

the corresponding 3 × 3 camera calibration matrix. TL<br />

W represents a rigid trans<strong>for</strong>mation into<br />

the global coordinate system. It is a 4×4 similarity trans<strong>for</strong>mation matrix (rotation, translation,<br />

scale) which trans<strong>for</strong>ms 3D points from the local into the global coordinate system. A sub-map<br />

defined in that way contains all necessary in<strong>for</strong>mation <strong>for</strong> global localization. In the following<br />

the computation of the various entries will be described.<br />

7.1.3 Structure computation<br />

We will now describe how to extract the 3D map structure from two images. From the sub-map<br />

identification step a short-baseline stereo image pair I, I ′ is established. Goal of the structure<br />

computation is to identify planes in the image scene <strong>and</strong> compute a 3D reconstruction of the<br />

planes. The reconstruction will only contain planes, non-planar image parts will be discarded.<br />

This will result in a piece-wise planar 3D reconstruction of the scene. The segmentation of the<br />

scene into planar regions will be done with the method described in Chapter 6. Prerequisite <strong>for</strong><br />

this method is that the camera poses are known. Thus in a first step the camera positions <strong>for</strong><br />

the images I, I ′ have to be computed. DoG interest points are detected in both images I, I ′ <strong>and</strong><br />

SIFT descriptors D, D ′ are computed <strong>for</strong> every detected interest point. Corresponding points<br />

are identified by nearest neighbor search in feature space. As distance measure the Euclidean

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!