PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
7.1. Map building 115<br />
Sub-map component<br />
T W L<br />
K<br />
I<br />
L<br />
C<br />
Π<br />
D<br />
A<br />
P L<br />
Description<br />
rigid trans<strong>for</strong>mation into the global coordinate system<br />
camera calibration matrix<br />
plane index image<br />
l<strong>and</strong>mark image patches<br />
plane covariances<br />
3D planes<br />
l<strong>and</strong>mark SIFT descriptors<br />
l<strong>and</strong>mark normalization trans<strong>for</strong>mations<br />
camera matrix of the local coordinate system<br />
Table 7.1: Components of the piece-wise planar sub-map.<br />
A is a set of size n holding a trans<strong>for</strong>mation <strong>for</strong> each l<strong>and</strong>mark in L. An entry of A is an affine<br />
trans<strong>for</strong>mation matrix of size 3 × 3. D is a set of feature vectors providing a description <strong>for</strong><br />
each l<strong>and</strong>mark of size n. Each entry of D is a SIFT feature vector of length 128 providing the<br />
description <strong>for</strong> the corresponding l<strong>and</strong>mark. The SIFT feature vector is computed from the<br />
normalized image patches in L. I <strong>and</strong> Π are used to represent the 3D coordinate of a l<strong>and</strong>mark.<br />
Π is a set of 3D plane descriptions of the sub-map of size p, where p is the number of planes<br />
detected in the sub-map. Each plane is described by a 6-vector (parameterized with normal<br />
vector <strong>and</strong> one 3D point) representing the 3D parameters within the local coordinate system.<br />
Each l<strong>and</strong>mark is located on these planes in 3D space. The corresponding mapping is stored in<br />
I which is an index image holding the in<strong>for</strong>mation which pixel in the image space corresponds<br />
to which plane in 3D. The map also contains uncertainties <strong>for</strong> the 3D planes. The set C of size<br />
p contains covariance matrices <strong>for</strong> the different 3D planes. K <strong>and</strong> P L define the local coordinate<br />
system. P L is the camera matrix which connects the 3D planes to the image coordinates. K is<br />
the corresponding 3 × 3 camera calibration matrix. TL<br />
W represents a rigid trans<strong>for</strong>mation into<br />
the global coordinate system. It is a 4×4 similarity trans<strong>for</strong>mation matrix (rotation, translation,<br />
scale) which trans<strong>for</strong>ms 3D points from the local into the global coordinate system. A sub-map<br />
defined in that way contains all necessary in<strong>for</strong>mation <strong>for</strong> global localization. In the following<br />
the computation of the various entries will be described.<br />
7.1.3 Structure computation<br />
We will now describe how to extract the 3D map structure from two images. From the sub-map<br />
identification step a short-baseline stereo image pair I, I ′ is established. Goal of the structure<br />
computation is to identify planes in the image scene <strong>and</strong> compute a 3D reconstruction of the<br />
planes. The reconstruction will only contain planes, non-planar image parts will be discarded.<br />
This will result in a piece-wise planar 3D reconstruction of the scene. The segmentation of the<br />
scene into planar regions will be done with the method described in Chapter 6. Prerequisite <strong>for</strong><br />
this method is that the camera poses are known. Thus in a first step the camera positions <strong>for</strong><br />
the images I, I ′ have to be computed. DoG interest points are detected in both images I, I ′ <strong>and</strong><br />
SIFT descriptors D, D ′ are computed <strong>for</strong> every detected interest point. Corresponding points<br />
are identified by nearest neighbor search in feature space. As distance measure the Euclidean