PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
7.1. Map building 116<br />
distance is used. Two features correspond if<br />
d 01<br />
d 02<br />
< t (7.6)<br />
where d 01 is the Euclidean distance between the query feature <strong>and</strong> the nearest feature point from<br />
D ′ <strong>and</strong> d 02 is the Euclidean distance to the second closest feature. t is a distance threshold.<br />
Good results can be achieved with t=0.8. The distance measure has been suggested by Lowe [67]<br />
<strong>for</strong> SIFT feature matching. The such established feature correspondences can now be used <strong>for</strong><br />
estimating the camera poses. As already mentioned we assume calibrated cameras, i.e. the<br />
calibration matrix K is know. Thus we can estimate the essential matrix which encodes the<br />
camera positions of the two viewpoints. Essential matrix estimation is per<strong>for</strong>med using the 5-<br />
point algorithm of Nister [83] within a st<strong>and</strong>ard RANSAC scheme [28]. The essential matrix E<br />
can be decomposed into two camera matrices P, P ′ where P is the canonical camera matrix <strong>and</strong><br />
P ′ defines the second camera position in the local canonical coordinate frame (see equation 7.7<br />
<strong>and</strong> 7.8).<br />
⎛<br />
⎞<br />
1 0 0 0<br />
P = ⎝ 0 1 0 0 ⎠ (7.7)<br />
0 0 1 0<br />
P ′ = [R|t] (7.8)<br />
R is a 3 × 3 rotation matrix <strong>and</strong> t is a translation 3-vector defining the baseline of the stereo<br />
case. P, P ′ are input parameters <strong>for</strong> the subsequent plane segmentation <strong>and</strong> reconstruction step.<br />
The algorithm requires also initial guesses <strong>for</strong> small planar regions in the images I, I ′ as input<br />
parameters with the according inter-image homographies. For that MSER regions are detected<br />
in I <strong>and</strong> I ′ . Region matching is per<strong>for</strong>med (as described in Section 6.1) which returns point<br />
correspondences within each interest region <strong>and</strong> the according homography trans<strong>for</strong>m. This<br />
constitutes the initial guesses <strong>for</strong> the plane segmentation algorithm. Plane segmentation <strong>and</strong><br />
reconstruction is now per<strong>for</strong>med yielding the following map components:<br />
• An index image I.<br />
• A set of detected <strong>and</strong> reconstructed planes Π. Each plane is represented by a 6-vector<br />
giving the full 3D parameters in the local coordinate frame.<br />
• Covariances <strong>for</strong> each plane giving an uncertainty measure <strong>for</strong> the reconstruction accuracy.<br />
The structure computation will be completed by a post-processing step. Planes which are<br />
extended behind the camera planes are removed. This consistency check deletes incorrect reconstructed<br />
image parts resulting in higher robustness. The different steps of structure reconstruction<br />
are illustrated in Figure 7.3.<br />
7.1.4 L<strong>and</strong>mark extraction<br />
Up to now the sub-map is still missing essential components, the l<strong>and</strong>marks. Closing this gap<br />
is the goal of the next step. L<strong>and</strong>mark appearance has to be connected to 3D in<strong>for</strong>mation <strong>and</strong><br />
incorporated into the sub-map. In the following we describe an approach using MSER interest<br />
regions as l<strong>and</strong>marks. However, the definition of the sub-map is general enough to allow the use<br />
of any other kind of interest regions. In the following the necessary steps are outlined: