PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

More documents

Recommendations

Info

1.2. Why vision? 3 resentation. A possible classification of the different paradigms according to [65] might be: topological, metric and appearance based. Topological maps: Topological maps represent the environment as connected graphs. The nodes of the graph are possible places, e.g. rooms. Connected nodes therefore represent places which are located close by and are reachable for the robot. Navigation and path planning in such a map can be difficult. The robot only gets the information which places need to be traversed to get to its goal, but in the absence of metric information no direction information or distance information can be given. Metric maps: In a metric map the single map elements are spatially organized, that means the position of a map element (landmark) is known in a common world coordinate frame. Metric maps can differ widely by the used landmarks. One possible metric world representation partitions the known world by a grid into discrete cells [81]. For each cell position it is stored if the position is occupied by an object (e.g. wall, tables, etc.) or if it is free. Such a map is often called occupancy grid, and basically represents the 2D floor plan of an environment. As the size of each grid cell is, known metric information is available and allows distance computations and metric path planning. Another possibility is to represent the world by geometric features which are positioned in 3D [96]. Such features can be 3D points, 3D lines, etc. Localization in such a world can be done by triangulation. The main difficulty however is to find the correspondences between the features in the map and the features detected by the current sensor readings. Appearance-based maps: In such an approach the world is represented by raw sensor data. The map is simply a collection of all previously acquired sensor readings [51]. Guided navigation and localization is difficult. But the main problem however is the scalability of the approach. Simply storing all the sensor readings is very memory consuming and poses a big problem for large scale maps. In contrast to this classification, combinations of the different approaches are also proposed in the literature. In [103] metric grid-maps are connected by a topological approach on top to generate the world representation. 1.2 Why vision? Most of the maps described in the previous section do not depend on a specific kind of sensors. In fact, research is done with a variety of different sensors. The prominent sensors for robot localization are wheel encoders (odometry), inertial sensors, sonar, infrared, laser range finders and of course vision sensors. Each sensor type shows different advantages and disadvantages, a list may be found in [65]. Wheel encoders and inertial sensors provide direct information about the path of the robot. Sonar, infrared sensors and laser range finders are ranging devices. They provide the robot with more or less (depending of the type of the sensor) accurate distances to objects in the vicinity of the robot. However, they only provide distance information. Compared to these sensors a vision sensor seems to be the most powerful one. A vision sensor can provide odometry information as described in [84]. A vision sensor can also act as ranging device, either as a stereo setup (demonstrated in [80]) or with a structure-from-motion approach [106]. In addition a vision sensor allows to record the appearance of the world surrounding the robot. The visual appearance of landmarks can now be associated to range information. A vision sensor
1.3. What has already been achieved? 4 would probably give the most general world representation. In fact, certain tasks would require the use of vision sensors. Imagine a mobile robot with the task to find a certain object, lets say a coffee cup, for its user. The task of detecting the coffee cup can certainly not get achieved with ranging devices solely. Although one can think about detecting a cup by its 3D shape with a laser range finder this method cannot distinguish between similar cups differing only in the color. Such a task requires a vision sensor, and thus as vision is already onboard it is tempting to use it for navigation and localization too. 1.3 What has already been achieved? The use of vision sensors for mobile robot localization has not yet reached an as elaborate state as the use of laser range finders. Mobile robots equipped with laser range finders already navigate safely in unknown and people crowded environments [104] and are able to build large and accurate maps [105]. But lets discuss what has been achieved using visual sensors in odometry, localization and map building. In the absence of a map or within a featureless environment visual odometry can be used to compute the path a robot went and thus the actual position can be computed from the travelled path. For visual odometry point features are tracked from frame to frame and with a structurefrom-motion approach the robots movement for each frame can be computed. The estimation has to be very accurate, because the final position is computed incrementally from all small movements. And even small inaccuracies might result into big deviations. The capabilities of the current state-of-the-art in visual odometry has been shown impressively in NASA’s Mars Exploration Rovers ”Spirit” and ”Opportunity” [86]. The slippy surface did not allow for an accurate wheel odometry computation and laser range finders could not be used in the open outdoor environment. However, a fully autonomous visual based navigation was still not possible and the rovers where controlled by human operators in the end to compensate for errors of the visual localization system. The current state-of-the-art in visual localization is defined by vSlam [56] and the method described in [96]. Both systems are SLAM-approaches based on SIFT-landmarks [67] and show very similar performances. The approaches allow map building in indoor environments of the size of a room up to small flats. The robot will explore the environment autonomously and create a map of 3D point landmarks. After the map creation is finished the robot can perform global localization and path planning tasks. The achieved localization accuracy is about 10-15 cm at average. For vSlam the robot has to be equipped with a single camera only. The 3D reconstruction of the landmarks is done with a structure-from-motion approach. The other system uses a stereo setup on the robot for 3D reconstruction. A main limitation of the systems is the size of the maintained environment map. For bigger than room-size environments the map will be too big to be handled in real-time. The last example deals with the automatic map building of large scale environments and outdoor environments. The system proposed in [10] is capable of mapping a path of the length of several kilometers accurately. The large scale map is composed of connected metric sub-maps. The sub-maps contain 3D line features. The system allows loop closing by matching the 3D lines from the current reconstruction to the 3D lines of the sub-maps. A global optimization ensures the high accuracy of the map. However, the map features are purely geometric and the system will get difficulties in buildings with highly similar structures. Moreover the method does not allow global localization in general. The map in the presented form cannot be used for robot navigation and localization.
Page 1 and 2: Graz University of Technology Insti
Page 3 and 4: Abstract Visual map building and lo
Page 5 and 6: Contents 1 Introduction to mobile r
Page 7 and 8: CONTENTS vi 7.1.5 Sub-map linking .
Page 9: 1.1. Localization and map building
Page 13 and 14: 1.5. How can it get solved? 6 fully
Page 15 and 16: 1.6. Contribution of this thesis 8
Page 17 and 18: 1.7. Structure of the thesis 10 com
Page 19 and 20: 2.2. Localization from point featur
Page 21 and 22: 2.2. Localization from point featur
Page 23 and 24: 2.4. Localization from plane featur
Page 25 and 26: 2.5. Summary 18 or not. Clearly thi
Page 27 and 28: Chapter 3 Local detectors Research
Page 29 and 30: 3.1. Interest point detectors 22 fu
Page 31 and 32: 3.2. Scale invariant detectors 24 r
Page 33 and 34: 3.2. Scale invariant detectors 26 (
Page 35 and 36: 3.2. Scale invariant detectors 28 3
Page 37 and 38: 3.3. Affine invariant detectors 30
Page 47 and 48: 3.4. Comparison of the described me
Page 49 and 50: 3.4. Comparison of the described me
Page 51 and 52: 44 But using a plane to plane homog
Page 53 and 54: 4.2. Representation of the detectio
Page 55 and 56: 4.3. Detection correspondence 48 th
Page 57 and 58: 4.4. Point transfer using the trifo
Page 59 and 60: 4.5. Ground truth generation 52 usi
Page 61 and 62:
4.6. Experimental evaluation 54 Fig
Page 63 and 64:
4.6. Experimental evaluation 56 rep
Page 65 and 66:
4.6. Experimental evaluation 58 MSE
Page 67 and 68:
4.6. Experimental evaluation 60 mat
Page 69 and 70:
4.6. Experimental evaluation 62 vie
Page 71 and 72:
4.6. Experimental evaluation 64 vie
Page 73 and 74:
4.6. Experimental evaluation 66 rel
Page 75 and 76:
Chapter 5 Maximally Stable Corner C
Page 77 and 78:
5.1. The MSCC detector 70 (a) (b) (
Page 79 and 80:
5.2. Region representation 72 400 (
Page 81 and 82:
5.3. Computational complexity 74 6.
Page 83 and 84:
5.5. Detection examples 76 paramete
Page 85 and 86:
5.5. Detection examples 78 Figure 5
Page 87 and 88:
5.6. Detector evaluation: Repeatabi
Page 89 and 90:
5.7. Combining MSCC with other loca
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
6.1. Wide-baseline region matching
Page 101 and 102:
6.1. Wide-baseline region matching
Page 103 and 104:
6.2. Piece-wise planar scene recons
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Chapter 7 Living in a piecewise pla
Page 119 and 120:
7.1. Map building 112 s,R,t sub-map
Page 121 and 122:
7.1. Map building 114 x = (x 1 ...x
Page 123 and 124:
7.1. Map building 116 distance is u
Page 125 and 126:
7.1. Map building 118 normalization
Page 127 and 128:
7.2. Localization 120 where N = |D
Page 129 and 130:
7.2. Localization 122 registration
Page 131 and 132:
7.2. Localization 124 (a) (b) (c) F
Page 133 and 134:
7.2. Localization 126 3D structure
Page 135 and 136:
7.2. Localization 128 other landmar
Page 137 and 138:
Chapter 8 Map building and localiza
Page 139 and 140:
8.2. Map building experiments 132 8
Page 141 and 142:
8.2. Map building experiments 134 D
Page 143 and 144:
8.2. Map building experiments 136 (
Page 145 and 146:
8.2. Map building experiments 138 F
Page 147 and 148:
8.2. Map building experiments 140 F
Page 149 and 150:
8.3. Localization experiments 142 8
Page 151 and 152:
8.3. Localization experiments 144 F
Page 153 and 154:
8.3. Localization experiments 146 F
Page 155 and 156:
8.3. Localization experiments 148 8
Page 157 and 158:
8.3. Localization experiments 150 (
Page 159 and 160:
Chapter 9 Conclusion More than 25 y
Page 161 and 162:
9.1. Future work 154 Map building a
Page 163 and 164:
9.1. Future work 156 information ca
Page 165 and 166:
A.1. Projective ellipse transfer 15
Page 167 and 168:
A.1. Projective ellipse transfer 16
Page 169 and 170:
A.2. Affine approximation of ellips
Page 171 and 172:
Appendix B The trifocal tensor and
Page 173 and 174:
Bibliography [1] S. Atiya and G. Ha
Page 175 and 176:
168 [31] F. Fraundorfer and H. Bisc
Page 177 and 178:
170 [61] U. Köthe. Edge and juncti
Page 179 and 180:
172 [92] F. Schaffalitzky and A. Zi
show all

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?