PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
PHD Thesis - Institute for Computer Graphics and Vision - Graz ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
1.2. Why vision? 3<br />
resentation. A possible classification of the different paradigms according to [65] might be:<br />
topological, metric <strong>and</strong> appearance based.<br />
Topological maps: Topological maps represent the environment as connected graphs. The<br />
nodes of the graph are possible places, e.g. rooms. Connected nodes there<strong>for</strong>e represent<br />
places which are located close by <strong>and</strong> are reachable <strong>for</strong> the robot. Navigation <strong>and</strong> path<br />
planning in such a map can be difficult. The robot only gets the in<strong>for</strong>mation which places<br />
need to be traversed to get to its goal, but in the absence of metric in<strong>for</strong>mation no direction<br />
in<strong>for</strong>mation or distance in<strong>for</strong>mation can be given.<br />
Metric maps: In a metric map the single map elements are spatially organized, that means<br />
the position of a map element (l<strong>and</strong>mark) is known in a common world coordinate frame.<br />
Metric maps can differ widely by the used l<strong>and</strong>marks. One possible metric world representation<br />
partitions the known world by a grid into discrete cells [81]. For each cell position<br />
it is stored if the position is occupied by an object (e.g. wall, tables, etc.) or if it is free.<br />
Such a map is often called occupancy grid, <strong>and</strong> basically represents the 2D floor plan of<br />
an environment. As the size of each grid cell is, known metric in<strong>for</strong>mation is available <strong>and</strong><br />
allows distance computations <strong>and</strong> metric path planning. Another possibility is to represent<br />
the world by geometric features which are positioned in 3D [96]. Such features can be 3D<br />
points, 3D lines, etc. Localization in such a world can be done by triangulation. The main<br />
difficulty however is to find the correspondences between the features in the map <strong>and</strong> the<br />
features detected by the current sensor readings.<br />
Appearance-based maps: In such an approach the world is represented by raw sensor data.<br />
The map is simply a collection of all previously acquired sensor readings [51]. Guided<br />
navigation <strong>and</strong> localization is difficult. But the main problem however is the scalability of<br />
the approach. Simply storing all the sensor readings is very memory consuming <strong>and</strong> poses<br />
a big problem <strong>for</strong> large scale maps.<br />
In contrast to this classification, combinations of the different approaches are also proposed<br />
in the literature. In [103] metric grid-maps are connected by a topological approach on top to<br />
generate the world representation.<br />
1.2 Why vision?<br />
Most of the maps described in the previous section do not depend on a specific kind of sensors.<br />
In fact, research is done with a variety of different sensors. The prominent sensors <strong>for</strong> robot<br />
localization are wheel encoders (odometry), inertial sensors, sonar, infrared, laser range finders<br />
<strong>and</strong> of course vision sensors. Each sensor type shows different advantages <strong>and</strong> disadvantages, a<br />
list may be found in [65]. Wheel encoders <strong>and</strong> inertial sensors provide direct in<strong>for</strong>mation about<br />
the path of the robot. Sonar, infrared sensors <strong>and</strong> laser range finders are ranging devices. They<br />
provide the robot with more or less (depending of the type of the sensor) accurate distances to<br />
objects in the vicinity of the robot. However, they only provide distance in<strong>for</strong>mation. Compared<br />
to these sensors a vision sensor seems to be the most powerful one. A vision sensor can provide<br />
odometry in<strong>for</strong>mation as described in [84]. A vision sensor can also act as ranging device, either<br />
as a stereo setup (demonstrated in [80]) or with a structure-from-motion approach [106]. In<br />
addition a vision sensor allows to record the appearance of the world surrounding the robot.<br />
The visual appearance of l<strong>and</strong>marks can now be associated to range in<strong>for</strong>mation. A vision sensor