18.07.2014 Views

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1.2. Why vision? 3<br />

resentation. A possible classification of the different paradigms according to [65] might be:<br />

topological, metric <strong>and</strong> appearance based.<br />

Topological maps: Topological maps represent the environment as connected graphs. The<br />

nodes of the graph are possible places, e.g. rooms. Connected nodes there<strong>for</strong>e represent<br />

places which are located close by <strong>and</strong> are reachable <strong>for</strong> the robot. Navigation <strong>and</strong> path<br />

planning in such a map can be difficult. The robot only gets the in<strong>for</strong>mation which places<br />

need to be traversed to get to its goal, but in the absence of metric in<strong>for</strong>mation no direction<br />

in<strong>for</strong>mation or distance in<strong>for</strong>mation can be given.<br />

Metric maps: In a metric map the single map elements are spatially organized, that means<br />

the position of a map element (l<strong>and</strong>mark) is known in a common world coordinate frame.<br />

Metric maps can differ widely by the used l<strong>and</strong>marks. One possible metric world representation<br />

partitions the known world by a grid into discrete cells [81]. For each cell position<br />

it is stored if the position is occupied by an object (e.g. wall, tables, etc.) or if it is free.<br />

Such a map is often called occupancy grid, <strong>and</strong> basically represents the 2D floor plan of<br />

an environment. As the size of each grid cell is, known metric in<strong>for</strong>mation is available <strong>and</strong><br />

allows distance computations <strong>and</strong> metric path planning. Another possibility is to represent<br />

the world by geometric features which are positioned in 3D [96]. Such features can be 3D<br />

points, 3D lines, etc. Localization in such a world can be done by triangulation. The main<br />

difficulty however is to find the correspondences between the features in the map <strong>and</strong> the<br />

features detected by the current sensor readings.<br />

Appearance-based maps: In such an approach the world is represented by raw sensor data.<br />

The map is simply a collection of all previously acquired sensor readings [51]. Guided<br />

navigation <strong>and</strong> localization is difficult. But the main problem however is the scalability of<br />

the approach. Simply storing all the sensor readings is very memory consuming <strong>and</strong> poses<br />

a big problem <strong>for</strong> large scale maps.<br />

In contrast to this classification, combinations of the different approaches are also proposed<br />

in the literature. In [103] metric grid-maps are connected by a topological approach on top to<br />

generate the world representation.<br />

1.2 Why vision?<br />

Most of the maps described in the previous section do not depend on a specific kind of sensors.<br />

In fact, research is done with a variety of different sensors. The prominent sensors <strong>for</strong> robot<br />

localization are wheel encoders (odometry), inertial sensors, sonar, infrared, laser range finders<br />

<strong>and</strong> of course vision sensors. Each sensor type shows different advantages <strong>and</strong> disadvantages, a<br />

list may be found in [65]. Wheel encoders <strong>and</strong> inertial sensors provide direct in<strong>for</strong>mation about<br />

the path of the robot. Sonar, infrared sensors <strong>and</strong> laser range finders are ranging devices. They<br />

provide the robot with more or less (depending of the type of the sensor) accurate distances to<br />

objects in the vicinity of the robot. However, they only provide distance in<strong>for</strong>mation. Compared<br />

to these sensors a vision sensor seems to be the most powerful one. A vision sensor can provide<br />

odometry in<strong>for</strong>mation as described in [84]. A vision sensor can also act as ranging device, either<br />

as a stereo setup (demonstrated in [80]) or with a structure-from-motion approach [106]. In<br />

addition a vision sensor allows to record the appearance of the world surrounding the robot.<br />

The visual appearance of l<strong>and</strong>marks can now be associated to range in<strong>for</strong>mation. A vision sensor

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!