PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision
PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision
PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2 Chapter 1. Introduction<br />
Figure 1.1: The eyes are the most important senses <strong>for</strong> a frog.<br />
1982] and states the scientific field where the thesis at hand is placed.<br />
Why is it worth doing research on computer vision? The answer to this question can<br />
be split up into two parts: First, from a researcher’s perspective and due to the human pioneering<br />
spirit, i.e., simply because we want to find out if we can do it. Second, because<br />
there exist a huge amount of potential applications. Some of these are optical character<br />
recognition (OCR), web-based image search, industrial machine inspection, biomedical<br />
imaging, surveillance, 3D model building and photogrammetry, automotive safety, biometrics,<br />
robotics, etc.. See also David Lowe’s website of industrial vision applications 1 .<br />
Overall, computer vision already now comprises a multi-billion dollar market with expected<br />
steady growth.<br />
<strong>Vision</strong> is hard The human visual system rapidly and ef<strong>for</strong>tlessly recognizes a large<br />
number of diverse objects despite large variations in the object’s position, pose, lighting<br />
and background clutter. Additionally, we can easily segment an object, analyze its shape<br />
and track it. Building computational systems that are able to achieve the same per<strong>for</strong>mance<br />
is extremely hard. One of the reasons <strong>for</strong> this difficulty is that “reverse engineering<br />
the brain” in order to emulate it on machines is very complicated. Although cognitive science<br />
has made large progress in understanding the brain’s solution to per<strong>for</strong>m visual tasks<br />
(see above), we are still far from fully understanding how human perception works. Additionally,<br />
unlike humans, we provide computers with digital data and from a machine’s<br />
perspective, an image is nothing else than a matrix of numbers. The size and the quality<br />
of the matrices may vary a lot. Hence, the way how computer vision is done today can be<br />
also interpreted as searching <strong>for</strong> useful in<strong>for</strong>mation in matrices and researchers still argue<br />
if this is the right path to follow.<br />
<strong>Computer</strong> vision is an inverse discipline, that is we have to find a solution <strong>for</strong> a problem<br />
where we get provided by an insufficient amount of in<strong>for</strong>mation; and inverse problems<br />
are typically ill-posed, i.e., there does not a exist a unique solution [Hadamard,<br />
1 http:://www.cs.ubc.ca/spider/lowe/vision.html (01.04.2010)