24.11.2013 Views

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

38 Chapter 3. Overview of <strong>Semi</strong>-<strong>Supervised</strong> Learning<br />

3.8 <strong>Computer</strong> <strong>Vision</strong> Applications<br />

In computer vision, the probably most frequently applied semi-supervised learning algorithm<br />

is co-training. For example, Levin et al. [Levin et al., 2003] used co-training to train<br />

a car detector. They start with a small number of hand labeled samples and generate additional<br />

labeled examples by applying co-training of two boosted off-line classifiers, where<br />

one uses gray-value images and the other is trained from background subtracted images,<br />

respectively. Moreover, Javed et al. [Javed et al., 2005] applied an arbitrary number of<br />

classifiers and extended the method to on-line learning. In particular, they first generate a<br />

seed model by off-line boosting, which is improved later on by on-line boosting. If multiple<br />

disjoint views exist, co-training can also be applied <strong>for</strong> tracking, e.g., [Tang et al.,<br />

2007, Yu et al., 2008, Liu et al., 2009]. There also exist several approaches based on deep<br />

neural networks in order to improve the visual recognition per<strong>for</strong>mance using unlabeled<br />

data, e.g., [Yu et al., 2008, Mobahi and Collobert, 2009]. Recently, Fergus et al. [Fergus<br />

et al., 2009] presented a semi-supervised framework that is able to learn object classifiers<br />

from 80 million images. In particular, they propose a graph-based method that scales<br />

linear with the number of samples and thus allows <strong>for</strong> large-scale usage. Guillaumin et<br />

al. [Guillaumin and Schmid, 2010] proposed a multimodal SSL approach used <strong>for</strong> image<br />

categorization, where the main idea is to additionally to the visual in<strong>for</strong>mation, also<br />

exploit other sources of in<strong>for</strong>mation such as text, which is surrounding images on web<br />

pages. Socher and Fei-Fei [Socher and Fei-Fei, 2010] applied <strong>Semi</strong>-supervised learning<br />

to image-segmentation.<br />

3.9 SSL from weakly related data<br />

As has been shown above, there exist a large amount of methods and algorithms <strong>for</strong> the<br />

semi-supervised learning problem. The main differences between these approaches are<br />

often only based on their assumptions which they are imposing over the unlabeled data<br />

(e.g., manifold assumption or large margin assumption, etc.) and on which supervised<br />

learning method they are based, such as SVMs or boosting. Yet, one assumption that<br />

most of them have in common is that the underlying marginal data distribution P (X , Y)<br />

is i.i.d., which means they draw samples from data of which they assume it is identical<br />

and independently distributed. However, in practice, unlabeled data does not necessarily<br />

come from the same distribution as the labeled data.<br />

Another problem that occurs in practice but is ignored by most approaches is the fact<br />

that although unlabeled data is usually easy to obtain, unlabeled data which consists of<br />

sufficient amounts of target class samples is not. For instance, consider the problem of<br />

training a visual object detector <strong>for</strong> alpacas (see Figure 3.3a). For this task, it is difficult

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!