24.11.2013 Views

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

64 Chapter 5. On-line <strong>Semi</strong>-<strong>Supervised</strong> Boosting<br />

Discussion As we have seen, by training a prior classifier F P from labeled samples a<br />

priori provided, it is possible to include unlabeled data into the on-line boosting framework<br />

using pseudo-labels and pseudo-importances. Note that if some labeled data is given<br />

in advance, this data can be used to train the prior (or even update the prior when it arrives<br />

over time). However, it should also be clear that the incorporation of prior knowledge<br />

is not limited to a prior classifier and in principle any source of prior knowledge can<br />

be incorporated. Our on-line semi-supervised boosting algorithm <strong>for</strong> feature selection is<br />

sketched in Algorithm 5.1. As can be seen, compared to the original on-line boosting<br />

algorithm [Grabner and Bischof, 2006] only a few lines of code have to be changed in<br />

order to incorporate unlabeled data.<br />

5.2 On Robustness of On-line Boosting<br />

Currently, boosting is one of the best and thus one of the mostly applied classification<br />

methods in machine learning. This is also true <strong>for</strong> the on-line variant, which is frequently<br />

applied in practice. However, boosting is proven, from both the theoretical and the experimental<br />

point of view to be sensitive to label noise. For off-line boosting, this issue was<br />

discovered relatively early and, hence, different more robust methods [Maclin and Opitz,<br />

1997, Mason et al., 1999, Dietterich, 1998, Friedman et al., 2000, Freund, 2000, Domingo<br />

and Watanabe, 2000,Long and Servedio, 2008b,Masnadi-Shirazi and Vasconcelos, 2008]<br />

have been proposed.<br />

Although <strong>for</strong> on-line boosting robustness is highly important because it is frequently<br />

applied in autonomous learning problems, in contrast to the off-line case, up to now this<br />

issue has not been studied. Thus, in the following, we study the robustness of on-line<br />

boosting <strong>for</strong> feature selection in terms of label noise. In particular, we follow the work of<br />

Long et al. [Long and Servedio, 2008b], who showed that the loss function has not only<br />

high influence to the learning behavior but also on the robustness. Especially convex loss<br />

functions (typically used in boosting) are highly susceptible to random noise. Hence, to<br />

increase the robustness the goal is to find more suitable less noise sensitive loss functions.<br />

For that purpose, we first introduce a generic boosting algorithm, which we call Online<br />

GradientBoost, where arbitrary loss functions can be plugged in easily. In fact, this<br />

method extends the GradientBoost algorithm of Friedman et al. [Friedman, 2001] and<br />

is similar to the AnyBoost algorithm of Mason et al. [Mason et al., 1999]. Based on<br />

this algorithm, we develop different on-line boosting methods using the loss functions<br />

proposed <strong>for</strong> robust off-line boosting algorithms.<br />

We talk about label noise, if a sample was assigned a wrong label during the labeling<br />

process of the data. AdaBoost is highly susceptible to noise. This sensitivity comes from<br />

the fact that AdaBoost increases the weight of a mis-classified sample in each iteration.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!