PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

More documents

Recommendations

Info

64 Chapter 5. On-line Semi-Supervised Boosting Discussion As we have seen, by training a prior classifier F P from labeled samples a priori provided, it is possible to include unlabeled data into the on-line boosting framework using pseudo-labels and pseudo-importances. Note that if some labeled data is given in advance, this data can be used to train the prior (or even update the prior when it arrives over time). However, it should also be clear that the incorporation of prior knowledge is not limited to a prior classifier and in principle any source of prior knowledge can be incorporated. Our on-line semi-supervised boosting algorithm for feature selection is sketched in Algorithm 5.1. As can be seen, compared to the original on-line boosting algorithm [Grabner and Bischof, 2006] only a few lines of code have to be changed in order to incorporate unlabeled data. 5.2 On Robustness of On-line Boosting Currently, boosting is one of the best and thus one of the mostly applied classification methods in machine learning. This is also true for the on-line variant, which is frequently applied in practice. However, boosting is proven, from both the theoretical and the experimental point of view to be sensitive to label noise. For off-line boosting, this issue was discovered relatively early and, hence, different more robust methods [Maclin and Opitz, 1997, Mason et al., 1999, Dietterich, 1998, Friedman et al., 2000, Freund, 2000, Domingo and Watanabe, 2000,Long and Servedio, 2008b,Masnadi-Shirazi and Vasconcelos, 2008] have been proposed. Although for on-line boosting robustness is highly important because it is frequently applied in autonomous learning problems, in contrast to the off-line case, up to now this issue has not been studied. Thus, in the following, we study the robustness of on-line boosting for feature selection in terms of label noise. In particular, we follow the work of Long et al. [Long and Servedio, 2008b], who showed that the loss function has not only high influence to the learning behavior but also on the robustness. Especially convex loss functions (typically used in boosting) are highly susceptible to random noise. Hence, to increase the robustness the goal is to find more suitable less noise sensitive loss functions. For that purpose, we first introduce a generic boosting algorithm, which we call Online GradientBoost, where arbitrary loss functions can be plugged in easily. In fact, this method extends the GradientBoost algorithm of Friedman et al. [Friedman, 2001] and is similar to the AnyBoost algorithm of Mason et al. [Mason et al., 1999]. Based on this algorithm, we develop different on-line boosting methods using the loss functions proposed for robust off-line boosting algorithms. We talk about label noise, if a sample was assigned a wrong label during the labeling process of the data. AdaBoost is highly susceptible to noise. This sensitivity comes from the fact that AdaBoost increases the weight of a mis-classified sample in each iteration.
5.2. On Robustness of On-line Boosting 65 Algorithm 5.1 On-line SemiBoost Require: A training sample: (x n , y n ) ∈ X L or (x n ) ∈ X U . Require: Prior knowledge: F P Require: Number of selectors M. Require: Number of weak learners per selector K. 1: Set the initial weight λ m = e −yF (0) = 1 2: for m = 1 to M do 3: // Update weight estimation 4: if (x n , y n ) ∈ X L then 5: y m = y, λ m = e −yF m−1(x) 6: else 7: // Update pseudo label and weight 8: ˜z m (x) = tanh(F P (x)) − tanh(F m−1 (x)) 9: y m = sign(˜z m (x)), λ m = |˜z m (x)| 10: end if 11: for k = 1 to K do 12: Train k th weak learner fm(x) k with sample (x n , y m ) and weight λ m . 13: Estimate the error: 14: if fm,k weak (x) == y then 15: λ c m,k = λc m,k + λ k 16: else 17: λ w m,k = λw m,k + λ k 18: end if 19: e k m = λw n,m λ c n,m+λ w n,m 20: end for 21: Find the best weak learner with the least total weighted error: j = arg min k 22: Set f m (x n ) = f( m(x j n ). ) 23: Set α m = 1 ln 1−e m 2 e m 24: end for 25: Output the final model: F (x) e k m. This re-weighting strategy allows boosting to concentrate on hard samples while easy samples are less emphasized. However, if the sample has a wrong label and the previous weak learners are assigning the true (but hidden) label to the sample, AdaBoost still will consider this as a mis-classification and dramatically (exponentially) increase its weight. This can finally corrupt the learning result. Therefore, the performance of the boosting algorithm will be highly dependent on the presence of such noisy samples.
Page 1:
PhD Thesis Semi-Supervised Ensemble
Page 5:
Statutory Declaration I declare tha
Page 8 and 9:
Most of all, I would like to thank
Page 10 and 11:
learning. Finally, we hypothesize t
Page 12 and 13:
sten Teil dieser Arbeit schlagen wi
Page 14 and 15:
ii CONTENTS 3.6 Graph-based Methods
Page 16 and 17:
iv CONTENTS 10 Conclusion 137 10.1
Page 18 and 19:
vi LIST OF FIGURES 4.8 Performance
Page 20 and 21:
viii LIST OF FIGURES 9.7 Comparison
Page 22 and 23:
x LIST OF FIGURES
Page 24 and 25:
xii LIST OF TABLES 8.2 Results and
Page 26 and 27:
xiv LIST OF ALGORITHMS
Page 28 and 29:
2 Chapter 1. Introduction Figure 1.
Page 30 and 31:
4 Chapter 1. Introduction the liter
Page 32 and 33:
6 Chapter 1. Introduction 1.1 Contr
Page 34 and 35:
8 Chapter 1. Introduction
Page 36 and 37:
10 Chapter 2. Preliminaries and Not
Page 38 and 39:
12 Chapter 2. Preliminaries and Not
Page 40 and 41: 14 Chapter 2. Preliminaries and Not
Page 52 and 53: 26 Chapter 3. Overview of Semi-Supe
Page 70 and 71: 44 Chapter 4. SemiBoost and Visual
Page 88 and 89: 62 Chapter 5. On-line Semi-Supervis
Page 110 and 111: 84 Chapter 6. Semi-Supervised Rando
Page 126 and 127: 100 Chapter 7. On-line Semi-Supervi
Page 128 and 129: 102 Chapter 7. On-line Semi-Supervi
Page 130 and 131: 104 Chapter 8. Multiple Instance Le
Page 140 and 141:
114 Chapter 8. Multiple Instance Le
Page 142 and 143:
116 Chapter 9. Visual Object Tracki
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
138 Chapter 10. Conclusion As many
Page 166 and 167:
140 Chapter 10. Conclusion positive
Page 168 and 169:
142 Chapter 10. Conclusion
Page 170 and 171:
144 Chapter A. Publications (8) Mar
Page 172 and 173:
146 Chapter A. Publications
Page 174 and 175:
148 Chapter B. Acronyms SVM Support
Page 176 and 177:
150 BIBLIOGRAPHY [Balcan et al., 20
Page 178 and 179:
152 BIBLIOGRAPHY [Chapelle and Zien
Page 180 and 181:
154 BIBLIOGRAPHY [Gall and Lempinsk
Page 182 and 183:
156 BIBLIOGRAPHY [Leistner et al.,
Page 184 and 185:
158 BIBLIOGRAPHY [Nigam et al., 200
Page 186 and 187:
160 BIBLIOGRAPHY [Shalev-Shwartz, 2
Page 188 and 189:
162 BIBLIOGRAPHY [Xu et al., 2009]
show all

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?