PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

More documents

Recommendations

Info

78 Chapter 5. On-line Semi-Supervised Boosting between the prior probability and the optimized model in order to define the loss function for the unlabeled data. The optimization reduces to minimizing the cross entropy H(P p , P ) = − ∑ z∈{−1,1} P p (y = z|x) log P (y = z|x) = − (2P p (y = 1|x) − 1) F (x) + log ( e F (x) + e −F (x)) . } {{ } y p(x) Thus, the loss over unlabeled samples results in L u (X U ) = ∑ x∈X U H(P p , ˆP ) = ∑ (5.14) x∈X U −y p (x)F (x) + log ( e F (x) + e −F (x)) . (5.15) Here, y p (x) = 2P p (y = 1|x) − 1 ∈ [−1, 1] is the prior soft label. For the prior probability P p (y = 1|x) any available prior information can be used. Since the formulation of Equation (5.15) is the logit loss of Eq (5.13) in case the prior soft label y p (x) is casted to a hard label ∈ {−1, 1}, we do not further pass it to an exponential function as in [Saffari et al., 2008]. As a result, both labeled and unlabeled loss are based on a coherent logistic function. 5.3.1 On-line learning In order to minimize the loss we use the on-line GradientBoost as introduced above. As Grabner et al., we keep a fixed set of weak learners and perform boosting over the selectors. The semi-supervised loss is iteratively optimized by propagating the samples through the selectors and updating the weight estimate λ m according to the negative derivative of the loss. Hence, we need to compute the negative gradients a ij with respect to the current classifier of the logit loss function: a x (z) = ∂ zF (x) − log( e F (x) −F + e (x)) ∂F (x) = z − tanh(F (x)), (5.16) where z is either the hard label y for labeled samples, or the soft prior label y p (x) for unlabeled samples. Therefore, the update rule for unlabeled samples can be written as ∀ x ∈ X U :w x = |y p (x) − tanh(F (x))| ŷ x = sign (y p (x) − tanh(F (x))) We depict a summary of the update rules for unlabeled samples depending on popular loss functions in Table 5.3 and show the entire algorithm in Algorithm 5.3.
5.4. Machine Learning Experiments 79 Dataset KL-Exponential [Saffari et al., 2008] KL w U x ← |y p (x) cosh(F (x)) − sinh(F (x))|e −ypF (x) |y p (x) − tanh(F (x))| ŷ x ← sign(y p (x) cosh(F (x)) − sinh(F (x))) sign(y p (x) − tanh(F (x))) Table 5.3: Comparison of update rules depending on different loss functions; i.e., exponential and logit loss. 5.4 Machine Learning Experiments In this set of experiments, we evaluated on-line SemiBoost and on-line SERBoost on standard semi-supervised benchmark data sets taken from [Chapelle et al., 2006] 1 . The data consists of both artificial and real-life data. Furthermore, on some data sets the cluster assumption holds while on some the manifold assumption holds. A summary of the data sets is presented in Table 5.4. We compared our methods to supervised off-line variants of nearest neighbor (1-NN), SVM and AdaBoost. We used LaRank SVM [Bordes et al., 2007] on-line GradientBoost with logistic loss (OLogitBoost) [Leistner et al., 2009a] as well as standard on-line boosting (OAdaBoost) [Grabner and Bischof, 2006] with LaRank as weak learners. For semi-supervised comparison we took the off-line versions of SER- Boost [Saffari et al., 2008], ManifoldBoost [Loeff et al., 2008] and TSVM [Joachims, 1999]. For both on-line and off-line SERBoost we used k-means clustering as prior. For SemiBoost we used the Euclidean distance in order to calculate S ij , with σ set using 5-fold cross validation. For all boosting methods we used 50 weak learners. For gradientbased methods we set the shrinkage factor to ν = 0.1. The importance λ of the unlabeled data was set to 0.1. We present the results in Table 5.5. As can be seen, both on-line SemiBoost and on-line SERBoost are competitive SSL methods and both are able to match the results of their off-line counterparts. As expected, SemiBoost has slight advantages on manifold-like data sets while SERBoost performs very well on cluster-based data sets. Interestingly, OSemiBoost is able to match the performance of OSERBoost on g241c, a manifold-based set, with l = 10 and OSERBoost is able to outperform OSemiBoost on Digit1, a cluster-based set, with l = 10. 5.5 Summary and Conclusion In this chapter, we introduced a novel on-line semi-supervised boosting algorithm based on SemiBoost. The algorithm is thus called on-line SemiBoost. We further illustrated that one of the major drawbacks of current boosting algorithms, both off-line and on-line, 1 http://www.kyb.tuebingen.mpg.de/ssl-book
Page 1:
PhD Thesis Semi-Supervised Ensemble
Page 5:
Statutory Declaration I declare tha
Page 8 and 9:
Most of all, I would like to thank
Page 10 and 11:
learning. Finally, we hypothesize t
Page 12 and 13:
sten Teil dieser Arbeit schlagen wi
Page 14 and 15:
ii CONTENTS 3.6 Graph-based Methods
Page 16 and 17:
iv CONTENTS 10 Conclusion 137 10.1
Page 18 and 19:
vi LIST OF FIGURES 4.8 Performance
Page 20 and 21:
viii LIST OF FIGURES 9.7 Comparison
Page 22 and 23:
x LIST OF FIGURES
Page 24 and 25:
xii LIST OF TABLES 8.2 Results and
Page 26 and 27:
xiv LIST OF ALGORITHMS
Page 28 and 29:
2 Chapter 1. Introduction Figure 1.
Page 30 and 31:
4 Chapter 1. Introduction the liter
Page 32 and 33:
6 Chapter 1. Introduction 1.1 Contr
Page 34 and 35:
8 Chapter 1. Introduction
Page 36 and 37:
10 Chapter 2. Preliminaries and Not
Page 38 and 39:
Page 40 and 41:
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 52 and 53:
26 Chapter 3. Overview of Semi-Supe
Page 54 and 55: 28 Chapter 3. Overview of Semi-Supe
Page 70 and 71: 44 Chapter 4. SemiBoost and Visual
Page 88 and 89: 62 Chapter 5. On-line Semi-Supervis
Page 110 and 111: 84 Chapter 6. Semi-Supervised Rando
Page 126 and 127: 100 Chapter 7. On-line Semi-Supervi
Page 128 and 129: 102 Chapter 7. On-line Semi-Supervi
Page 130 and 131: 104 Chapter 8. Multiple Instance Le
Page 142 and 143: 116 Chapter 9. Visual Object Tracki
Page 154 and 155:
128 Chapter 9. Visual Object Tracki
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
138 Chapter 10. Conclusion As many
Page 166 and 167:
140 Chapter 10. Conclusion positive
Page 168 and 169:
142 Chapter 10. Conclusion
Page 170 and 171:
144 Chapter A. Publications (8) Mar
Page 172 and 173:
146 Chapter A. Publications
Page 174 and 175:
148 Chapter B. Acronyms SVM Support
Page 176 and 177:
150 BIBLIOGRAPHY [Balcan et al., 20
Page 178 and 179:
152 BIBLIOGRAPHY [Chapelle and Zien
Page 180 and 181:
154 BIBLIOGRAPHY [Gall and Lempinsk
Page 182 and 183:
156 BIBLIOGRAPHY [Leistner et al.,
Page 184 and 185:
158 BIBLIOGRAPHY [Nigam et al., 200
Page 186 and 187:
160 BIBLIOGRAPHY [Shalev-Shwartz, 2
Page 188 and 189:
162 BIBLIOGRAPHY [Xu et al., 2009]
show all

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?