PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

More documents

Recommendations

Info

72 Chapter 5. On-line Semi-Supervised Boosting as outlier and the weight is thus decreased. LogitBoost can be considered as lying between these two extremes; i.e., if a sample is misclassified with high accuracy the logit-loss neither increases the weight nor decreases the weight but keeps it constant to one. Figure 5.4: Weight update functions for different loss functions. 5.2.3 Competitive Study In this section, we will conduct a competitive study with the proposed on-line GradientBoost on machine learning data in order to study the influence of the different loss functions. Therefore, we chose 8 benchmark datasets from UCI and LIBSVM repositories, which are shown in Table 6.1. We compare the performance of on-line AdaBoost and GradientBoost by using exponential, Logit, DoomII, and Savage losses. Note, when using exponential loss, we get the on-line formulation of RealBoost of Friedman et al. [Friedman et al., 2000]. For these experiments, we randomly introduce label noise into the training set and train the on-line classifiers for 5 epochs. We repeat all these experiments for 3 times and report the average test errors. We also repeat these experiments for two different kinds of weak learners, i.e., (i) decision stumps which assume the feature responses being Gaussian distributed, where the means µ + , µ − and the standard deviations σ + , σ − are estimated with the help of a Kalman filter [Grabner and Bischof, 2006], and (ii) fixed-binned on-line histograms. Some variants of on-line GradientBoost, e.g., RealBoost, need confidence-rated weak
5.2. On Robustness of On-line Boosting 73 predictions which can be easily calculated using on-line histograms as weak learners. Following [Friedman et al., 2000] we use the probabilistic output of the classifier in form of f(x) = 1 2 log p + (x) 1 − p + (x) , (5.12) where p + (x) is the probability of a sample to be positive. Histograms can also inherently describe multi-modal distributions, which as we show in the experiments is also helpful for algorithms which in principle do not require confidence-rated predictions. In total we conducted 192 experiments per classifier, which hopefully shows clearly which methods perform best in presence of noise. For multi-class datasets, we used an one-vs-all strategy and incorporated the ratio between positive and negative samples in the initial weight of samples. Dataset # Train # Test # Class # Feat. DNA 1400 1186 3 180 Letter 15000 5000 26 16 Magic 9510 9510 2 10 Pendigits 7494 3498 10 16 SatImage 3104 2000 6 36 Shuttle 30450 14500 7 9 Splice 1000 2175 2 60 USPS 7291 2007 10 256 Table 5.2: Datasets used in machine learning experiments. Figures 5.5(a) and 5.5(b) show the test error with respect to the amount of label noise in the training set for each classifier and each dataset by using stumps and histograms, respectively. The main outcome of these experiments is that on average On-line GradientBoost using Logit or Savage losses performs best. Even though using the DoomII loss function provides excellent results outperforming the others for some datasets, the results obtained by using Logit and Savage loss-functions are consistently among the best. Additionally, as it can be seen, the Real loss-function does not perform very well when using stumps, but is competitive when the weak learner is switched to histograms. The reason for this behavior could be attributed to the fact that stumps are not able to return probabilistic outputs which are required by on-line GradientBoost losses. By referring to Figure 5.4, we can see that for the Real loos-function does not provide a bounded weight update function, and therefore, without a probabilistic estimates, the sample weights grow unbounded. Additionally, by comparing these two figures it becomes clear that, in gen-
Page 1:
PhD Thesis Semi-Supervised Ensemble
Page 5:
Statutory Declaration I declare tha
Page 8 and 9:
Most of all, I would like to thank
Page 10 and 11:
learning. Finally, we hypothesize t
Page 12 and 13:
sten Teil dieser Arbeit schlagen wi
Page 14 and 15:
ii CONTENTS 3.6 Graph-based Methods
Page 16 and 17:
iv CONTENTS 10 Conclusion 137 10.1
Page 18 and 19:
vi LIST OF FIGURES 4.8 Performance
Page 20 and 21:
viii LIST OF FIGURES 9.7 Comparison
Page 22 and 23:
x LIST OF FIGURES
Page 24 and 25:
xii LIST OF TABLES 8.2 Results and
Page 26 and 27:
xiv LIST OF ALGORITHMS
Page 28 and 29:
2 Chapter 1. Introduction Figure 1.
Page 30 and 31:
4 Chapter 1. Introduction the liter
Page 32 and 33:
6 Chapter 1. Introduction 1.1 Contr
Page 34 and 35:
8 Chapter 1. Introduction
Page 36 and 37:
10 Chapter 2. Preliminaries and Not
Page 38 and 39:
Page 40 and 41:
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49: 22 Chapter 2. Preliminaries and Not
Page 50 and 51: 24 Chapter 2. Preliminaries and Not
Page 52 and 53: 26 Chapter 3. Overview of Semi-Supe
Page 70 and 71: 44 Chapter 4. SemiBoost and Visual
Page 88 and 89: 62 Chapter 5. On-line Semi-Supervis
Page 110 and 111: 84 Chapter 6. Semi-Supervised Rando
Page 126 and 127: 100 Chapter 7. On-line Semi-Supervi
Page 128 and 129: 102 Chapter 7. On-line Semi-Supervi
Page 130 and 131: 104 Chapter 8. Multiple Instance Le
Page 142 and 143: 116 Chapter 9. Visual Object Tracki
Page 148 and 149:
122 Chapter 9. Visual Object Tracki
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
138 Chapter 10. Conclusion As many
Page 166 and 167:
140 Chapter 10. Conclusion positive
Page 168 and 169:
142 Chapter 10. Conclusion
Page 170 and 171:
144 Chapter A. Publications (8) Mar
Page 172 and 173:
146 Chapter A. Publications
Page 174 and 175:
148 Chapter B. Acronyms SVM Support
Page 176 and 177:
150 BIBLIOGRAPHY [Balcan et al., 20
Page 178 and 179:
152 BIBLIOGRAPHY [Chapelle and Zien
Page 180 and 181:
154 BIBLIOGRAPHY [Gall and Lempinsk
Page 182 and 183:
156 BIBLIOGRAPHY [Leistner et al.,
Page 184 and 185:
158 BIBLIOGRAPHY [Nigam et al., 200
Page 186 and 187:
160 BIBLIOGRAPHY [Shalev-Shwartz, 2
Page 188 and 189:
162 BIBLIOGRAPHY [Xu et al., 2009]
show all

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?