PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

More documents

Recommendations

Info

106 Chapter 8. Multiple Instance Learning with Random Forests which finally leads to drifting. Recently, Babenko et al. [Babenko et al., 2009b] demonstrated that using MIL for tracking leads to much more stable results. For most of these vision tasks SVM variants or boosting have been used. 8.2 Multiple Instance Learning as a special case of Semisupervised Learning At the first glance, multiple instance learning and SSL do not seem to have many things in common. In the literature, they are hence considered as being two different branches of machine learning. However, it can be easily shown [Zhou and Xu, 2007] that multipleinstance learning can be seen as a special case of semi-supervised learning, however, with having an additional constraint on some parts of the unlabeled data and without knowing one single real positive instance. More formally, consider a training set consisting of bags {(B 1 , y 1 ), . . . , (B n , y n )} and assume p positive and q negative bags, with p + q = n. The negative bags are ordered before the positive bags, i.e., {(B1 − , y 1 ), (B2 − , y 2 ) . . . , (B n−1, + y n−1 ), (B n + , y n )}. If we then take the instances bag-by-bag into an instance set {(x 1,1 , y 1,1 ), (x 1,2 , y 1,2 ), . . . , . . . , (x n,ni −1, y n,ni −1), (x n,ni , y n,ni )} it can easily be seen that the first Q = ∑ q i=1 n i instances are from negative bags and the remaining P = ∑ n i=q+1 n i are from positive bags. ∑ Now, we can denote X l = Q and X u = P subject to ∀i ∈ X u : ni j=1 I(y j = 1) ≥ 1. If we interpret X l as labeled data set with y i = −1 and X u as unlabeled data set, we can easily see that MIL is a special case of semi-supervised learning, however, having the additional constraint that for a sub-sequence of instances {x i,1 , . . . , x i,ni } coming from the same positive bag B + i at least one instance is positive. Note that MIL can hence also be interpreted as an asymmetric semi-supervised learning problem, because latency is only given for the positive bags and the instances therein. Also based on this insight, in the following section we will present a multiple-instance learning method using random forests. 8.3 MILForests In the following, we introduce a novel multiple instance learning algorithm using randomized trees called MILForests. MILForests deliver multi-class instance classifiers in form of F (x) : X → Y = {1, . . . , K}. Formally, in contrast to the binary case, for multi-class MIL problems the data is provided in form of {(B 1 , y 1 ), . . . , (B n , y n )}, where y i ∈ {1, . . . , K}. This means that for all bags the instance labels are unknown and can consist of labels {1, . . . , K}. It is only guaranteed that at least one instance has the bag
8.3. MILForests 107 label. This makes MILForests different to most previous MIL algorithms that only yield binary classifiers and require to handle a multi-class problem by a sequence of binary ones. One obvious way to design RFs capable of solving MIL tasks is to adopt MIL versions for single decision trees [Blockeel et al., 2005]. However, strategies developed for common decision trees are hard to apply for RFs due to the random split nature of their trees. For example, improper regularization of trees of a RF on the node level can decrease the diversity ¯ρ among trees and thus increase the overall generalization error (see Eq. (2.16)). Additionally, the method proposed in [Blockeel et al., 2005] is based on simple heuristics and needs a complicated inter-node communication channel. Thus, in order to perform multiple instance learning with random forests one has to find an optimization strategy that preserves the diversity among the trees. In fact, this is a similar condition as for SSL with random forests. Hence, following this condition and the arguments stated in the previous section 8.2, we makes sense to use a similar optimization strategy as for our semi-supervised random forests introduced in Chapter 6. Therefore, we formulate multiple instance learning as an optimization procedure where the labels of the instances become the optimization variables. The algorithm tries to uncover the true labels of the instances in an iterative manner. Given such labels, one can train a supervised classifier which then can be used to classify both instances and bags. Let B i , i = 1, . . . , n denote the i-th bag in the training set with label y i . Each bag consists of n i instances: {x 1 i , . . . , x n i i }. We write the objective function to optimize as: n∑ ({y j i }∗ , F ∗ ) =arg min {y j i },F (·) i=1 ∑n i s.t. ∀i : j=1 ∑n i j=1 I(y i = arg max k∈Y l(F y j(x j i )) (8.1) i F k (x j i )) ≥ 1. The objective in this optimization procedure is to minimize a loss function l(·) which is defined over the entire set of instances by considering the condition that at least one instance in each bag has to be from the target class. Note that I(·) is an indicator function and F k (x) is the confidence of the classifier for the k-th class, i.e., F k (x) = p(k|x) − 1 K . Often, the loss function depends on the classification margin of an instance. In the case of Random Forests, the margin can be written as [Breiman, 2001] m(x, y) = p(y|x) − max k∈Y k≠y p(k|x) = F y (x) − max k∈Y k≠y F k (x). (8.2) Note that for a correct classification m(x, y) > 0 should hold. Overall, it can easy be seen that Eq. (8.1) is a non-convex optimization problem because a random forest has
Page 1:
PhD Thesis Semi-Supervised Ensemble
Page 5:
Statutory Declaration I declare tha
Page 8 and 9:
Most of all, I would like to thank
Page 10 and 11:
learning. Finally, we hypothesize t
Page 12 and 13:
sten Teil dieser Arbeit schlagen wi
Page 14 and 15:
ii CONTENTS 3.6 Graph-based Methods
Page 16 and 17:
iv CONTENTS 10 Conclusion 137 10.1
Page 18 and 19:
vi LIST OF FIGURES 4.8 Performance
Page 20 and 21:
viii LIST OF FIGURES 9.7 Comparison
Page 22 and 23:
x LIST OF FIGURES
Page 24 and 25:
xii LIST OF TABLES 8.2 Results and
Page 26 and 27:
xiv LIST OF ALGORITHMS
Page 28 and 29:
2 Chapter 1. Introduction Figure 1.
Page 30 and 31:
4 Chapter 1. Introduction the liter
Page 32 and 33:
6 Chapter 1. Introduction 1.1 Contr
Page 34 and 35:
8 Chapter 1. Introduction
Page 36 and 37:
10 Chapter 2. Preliminaries and Not
Page 38 and 39:
Page 40 and 41:
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 52 and 53:
26 Chapter 3. Overview of Semi-Supe
Page 54 and 55:
Page 56 and 57:
Page 58 and 59:
Page 60 and 61:
Page 62 and 63:
Page 64 and 65:
Page 66 and 67:
Page 68 and 69:
Page 70 and 71:
44 Chapter 4. SemiBoost and Visual
Page 72 and 73:
Page 74 and 75:
Page 76 and 77:
Page 78 and 79:
Page 80 and 81:
Page 82 and 83: 56 Chapter 4. SemiBoost and Visual
Page 88 and 89: 62 Chapter 5. On-line Semi-Supervis
Page 110 and 111: 84 Chapter 6. Semi-Supervised Rando
Page 126 and 127: 100 Chapter 7. On-line Semi-Supervi
Page 128 and 129: 102 Chapter 7. On-line Semi-Supervi
Page 130 and 131: 104 Chapter 8. Multiple Instance Le
Page 142 and 143: 116 Chapter 9. Visual Object Tracki
Page 164 and 165: 138 Chapter 10. Conclusion As many
Page 166 and 167: 140 Chapter 10. Conclusion positive
Page 168 and 169: 142 Chapter 10. Conclusion
Page 170 and 171: 144 Chapter A. Publications (8) Mar
Page 172 and 173: 146 Chapter A. Publications
Page 174 and 175: 148 Chapter B. Acronyms SVM Support
Page 176 and 177: 150 BIBLIOGRAPHY [Balcan et al., 20
Page 178 and 179: 152 BIBLIOGRAPHY [Chapelle and Zien
Page 180 and 181: 154 BIBLIOGRAPHY [Gall and Lempinsk
Page 182 and 183:
156 BIBLIOGRAPHY [Leistner et al.,
Page 184 and 185:
158 BIBLIOGRAPHY [Nigam et al., 200
Page 186 and 187:
160 BIBLIOGRAPHY [Shalev-Shwartz, 2
Page 188 and 189:
162 BIBLIOGRAPHY [Xu et al., 2009]
show all

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

Create successful ePaper yourself

Delete template?

Save as template?