PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

More documents

Recommendations

Info

108 Chapter 8. Multiple Instance Learning with Random Forests to be trained and simultaneously a suitable set of labels y j i has to be found. Due to the integer values of the labels y j i , this problem is a type of integer programming and is usually difficult to solve. In order to solve this non-convex optimization problem without loosing too much of the training speed of random forests, we will need a fast optimization procedure. As we have seen in the previous chapter for SSL, deterministic annealing is an ideal candidate for such a task and we will hence use it also here for solving our MIL-constrained learning task. 8.3.1 Optimization In order to optimize our MIL objective function (Eq. (8.1)), we propose the following iterative strategy: In the first iteration, we train a naïve RF that ignores the MIL constraint and uses the corresponding bag labels for instances inside that bag. Then, after the first iteration, we treat the instance labels in positive bags as random binary variables. These random variables are defined over a space of probability distributions P. We now search a distribution ˆp ∈ P for each bag which solves our optimization problem in Eq. (8.1). Based on ˆp each tree randomly selects the instance labels for training. Hence, based on the optimization of ˆp we try to identify the real but hidden labels of instances. We reformulate the objective function given in Eq. (8.1) so that it is suitable for DA optimization L DA (F, ˆp) = n∑ ∑n i K∑ ˆp(k|x j i )l(F k(x j i )) + T i=1 j=1 k=1 n∑ H( ˆp i ), (8.3) i=1 where T is the temperature parameter and ∑n i H( ˆp i ) = − K∑ j=1 k=1 ˆp(k|x j i ) log(ˆp(k|xj i )) (8.4) is the entropy over the predicted distribution inside a bag. It can be seen that the parameter T steers the importance between the original objective function and the entropy. If T is high, the entropy dominates the loss function and the problem can be easier solved due to the convexity. If T = 0 the original loss dominates (Eq. (8.1)). Hence, DA first solves the easy task of entropy minimization and then by continuously decreasing T from high values to zero gradually solves the original optimization problem, i.e., finding the real but hidden instance labels y and simultaneously training an instance classifier.
8.4. On-line MILForests 109 In more detail, for a given temperature level, the learning problem can be written as (F ∗ , ˆp ∗ ) =arg min ˆp,F (·) ∑n i s.t. ∀i : I(y i = arg max k∈Y j=1 L DA (F, ˆp) (8.5) F k (x j i )) ≥ 1. We split this optimization problem up into a two-step convex optimization problem analog to an alternating coordinate descent approach. In the first step, the objective function F is optimized by fixing the distribution ˆp and optimizing the learning model. In the second step, the distribution p ∗ over the bags according to the current entropy level is adjusted. Note that both individual steps are convex optimization problems. The detailed optimization now runs analogue to the SSL case; i.e., for a given distribution over the bag samples, we randomly choose a label according to ˆp. We repeat this process independently for every tree f in the forest. Let {ŷ ij } be the randomly drawn labels according to the distribution ˆp for t-th tree. The optimization problem for the t-th tree becomes f ∗ t =arg min f s.t. ∀i : i=1 ∑n i j=1 n∑ ∑n i j=1 I(y i = arg max k∈Y l(fŷj (x j i )) (8.6) i f k (x j i )) ≥ 1. Since the margin maximizing loss function is convex, this loss function is also convex. In order to not violate the MIL constraint, after having randomly selected instance labels for a bag, we always set the instance with the highest probability according to ˆp equal to the bag label. At this stage we train all the trees in the forest by the formulation given above. After we trained the random forest, we enter the second stage where we find the optimal distribution according to ˆp ∗ =arg min ˆp n∑ ∑n i i=1 j=1 k=1 K∑ ˆp(k|x j i )l(F k(x j i )) + T n∑ H( ˆp i ). (8.7) The optimal distribution is found by taking the derivative w.r.t p and setting it to zero. We depict all detailed steps of the method in Algorithm 8.1. i=1 8.4 On-line MILForests Although there have been proposed numerous approaches to the MIL problem, most of them operate in off-line or batch mode. In practice, however, it would also be desired
Page 1:
PhD Thesis Semi-Supervised Ensemble
Page 5:
Statutory Declaration I declare tha
Page 8 and 9:
Most of all, I would like to thank
Page 10 and 11:
learning. Finally, we hypothesize t
Page 12 and 13:
sten Teil dieser Arbeit schlagen wi
Page 14 and 15:
ii CONTENTS 3.6 Graph-based Methods
Page 16 and 17:
iv CONTENTS 10 Conclusion 137 10.1
Page 18 and 19:
vi LIST OF FIGURES 4.8 Performance
Page 20 and 21:
viii LIST OF FIGURES 9.7 Comparison
Page 22 and 23:
x LIST OF FIGURES
Page 24 and 25:
xii LIST OF TABLES 8.2 Results and
Page 26 and 27:
xiv LIST OF ALGORITHMS
Page 28 and 29:
2 Chapter 1. Introduction Figure 1.
Page 30 and 31:
4 Chapter 1. Introduction the liter
Page 32 and 33:
6 Chapter 1. Introduction 1.1 Contr
Page 34 and 35:
8 Chapter 1. Introduction
Page 36 and 37:
10 Chapter 2. Preliminaries and Not
Page 38 and 39:
Page 40 and 41:
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 52 and 53:
26 Chapter 3. Overview of Semi-Supe
Page 54 and 55:
Page 56 and 57:
Page 58 and 59:
Page 60 and 61:
Page 62 and 63:
Page 64 and 65:
Page 66 and 67:
Page 68 and 69:
Page 70 and 71:
44 Chapter 4. SemiBoost and Visual
Page 72 and 73:
Page 74 and 75:
Page 76 and 77:
Page 78 and 79:
Page 80 and 81:
Page 82 and 83:
Page 84 and 85: 58 Chapter 4. SemiBoost and Visual
Page 86 and 87: 60 Chapter 4. SemiBoost and Visual
Page 88 and 89: 62 Chapter 5. On-line Semi-Supervis
Page 110 and 111: 84 Chapter 6. Semi-Supervised Rando
Page 126 and 127: 100 Chapter 7. On-line Semi-Supervi
Page 128 and 129: 102 Chapter 7. On-line Semi-Supervi
Page 130 and 131: 104 Chapter 8. Multiple Instance Le
Page 142 and 143: 116 Chapter 9. Visual Object Tracki
Page 164 and 165: 138 Chapter 10. Conclusion As many
Page 166 and 167: 140 Chapter 10. Conclusion positive
Page 168 and 169: 142 Chapter 10. Conclusion
Page 170 and 171: 144 Chapter A. Publications (8) Mar
Page 172 and 173: 146 Chapter A. Publications
Page 174 and 175: 148 Chapter B. Acronyms SVM Support
Page 176 and 177: 150 BIBLIOGRAPHY [Balcan et al., 20
Page 178 and 179: 152 BIBLIOGRAPHY [Chapelle and Zien
Page 180 and 181: 154 BIBLIOGRAPHY [Gall and Lempinsk
Page 182 and 183: 156 BIBLIOGRAPHY [Leistner et al.,
Page 184 and 185:
158 BIBLIOGRAPHY [Nigam et al., 200
Page 186 and 187:
160 BIBLIOGRAPHY [Shalev-Shwartz, 2
Page 188 and 189:
162 BIBLIOGRAPHY [Xu et al., 2009]
show all

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?