24.11.2013 Views

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.2. <strong>Semi</strong>Boost 47<br />

There<strong>for</strong>e, at each step of boosting, we compute the pseudo-labels and weights of<br />

unlabeled samples and use them to find the best weak learner and its corresponding weight<br />

similar to the AdaBoost algorithm. Note that if no unlabeled data is used, the algorithm<br />

reduces to standard boosting. The detailed steps of the algorithm are summarized in<br />

Algorithm 4.1.<br />

Algorithm 4.1 <strong>Semi</strong>Boost<br />

Require: labeled training data (x, y) ∈ X L and unlabeled data x ′ ∈ X U<br />

Require: Similarity measure s(x, x ′ )<br />

Require: Weak learners f i<br />

Require: weight parameters λ u , λ l<br />

Require: max iterations T<br />

1: <strong>for</strong> t = 1, 2, . . . , T do<br />

2: Compute p i and q i <strong>for</strong> every given sample<br />

3: ŷ x = sign(p x − q x )<br />

4: w x = |p x − q x |<br />

5: Train weak classifier f t (x)<br />

6: Compute α t using Equation (4.13)<br />

7: F (x) ← F (x) + α t f t (x)<br />

8: end <strong>for</strong><br />

4.2.2 Learning Visual Similarities<br />

Being a manifold-based approach <strong>Semi</strong>Boost has the power to exploit both labeled and<br />

unlabeled samples if a similarity measure s(x, x ′ ) is given. The similarity can be obtained<br />

from a distance measure d(x, x ′ ), e.g., by using a radial basis function<br />

s(x, x ′ ) = e<br />

(<br />

− d(x,x′ ) 2<br />

σ 2 ), (4.14)<br />

where σ 2 is the scale parameter [Zhu et al., 2003]. The crucial point is how to measure the<br />

distance d(x, x ′ ) between points. Popular measurements are, <strong>for</strong> example, the Euclidean<br />

distance ||x − x ′ || or the Chi-Square distance (x−x′ ) 2<br />

. 2(x+x ′ )<br />

As already mentioned above, a more powerful and flexible approach is to learn the<br />

distance function from labeled data, which is also known as metric learning. The advantage<br />

of discriminative learning of distance functions is that the metric can much better<br />

support task-specific classification. We use boosting to learn pair-wise distance functions<br />

similar to the method proposed by Hertz et al. [Hertz et al., 2004]. A distance function is<br />

a function of pairs of data points to be positive real numbers, usually (but not necessary)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!