18.07.2014 Views

Unsupervised Color Segmentation - Institute for Computer Graphics ...

Unsupervised Color Segmentation - Institute for Computer Graphics ...

Unsupervised Color Segmentation - Institute for Computer Graphics ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The Mean Shift algorithm proposed by Comaniciu and<br />

Meer [4] is a non parametric kernel-based density estimation<br />

technique. In general, the Mean Shift concept enables<br />

clustering of a set of data-points into C different clusters,<br />

without prior knowledge of the number of clusters. It is<br />

based on the non parametric density estimator at the d-<br />

dimensional feature vector ⃗x in the feature space, which can<br />

be obtained with a kernel K(⃗x) and a window size h by<br />

f(⃗x) = 1<br />

nh d<br />

n ∑<br />

i=1<br />

( ∥∥∥∥ ⃗x − ⃗x i<br />

)<br />

K ∥<br />

h<br />

∥2<br />

, (2)<br />

where ⃗x i are the n feature vectors of the data set. Calculating<br />

the gradient of the density estimator shows that the<br />

so-called Mean Shift defined by<br />

m(⃗x) =<br />

n∑<br />

⃗x i K (⃗x − ⃗x i )<br />

n∑<br />

, (3)<br />

K (⃗x − ⃗x i )<br />

i=1<br />

i=1<br />

points toward the direction of the maximum increase in the<br />

density. The main part of the algorithm is to move the window<br />

iteratively by<br />

⃗x t+1 = ⃗x t + m(⃗x t ). (4)<br />

It is guaranteed that the shift converges to a point where<br />

the gradient of the underlying density function is zero.<br />

These points are the detected cluster centers of the distribution<br />

and represent an estimated mean value <strong>for</strong> the cluster.<br />

We are not interested in the mean shift clusters itself.<br />

We just run the Mean Shift procedure to find the stationary<br />

points of the density estimates, which are the modes of the<br />

distribution. These modes serve as initialization of the EM<br />

algorithm to find the maximum likelihood parameters of the<br />

GMM model.<br />

2.2. Definition of ordering relationship<br />

The next step is to order the pixels of the input color image.<br />

There<strong>for</strong>e, <strong>for</strong> every pixel a unique distance value β<br />

between a single Gaussian distribution fitted to the Luv values<br />

within a x × y window around the pixel and the Gaussian<br />

Mixture Model (GMM) of the ROI, as obtained by the<br />

step described in Section 2.1, is calculated. While the single<br />

Gaussian distributions have to be recalculated <strong>for</strong> all of the<br />

windows located on every pixel, the GMM of the ROI stays<br />

the same during the entire computation.<br />

The comparison between the GMM and the single Gaussian<br />

distributions is based on calculating Bhattacharyya distances<br />

[1]. The Bhattacharyya distance β compares two<br />

d - dimensional Gaussian distributions N 1 = {⃗µ 1 , Σ ⃗ 1 } and<br />

N 2 = {⃗µ 2 , Σ ⃗ 2 } by<br />

∣ ∣∣<br />

β (N 1 , N 2 ) = 1 Σ1+ ⃗ Σ ⃗ 2<br />

2 ln 2<br />

∣<br />

√ ∣∣∣ ∣ ∣ ∣ +<br />

Σ1 ⃗ ∣∣ ∣∣ Σ2 ⃗ ∣∣<br />

[ ⃗Σ1 + ⃗ Σ 2<br />

2<br />

] −1<br />

(⃗µ 2 − ⃗µ 1 ).<br />

1<br />

8 (⃗µ 2 − ⃗µ 1 ) t<br />

The Bhattacharyya distance allows a quantitative statement<br />

about the discriminability of two Gaussian distributions<br />

by the Bayes decision rule. There<strong>for</strong>e, <strong>for</strong> our analysis<br />

in the three-dimensional CIE Luv feature space, the Bhattacharyya<br />

distance represents a quantitative statement if two<br />

color distributions are likely to be equal or not.<br />

To be able to compare a single Gaussian distribution to a<br />

GMM, C different Bhattacharyya distance values to every<br />

component of the GMM have to be calculated. Then, the<br />

final distance value β is computed by<br />

β =<br />

(5)<br />

C∑<br />

ω c β (N c , N w ), (6)<br />

c=1<br />

where ω c is the c-th GMM weight, N c = {⃗µ c , Σ ⃗ c } denotes<br />

the c-th component of the GMM and N w = {⃗µ w , Σ ⃗ w } is<br />

the single Gaussian fitted to the window pixels.<br />

To be able to calculate a Bhattacharyya distance <strong>for</strong> every<br />

pixel, the window has to be moved all over the image.<br />

For every window location the mean and the covariance matrix<br />

of the corresponding Luv values within the window<br />

have to be computed which is a time-consuming process.<br />

There<strong>for</strong>e, we introduce an adaption of the Summed-Area-<br />

Table (SAT) approach to efficiently calculate the Bhattacharyya<br />

distances. The SAT idea was originally proposed<br />

<strong>for</strong> texture mapping and brought back to the computer vision<br />

community by Viola and Jones [20] as integral image.<br />

In general, the integral image Int(r, c) is defined <strong>for</strong> a<br />

gray scale input image I(x, y) by<br />

Int(r, c) =<br />

∑<br />

I(x, y), (7)<br />

x≤r,y≤c<br />

as the sum of all pixel values inside the rectangle bounded<br />

by the upper left corner.<br />

Tuzel et al. [18] proposed an efficient method to calculate<br />

covariances based on the integral image concept. They<br />

build d integral images P i <strong>for</strong> the sum of the values <strong>for</strong> each<br />

dimension and d ∗ (d + 1)/2 images Q ij <strong>for</strong> each product<br />

between the values of any two dimensions, where d is the<br />

number of dimensions of the feature space.<br />

Thus, the integral images P i , with i = 1 . . . d, are defined<br />

by<br />

P i (r, c) =<br />

∑<br />

I i (x, y), (8)<br />

x≤r,y≤c

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!