Real-time feature extraction from video stream data for stream ...

3. Machine Learning

Formally, the task of a cluster analysis algorithm can be specified as a multi-objective

optimization problem. On the one hand, the within cluster distance W (ζ) must be minimized,

whilst on the other hand the between cluster distance B(ζ) must be maximized.

W (ζ) = 1 2

⎛

⎛

k∑ ∑

∑ (

⎝

⎝

d(⃗x (a) , ⃗x )) ⎞ ⎞

(b) ⎠⎠

⃗x (a) ∈X|ζ(⃗x (a) )=C i ⃗x (b) ∈X|ζ(⃗x (b) )=C i

i=1

B(ζ) = 1 2

⎛

⎛

k∑ ∑

∑ (

⎝

⎝

d(⃗x (a) , ⃗x )) ⎞ ⎞

(b) ⎠⎠

⃗x (a) ∈X|ζ(⃗x (a) )=C i ⃗x (b) ∈X|ζ(⃗x (b) )≠C i

i=1

Hence we can define the term clustering as follows:

Definition 14 (Cluster Analysis) Clustering, or Cluster Analysis, is the machine

learning task of inferring a mapping ζ : X → C that maps unlabeled **data**

X = {⃗x (1) , ...., ⃗x (N) } to k clusters C = {C 1 , C 2 , ..., C k } , in a way that each example

⃗x (i) gets assigned to exactly one cluster, under the condition that W (ζ) is minimized,

whilst B(ζ) is maximized.

There are various algorithms that per**for**m cluster analysis, differing a lot: Some of them

allow objects to belong to more than one cluster at the same **time** (no strict partitioning),

some do not group the objects into concrete clusters but rather evaluate probabilities

**for** each pair of examples and clusters, saying how likely it is, that the example belongs

to the cluster. As the topic is really broad, I do not take these approaches into account,

but focus on easy clustering algorithms like k-Means ([MacQueen, 1967]).

k-Means

k-Means is an algorithm **for** cluster analysis. It partitions the incoming unlabeled **data**

X = {⃗x (1) , ...., ⃗x (N) } into k clusters C = {C 1 , C 2 , ..., C k }. Each of the incoming examples

⃗x (i) gets assigned to exactly one cluster C j . The clusters are represented by their

centroids. Centroids are calculated by averaging the values of the corresponding **feature**s

**for** all points in the cluster. Usually the centroid will be an imaginary point, that is not

included in the **data** X.

The k-Means algorithm works iterative. Initially, k centroids are randomly chosen. Then

each example is assigned to the cluster, it is closest to, by calculating the distance of the

example to all centroids. When all examples are assigned to an cluster, the centroids **for**

all clusters are updated. This procedure is repeated continuously, until none of the centroids

moves any longer. The resulting centroids are then assumed to be good centroids

to cluster the **data**.

38