K-means clustering algorithm

How to set k in k-means clustering 

For K=1, 2, 3, …, run the k-means clustering algorithm. 

After the k-means algorithm has converged, we have cluster assignments for each 

sample as well as the locations of the cluster centers. 

Compute 

as d 

K 

Let d 

K 

Let the 

1 

N 

be 

the mean squared distance 

K 

j 

1 

( i ) 

0 x Cluster j 

the distortion 

x 

of 

( i) 

transformed distortion 

m 

j 

2 

the clustering 

be d 

. 

K 

of 

p / 2 

a sample 

result. 

from its 

, where p is the dimension 

corresponding 

of 

cluster center 

d K decreases as K increases 

the data samples. 

The jump value of transformed distortion 

(Assume d 

0 

0 when computing 

J 

1 

.) 

is 

J 

K 

d 

p / 2 

K 

d 

p / 2 

K 1 

. 

The peak of the jump values corresponds to the K that provides the best description of 

the original samples.

Previous page

Next page

1

2

3

4

5

6

7

8

9

10

11

K-means clustering algorithm

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?