26.09.2015 Views

K-means clustering algorithm

K-means clustering algorithm - ISCAS 2007

K-means clustering algorithm - ISCAS 2007

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

How to set k in k-<strong>means</strong> <strong>clustering</strong><br />

For K=1, 2, 3, …, run the k-<strong>means</strong> <strong>clustering</strong> <strong>algorithm</strong>.<br />

After the k-<strong>means</strong> <strong>algorithm</strong> has converged, we have cluster assignments for each<br />

sample as well as the locations of the cluster centers.<br />

Compute<br />

as d<br />

K<br />

Let d<br />

K<br />

Let the<br />

1<br />

N<br />

be<br />

the mean squared distance<br />

K<br />

j<br />

1<br />

( i )<br />

0 x Cluster j<br />

the distortion<br />

x<br />

of<br />

( i)<br />

transformed distortion<br />

m<br />

j<br />

2<br />

the <strong>clustering</strong><br />

be d<br />

.<br />

K<br />

of<br />

p / 2<br />

a sample<br />

result.<br />

from its<br />

, where p is the dimension<br />

corresponding<br />

of<br />

cluster center<br />

d K decreases as K increases<br />

the data samples.<br />

The jump value of transformed distortion<br />

(Assume d<br />

0<br />

0 when computing<br />

J<br />

1<br />

.)<br />

is<br />

J<br />

K<br />

d<br />

p / 2<br />

K<br />

d<br />

p / 2<br />

K 1<br />

.<br />

The peak of the jump values corresponds to the K that provides the best description of<br />

the original samples.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!