30.01.2013 Views

Actes - Société Francophone de Classification

Actes - Société Francophone de Classification

Actes - Société Francophone de Classification

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SFC 2009<br />

Moreover, the mixture h m is such that minimizes JS divergence within hth cluster.<br />

The final partition’s quality can be evaluated using the in<strong>de</strong>x obtained by ratio between<br />

divergence, analogously to the way proposed by Chavent et al. [CHA 03]<br />

5. Computation procedure<br />

According to (4), the JS divergence among objects belonging to h th cluster has the following expression:<br />

d<br />

( h)<br />

i<br />

= H ( m ) " ( ) H ( f )<br />

JS<br />

h<br />

#<br />

! !<br />

i<br />

C<br />

h<br />

$<br />

h<br />

$<br />

i<br />

( h)<br />

and after simple passages, we obtained the following expression for d :<br />

d<br />

' p<br />

$ ( ( )<br />

$<br />

j<br />

= % ! H f ) + H ( ci<br />

) "<br />

i<br />

# ! C $ % & j=<br />

" i h 1<br />

#<br />

( h)<br />

i<br />

JS H ( mh<br />

) " ! ( h)<br />

JS<br />

B<br />

d JS and total JS<br />

Then, we can easily compute JS dissimilarities among objects in a cluster, computing copula entropy, marginal<br />

entropies and mixture entropy separately. To obtain these quantities, numerical integration procedure, based on<br />

adaptive methods, can be used. Subsequently, the W<br />

JS<br />

d quantity can be computed.<br />

The proposed clustering algorithm allows us to find simultaneously the best partition of symbolic objects,<br />

according to the chosen criterion, and a suitable mo<strong>de</strong>l to <strong>de</strong>scribing <strong>de</strong>pen<strong>de</strong>nce insi<strong>de</strong> observations.<br />

6. Bibliography<br />

[BOC 00] BOCK H.H., DIDAY E., Analysis of Symbolic Data, Expanatory methods for extracting statistical<br />

informations from Complex data, Studies in <strong>Classification</strong>, Data Analysis and Knowledge Organization,<br />

Springer Verlag, 2000.<br />

[CHA 03] CHAVENT M., DE CARVALHO F.A.T., LECHEVALLIER Y., VERDE R., Trois nouvelles métho<strong>de</strong>s <strong>de</strong><br />

classification automatique <strong>de</strong>s données symbolique <strong>de</strong> type intervalle, Revue <strong>de</strong> Statistique Appliquée, vol. 4,<br />

2003, p. 5-29.<br />

[PAP 91] PAPOULIS A., Probability, Random Variables and Stochastic Process, McGraw-Hill,1991.<br />

[SHA 71] SHANNON C.E., WEAVER W., La teoria matematica <strong>de</strong>lle comunicazioni, Etas Kompass, 1971.<br />

[SKL 59] SKLAR A., Fonctions <strong>de</strong> répartition à n dimension et leurs marges. Publications <strong>de</strong> l’Institut <strong>de</strong><br />

Statistique <strong>de</strong> l’Université <strong>de</strong> Paris, vol. 8, 1959, p. 229-231<br />

[VER 08] VERDE E., IRPINO A., Comparing Histogram Data Using Mahalanobis-Wasserstein Distance,<br />

Proceeding in Compstat 2008: Proceedings in Computational Statistics, Hei<strong>de</strong>lberg, Physica-Verlag<br />

Springer, 2008.<br />

196<br />

(10)<br />

(11)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!