Actes - Société Francophone de Classification
Actes - Société Francophone de Classification
Actes - Société Francophone de Classification
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
SFC 2009<br />
Moreover, the mixture h m is such that minimizes JS divergence within hth cluster.<br />
The final partition’s quality can be evaluated using the in<strong>de</strong>x obtained by ratio between<br />
divergence, analogously to the way proposed by Chavent et al. [CHA 03]<br />
5. Computation procedure<br />
According to (4), the JS divergence among objects belonging to h th cluster has the following expression:<br />
d<br />
( h)<br />
i<br />
= H ( m ) " ( ) H ( f )<br />
JS<br />
h<br />
#<br />
! !<br />
i<br />
C<br />
h<br />
$<br />
h<br />
$<br />
i<br />
( h)<br />
and after simple passages, we obtained the following expression for d :<br />
d<br />
' p<br />
$ ( ( )<br />
$<br />
j<br />
= % ! H f ) + H ( ci<br />
) "<br />
i<br />
# ! C $ % & j=<br />
" i h 1<br />
#<br />
( h)<br />
i<br />
JS H ( mh<br />
) " ! ( h)<br />
JS<br />
B<br />
d JS and total JS<br />
Then, we can easily compute JS dissimilarities among objects in a cluster, computing copula entropy, marginal<br />
entropies and mixture entropy separately. To obtain these quantities, numerical integration procedure, based on<br />
adaptive methods, can be used. Subsequently, the W<br />
JS<br />
d quantity can be computed.<br />
The proposed clustering algorithm allows us to find simultaneously the best partition of symbolic objects,<br />
according to the chosen criterion, and a suitable mo<strong>de</strong>l to <strong>de</strong>scribing <strong>de</strong>pen<strong>de</strong>nce insi<strong>de</strong> observations.<br />
6. Bibliography<br />
[BOC 00] BOCK H.H., DIDAY E., Analysis of Symbolic Data, Expanatory methods for extracting statistical<br />
informations from Complex data, Studies in <strong>Classification</strong>, Data Analysis and Knowledge Organization,<br />
Springer Verlag, 2000.<br />
[CHA 03] CHAVENT M., DE CARVALHO F.A.T., LECHEVALLIER Y., VERDE R., Trois nouvelles métho<strong>de</strong>s <strong>de</strong><br />
classification automatique <strong>de</strong>s données symbolique <strong>de</strong> type intervalle, Revue <strong>de</strong> Statistique Appliquée, vol. 4,<br />
2003, p. 5-29.<br />
[PAP 91] PAPOULIS A., Probability, Random Variables and Stochastic Process, McGraw-Hill,1991.<br />
[SHA 71] SHANNON C.E., WEAVER W., La teoria matematica <strong>de</strong>lle comunicazioni, Etas Kompass, 1971.<br />
[SKL 59] SKLAR A., Fonctions <strong>de</strong> répartition à n dimension et leurs marges. Publications <strong>de</strong> l’Institut <strong>de</strong><br />
Statistique <strong>de</strong> l’Université <strong>de</strong> Paris, vol. 8, 1959, p. 229-231<br />
[VER 08] VERDE E., IRPINO A., Comparing Histogram Data Using Mahalanobis-Wasserstein Distance,<br />
Proceeding in Compstat 2008: Proceedings in Computational Statistics, Hei<strong>de</strong>lberg, Physica-Verlag<br />
Springer, 2008.<br />
196<br />
(10)<br />
(11)