13.07.2015 Views

master thesis.pdf - Atrium - University of Guelph

master thesis.pdf - Atrium - University of Guelph

master thesis.pdf - Atrium - University of Guelph

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

graphical diagnostics and proposed a hierarchical method to merge Gaussian componentsbased on the dip test for unimodality (Hartigan and Hartigan, 1985). Li (2005)proposed a method to fit a multilayer mixture (mixture <strong>of</strong> Gaussian mixtures). Itassumes the number <strong>of</strong> clusters k is known in advance, and then applies k-meansclustering to the s component means, where s is the estimated number <strong>of</strong> Gaussianmixture components based on BIC. However, the methods suggested by Wang andRaftery (2002) and Li (2005) shared the same drawback in that both are based onthe means <strong>of</strong> the clusters but do not take account <strong>of</strong> clusters’ shape. Besides, thatthe number <strong>of</strong> clusters is known a priori may be questionable in many real worldapplications.Baudry et al. (2008) proposed fitting a Gaussian mixture model to the data, thenselecting the total number <strong>of</strong> Gaussian mixture components using BIC, and combiningthem hierarchically according to an entropy criterion. Herein, we propose a newmethod <strong>of</strong> merging the Gaussian mixture components. In this work (cf., Chapter 3),we use the adjusted Rand index (ARI; Rand, 1971; Hubert and Arabie, 1985) as ourmerging criterion, while Baudry et al. (2008) utilize entropy criteria, and Wang andRaftery (2002) and Li (2005) employ the distance between the means <strong>of</strong> the clusters.The idea is to create an automated procedure to merge G-component Gaussian mixturedensity models to a multi-layer Gaussian mixture model with the desired number<strong>of</strong> clusters (less than G). Multi-layer Gaussian mixture models are an extension tothe Gaussian mixture models in a sense that the density <strong>of</strong> each cluster is a singleGaussian density or a mixture <strong>of</strong> Gaussian densities. To illustrate the idea above,consider two examples shown in Figure 1 and Figure 2.12

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!