10.07.2015 Views

Dimension Reduction for Model-based Clustering via Mixtures of ...

Dimension Reduction for Model-based Clustering via Mixtures of ...

Dimension Reduction for Model-based Clustering via Mixtures of ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

tMMDR variables which provide no clustering in<strong>for</strong>mation but require parameter estimation.Thus, the next step in the process <strong>of</strong> model-<strong>based</strong> clustering is to detect andremove these unnecessary variables.Scrucca (2010) used the subset selection method <strong>of</strong> Raftery and Dean (2006) toprune the subset <strong>of</strong> GMMDR variables. We will also use this approach to select themost appropriate tMMDR variables.Let s be a subset <strong>of</strong> q features from the original tMMDR variables Z, with dim(s) = qand q ≤ d. Let s ′ = {s \ i} ⊂ s be the set <strong>of</strong> dim = q − 1 which is obtained by excludingthe i-th feature from s. The comparison <strong>of</strong> the two subsets can be viewed as a modelcomparison problem and addressed by using the BIC difference which is given in Rafteryand Dean (2006) asBIC diff (Z i∈s ) = BIC clust (Z s ) − BIC not clust (Z s ) , (3.9)where BIC clust (Z s ) is the BIC value <strong>for</strong> the best model fitted using features in s andBIC not clust (Z s ) is the BIC value <strong>for</strong> no clustering. We can write the latter asBIC not clust (Z s ) = BIC clust (Z s ′) + BIC reg (Z i |Z s ′) ,where BIC clust (Z s ′) is the BIC value <strong>for</strong> the best model fitted using features in s ′ andBIC reg (Z i |Z s ′) is the BIC value <strong>for</strong> the regression <strong>of</strong> the i-th feature on the remaining(q − 1) features in s ′ . Since the tMMDR variables are orthogonal, the <strong>for</strong>mula <strong>for</strong> BIC reg(Raftery and Dean, 2006) reduces to( ) RSSBIC reg (Z i |Z s ′) = −n log(2π) − n log − n − (q + 1) log(n) ,nwhere RSS is the residual sum <strong>of</strong> squares in the regression <strong>of</strong> Z i on Z s ′.Now, the space <strong>of</strong> all possible subsets has size dim(s) = q, where q = 1, . . . , d has 2 d −1elements and an exhaustive search would be unfeasible. To bypass this issue, Rafteryand Dean (2006) proposed a greedy search algorithm which finds a local optimum in themodel space. The algorithm comprises the following steps:20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!