14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

470 Clustering Data Chapter 18<br />

K-Means Clustering<br />

K-Means Control Panel<br />

As an example of KMeans clustering, use the Cytometry.jmp sample data table. Add the variables CD3 <strong>and</strong><br />

CD8 as Y, Columns variables. Select the KMeans option. Click OK. The Control Panel appears, <strong>and</strong> is<br />

shown in Figure 18.5.<br />

Figure 18.5 Iterative Clustering Control Panel<br />

The Iterative Clustering red-triangle menu has the Save Transformed option. This saves the Johnson<br />

transformed variables to the data table. This option is available only if the Johnson Transform option is<br />

selected on the launch dialog (Figure 18.4).<br />

The Control Panel has these options:<br />

Declutter is used to locate outliers in the multivariate sense. Plots are produced giving distances between<br />

each point <strong>and</strong> that points nearest neighbor, the second nearest neighbor, up to the k th nearest neighbor.<br />

You are prompted to enter k. Beneath the plots are options to create a scatterplot matrix, save the<br />

distances to the data table, or to not include rows that you have excluded in the clustering procedure. If<br />

an outlier is identified, you might want to exclude the row from the clustering process.<br />

Method<br />

is used to choose the clustering method. The available methods are:<br />

KMeans Clustering is described in this section.<br />

Normal Mixtures is described in “Normal Mixtures” on page 473.<br />

Robust Normal Mixtures is described in “Normal Mixtures” on page 473.<br />

Self Organizing Map is described in “Self Organizing Maps” on page 477.<br />

Number of Clusters<br />

is the number of clusters to form.<br />

Optional range of clusters is an upper bound for the number of clusters to form. If a number is entered<br />

here, the platform creates separate analyses for every integer between Number of clusters <strong>and</strong> this one.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!