14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

472 Clustering Data Chapter 18<br />

K-Means Clustering<br />

K-Means Platform Options<br />

These options are accessed from the red-triangle menus, <strong>and</strong> apply to KMeans, Normal Mixtures, Robust<br />

Normal Mixtures, <strong>and</strong> Self-Organizing Map methods.<br />

Biplot shows a plot of the points <strong>and</strong> clusters in the first two principal components of the data. Circles are<br />

drawn around the cluster centers. The size of the circles are proportional to the count inside the cluster.<br />

The shaded area is the 90% density contour around the mean, therefore the shaded area indicates where<br />

90% of the observations in that cluster would fall. Below the plot is an option to save the cluster colors<br />

to the data table.<br />

Biplot Options contains options for controlling the Biplot.<br />

Show Biplot Rays allows you to show or hide the biplot rays.<br />

Biplot Ray Position allows you to position the biplot ray display. This is viable since biplot rays only<br />

signify the directions of the original variables in canonical space, <strong>and</strong> there is no special significance to<br />

where they are placed in the graph.<br />

Mark Clusters assigns markers to the rows of the data table corresponding to the clusters.<br />

Biplot 3D<br />

shows a three-dimensional biplot of the data. Three variables are needed to use this option.<br />

Parallel Coord Plots creates a parallel coordinate plot for each cluster. For details about the plots, see<br />

Basic Analysis <strong>and</strong> Graphing. The plot report has options for showing <strong>and</strong> hiding the data <strong>and</strong> means.<br />

Scatterplot Matrix<br />

creates a scatterplot matrix using all the variables.<br />

Save Colors to Table<br />

colors each row with a color corresponding to the cluster it is in.<br />

Save Clusters creates a new column with the cluster number that each row is assigned to. For normal<br />

mixtures, this is the cluster that is most likely.<br />

Save Cluster Formula<br />

creates a new column with a formula to evaluate which cluster the row belongs to.<br />

Save Mixture Probabilities creates a column for each cluster <strong>and</strong> saves the probability an observation<br />

belongs to that cluster in the column. This is available for Normal Mixtures <strong>and</strong> Robust Normal<br />

Mixtures clustering only.<br />

Save Mixture Formulas creates columns with mixture probabilities, but stores their formulas in the<br />

column <strong>and</strong> needs additional columns to hold intermediate results for the formulas. Use this feature if<br />

you want to score probabilities for excluded data, or data you add to the table. This is available for<br />

Normal Mixtures <strong>and</strong> Robust Normal Mixtures clustering only.<br />

Save Density Formula<br />

clustering only.<br />

saves the density formula in the data table. This is available for Normal Mixtures<br />

Simulate Clusters creates a new data table containing simulated clusters using the mixing probabilities,<br />

means, <strong>and</strong> st<strong>and</strong>ard deviations.<br />

Remove<br />

removes the clustering report.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!