10.01.2015 Views

Package 'fpc' - open source solution for an Internet free, intelligent ...

Package 'fpc' - open source solution for an Internet free, intelligent ...

Package 'fpc' - open source solution for an Internet free, intelligent ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 fpc-package<br />

Cluster validity indexes <strong>an</strong>d estimation of the number of clusters<br />

cluster.stats This computes several cluster validity statistics from a clustering <strong>an</strong>d a dissimilarity<br />

matrix including the Calinski-Harabasz index, the adjusted R<strong>an</strong>d index <strong>an</strong>d other statistics explained<br />

in Gordon (1999) as well as several characterising measures such as average between<br />

cluster <strong>an</strong>d within cluster dissimilarity <strong>an</strong>d separation. See also calinhara, dudahart2 <strong>for</strong><br />

specific indexes.<br />

prediction.strength Estimates the number of clusters by computing the prediction strength of a<br />

clustering of a dataset into different numbers of components <strong>for</strong> various clustering methods,<br />

see Tibshir<strong>an</strong>i <strong>an</strong>d Walther (2005). In fact, this is more flexible th<strong>an</strong> what is in the original<br />

paper, because it c<strong>an</strong> use point classification schemes that work better with clustering methods<br />

other th<strong>an</strong> k-me<strong>an</strong>s.<br />

nselectboot Estimates the number of clusters by bootstrap stability selection, see F<strong>an</strong>g <strong>an</strong>d W<strong>an</strong>g<br />

(2012). This is quite flexible regarding clustering methods <strong>an</strong>d point classification schemes<br />

<strong>an</strong>d also allows <strong>for</strong> dissimilarity data.<br />

Cluster visualisation <strong>an</strong>d validation<br />

clucols Sets of colours <strong>an</strong>d symbols useful <strong>for</strong> cluster plotting.<br />

clusterboot Cluster-wise stability assessment of a clustering. Clusterings are per<strong>for</strong>med on resampled<br />

data to see <strong>for</strong> every cluster of the original dataset how well this is reproduced. See<br />

Hennig (2007) <strong>for</strong> details.<br />

cluster.varstats Extracts variable-wise in<strong>for</strong>mation <strong>for</strong> every cluster in order to help with cluster<br />

interpretation.<br />

plotcluster Visualisation of a clustering or grouping in data by various linear projection methods<br />

that optimise the separation between clusters, or between a single cluster <strong>an</strong>d the rest of the<br />

data according to Hennig (2004) including classical methods such as discrimin<strong>an</strong>t coordinates.<br />

This calls the function discrproj, which is a bit more flexible but doesn’t produce a plot itself.<br />

ridgeline.diagnosis Plots <strong>an</strong>d diagnostics <strong>for</strong> assessing modality of Gaussi<strong>an</strong> mixtures, see Ray<br />

<strong>an</strong>d Lindsay (2005).<br />

weightplots Plots to diagnose component separation in Gaussi<strong>an</strong> mixtures, see Hennig (2010).<br />

localshape Local shape matrix, c<strong>an</strong> be used <strong>for</strong> finding clusters in connection with function ics in<br />

package ICS, see Hennig’s discussion <strong>an</strong>d rejoinder of Tyler et al. (2009).<br />

Useful wrapper functions <strong>for</strong> clustering methods<br />

kme<strong>an</strong>sCBI This <strong>an</strong>d other "CBI"-functions (see the kme<strong>an</strong>sCBI-help page) are unified wrappers<br />

<strong>for</strong> various clustering methods in R that may be useful because they do in one step <strong>for</strong> what<br />

you normally may need to do a bit more in R (<strong>for</strong> example fitting a Gaussi<strong>an</strong> mixture with<br />

noise component in package mclust).<br />

kme<strong>an</strong>sruns This calls kme<strong>an</strong>s <strong>for</strong> the k-me<strong>an</strong>s clustering method <strong>an</strong>d includes estimation of the<br />

number of clusters <strong>an</strong>d finding <strong>an</strong> optimal <strong>solution</strong> from several starting points.<br />

pamk This calls pam <strong>an</strong>d clara <strong>for</strong> the partitioning around medoids clustering method (Kaufm<strong>an</strong><br />

<strong>an</strong>d Rouseeuw, 1990) <strong>an</strong>d includes two different ways of estimating the number of clusters.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!