10.01.2015 Views

Package 'fpc' - open source solution for an Internet free, intelligent ...

Package 'fpc' - open source solution for an Internet free, intelligent ...

Package 'fpc' - open source solution for an Internet free, intelligent ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

30 clusterboot<br />

more. Between 0.6 <strong>an</strong>d 0.75, clusters may be considered as indicating patterns in the data, but<br />

which points exactly should belong to these clusters is highly doubtful. Below average Jaccard<br />

values of 0.6, clusters should not be trusted. "Highly stable" clusters should yield average Jaccard<br />

similarities of 0.85 <strong>an</strong>d above. All of this refers to bootstrap; <strong>for</strong> the other resampling schemes it<br />

depends on the tuning const<strong>an</strong>ts, though their default values should gr<strong>an</strong>t similar interpretations in<br />

most cases.<br />

While B=100 is recommended, smaller run numbers could give quite in<strong>for</strong>mative results as well, if<br />

computation times become too high.<br />

Note that the stability of a cluster is assessed, but stability is not the only import<strong>an</strong>t validity criterion<br />

- clusters obtained by very inflexible clustering methods may be stable but not valid, as discussed<br />

in Hennig (2007). See plotcluster <strong>for</strong> graphical cluster validation.<br />

In<strong>for</strong>mation about interface functions <strong>for</strong> clustering methods:<br />

The following interface functions are currently implemented (in the present package; note that<br />

almost all of these functions require the specification of some control parameters, so if you use one<br />

of them, look up their common help page kme<strong>an</strong>sCBI) first:<br />

kme<strong>an</strong>sCBI <strong>an</strong> interface to the function kme<strong>an</strong>s <strong>for</strong> k-me<strong>an</strong>s clustering. This assumes a cases*variables<br />

matrix as input.<br />

hclustCBI <strong>an</strong> interface to the function hclust <strong>for</strong> agglomerative hierarchical clustering with optional<br />

noise component. This function produces a partition <strong>an</strong>d assumes a cases*variables<br />

matrix as input.<br />

hclusttreeCBI <strong>an</strong> interface to the function hclust <strong>for</strong> agglomerative hierarchical clustering. This<br />

function produces a tree (not only a partition; there<strong>for</strong>e the number of clusters c<strong>an</strong> be huge!)<br />

<strong>an</strong>d assumes a cases*variables matrix as input.<br />

disthclustCBI <strong>an</strong> interface to the function hclust <strong>for</strong> agglomerative hierarchical clustering with<br />

optional noise component. This function produces a partition <strong>an</strong>d assumes a dissimilarity<br />

matrix as input.<br />

noisemclustCBI <strong>an</strong> interface to the function mclustBIC <strong>for</strong> normal mixture model based clustering.<br />

This assumes a cases*variables matrix as input. Warning: mclustBIC sometimes has<br />

problems with multiple points. It is recommended to use this only together with multipleboot=FALSE.<br />

distnoisemclustCBI <strong>an</strong> interface to the function mclustBIC <strong>for</strong> normal mixture model based clustering.<br />

This assumes a dissimilarity matrix as input <strong>an</strong>d generates a data matrix by multidimensional<br />

scaling first. Warning: mclustBIC sometimes has problems with multiple points.<br />

It is recommended to use this only together with multipleboot=FALSE.<br />

claraCBI <strong>an</strong> interface to the functions pam <strong>an</strong>d clara <strong>for</strong> partitioning around medoids. This c<strong>an</strong><br />

be used with cases*variables as well as dissimilarity matrices as input.<br />

pamkCBI <strong>an</strong> interface to the function pamk <strong>for</strong> partitioning around medoids. The number of cluster<br />

is estimated by the average silhouette width. This c<strong>an</strong> be used with cases*variables as well as<br />

dissimilarity matrices as input.<br />

trimkme<strong>an</strong>sCBI <strong>an</strong> interface to the function trimkme<strong>an</strong>s <strong>for</strong> trimmed k-me<strong>an</strong>s clustering. This<br />

assumes a cases*variables matrix as input.<br />

tclustCBI <strong>an</strong> interface to the function tclust <strong>for</strong> trimmed Gaussi<strong>an</strong> clustering. This assumes a<br />

cases*variables matrix as input.<br />

disttrimkme<strong>an</strong>sCBI <strong>an</strong> interface to the function trimkme<strong>an</strong>s <strong>for</strong> trimmed k-me<strong>an</strong>s clustering.<br />

This assumes a dissimilarity matrix as input <strong>an</strong>d generates a data matrix by multidimensional<br />

scaling first.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!