Package 'fpc' - open source solution for an Internet free, intelligent ...

More documents

Recommendations

Info

28 clusterboot Arguments data B distances bootmethod bscompare multipleboot jittertuning noisetuning something that can be coerced into a matrix. The data matrix - either an n*p-data matrix (or data frame) or an n*n-dissimilarity matrix (or dist-object). integer. Number of resampling runs for each scheme, see bootmethod. logical. If TRUE, the data is interpreted as dissimilarity matrix. If data is a dist-object, distances=TRUE automatically, otherwise distances=FALSE by default. This means that you have to set it to TRUE manually if data is a dissimilarity matrix. vector of strings, defining the methods used for resampling. Possible methods: "boot": nonparametric bootstrap (precise behaviour is controlled by parameters bscompare and multipleboot). "subset": selecting random subsets from the dataset. Size determined by subtuning. "noise": replacing a certain percentage of the points by random noise, see noisetuning. "jitter" add random noise to all points, see jittertuning. (This didn’t perform well in Hennig (2007), but you may want to get your own experience.) "bojit" nonparametric bootstrap first, and then adding noise to the points, see jittertuning. Important: only the methods "boot" and "subset" work with dissimilarity data! The results in Hennig (2007) indicate that "boot" is generally informative and often quite similar to "subset" and "bojit", while "noise" sometimes provides different information. Therefore the default (for distances=FALSE) is to use "boot" and "noise". However, some clustering methods may have problems with multiple points, which can be solved by using "bojit" or "subset" instead of "boot" or by multipleboot=FALSE below. logical. If TRUE, multiple points in the bootstrap sample are taken into account to compute the Jaccard similarity to the original clusters (which are represented by their "bootstrap versions", i.e., the points of the original cluster which also occur in the bootstrap sample). If a point was drawn more than once, it is in the "bootstrap version" of the original cluster more than once, too, if bscompare=TRUE. Otherwise (default) multiple points are ignored for the computation of the Jaccard similarities. If multipleboot=FALSE, it doesn’t make a difference. logical. If FALSE, all points drawn more than once in the bootstrap draw are only used once in the bootstrap samples. positive numeric. Tuning for the "jitter"-method. The noise distribution for jittering is a normal distribution with zero mean. The covariance matrix has the same Eigenvectors as that of the original data set, but the standard deviation along the principal directions is determined by the jittertuning-quantile of the distances between neighboring points projected along these directions. A vector of two positive numerics. Tuning for the "noise"-method. The first component determines the probability that a point is replaced by noise. Noise is generated by a uniform distribution on a hyperrectangle along the principal directions of the original data set, ranging from -noisetuning[2] to noisetuning[2]
clusterboot 29 subtuning clustermethod noisemethod count showplots dissolution recover seed times the standard deviation of the data set along the respective direction. Note that only points not replaced by noise are considered for the computation of Jaccard similarities. integer. Size of subsets for "subset". an interface function (the function name, not a string containing the name, has to be provided!). This defines the clustering method. See the "Details"-section for a list of available interface functions and guidelines how to write your own ones. logical. If TRUE, the last cluster is regarded as "noise component", which means that for computing the Jaccard similarity, it is not treated as a cluster. The noise component of the original clustering is only compared with the noise component of the clustering of the resampled data. This means that in the clusterbootoutput (and plot), if points were assigned to the noise component, the last cluster number refers to it, and its Jaccard similarity values refer to comparisons with estimated noise components in resampled datasets only. (Some cluster methods such as trimmed k-means and mclustBIC produce such noise components.) logical. If TRUE, the resampling runs are counted on the screen. logical. If TRUE, a plot of the first two dimensions of the resampled data set (or the classical MDS solution for dissimilarity data) is shown for every resampling run. The last plot shows the original data set. numeric between 0 and 1. If the Jaccard similarity between the resampling version of the original cluster and the most similar cluster on the resampled data is smaller or equal to this value, the cluster is considered as "dissolved". Numbers of dissolved clusters are recorded. numeric between 0 and 1. If the Jaccard similarity between the resampling version of the original cluster and the most similar cluster on the resampled data is larger than this value, the cluster is considered as "successfully recovered". Numbers of recovered clusters are recorded. integer. Seed for random generator (fed into set.seed) to make results reproducible. If NULL, results depend on chance. ... additional parameters for the clustermethods called by clusterboot. No effect in print.clboot and plot.clboot. x statistics xlim Details breaks object of class clboot. specifies in print.clboot, which of the three clusterwise Jaccard similarity statistics "mean", "dissolution" (number of times the cluster has been dissolved) and "recovery" (number of times a cluster has been successfully recovered) is printed. transferred to hist. transferred to hist. Here are some guidelines for interpretation. There is some theoretical justification to consider a Jaccard similarity value smaller or equal to 0.5 as an indication of a "dissolved cluster", see Hennig (2008). Generally, a valid, stable cluster should yield a mean Jaccard similarity value of 0.75 or
Page 1 and 2: Title Flexible procedures for clust
Page 3 and 4: fpc-package 3 mergenormals . . . .
Page 5 and 6: adcoord 5 Author(s) Christian Henni
Page 7 and 8: ancoord 7 ancoord Asymmetric neighb
Page 9 and 10: awcoord 9 Arguments xd clvecd clnum
Page 11 and 12: atcoord 11 dom string. dom="mean" m
Page 13 and 14: hattacharyya.matrix 13 References F
Page 15 and 16: can 15 See Also cluster.stats Examp
Page 17 and 18: classifdist 17 Examples set.seed(77
Page 19 and 20: clucols 19 clucols Sets of colours
Page 21 and 22: cluster.stats 21 Arguments n p cn i
Page 23 and 24: cluster.stats 23 Value cluster.stat
Page 25 and 26: cluster.varstats 25 See Also silhou
Page 27: clusterboot 27 fcc
Page 31 and 32: clusterboot 31 Value dbscanCBI an i
Page 33 and 34: cmahal 33 Arguments n p nmin cmin n
Page 35 and 36: confusion 35 # The same should be p
Page 37 and 38: cweight 37 Examples x
Page 39 and 40: dbscan 39 Details Value Clusters re
Page 41 and 42: diptest.multi 41 References J. A. H
Page 43 and 44: discrete.recode 43 Author(s) Christ
Page 45 and 46: discrproj 45 Usage discrproj(x, clv
Page 47 and 48: distancefactor 47 Arguments cat n c
Page 49 and 50: dridgeline 49 References Halkidi, M
Page 51 and 52: extract.mixturepars 51 Author(s) Ch
Page 53 and 54: fixmahal 53 ## S3 method for class
Page 55 and 56: fixmahal 55 Details Value A (crisp)
Page 57 and 58: fixmahal 57 calpha pointit subset m
Page 59 and 60: fixreg 59 Usage linear regression r
Page 61 and 62: fixreg 61 pch col gv Details Value
Page 63 and 64: fixreg 63 sim coef var g coll vecto
Page 65 and 66: flexmixedruns 65 Details Value disc
Page 67 and 68: fpclusters 67 fpclusters Extracting
Page 69 and 70: kmeansCBI 69 Arguments x jitterv fa
Page 71 and 72: kmeansCBI 71 k scaling runs criteri
Page 73 and 74: kmeansCBI 73 Value claraCBI an inte
Page 75 and 76: kmeansruns 75 Value iter.max runs s
Page 77 and 78: localshape 77 Details Value The dat
Page 79 and 80:
mahalanofix 79 Arguments x2 mg covg
Page 81 and 82:
mahalconf 81 mahalconf Mahalanobis
Page 83 and 84:
mergenormals 83 numberstop renumber
Page 85 and 86:
mergeparameters 85 Tantrum, J., Mur
Page 87 and 88:
mixdens 87 Arguments n positive int
Page 89 and 90:
mixpredictive 89 Details The predic
Page 91 and 92:
ncoord 91 See Also plotcluster for
Page 93 and 94:
nselectboot 93 nselectboot Selectio
Page 95 and 96:
pamk 95 Arguments data krange crite
Page 97 and 98:
piridge.zeroes 97 Author(s) Christi
Page 99 and 100:
plotcluster 99 "arc" asymmetric dis
Page 101 and 102:
prediction.strength 101 Arguments x
Page 103 and 104:
andconf 103 Value An n*cln-matrix.
Page 105 and 106:
egmix 105 Details The result of the
Page 107 and 108:
Face 107 Arguments n p nrep.top smi
Page 109 and 110:
idgeline.diagnosis 109 ridgeline.di
Page 111 and 112:
solvecov 111 Arguments fpcobj an ob
Page 113 and 114:
tdecomp 113 tdecomp Root of singula
Page 115 and 116:
weightplots 115 Arguments y numeric
Page 117 and 118:
zmisclassification.matrix 117 Argum
Page 119 and 120:
Index ∗Topic arith can, 15 cweigh
Page 121 and 122:
INDEX 121 dip.test, 40, 41, 83 dipp
show all

Package 'fpc' - open source solution for an Internet free, intelligent ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?