16.01.2015 Views

Package 'WGCNA' - Laboratory Web Sites - UCLA

Package 'WGCNA' - Laboratory Web Sites - UCLA

Package 'WGCNA' - Laboratory Web Sites - UCLA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

42 goodGenesMS<br />

goodGenesMS<br />

Filter genes with too many missing entries across multiple sets<br />

Description<br />

This function checks data for missing entries and returns a list of genes that have non-zero variance<br />

in all sets and pass two criteria on maximum number of missing values in each given set: the fraction<br />

of missing values must be below a given threshold and the total number of missing samples must<br />

be below a given threshold<br />

Usage<br />

goodGenesMS(multiExpr,<br />

useSamples = NULL,<br />

useGenes = NULL,<br />

minFraction = 1/2,<br />

minNSamples = ..minNSamples,<br />

minNGenes = ..minNGenes,<br />

verbose = 1, indent = 0)<br />

Arguments<br />

multiExpr<br />

useSamples<br />

useGenes<br />

minFraction<br />

minNSamples<br />

minNGenes<br />

verbose<br />

indent<br />

expression data in the multi-set format (see checkSets). A vector of lists, one<br />

per set. Each set must contain a component data that contains the expression<br />

data, with rows corresponding to samples and columns to genes or probes.<br />

optional specifications of which samples to use for the check. Should be a logical<br />

vector; samples whose entries are FALSE will be ignored for the missing<br />

value counts. Defaults to using all samples.<br />

optional specifications of genes for which to perform the check. Should be a<br />

logical vector; genes whose entries are FALSE will be ignored. Defaults to<br />

using all genes.<br />

minimum fraction of non-missing samples for a gene to be considered good.<br />

minimum number of non-missing samples for a gene to be considered good.<br />

minimum number of good genes for the data set to be considered fit for analysis.<br />

If the actual number of good genes falls below this threshold, an error will be<br />

issued.<br />

integer level of verbosity. Zero means silent, higher values make the output<br />

progressively more and more verbose.<br />

indentation for diagnostic messages. Zero means no indentation, each unit adds<br />

two spaces.<br />

Details<br />

The constants ..minNSamples and ..minNGenes are both set to the value 4. For most data sets,<br />

the fraction of missing samples criterion will be much more stringent than the absolute number of<br />

missing samples criterion.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!