10.07.2015 Views

Package 'oligo' - Bioconductor

Package 'oligo' - Bioconductor

Package 'oligo' - Bioconductor

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Package</strong> ‘oligo’May 22, 2014Version 1.28.2Title Preprocessing tools for oligonucleotide arrays.Author Benilton Carvalho and Rafael Irizarry. Contributors: BenBolstad, Vincent Carey, Wolfgang Huber, Harris Jaffee, Jim MacDonald, Matt SettlesMaintainer Benilton Carvalho Depends R (>= 2.15.0), BiocGenerics (>= 0.3.2), oligoClasses (>=1.25.4), Biobase (>= 2.17.8), Biostrings (>= 2.25.12)Imports affyio (>= 1.25.0), affxparser (>= 1.29.11), BiocGenerics (>=0.3.2), DBI (>= 0.2-5), ff, graphics, methods, preprocessCore(>= 1.19.0), splines, stats, stats4, utils, zlibbiocEnhances ff, doMC, doMPILinkingTo preprocessCoreSuggestshapmap100kxba, pd.mapping50k.xba240, pd.huex.1.0.st.v2,pd.hg18.60mer.expr, pd.hugene.1.0.st.v1, maqc-Expression4plex,genefilter, limma, RColorBrewer, oligoData, RUnitDescription A package to analyze oligonucleotide arrays(expression/SNP/tiling/exon) at probe-level. It currentlysupports Affymetrix (CEL files) and NimbleGen arrays (XYS files).License LGPL (>= 2)Collate AllGenerics.R methods-GeneFeatureSet.Rmethods-ExonFeatureSet.R methods-ExpressionFeatureSet.Rmethods-ExpressionSet.R methods-LDS.R methods-FeatureSet.Rmethods-SnpFeatureSet.R methods-SnpCnvFeatureSet.Rmethods-TilingFeatureSet.R methods-HtaFeatureSet.Rmethods-DBPDInfo.R methods-background.R methods-normalization.Rmethods-summarization.R read.celfiles.R read.xysfiles.Rutils-general.R utils-selectors.R todo-snp.R functions-crlmm.Rfunctions-snprma.R justSNPRMA.R justCRLMM.R methods-snp6.Rmethods-genotype.R methods-PLMset.R zzz.R1


2 R topics documented:LazyLoad YesbiocViews Microarray, OneChannel, TwoChannel, Preprocessing, SNP,DifferentialExpression, ExonArray,GeneExpression, DataImportR topics documented:oligo-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3basecontent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4basicPLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4basicRMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6chromosome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7crlmm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8darkColors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9fitProbeLevelModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10getAffinitySplineCoefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11getBaseProfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12getContainer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12getCrlmmSummaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13getNetAffx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13getNgsColorsInfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14getPlatformDesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15getProbeInfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15getX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16hist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18justSNPRMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19list.xysfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19MAplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20mm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22mmindex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23mmSequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24oligo-defunct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24oligoPLM-class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25paCalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27plotM-methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29pmAllele . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29pmFragmentLength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30pmPosition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30pmStrand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31probeNames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31read.celfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32read.xysfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33readSummaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35rma-methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35runDate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37sequenceDesignMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38


oligo-package 3snprma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38summarize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Index 41oligo-packageThe oligo package: a tool for low-level analysis of oligonucleotidearraysDescriptionThe oligo package provides tools to preprocess different oligonucleotide arrays types: expression,tiling, SNP and exon chips. The supported manufacturers are Affymetrix and NimbleGen.It offers support to large datasets (when the bigmemory is loaded) and can execute preprocessingtasks in parallel (if, in addition to bigmemory, the snow package is also loaded).DetailsThe package will read the raw intensity files (CEL for Affymetrix; XYS for NimbleGen) and allowthe user to perform analyses starting at the feature-level.Reading in the intensity files require the existence of data packages that contain the chip specificinformation (X/Y coordinates; feature types; sequence). These data packages packages are builtusing the pdInfoBuilder package.For Affymetrix SNP arrays, users are asked to download the already built annotation packages fromBioConductor. This is because these packages contain metadata that are not automatically created.The following annotation packages are available:50K Xba - pd.mapping50kxba.240 50K Hind - pd.mapping50khind.240 250K Sty - pd.mapping250k.sty250K Nsp - pd.mapping250k.nsp GenomeWideSnp 5 (SNP 5.0) - pd.genomewidesnp.5 GenomeWideSnp6 (SNP 6.0) - pd.genomewidesnp.6For users interested in genotype calls for SNP 5.0 and 6.0 arrays, we strongly recommend the useuse the crlmm package, which implements a more efficient version of CRLMM.Author(s)Benilton Carvalho - ReferencesCarvalho, B.; Bengtsson, H.; Speed, T. P. & Irizarry, R. A. Exploration, Normalization, and GenotypeCalls of High Density Oligonucleotide SNP Array Data. Biostatistics, 2006.


4 basicPLMbasecontentSequence Base ContentsDescriptionFunction to compute the amounts of each nucleotide in a sequence.Usagebasecontent(seq)Argumentsseqcharacter vector of length n containg a valid sequence (A/T/C/G)Valuematrix with n rows and 4 columns with the counts for each base.Examplessequences


asicRMA 5ValueA list with the following components:EstimatesStdErrorsResidualsA (length(pnVec) x ncol(pmMat)) matrix with probeset summaries.A (length(pnVec) x ncol(pmMat)) matrix with standard errors of ’Estimates’.A (nrow(pmMat) x ncol(pmMat)) matrix of residuals.NoteCurrently, only RMA-bg-correction and quantile normalization are allowed.Author(s)Benilton CarvalhoSee AlsorcModelPLM, rcModelPLMr, rcModelPLMrr, rcModelPLMrc, basicRMAExamplesset.seed(1)pms


6 boxplotArgumentspmMatMatrix of intensities to be processed.pnVecProbeset names.normalize Logical flag: normalize?background Logical flag: background adjustment?bgversion Version of background correction.destructive Logical flag: use destructive methods?verbose Logical flag: verbose.... Not currently used.ValueMatrix.Examplesset.seed(1)pms


chromosome 7DetailsThe ’transfo’ argument will set the transformation to be used. For raw data, ’transfo=log2’ is acommon practice. For summarized data (which are often in log2-scale), no transformation is needed(therefore ’transfo=identity’).NoteThe boxplot methods for FeatureSet and Expression use a sample (via sample) of the probes/probesetsto produce the plot. Therefore, the user interested in reproducibility is advised to use set.seed.See Alsohist, image, sample, set.seedchromosomeAccessor for chromosome informationDescriptionReturns chromosome information.UsagepmChr(object)ArgumentsobjectTilingFeatureSet or SnpCallSet objectDetailschromosome() returns the chromosomal information for all probes and pmChr() subsets the outputto the PM probes only (if a TilingFeatureSet object).ValueVector with chromosome information.


8 crlmmcrlmmGenotype CallsDescriptionPerforms genotype calls via CRLMM (Corrected Robust Linear Model with Maximum-likelihoodbased distances).Usagecrlmm(filenames, outdir, batch_size=40000, balance=1.5,minLLRforCalls=c(5, 1, 5), recalibrate=TRUE,verbose=TRUE, pkgname, reference=TRUE)justCRLMM(filenames, batch_size = 40000, minLLRforCalls = c(5, 1, 5),recalibrate = TRUE, balance = 1.5, phenoData = NULL, verbose = TRUE,pkgname = NULL, tmpdir=tempdir())Argumentsfilenames character vector with the filenames.outdir directory where the output (and some tmp files) files will be saved.batch_size integer defining how many SNPs should be processed at a time.recalibrate Logical - should recalibration be performed?balance Control parameter to balance homozygotes and heterozygotes calls.minLLRforCalls Minimum thresholds for genotype calls.verbose Logical.phenoData phenoData object or NULLpkgname alt. pdInfo package to be usedreference logical, defaulting to TRUE ...tmpdir Directory where temporary files are going to be stored at.ValueSnpCallSetPlus object.


darkColors 9darkColorsCreate set of colors, interpolating through a set of preferred colors.DescriptionCreate set of colors, interpolating through a set of preferred colors.UsagedarkColors(n)seqColors(n)seqColors2(n)divColors(n)Argumentsninteger determining number of colors to be generatedDetailsdarkColors is based on the Dark2 palette in RColorBrewer, therefore useful to describe qualitativefeatures of the data.seqColors is based on Blues and generates a gradient of blues, therefore useful to describe quantitativefeatures of the data. seqColors2 behaves similarly, but it is based on OrRd (white-orange-red).divColors is based on the RdBu pallete in RColorBrewer, therefore useful to describe quantitativefeatures ranging on two extremes.Examplesx


10 fitProbeLevelModelfitProbeLevelModelTool to fit Probe Level Models.DescriptionUsageFits robust Probe Level linear Models to all the (meta)probesets in an FeatureSet. This is carriedout on a (meta)probeset by (meta)probeset basis.fitProbeLevelModel(object, background=TRUE, normalize=TRUE, target="core", method="plm", verbose=TRUEArgumentsobjectbackgroundnormalizetargetmethodverboseS4ValueNoteFeatureSet object.Do background correction?Do normalization?character vector describing the summarization target. Valid values are: ’probeset’,’core’ (Gene/Exon), ’full’ (Exon), ’extended’ (Exon).summarization method to be used.verbosity flag.return final value as an S4 object (oligoPLM) if TRUE. If FALSE, final value isreturned as a list.... subset to be passed down to getProbeInfo for subsetting. See subset for details.fitProbeLevelModel returns an oligoPLM object, if S4=TRUE; otherwise, it will return a list.This is the initial port of fitPLM to oligo. Some features found on the original work by Ben Bolstad(in the affyPLM package) may not be yet available. If you found one of this missing characteristics,please contact Benilton Carvalho.Author(s)This is a simplified port from Ben Bolstad’s work implemented in the affyPLM package. Problemswith the implementation in oligo should be reported to Benilton Carvalho.ReferencesBolstad, BM (2004) Low Level Analysis of High-density Oligonucleotide Array Data: Background,Normalization and Summarization. PhD Dissertation. University of California, Berkeley.


getAffinitySplineCoefficients 11See Alsorma, summarizationMethods, subsetExamplesif (require(oligoData)){data(nimbleExpressionFS)fit


12 getContainergetBaseProfileCompute and plot nucleotide profile.DescriptionComputes and, optionally, lots nucleotide profile, describing the sequence effect on intensities.UsagegetBaseProfile(coefs, probeLength = 25, plot = FALSE, ...)Argumentscoefsaffinity spline coefficients.probeLength length of probesplotlogical. Plots profile?... arguments to be passed to matplot.ValueInvisibly returns a matrix with estimated effects.getContainerGet container information for NimbleGen Tiling Arrays.DescriptionGet container information for NimbleGen Tiling Arrays. This is useful for better identification ofcontrol probes.UsagegetContainer(object, probeType)ArgumentsobjectprobeTypeA TilingFeatureSet or TilingFeatureSet object.String describing which probes to query (’pm’, ’bg’)Value’character’ vector with container information.


getCrlmmSummaries 13getCrlmmSummariesFunction to get CRLMM summaries saved to diskDescriptionUsageThis will read the summaries written to disk and return them to the user as a SnpCallSetPlus orSnpCnvCallSetPlus object.getCrlmmSummaries(tmpdir)Argumentstmpdirdirectory where CRLMM saved the results to.ValueIf the data were from SNP 5.0 or 6.0 arrays, the function will return a SnpCnvCallSetPlus object.It will return a SnpCallSetPlus object, otherwise.getNetAffxNetAffx Biological AnnotationsDescriptionUsageGets NetAffx Biological Annotations saved in the annotation package (Exon and Gene ST Affymetrixarrays).getNetAffx(object, type = "probeset")Argumentsobjecttype’ExpressionSet’ object (eg., result of rma())Either ’probeset’ or ’transcript’, depending on what type of summaries wereobtained.DetailsThis retrieves NetAffx annotation saved in the (pd) annotation package - annotation(object). It isonly available for Exon ST and Gene ST arrays.The ’type’ argument should match the summarization target used to generate ’object’. The ’rma’method allows for two targets: ’probeset’ (target=’probeset’) and ’transcript’ (target=’core’, target=’full’,target=’extended’).


14 getNgsColorsInfoValue’AnnotatedDataFrame’ that can be used as featureData(object)Author(s)Benilton CarvalhogetNgsColorsInfoHelper function to extract color information for filenames on Nimble-Gen arrays.DescriptionThis function will (try to) extract the color information for NimbleGen arrays. This is useful whenusing read.xysfiles2 to parse XYS files for Tiling applications.UsagegetNgsColorsInfo(path = ".", pattern1 = "_532", pattern2 = "_635", ...)Argumentspathpath where to look for filespattern1 pattern to match files supposed to go to the first channelpattern2 pattern to match files supposed to go to the second channel... extra arguments for list.xysfilesDetailsMany NimbleGen samples are identified following the pattern sampleID_532.XYS / sampleID_635.XYS.The function suggests sample names if all the filenames follow the standard above.ValueA data.frame with, at least, two columns: ’channel1’ and ’channel2’. A third column, ’sample-Names’, is returned if the filenames follow the sampleID_532.XYS / sampleID_635.XYS standard.Author(s)Benilton Carvalho


getPlatformDesign 15getPlatformDesignRetrieve Platform Design objectDescriptionUsageRetrieve platform design object.getPlatformDesign(object)getPD(object)ArgumentsobjectFeatureSet objectDetailsRetrieve platform design object.ValueplatformDesign or PDInfo object.getProbeInfoProbe information selector.DescriptionUsageA tool to simplify the selection of probe information, so user does not need to use the SQL approaches.getProbeInfo(object, field, probeType = "pm", target = "core", sortBy = c("fid", "man_fsetid", "none"),ArgumentsobjectfieldprobeTypetargetsortByFeatureSet object.character string with names of field(s) of interest to be obtained from database.character string: ’pm’ or ’mm’Used only for Exon or Gene ST arrays: ’core’, ’full’, ’extended’, ’probeset’.Field to be used for sorting.... Arguments to be passed to subset


16 getXValueA data.frame with the probe level information.NoteThe code allows for querying info on MM probes, however it has been used mostly on PM probes.Author(s)Benilton CarvalhoExamplesif (require(oligoData)){data(affyGeneFS)availProbeInfo(affyGeneFS)probeInfo head(agenGene)}getXAccessors for physical array coordinates.DescriptionAccessors for physical array coordinates.UsagegetX(object, type)getY(object, type)ArgumentsobjecttypeFeatureSet object’character’ defining the type of the probes to be queried. Valid options are ’pm’,’mm’, ’bg’ValueA vector with the requested coordinates.


hist 17Examples## Not run:x


18 imageimageDisplay a pseudo-image of a microarray chipDescriptionUsageProduces a pseudo-image (graphics::image) for each sample.## S4 method for signature FeatureSetimage(x, which, transfo=log2, ...)## S4 method for signature PLMsetimage(x, which=0,type=c("weights","resids", "pos.resids","neg.resids","sign.resids"),use.log=TRUE, add.legend=FALSE, standardize=FALSE,col=NULL, main, ...)Argumentsxwhichtransfotypeuse.logadd.legendstandardizecolmainFeatureSet objectinteger indices of samples to be plotted (optional).function to be applied to the data prior to plotting.Type of statistics to be used.Use log.Add legend.Standardize residuals.Colors to be used.Main title.... parameters to be passed to imageExamplesif(require(oligoData) & require(pd.hg18.60mer.expr)){data(nimbleExpressionFS)par(mfrow=c(1, 2))image(nimbleExpressionFS, which=4)## fit


justSNPRMA 19justSNPRMASummarization of SNP dataDescriptionThis function implements the SNPRMA method for summarization of SNP data. It works directlywith the CEL files, saving memory.UsagejustSNPRMA(filenames, verbose = TRUE, phenoData = NULL, normalizeToHapmap = TRUE)Argumentsfilenamesverbosecharacter vector with the filenames.logical flag for verbosity.phenoData a phenoData object or NULLnormalizeToHapmapNormalize to Hapmap? Should always be TRUE, but it’s kept here for futureuse.ValueSnpQSet or a SnpCnvQSet, depending on the array type.Examples## snprmaResults


20 MAplotDetailsThe functions interface list.files and the user is asked to check that function for further details.ValueCharacter vector with the filenames.See Alsolist.filesExampleslist.xysfiles()MAplotMA plotsDescriptionUsageCreate MA plots using a reference array (if one channel) or using channel2 as reference (if twochannel).MAplot(object, ...)## S4 method for signature FeatureSetMAplot(object, what=pm, transfo=log2, groups,refSamples, which, pch=".", summaryFun=rowMedians,plotFun=smoothScatter, main="vs pseudo-median reference chip",pairs=FALSE, ...)## S4 method for signature TilingFeatureSetMAplot(object, what=pm, transfo=log2, groups,refSamples, which, pch=".", summaryFun=rowMedians,plotFun=smoothScatter, main="vs pseudo-median reference chip",pairs=FALSE, ...)## S4 method for signature PLMsetMAplot(object, what=coefs, transfo=identity, groups,refSamples, which, pch=".", summaryFun=rowMedians,plotFun=smoothScatter, main="vs pseudo-median reference chip",pairs=FALSE, ...)## S4 method for signature matrixMAplot(object, what=identity, transfo=identity,


MAplot 21groups, refSamples, which, pch=".", summaryFun=rowMedians,plotFun=smoothScatter, main="vs pseudo-median reference chip",pairs=FALSE, ...)## S4 method for signature ExpressionSetMAplot(object, what=exprs, transfo=identity,groups, refSamples, which, pch=".", summaryFun=rowMedians,plotFun=smoothScatter, main="vs pseudo-median reference chip",pairs=FALSE, ...)ArgumentsobjectwhattransfogroupsrefSampleswhichpchsummaryFunplotFunmainDetailsValuepairsFeatureSet, PLMset or ExpressionSet object.function to be applied on object that will extract the statistics of interest, fromwhich log-ratios and average log-intensities will be computed.function to transform the data prior to plotting.factor describing groups of samples that will be combined prior to plotting. Ifmissing, MvA plots are done per sample.integers (indexing samples) to define which subjects will be used to compute thereference set. If missing, a pseudo-reference chip is estimated using summaryFun.integer (indexing samples) describing which samples are to be plotted.same as pch in plotfunction that operates on a matrix and returns a vector that will be used to summarizedata belonging to the same group (or reference) on the computation ofgrouped-stats.function to be used for plotting. Usually smoothScatter, plot or points.string to be used in title.logical flag to determine if a matrix of MvA plots is to be generated... Other arguments to be passed downstream, like plot arguments.MAplot will take the following extra arguments:Plot1. subset: indices of elements to be plotted to reduce impact of plotting 100’s thousands points(if pairs=FALSE only);2. span: see loess;3. family.loess: see loess;4. addLoess: logical flag (default TRUE) to add a loess estimate;5. parParams: list of params to be passed to par() (if pairs=TRUE only);


22 mmAuthor(s)Benilton Carvalho - based on Ben Bolstad’s original MAplot function.See Alsoplot, smoothScatterExamplesif(require(oligoData) & require(pd.hg18.60mer.expr)){data(nimbleExpressionFS)nimbleExpressionFSgroups


mmindex 23DetailsFor all objects but TilingFeatureSet, these methods will return matrices. In case of TilingFeatureSetobjects, the value is a 3-dimensional array (probes x samples x channels).intensity will return the whole intensity matrix associated to the object. pm, mm, bg will return therespective PM/MM/BG matrix.When applied to ExonFeatureSet or GeneFeatureSet objects, pm will return the PM matrix at thetranscript level (’core’ probes) by default. The user should set the target argument accordingly ifsomething else is desired. The valid values are: ’probeset’ (Exon and Gene arrays), ’core’ (Exonand Gene arrays), ’full’ (Exon arrays) and ’extended’ (Exon arrays).The target argument has no effects when used on designs other than Gene and Exon ST.Examplesif (require(maqcExpression4plex) & require(pd.hg18.60mer.expr)){xysPath


24 oligo-defunctExamples## How pm() works## Not run:x


oligoPLM-class 25ArgumentsDetails... Arguments.fitPLM was replaced by fitProbeLevelModel, allowing faster execution and providing morespecific models. fitPLM was based in the code written by Ben Bolstad in the affyPLM package.However, all the model-fitting functions are now in the package preprocessCore, on whichfitProbeLevelModel depends.coefs and resids, like fitPLM, were inherited from the affyPLM package. They were replacedrespectively by coef and residuals, because this is how these statistics are called everywhere elsein R.oligoPLM-classClass "oligoPLM"DescriptionA class to represent Probe Level Models.Objects from the ClassSlotsObjects can be created by calls of the form fitProbeLevelModel(FeatureSetObject), whereFeatureSetObject is an object obtained through read.celfiles or read.xysfiles, representingintensities observed for different probes (which are grouped in probesets or meta-probesets) acrossdistinct samples.chip.coefs: "matrix" with chip/sample effects - probeset-levelprobe.coefs: "numeric" vector with probe effectsweights: "matrix" with weights - probe-levelresiduals: "matrix" with residuals - probe-levelse.chip.coefs: "matrix" with standard errors for chip/sample coefficientsse.probe.coefs: "numeric" vector with standard errors for probe effectsresidualSE: scale - residual standard errorgeometry: array geometry used for plotsmethod: "character" string describing method used for PLMmanufacturer: "character" string with manufacturer nameannotation: "character" string with the name of the annotation packagenarrays: "integer" describing the number of arraysnprobes: "integer" describing the number of probes before summarizationnprobesets: "integer" describing the number of probesets after summarization


26 oligoPLM-classMethodsannotation signature(object = "oligoPLM"): accessor/replacement method to annotation slotboxplot signature(x = "oligoPLM"): boxplot methodcoef signature(object = "oligoPLM"): accessor/replacement method to coef slotcoefs.probe signature(object = "oligoPLM"): accessor/replacement method to coefs.probeslotgeometry signature(object = "oligoPLM"): accessor/replacement method to geometry slotimage signature(x = "oligoPLM"): image methodmanufacturer signature(object = "oligoPLM"): accessor/replacement method to manufacturerslotmethod signature(object = "oligoPLM"): accessor/replacement method to method slotncol signature(x = "oligoPLM"): accessor/replacement method to ncol slotnprobes signature(object = "oligoPLM"): accessor/replacement method to nprobes slotnprobesets signature(object = "oligoPLM"): accessor/replacement method to nprobesets slotresiduals signature(object = "oligoPLM"): accessor/replacement method to residuals slotresidualSE signature(object = "oligoPLM"): accessor/replacement method to residualSE slotse signature(object = "oligoPLM"): accessor/replacement method to se slotse.probe signature(object = "oligoPLM"): accessor/replacement method to se.probe slotshow signature(object = "oligoPLM"): show methodweights signature(object = "oligoPLM"): accessor/replacement method to weights slotNUSE signature(x = "oligoPLM") : Boxplot of Normalized Unscaled Standard Errors (NUSE)or NUSE values.RLE signature(x = "oligoPLM") : Relative Log Expression boxplot or values.Author(s)This is a port from Ben Bolstad’s work implemented in the affyPLM package. Problems with theimplementation in oligo should be reported to the package’s maintainer.ReferencesBolstad, BM (2004) Low Level Analysis of High-density Oligonucleotide Array Data: Background,Normalization and Summarization. PhD Dissertation. University of California, Berkeley.See Alsorma, summarize


paCalls 27Examples## TODO: review code and fix broken## Not run:if (require(oligoData)){data(nimbleExpressionFS)fit


28 paCallsValue2. alpha2: a significance threshold in (alpha1, 0.5);3. tau: a small positive constant;4. ignore.saturated: if TRUE, do the saturation correction described in the paper, with asaturation level of 46000;This function performs the hypothesis test:H0: median(Ri) = tau, corresponding to absence of transcript H1: median(Ri) > tau, correspondingto presence of transcriptwhere Ri = (PMi - MMi) / (PMi + MMi) for each i a probe-pair in the probe-set represented by data.The p-value that is returned estimates the usual quantity:Pr(observing a more "present looking" probe-set than data | data is absent)So that small p-values imply presence while large ones imply absence of transcript. The detectioncall is computed by thresholding the p-value as in:call "P" if p-value < alpha1 call "M" if alpha1


plotM-methods 29head(dabgP) ## for probehead(dabgPS) ## for probeset}## End(Not run)plotM-methodsMethods for Log-Ratio plottingDescriptionThe plotM methods are meant to plot log-ratios for different classes of data.Methodsobject = "SnpQSet", i = "character" Plot log-ratio for SNP data for sample i.object = "SnpQSet", i = "integer" Plot log-ratio for SNP data for sample i.object = "SnpQSet", i = "numeric" Plot log-ratio for SNP data for sample i.object = "TilingQSet", i = "missing" Plot log-ratio for Tiling data for sample i.pmAlleleAccess the allele information for PM probes.DescriptionAccessor to the allelic information for PM probes.UsagepmAllele(object)ArgumentsobjectSnpFeatureSet or PDInfo object.


30 pmPositionpmFragmentLengthAccess the fragment length for PM probes.DescriptionAccessor to the fragment length for PM probes.UsagepmFragmentLength(object, enzyme, type=c(snp, cn))ArgumentsobjectenzymetypePDInfo or SnpFeatureSet object.Enzyme to be used for query. If missing, all enzymes are used.Type of probes to be used: ’snp’ for SNP probes; ’cn’ for Copy Number probes.ValueNoteA list of length equal to the number of enzymes used for digestion. Each element of the list is adata.frame containing:• row: the row used to link to the PM matrix;• length: expected fragment length.There is not a 1:1 relationship between probes and expected fragment length. For one enzyme, agiven probe may be associated to multiple fragment lengths. Therefore, the number of rows in thedata.frame may not match the number of PM probes and the row column should be used to matchthe fragment length with the PM matrix.pmPositionAccessor to position informationDescriptionpmPosition will return the genomic position for the (PM) probes.UsagepmPosition(object)pmOffset(object)


pmStrand 31ArgumentsobjectAffySNPPDInfo, TilingFeatureSet or SnpCallSet objectDetailspmPosition will return genomic position for PM probes on a tiling array.pmOffset will return the offset information for PM probes on SNP arrays.pmStrandAccessor to the strand informationDescriptionReturns the strand information for PM probes (0 - sense / 1 - antisense).UsagepmStrand(object)ArgumentsobjectAffySNPPDInfo or TilingFeatureSet objectprobeNamesAccessor to feature namesDescriptionUsageAccessors to featureset names.probeNames(object, subset = NULL, ...)probesetNames(object, ...)ArgumentsValueobjectsubsetFeatureSet or DBPDInfonot implemented yet.... Arguments (like ’target’) passed to downstream methods.probeNames returns a string with the probeset names for *each probe* on the array. probesetNames,on the other hand, returns the *unique probeset names*.


32 read.celfilesread.celfilesParser to CEL filesDescriptionReads CEL files.Usageread.celfiles(..., filenames, pkgname, phenoData, featureData,experimentData, protocolData, notes, verbose=TRUE, sampleNames,rm.mask=FALSE, rm.outliers=FALSE, rm.extra=FALSE, checkType=TRUE)read.celfiles2(channel1, channel2, pkgname, phenoData, featureData,experimentData, protocolData, notes, verbose=TRUE, sampleNames,rm.mask=FALSE, rm.outliers=FALSE, rm.extra=FALSE, checkType=TRUE)Arguments... names of files to be read.filenameschannel1channel2pkgnamephenoDatafeatureDataa character vector with the CEL filenames.a character vector with the CEL filenames for the first ’channel’ on a Tilingapplicationa character vector with the CEL filenames for the second ’channel’ on a Tilingapplicationalternative data package to be loaded.phenoDatafeatureDataexperimentData experimentDataprotocolDatanotesverbosesampleNamesrm.maskrm.outliersrm.extracheckTypeprotocolDatanoteslogicalcharacter vector with sample names (usually better descriptors than the filenames)logical. Read masked?logical. Remove outliers?logical. Remove extra?logical. Check type of each file? This can be time consuming.


ead.xysfiles 33DetailsWhen using ’affyio’ to read in CEL files, the user can read compressed CEL files (CEL.gz). Additionally,’affyio’ is much faster than ’affxparser’.The function guesses which annotation package to use from the header of the CEL file. The usercan also provide the name of the annotaion package to be used (via the pkgname argument). If theannotation package cannot be loaded, the function returns an error. If the annotation package is notavailable from BioConductor, one can use the pdInfoBuilder package to build one.ValueExpressionFeatureSetif Expresssion arraysExonFeatureSet if Exon arraysSnpFeatureSet if SNP arraysTilingFeatureSetif Tiling arraysSee Alsolist.celfiles, read.xysfilesExamplesif(require(pd.mapping50k.xba240) & require(hapmap100kxba)){celPath


34 read.xysfilesArgumentsDetails... file namesfilenameschannel1channel2pkgnamephenoDatafeatureDatacharacter vector with filenames.a character vector with the XYS filenames for the first ’channel’ on a Tilingapplicationa character vector with the XYS filenames for the second ’channel’ on a Tilingapplicationcharacter vector with alternative PD Info package namephenoDatafeatureDataexperimentData experimentDataprotocolDatanotesverbosesampleNamescheckTypeprotocolDatanotesverbosecharacter vector with sample names (usually better descriptors than the filenames)logical. Check type of each file? This can be time consuming.The function will read the XYS files provided by NimbleGen Systems and return an object of classFeatureSet.The function guesses which annotation package to use from the header of the XYS file. The usercan also provide the name of the annotaion package to be used (via the pkgname argument). If theannotation package cannot be loaded, the function returns an error. If the annotation package is notavailable from BioConductor, one can use the pdInfoBuilder package to build one.ValueExpressionFeatureSetif Expresssion arraysTilingFeatureSetif Tiling arraysSee Alsolist.xysfiles, read.celfilesExamplesif (require(maqcExpression4plex) & require(pd.hg18.60mer.expr)){xysPath


eadSummaries 35readSummariesRead summaries generated by crlmmDescriptionThis function read the different summaries generated by crlmm.UsagereadSummaries(type, tmpdir)Argumentstypetmpdirtype of summary of character class: ’alleleA’, ’alleleB’, ’alleleA-sense’, ’alleleAantisense’,’alleleB-sense’, ’alleleB-antisense’, ’calls’, ’llr’, ’conf’.directory containing the output saved by crlmmDetailsOn the 50K and 250K arrays, given a SNP, there are probes on both strands (sense and antisense).For this reason, the options ’alleleA-sense’, ’alleleA-antisense’, ’alleleB-sense’ and ’alleleB-antisense’should be used **only** with such arrays (XBA, HIND, NSP or STY).On the SNP 5.0 and SNP 6.0 platforms, this distinction does not exist in terms of algorithm (notethat the actual strand could be queried from the annotation package). For these arrays, options’alleleA’, ’alleleB’ are the ones to be used.The options calls, llr and conf will return, respectivelly, the CRLMM calls, log-likelihood ratios(for devel purpose **only**) and CRLMM confidence calls matrices.ValueMatrix with values of summaries.rma-methodsRMA - Robust Multichip Average algorithmDescriptionRobust Multichip Average preprocessing methodology. This strategy allows background subtraction,quantile normalization and summarization (via median-polish).


36 rma-methodsUsage## S4 method for signature ExonFeatureSetrma(object, background=TRUE, normalize=TRUE, subset=NULL, target="core")## S4 method for signature HTAFeatureSetrma(object, background=TRUE, normalize=TRUE, subset=NULL, target="core")## S4 method for signature ExpressionFeatureSetrma(object, background=TRUE, normalize=TRUE, subset=NULL)## S4 method for signature GeneFeatureSetrma(object, background=TRUE, normalize=TRUE, subset=NULL, target="core")## S4 method for signature SnpCnvFeatureSetrma(object, background=TRUE, normalize=TRUE, subset=NULL)ArgumentsobjectbackgroundnormalizesubsettargetExon/HTA/Expression/Gene/SnpCnv-FeatureSet object.Logical - perform RMA background correction?Logical - perform quantile normalization?To be implemented.Level of summarization (only for Exon/Gene arrays)Methodssignature(object = "ExonFeatureSet") When applied to an ExonFeatureSet object, rma canproduce summaries at different levels: probeset (as defined in the PGF), core genes (as definedin the core.mps file), full genes (as defined in the full.mps file) or extended genes (as definedin the extended.mps file). To determine the level for summarization, use the target argument.signature(object = "ExpressionFeatureSet") When used on an ExpressionFeatureSet object,rma produces summaries at the probeset level (as defined in the CDF or NDF files, dependingon the manufacturer).signature(object = "GeneFeatureSet") When applied to a GeneFeatureSet object, rma canproduce summaries at different levels: probeset (as defined in the PGF) and ’core genes’(as defined in the core.mps file). To determine the level for summarization, use the targetargument.signature(object = "HTAFeatureSet") When applied to a HTAFeatureSet object, rma canproduce summaries at different levels: probeset (as defined in the PGF) and ’core genes’(as defined in the core.mps file). To determine the level for summarization, use the targetargument.signature(object = "SnpCnvFeatureSet") If used on a SnpCnvFeatureSet object (ie., SNP5.0 or SNP 6.0 arrays), rma will produce summaries for the CNV probes. Note that this isan experimental feature for internal (and quick) assessment of CNV probes. We recommendthe use of the ’crlmm’ package, which contains a Copy Number tool specifically designed forthese data.


unDate 37ReferencesRafael. A. Irizarry, Benjamin M. Bolstad, Francois Collin, Leslie M. Cope, Bridget Hobbs and TerenceP. Speed (2003), Summaries of Affymetrix GeneChip probe level data Nucleic Acids Research31(4):e15Bolstad, B.M., Irizarry R. A., Astrand M., and Speed, T.P. (2003), A Comparison of NormalizationMethods for High Density O ligonucleotide Array Data Based on Bias and Variance. Bioinformatics19(2):185-193Irizarry, RA, Hobbs, B, Collin, F, Beazer-Barclay, YD, Antonellis, KJ, Scherf, U, Speed, TP (2003)Exploration, Normalizati on, and Summaries of High Density Oligonucleotide Array Probe LevelData. Biostatistics. Vol. 4, Number 2: 249-264See AlsosnprmaExamplesif (require(maqcExpression4plex) & require(pd.hg18.60mer.expr)){xysPath


38 snprmasequenceDesignMatrixCreate design matrix for sequencesDescriptionCreates design matrix for sequences.UsagesequenceDesignMatrix(seqs)Argumentsseqscharacter vector of 25-mers.DetailsValueThis assumes all sequences are 25bp long.The design matrix is often used when the objecive is to adjust intensities by sequence.Matrix with length(seqs) rows and 75 columns.ExamplesgenSequence


summarize 39ArgumentsValueobjectSnpFeatureSet objectverbose Verbosity flag. logicalnormalizeToHapmapinternalA SnpQSet object.summarizeTools for microarray preprocessing.DescriptionUsageThese are tools to preprocess microarray data. They include background correction, normalizationand summarization methods.backgroundCorrectionMethods()normalizationMethods()summarizationMethods()backgroundCorrect(object, method=backgroundCorrectionMethods(), copy=TRUE, extra, subset=NULL, targetsummarize(object, probes=rownames(object), method="medianpolish", verbose=TRUE, ...)normalize(object, method=normalizationMethods(), copy=TRUE, subset=NULL,target=core, verbose=TRUE, ..normalizeToTarget(object, targetDist, method="quantile", copy=TRUE, verbose=TRUE)ArgumentsobjectmethodtargetDistprobescopysubsettargetextraverboseObject containing probe intensities to be preprocessed.String determining which method to use at that preprocessing step.Vector with the target distributionCharacter vector that identifies the name of the probes represented by the rowsof object.Logical flag determining if data must be copied before processing (TRUE), or ifdata can be overwritten (FALSE).Not yet implemented.One of the following values: ’core’, ’full’, ’extended’, ’probeset’. Used onlywith Gene ST and Exon ST designs.Extra arguments to be passed to other methods.Logical flag for verbosity.... Arguments to be passed to methods.


40 summarizeDetailsValueNumber of rows of object must match the length of probes.backgroundCorrectionMethods and normalizationMethods will return a character vector withthe methods implemented currently.backgroundCorrect, normalize and normalizeToTarget will return a matrix with same dimensionsas the input matrix. If they are applied to a FeatureSet object, the PM matrix will be used asinput.The summarize method will return a matrix with length(unique(probes)) rows and ncol(object)columns.Examplesns


Index∗Topic IOread.celfiles, 32read.xysfiles, 33∗Topic classesoligoPLM-class, 25∗Topic classifcrlmm, 8getNetAffx, 13runDate, 37∗Topic filelist.xysfiles, 19∗Topic hplotboxplot, 6darkColors, 9hist, 17image, 18MAplot, 20∗Topic loessMAplot, 20∗Topic manipbasecontent, 4basicPLM, 4basicRMA, 5chromosome, 7fitProbeLevelModel, 10getAffinitySplineCoefficients, 11getBaseProfile, 12getContainer, 12getCrlmmSummaries, 13getNgsColorsInfo, 14getPlatformDesign, 15getProbeInfo, 15getX, 16justSNPRMA, 19mm, 22mmindex, 23mmSequence, 24oligo-defunct, 24paCalls, 27pmAllele, 29pmFragmentLength, 30pmPosition, 30pmStrand, 31probeNames, 31readSummaries, 35sequenceDesignMatrix, 38snprma, 38summarize, 39∗Topic methodsboxplot, 6hist, 17MAplot, 20plotM-methods, 29rma-methods, 35∗Topic packageoligo-package, 3∗Topic smoothMAplot, 20annotation,oligoPLM-method(oligoPLM-class), 25availProbeInfo (getProbeInfo), 15backgroundCorrect (summarize), 39backgroundCorrect,FeatureSet-method(summarize), 39backgroundCorrect,ff_matrix-method(summarize), 39backgroundCorrect,matrix-method(summarize), 39backgroundCorrect-methods (summarize),39backgroundCorrectionMethods(summarize), 39basecontent, 4basicPLM, 4basicRMA, 5, 5bg (mm), 22bg,FeatureSet-method (mm), 2241


42 INDEXbg,TilingFeatureSet-method (mm), 22bg


INDEX 43manufacturer,oligoPLM-method(oligoPLM-class), 25MAplot, 20MAplot,ExpressionSet-method (MAplot), 20MAplot,FeatureSet-method (MAplot), 20MAplot,matrix-method (MAplot), 20MAplot,PLMset-method (MAplot), 20MAplot,TilingFeatureSet-method(MAplot), 20MAplot-methods (MAplot), 20method (oligoPLM-class), 25method,oligoPLM-method(oligoPLM-class), 25mm, 22mm,FeatureSet-method (mm), 22mm,TilingFeatureSet-method (mm), 22mm


44 INDEXpmAllele,SnpFeatureSet-method(pmAllele), 29pmChr (chromosome), 7pmChr,ExonFeatureSet-method(chromosome), 7pmChr,FeatureSet-method (chromosome), 7pmChr,GeneFeatureSet-method(chromosome), 7pmFragmentLength, 30pmFragmentLength,AffySNPPDInfo-method(pmFragmentLength), 30pmFragmentLength,SnpFeatureSet-method(pmFragmentLength), 30pmindex (mmindex), 23pmindex,DBPDInfo-method (mmindex), 23pmindex,FeatureSet-method (mmindex), 23pmindex,stArrayDBPDInfo-method(mmindex), 23pmOffset (pmPosition), 30pmOffset,AffySNPPDInfo-method(pmPosition), 30pmPosition, 30pmPosition,ExpressionPDInfo-method(pmPosition), 30pmPosition,FeatureSet-method(pmPosition), 30pmPosition,TilingFeatureSet-method(pmPosition), 30pmPosition,TilingPDInfo-method(pmPosition), 30pmSequence (mmSequence), 24pmSequence,AffyGenePDInfo-method(mmSequence), 24pmSequence,AffySNPPDInfo-method(mmSequence), 24pmSequence,DBPDInfo-method(mmSequence), 24pmSequence,ExonFeatureSet-method(mmSequence), 24pmSequence,FeatureSet-method(mmSequence), 24pmSequence,GeneFeatureSet-method(mmSequence), 24pmSequence,stArrayDBPDInfo-method(mmSequence), 24pmStrand, 31pmStrand,AffySNPPDInfo-method(pmStrand), 31pmStrand,TilingFeatureSet-method(pmStrand), 31probeNames, 31probeNames,DBPDInfo-method(probeNames), 31probeNames,ExonFeatureSet-method(probeNames), 31probeNames,FeatureSet-method(probeNames), 31probeNames,GeneFeatureSet-method(probeNames), 31probeNames,stArrayDBPDInfo-method(probeNames), 31probesetNames (probeNames), 31probesetNames,FeatureSet-method(probeNames), 31rcModelPLM, 5rcModelPLMr, 5rcModelPLMrc, 5rcModelPLMrr, 5read.celfiles, 32, 34read.celfiles2 (read.celfiles), 32read.xysfiles, 33, 33read.xysfiles2 (read.xysfiles), 33readSummaries, 35resids (oligo-defunct), 24residuals,oligoPLM-method(oligoPLM-class), 25residualSE (oligoPLM-class), 25residualSE,oligoPLM-method(oligoPLM-class), 25RLE (oligoPLM-class), 25RLE,oligoPLM-method (oligoPLM-class), 25rma, 11, 26rma (rma-methods), 35rma,ExonFeatureSet-method(rma-methods), 35rma,ExpressionFeatureSet-method(rma-methods), 35rma,GeneFeatureSet-method(rma-methods), 35rma,HTAFeatureSet-method (rma-methods),35rma,SnpCnvFeatureSet-method(rma-methods), 35rma-methods, 35runDate, 37runDate,FeatureSet-method (runDate), 37


INDEX 45runDate-methods (runDate), 37sample, 7se (oligoPLM-class), 25se,oligoPLM-method (oligoPLM-class), 25se.probe (oligoPLM-class), 25se.probe,oligoPLM-method(oligoPLM-class), 25seqColors (darkColors), 9seqColors2 (darkColors), 9sequenceDesignMatrix, 38set.seed, 7show,oligoPLM-method (oligoPLM-class),25smoothScatter, 22snprma, 37, 38subset, 10, 15summarizationMethods, 11summarizationMethods (summarize), 39summarize, 26, 39summarize,ff_matrix-method (summarize),39summarize,matrix-method (summarize), 39summarize-methods (summarize), 39weights,oligoPLM-method(oligoPLM-class), 25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!