12.07.2015 Views

Analysis of microarray data - VSN International

Analysis of microarray data - VSN International

Analysis of microarray data - VSN International

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Analysis</strong> <strong>of</strong> Microarray Data in GenStat 41differentially expressed probes from the rest. There are two methods <strong>of</strong> estimating the FDR, one based ona mixture model, and another based on a non-parametric methods.False Discovery Rate using Bonferroni MethodThis menu can be used to estimate false discovery rates (defined in the table below) using a Bonferronitypeprocedure. This is a non-parametric approach, where for each value <strong>of</strong> lambda; the observedproportion <strong>of</strong> the sample that is not differentially expressed (π 0 ) is calculated. The procedure uses twomethods to get an overall measure <strong>of</strong> π 0 . The first uses bootstrapping to choose the value <strong>of</strong> π 0 whichminimises the mean squared error, and the second uses a spline smoother to smooth the values <strong>of</strong> π 0around the maximum value <strong>of</strong> lambda. Unadjusted q-values are then calculated from the estimate <strong>of</strong> π 0 asπ 0 *p*(Proportion <strong>of</strong> tests < p) (where p is the test probability) for each test value, and then the Bonferroniq values are defined as the minimum <strong>of</strong> the q values above each test value, stepping this procedure downthrough the sorted p values.The following table defines some random variables related to m hypothesis tests:Signficance Test # declared non-significant # declared significant Total# true null hypotheses U V m 0# non-true null hypotheses T S m 1 = m − m 0Total W = m − R R mm 0 is the number <strong>of</strong> true null hypotheses.m − m 0 is the number <strong>of</strong> false null hypotheses.U is the number <strong>of</strong> true negatives.V is the number <strong>of</strong> false positives.T is the number <strong>of</strong> false negatives.S is the number <strong>of</strong> true positives.H 1 ...H m are the null hypotheses being tested.In m hypothesis tests <strong>of</strong> which m 0 are true null hypotheses, R is an observable random variable, and S, T,U, and V are unobservable random variables. The proportion <strong>of</strong> tests that are truly null, π 0 , is m 0 dividedby m. The false discovery rate (FDR), also known as the q-value <strong>of</strong> a test, is a commonly used errormeasure in multiple-hypotheses, defined as FDR = E(V/R | R > 0) × Pr(R > 0), i.e. the expectedproportion <strong>of</strong> false positives findings among all the rejected hypotheses multiplied by the probability <strong>of</strong>making at least one rejection; the FDR is zero when R = 0. Similarly the false rejection rate (FRR) isdefined as FRR = E(T/W | W > 0) × Pr(W > 0), i.e. the expected proportion <strong>of</strong> false negatives findingsamong all the accepted hypotheses times the probability <strong>of</strong> accepting at least one test. We also define thepower to be equal to E(S/m 1 | m 1 > 0) ×Pr(m 1 > 0).Opening the Stats | Microarray | Analyse |False Discovery Rate by Bonferronimenu gives the menu to the right. Usingthe F probability values in the file ‘Hyb-ANOVA.gwb’ in the Empirical Bayessection above, we can fit obtain theesimated false discovery rates toFProb[1].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!