12.07.2015 Views

Analysis of microarray data - VSN International

Analysis of microarray data - VSN International

Analysis of microarray data - VSN International

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Analysis</strong> <strong>of</strong> Microarray Data in GenStat 39Empirical Bayes error estimationWith the large number <strong>of</strong> genes analysed in parallel on the same series <strong>of</strong> slides, the variation in the resultsfor each gene may be thought <strong>of</strong> as coming from a common error distribution. If all the results weregenerated from a normal error process, we would expect the distribution <strong>of</strong> standard deviations for eachgene to follow a Chi-squared distribution. If this was the case, considerable extra power could be obtainedif we model the genes together, borrowing information from the whole distribution <strong>of</strong> standard deviations.The empirical Bayes error estimation does this by modelling the distribution <strong>of</strong> the standard deviations <strong>of</strong>the results over all probes. The distribution <strong>of</strong> standard deviations has two components, a single commonstandard deviation <strong>of</strong> the uniform error process operating on all genes, and a specific component <strong>of</strong>variance unique to each gene. A prior distribution for the standard deviations, or equivalently, thevariances, is assumed. In this approach, it is assumed that the reciprocal <strong>of</strong> the variance is distributed witha multiple <strong>of</strong> a chi-squared distribution with d 0 degrees <strong>of</strong> freedom, i.e.1 1 2~ χ2 2 d0spd0s0If the parameters <strong>of</strong> this distribution, the prior degrees <strong>of</strong> freedom and standard deviation, d 0 and s 0 areestimated, more information can be gained on an individual probe, by shrinking it towards the priorstandard deviation, s 0 . The relative amount <strong>of</strong> information in the prior and individual standard deviation<strong>of</strong> a probe, (s 0 and s p respectively) is specified by their degrees <strong>of</strong> freedom, d 0 and d p . The modifiedstandard deviation,~sp , is then given by the weighted average <strong>of</strong> s0 and s p :~sp=d0sd200+ d0s+ dp2pA modified t-test can then be performed using the modified standard deviation with d 0 +d p degrees <strong>of</strong>freedom. The method can also produce the p values from a test <strong>of</strong> the differential expression beingdifferent from zero.Using the estimates from the 13-6 to 13-9 series (saved in13-6to9_Estimates), we can create modified t-statistics andp-values for the contrast effects <strong>of</strong> DM vs. Control.Opening the menu Stats | Microarrays | Analyse | EmpiricalBayes error estimation gives us the window to the right,when the fields are filled in with the appropriate columnnames. The Data Type dropdown list allows the <strong>data</strong> to begiven in 3 formats, means (as in this example),T values, or a Pointer to a set <strong>of</strong> columns from whichmeans and standard deviations are calculated from eachrow over the set <strong>of</strong> columns. Note: a pointer is a GenStatstructure that specifies a list <strong>of</strong> <strong>data</strong> structures to be treated as agroup, that can be defined using the View | Data View menu). Theresult columns are specified in the Save section, with the option <strong>of</strong>adding these back to the source spreadsheet if it is still open.Clicking the Options button opens the dialog to the right, whichallows you to specify whether the output printed in the Outputwindow, which graphs are plotted, and the nature <strong>of</strong> the t-testperformed (two sided or either <strong>of</strong> the one sided tests). Here, a two-sided test is used with output <strong>of</strong> theresults to the Output window, and just the histogram <strong>of</strong> the t-values before and after adjustment using theestimated prior parameters. Clicking the Run button creates the following output and graphs:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!