11.03.2014 Views

a Whole Genome Array Approach - Jacobs University

a Whole Genome Array Approach - Jacobs University

a Whole Genome Array Approach - Jacobs University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Background<br />

1.6 Microarray data analysis<br />

Data analysis is an essential process in DNA microarray experiments, since these experiments<br />

normally result in a large amount of information. These data must be adequately processed to<br />

find statistically significant correlations – e.g. co regulation of genes - within and between<br />

different arrays (FIG 4).<br />

First, the hybridisation signal intensities must be filtered and normalised. This transformation<br />

is done to minimise the bias arising from unequal quantities of starting RNA, differences in<br />

labelling or detection efficiencies of the fluorescent dyes applied, and other systematic biases<br />

(Quackenbush 2002). For an overview on different normalisation methods see Foster and<br />

Ghazal 2003(Foster and Ghazal 2003). In the next step, data mining techniques are required to<br />

answer the biological question behind the experiment. Normally, microarray experiments are<br />

conducted to identify genes which are either under- or over-expressed after a shift in the<br />

experimental conditions. For example, we might be interested in genes that have an elevated<br />

expression because of a drug treatment. Such genes are most easily found by simple filtering.<br />

If the log-transformed data (method) is used for filtering, differentially expressed genes are<br />

inferred by a fixed threshold cut off method (i.e. a two-fold increase or decrease). Filtering by<br />

absolute expression change can even be used for experiments, where there are no replicates.<br />

However, there are also ranking-methods available [t-test (Pan 2002), ANOVA (Kerr et al.<br />

2000), Bayesian method or Mann-Whitney test]. All these methods produce errors (falsepositive<br />

and false-negatives), therefore differential gene expression is usually confirmed by<br />

RT-PCR or northern blots (Leung and Cavalieri 2003). In case of interest for co-regulation of<br />

genes (or related arrays), various cluster techniques should be considered. The basic concept<br />

in clustering is to try to identify and group together similarly expressed genes and to correlate<br />

the observations to biology. The idea is that co-regulated and functionally related genes are<br />

grouped into clusters. Some often used grouping techniques are hierarchical clustering (Eisen<br />

et al. 1998), k-means clustering (Soukas et al. 2000), self-organising maps (SOMs) (Kohonen<br />

1992) and principal component analysis (PCA) (Raychaudhuri 2000) (Methods reviews<br />

Quackenbush 2002; Gollub and Sherlock 2006). There is no clustering method that can be<br />

applied for all kinds of experiments. Different cluster methods used on the same data set can<br />

reveal unique aspects of the data (Leung and Cavalieri 2003). It is therefore advisable to<br />

analyse the data using several methods rather than just one (Leung 2002).<br />

14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!