27.12.2014 Views

4 - Central Institute of Brackishwater Aquaculture

4 - Central Institute of Brackishwater Aquaculture

4 - Central Institute of Brackishwater Aquaculture

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

National Workshop-cum-Training on Bioinformatia and Information Management in <strong>Aquaculture</strong><br />

For example, we have the following intensity ratio results for two replicates (A<br />

and B) after normalization:<br />

The mean <strong>of</strong> these replicates is 1.25 instead <strong>of</strong> 1, which would have been<br />

expected. If the 2-based logarithmic transformation is applied, the log ratios are:<br />

The mean <strong>of</strong> these log ratios is 0, which corresponds to the mean intensity ratio<br />

<strong>of</strong> 1. Although log-transformation is not always the best choice for microarray<br />

data, it is used because the other transformations lack this handy additive<br />

property. One downside <strong>of</strong> the log-transformation is that it introduces systematic<br />

errors in the lower end <strong>of</strong> the expression value distiibution.<br />

3.2.3. Replicates: If the experiment includes replicates, their quality can be<br />

checked with simple methods, using scatter plots and pairwise correlations or<br />

hierarchical clustering techniquesewhen the distribution <strong>of</strong> the intensity values is<br />

skewed, the median characterizes the central tendency better than the mean.<br />

The median and mean can also be used to check the skewness <strong>of</strong> the<br />

distribution. For symmetrical distributions mean and median are approximately<br />

equal.<br />

3.2.4.0utliers and Filtering bad & Uninteresting Data: Outliers in chip<br />

experiments can occur at several levels. There can be entire chips, which deviate<br />

from all the other replicates. Or there can be an individual gene, which deviates<br />

from the other replicates <strong>of</strong> the same gene. Outliers should consist mainly <strong>of</strong><br />

quantification errors. In practise, it is <strong>of</strong>ten not very easy to distinguish<br />

quantification errors from true data, especially if there are no replicate<br />

measurements. If the expression ratio is very low (quantification errors) or very<br />

high (spot intensity saturation), the result can be assumed to be an artifact, and<br />

should be removed. Most <strong>of</strong> the actual outliers should be removed at the filtering<br />

step (those that have too low intensity values). This is <strong>of</strong>ten equivalent to a<br />

filtering, where observations with too low or high intensity values are excluded<br />

from further analyses.<br />

3.2.5. Linearity: Linearity means that in the scatter plot <strong>of</strong> channel 1 (red<br />

colour) versus channel 2 (green colour), the relationship between the channels is<br />

linear. It is <strong>of</strong>ten more informative to produce a scatter plot <strong>of</strong> the logtransformed<br />

intensities, because then the lowest intensities are better<br />

represented in the plot. In this kind <strong>of</strong> a plot, the data points fit a straight line, if

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!