17.11.2012 Views

Codon Evolution Mechanisms and Models

Codon Evolution Mechanisms and Models

Codon Evolution Mechanisms and Models

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

(a)<br />

1.0<br />

CAI<br />

Fop<br />

(b)<br />

1.0<br />

0.5<br />

CBI<br />

Nc 0.8<br />

Normalized mean<br />

0.0<br />

−0.5<br />

−1.0<br />

0 20 40 60 80 100<br />

GC content<br />

Coefficient of variation<br />

0.6<br />

0.4<br />

0.2<br />

0.0<br />

DEPENDENCIES OF MEASURES 207<br />

0 20 40 60 80 100<br />

GC content<br />

Figure 13.1 The dependence of indices on the GC content is shown for the indices CAI (circle), Fop (triangle), CBI (plus), <strong>and</strong> Nc (cross). (a) The<br />

normalized mean values for each index, where the mean value of the index of the samples at X% GC are subtracted from the total mean divided by the<br />

total sample st<strong>and</strong>ard deviation ((x − ¯x)/sx ). (b) The values for each index of the coefficient of variation (CV), which is the sample variance divided by the<br />

sample mean (sx / ¯x).<br />

(a)<br />

1.0<br />

CAI<br />

Fop<br />

(b) 1.0<br />

0.5<br />

CBI<br />

Nc 0.8<br />

Normalized mean<br />

0.0<br />

−0.5<br />

−1.0<br />

0 100 200 300 400 500<br />

Length<br />

Coefficient of variation<br />

0.6<br />

0.4<br />

0.2<br />

CAI<br />

Fop<br />

CBI<br />

Nc<br />

0.0<br />

0 100 200 300 400 500<br />

Length<br />

Figure 13.2 Length-dependence of codon indices: (a) the normalized mean of the indices at increasing gene length (number of codons); (b) the<br />

coefficient of variation. From the plots it can be seen that at short gene lengths the variance is higher <strong>and</strong> the estimates tend to deviate from the expected<br />

value (based on r<strong>and</strong>om sequences).<br />

shows for each index the values of the coefficient<br />

of variation (CV), the sample variance divided by<br />

the sample mean (sx/¯x). The coefficient of variation<br />

(CV) provides a way to compare the variation, irrespective<br />

of the value of the mean. The variation is<br />

not affected by GC content, albeit CBI has a much<br />

larger variance than the other indices.<br />

13.6.2 Dependence on gene length<br />

To examine the dependency caused by differing<br />

gene lengths we simulate sequences of different<br />

lengths with a fixed codon distribution (using that<br />

of E. coli). Figure 13.2 shows the dependence of<br />

CAI<br />

Fop<br />

CBI<br />

Nc<br />

CAI, Fop, CBI, <strong>and</strong> Nc (a) <strong>and</strong> their variances (b)<br />

on sequence length. Clearly the variation is higher<br />

for shorter sequences. This undesirable behaviour<br />

is due to stochastic sampling effects <strong>and</strong> many<br />

authors advise against using sequences shorter that<br />

100 amino acids.<br />

13.6.3 Dependence on the degree of codon<br />

degeneracy<br />

The degree of degeneracy has been shown to<br />

correlate with codon bias indices (Urrutia <strong>and</strong><br />

Hurst, 2001). Here we define ‘degree of degeneracy’<br />

as the percentage of four- <strong>and</strong> six-fold degenerate

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!