12.07.2015 Views

A computational study of bacterial gene regulation and adaptation ...

A computational study of bacterial gene regulation and adaptation ...

A computational study of bacterial gene regulation and adaptation ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Nucleic Acids Research Advance Access published January 18, 2008Nucleic Acids Research, 2008, 1–8doi:10.1093/nar/gkm1181Codon choice in <strong>gene</strong>s depends on flankingsequence information—implications for theoreticalreverse translationKarthikeyan Sivaraman 1 , AswinSaiNarain Seshasayee 2 ,Patrick M. Tarwater 3 <strong>and</strong> Alex<strong>and</strong>er M. Cole 1, *1 Department <strong>of</strong> Molecular Biology <strong>and</strong> Microbiology, Burnett School <strong>of</strong> Biomedical Sciences, University <strong>of</strong> CentralFlorida, Orl<strong>and</strong>o, FL, 32816, USA, 2 Genomics <strong>and</strong> Regulatory Systems Group, EMBL-European BioinformaticsInstitute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK <strong>and</strong> 3 Department <strong>of</strong> Biostatistics,University <strong>of</strong> Texas School <strong>of</strong> Public Health, El Paso, TX, 79902, USAReceived September 26, 2007; Revised December 7, 2007; Accepted December 27, 2007ABSTRACTAlgorithms for theoretical reverse translation havedirect applications in de<strong>gene</strong>rate PCR. The conventionalpractice is to create several de<strong>gene</strong>rateprimers each <strong>of</strong> which variably encode the peptideregion <strong>of</strong> interest. In the current work, for eachcodon we have analyzed the flanking residues inproteins <strong>and</strong> determined their influence on codonchoice. From this, we created a method for theoreticalreverse translation that includes informationfrom flanking residues <strong>of</strong> the protein in question.Our method, named the neighbor correlation method(NCM) <strong>and</strong> its enhancement, the consensus-NCM(c-NCM) performed significantly better than theconventional codon-usage statistic method (CSM).Using the methods NCM <strong>and</strong> c-NCM, we were ableto increase the average sequence identity from 77%up to 81%. Furthermore, we revealed a significantincrease in coverage, at 80% identity, from _ 20%(CSM) to ` 75% (c-NCM). The algorithms, theirapplications <strong>and</strong> implications are discussed herein.INTRODUCTIONWord usage <strong>and</strong> codon usage in <strong>bacterial</strong> genomes hasbeen extensively documented, both in the coding (1) <strong>and</strong>non-coding regions (2). These reports show that wordusage in genomes is non-r<strong>and</strong>om <strong>and</strong> it serves as abiological signature <strong>of</strong> the organism in question. One suchsignature is codon usage in open reading frames (ORFs),<strong>and</strong> is reflected in measures such as the codon <strong>adaptation</strong>index (CAI) (3). Though CAI provides a convenientmeasure <strong>of</strong> codon bias, several reports show that codonusage is not a property <strong>of</strong> isolated codons <strong>and</strong> in severalcases the bases immediately upstream or downstreamaffect the translation (4). Such neighboring base effects arewell studied in case <strong>of</strong> stop codon read-through experimentswhere the flanking base or codon has been shown toaffect the accuracy <strong>and</strong> magnitude <strong>of</strong> read-through (5).Apart from single bases, the effect <strong>of</strong> flanking codons hasalso been well studied in literature. Gutman <strong>and</strong> Hatfield(6) show that there is a strong first-order Markovianrelationship between codons in a <strong>gene</strong> <strong>and</strong> this relation isseen even after translation, in proteins. Boycheva <strong>and</strong>colleagues extended this <strong>study</strong> to reveal that translationefficiency is strongly dependent on the dicodon pair thatencodes for a given amino acid pair (7). They suggest thatrelative orientations <strong>of</strong> t-RNA in the ribosome may causethe observed differences in translation efficiency <strong>and</strong>subsequently certain dicodon pairs are selected evolutionarily.Moura <strong>and</strong> coworkers use a more recent <strong>and</strong>larger dataset for an analysis <strong>of</strong> dicodon usage patterns inboth prokaryotes <strong>and</strong> eukaryotes. Their results suggestthat the geometric constraints imposed by the translationmachinery are driving forces in the evolution <strong>of</strong> <strong>gene</strong>sequences in bacteria (8). Collectively, these results suggestthe existence <strong>of</strong> strong first-order Markovian relationshipsbetween codons in a <strong>gene</strong>. We hypothesized thatinformation content <strong>of</strong> such correlations is carried overto the proteins, at least in part, when the <strong>gene</strong> istranslated. This information manifests itself as a lack <strong>of</strong>r<strong>and</strong>omness in the choice <strong>of</strong> codons <strong>and</strong> it is apparentwhen one attempts to theoretically reverse translate aprotein sequence.Reverse translation has been discussed earlier as anabstract logical flow <strong>of</strong> information from proteins toDNA (9). In this work, we consider the pragmaticproblem <strong>of</strong> theoretical reverse translation itself, ratherthan that <strong>of</strong> information flow from proteins to DNA.Theoretical reverse translation <strong>of</strong> protein sequences haspotential applications in primer design for de<strong>gene</strong>rate*To whom correspondence should be addressed. Tel: 407 823 3633; 407 823 3635; Email: acole@mail.ucf.eduß 2008 The Author(s)This is an Open Access article distributed under the terms <strong>of</strong> the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, <strong>and</strong> reproduction in any medium, provided the original work is properly cited.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!