12.07.2015 Views

Boreskov

Boreskov

Boreskov

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OP‐13STATISTICS OF GENOME SIZE AND NUCLEOTIDE CONTENT USING DATAOF COMPLETE PROKARYOTIC GENOMESOrlov Yu.L., Suslov V.V.Institute of Cytology and Genetics SB RAS, Novosibirsk, RussiaE‐mail: orlov@bionet.nsc.ru, valya@bionet.nsc.ruResearch on evolution of early unicellular organisms relies on complete genomesequencing data abundant last years due to new sequencing technologies. Rapid growth ofdata banks allows us reexamine sequence features necessary for minimal genome size andminimal gene set as unit of evolution. Several theoretical and experimental studies haveestimated the minimal set of genes that are necessary and sufficient to sustain a functioningcell under some ideal conditions. The M. genitalium genome has the smallest genome size,comprising 580 Kbp, with a capacity to encode only 482 genes. The minimal nature of theM.genitalium genome triggered particular interest to the conception of a minimal cell(Mushegian&Koonin, 1996). A comparison of the first two completed bacterial genomes,those of the parasites H.influenzae and M.genitalium, produced a version of the minimalgene set consisting of approximately 250 genes (Koonin, 2000), with later re‐estimations.More recent studies have attempted to reconstruct its genome by chemical synthesis(Gibson et al, 2008), even to engineer a new living organism, referred to as Mycoplasmalaboratorium (Endy, 2008).Several thousand complete and assembled of bacterial and achaeal whole genomesequences were downloaded from NCBI ftp site(ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/, latest release, May 2011). We comparedgenome size with gene number and GC content for groups of the organisms (Figure 1). For2065 whole genome assemblies we found high linear correlation between genome size andgene number (0.81), that been expected, and, more interestingly, between genome size andGC content (0.46). Thus, larger genome size is strongly related to higher fraction of G and Cnucleotides. Correlation between GC content and genome size is 0,35 (for 104 archaeal) and0,59 (for 1478 bacterial) species. Correlation of GC content to genome size follows the sametrend in archaeal and bacterial groups. We also have compared Kolmogorov complexity ofgenomic sequences (Lempel‐Ziv estimation) using software developed earlier(Orlov&Potapov, 2004) and computer resources of Shared Facility Center “Bioinformatics”.50

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!