13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

WHOLE-GENOME COMPARISONS 279Table 2. Distributions <strong>of</strong> (Anopheles/D.melanogaster) Ecores in the Sequence <strong>of</strong> Drosophilain Two Successive AnnotationsEcores Ecores Ecores inBDGP Genes within Exons overlapping genes notannotation Ecores Genes detected genes Exons detected exons in exonsRelease 2 47,134 13,468 11,147 42,633 54,771 31,751 41,332 1072(%) (100) (100) (83) (90.5) (100) (58) (88) (2.5)Release 3 46,742 13,666 11,167 43,705 61,085 33,996 42,679 1026(%) (100) (100) (82) (93.5) (100) (56) (91.5) (2)lease 3 and observed that, at present, all ecores have beenplaced in two gene models (Fig. 3, bottom).Based on remote protein sequence or structure homologies,an additional set <strong>of</strong> 1,042 D. melanogaster candidategenes has been proposed (Gopal et al. 2001) (http://genomes.rockefeller.edu/dm). Ecores could be found in18.7% <strong>of</strong> these new gene models (the list <strong>of</strong> the matchescan be found at www.genoscope.cns/externe/Fly). Thislow fraction <strong>of</strong> matches could either result from a very lowconservation <strong>of</strong> these genes between A. gambiae and D.melanogaster, possibly representing a subset <strong>of</strong> rapidlyevolving genes, or indicate that a large fraction <strong>of</strong> thesehypothetical genes should be dismissed. However, Ex<strong>of</strong>ishcan also serve to validate a number <strong>of</strong> these potentialgenes. A genome-wide analysis was also performed onthe assembly <strong>of</strong> the A. gambiae genome sequence draft(Holt et al. 2002). We found more ecores in the Anophelesassembly (54,069 in release 6.01a) than in the D.melanogaster genome (ratio = 1.16). Several explanationsthat are not mutually exclusive may account for thisobservation. <strong>The</strong> high number <strong>of</strong> ecores could reflect (1)an increased coding capacity in the genome <strong>of</strong> Anophelesor (2) a larger number <strong>of</strong> pseudogenes or unmasked tranposableelements in Anopheles or (3) problems in the sequenceassembly. <strong>The</strong> presence <strong>of</strong> at least two differenthaplotypes in the A. gambiae strain sequenced is knownto have introduced a number <strong>of</strong> redundancies in the assembly,essentially as linked artefactual duplications andunanchored duplicated scaffolds (Holt et al. 2002). Workis in progress to test these hypotheses. We compared theFigure 3. Ex<strong>of</strong>ish analysis on a region on arm 2L <strong>of</strong> the genome <strong>of</strong> D. melanogaster from 2 different releases <strong>of</strong> annotations, andaround the same ecores. (Top) Results from release 2 <strong>of</strong> BDGP. (Bottom) Results from release 3 <strong>of</strong> BDGP. (A,D) BDGP annotationson the 5´–3´ strand. (B,E) BDGP annotations on the 3´–5´ strand. <strong>The</strong> genes are represented by boxes, with exons in black and intronsin white. (C,F) Ecores (gray boxes). In release 2 (top), 5 ecores (numbers 7, 8, 9, 11, 12) overlap 4 gene models, and 7 ecores (numbers1, 2, 3, 4, 5, 6, 10) do not overlap any annotation. In release 3, a large gene model overlaps all the ecores that fall exclusively inexons except ecore number 9. This ecore is part <strong>of</strong> a gene model on the 5´–3´ strand, which is predicted inside an intron on the 3´–5´strand.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!