13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

174 GUYON ET AL.only in one <strong>of</strong> the two species (Fig. 1). For CFA5, CS I andCS V, separated by 42.6 Mb in the human sequence, appearas a single contiguous block in dog. CS I is identifiedby 13 anchor sites and spans 8.8 Mb on HSA1p. CS V isidentified by 24 anchor sites and spans 13.8 Mb (Table 1).<strong>The</strong> gene order inside each block is conserved betweenhuman and dog, but CS I is inverted relative to HSA1p.For CFA15, the HSA1p orthologous region is split intotwo conserved segments <strong>of</strong> 14 and 10 anchor sites by anasyntenic region 328 TSP units long, and composed <strong>of</strong>three markers orthologous to a 0.2-Mb interval <strong>of</strong> HSA16.<strong>The</strong> gene orders in these two conserved segments are inthe same orientation, but their relative positions reveal atransposition event in the dog genome (Fig. 1).CS VII on CFA17 is composed <strong>of</strong> two blocks <strong>of</strong> 5 and14 anchor sites, respectively, the second one overlappingthe centromeric region <strong>of</strong> HSA1 and being in inverted orientation,includes 6 genes <strong>of</strong> HSA1q. Although gene orderswithin each <strong>of</strong> these blocks cannot be assessed withcertainty because <strong>of</strong> the RH panel resolution limit, this organizationreveals an inversion event including the centromericregion <strong>of</strong> HSA1 or in the CFA17 orthologous region.<strong>The</strong> whole conserved segment between CFA17 andHSA1 represents 1067 TSP units in the canine map. Inhuman, it includes the 21 Mb <strong>of</strong> the centromere <strong>of</strong> HSA1(HSA1p11.1-q12) and spans 39.8 Mb.In contrast to the above, we see little variation associatedwith CFA2 and CFA6, where 35 and 54 anchor siteshave been mapped, respectively. <strong>The</strong> association <strong>of</strong> homologousgenes in human and dog is contiguous and inaccordance with the definition <strong>of</strong> conserved segments. Finally,we note that the gene order between HSA1p andCFA6 (CS VI) is entirely conserved. However, CS II issubdivided in two blocks <strong>of</strong> conserved order, harboring23 and 7 anchor sites, respectively, highlighting an additionalrearrangement in one <strong>of</strong> the two species.DISCUSSIONUsing a simple statistical model, it can be estimatedthat the 1.5x coverage <strong>of</strong> the dog genome will provide70% <strong>of</strong> the genome sequence, with an average gap length<strong>of</strong> ~480 bases (Lander and Waterman 1988). It can alsobe estimated that sequences containing 100 bp <strong>of</strong> an exon(or exon fragment) will be sufficient to identify dog orthologs<strong>of</strong> most human genes. <strong>The</strong> probability <strong>of</strong> a specific100-base fragment <strong>of</strong> the genome occurring entirelywithin a single sequence read <strong>of</strong> 576 bases is only 0.58.However, most human genes appear to be composed <strong>of</strong> atleast 4 exons (Venter et al. 2001), and given the knownsimilarity in gene structure between humans and canids(see, e.g., Szabo et al. 1996; Credille et al. 2001; Haworthet al. 2001), the same is likely to be true for dog. Consequently,the probability that at least one 100-base exonfragment from a gene is contained within the genomic sequencedata can be estimated as >0.95. It has not yet beendetermined for what proportion <strong>of</strong> human genes 1:1 orthologycan be detected in the dog genome. For mouse,this value has been estimated to be ~80% (Okazaki et al.2002; Waterston et al. 2002). If dogs and humans share asimilar number <strong>of</strong> orthologous genes, we can estimatethat the dog genomic sequence data will yield at least oneorthologous exon fragment for ~80% <strong>of</strong> human genes. Inthis study, fragments <strong>of</strong> putative dog orthologs were identifiedfor 158 <strong>of</strong> 187 (84%) selected human genes. Recently,a more comprehensive analysis has indicated that79% <strong>of</strong> all annotated human genes (and 96% <strong>of</strong> those thathave detectable orthologs in mouse) are represented byorthologous dog sequences in the 1.5x coverage (Kirknesset al. 2003).If the primary objective <strong>of</strong> a sequencing project is togenerate gene-based markers for RH mapping,1.5x sequencecoverage <strong>of</strong> a genome <strong>of</strong>fers several advantagesover large collections <strong>of</strong> ESTs. Unlike cDNA libraries,the representation <strong>of</strong> genes is unaffected by cellular expressionlevels, and identification <strong>of</strong> orthologous exons isnot biased by the length <strong>of</strong> 3´-untranslated mRNA. In addition,the low but significant conservation <strong>of</strong> intronic sequencesbetween species is useful for distinguishing betweenparalogous sequences that share substantialsequence identity within exons.<strong>The</strong> most recent iteration <strong>of</strong> the canine RH map(Guyon et al. 2003) featured 870 markers for which orthologoussequences have been identified on the humangenome. Although the HSA1p orthologous regions wereshown to correspond to five canine chromosomes, CFA2, 5, 6, 15, and 17 in the previous RH map, by increasingmore than fourfold the number <strong>of</strong> markers, this workclearly delineates the gene order and breakpoints forseven conserved segments. <strong>The</strong> largest increase in resolutionis in the HSA1p orthologous region <strong>of</strong> CFA5 (CS Iand V), which now contains 34 genes compared to 4 inthe previous version <strong>of</strong> the canine map.This comparative map allows us to characterize moreprecisely the conserved segments orthologous to HSA1pthat were previously identified by reciprocal chromosomepainting studies (Breen et al. 1999a; Yang et al.1999; Sargan et al. 2000) on CFA5, CFA15, and CFA17(Fig. 1). On CFA5, although contiguous in dog, two conservedsegments (I and V) are split in human. On CFA15,the region is split by a novel asyntenic fragment <strong>of</strong>HSA16, as previously shown by Guyon (Guyon et al.2003), leaving two conserved segments (III and IV) harboringinverted positions in human versus dog. Finally,on CFA17, the region is split into blocks but constitutes aunique conserved segment. <strong>The</strong> two remaining caninechromosomal regions (CFA2 and CFA6) each constitutea unique conserved segment.<strong>The</strong> seven regions span 115 Mb <strong>of</strong> the 123-Mb HSA1parm, indicating that for roughly 8 Mb <strong>of</strong> HSA1p, no caninecounterparts have yet been mapped. Between thoseconserved segments, six regions that contain the breakpoints<strong>of</strong> interest range in length from 0.3 to 3.8 Mb, andrepresent a total <strong>of</strong> 7.3 Mb. Together, those intervals contain113 human genes (http://www.ncbi.nih.nlm.gov/mapview) ranging from 58 genes in the 3.8-Mb regionbetween CS II and III to 6 genes in the 0.4-Mb region betweenCS III and IV. In addition, 42 genes are present inthe 1 Mb most telomeric region <strong>of</strong> HSA1p above CS I.Despite the high density <strong>of</strong> anchor sites along HSA1p

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!