13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

RECENT SEGMENTAL DUPLICATIONS 121Comparative AnalysesAre recent interspersed duplications <strong>of</strong> genomic sequencea common property <strong>of</strong> all genomes, or is their occurrencelargely restricted to the primate genome? Ouranalysis indicates two types <strong>of</strong> recent interspersed duplicationsexist in the human genome, the chromosome-specificand interchromosomal repeats. Similar duplications,at least in apparent size or frequency, have not been reportedas <strong>of</strong> yet for any other organism. Differences inmethods <strong>of</strong> genomic and genetic characterization in thesespecies, however, could largely have explained this effect.With the advent <strong>of</strong> whole-genome sequencing, wesought to address this question in an unbiased fashion, bydirect examination <strong>of</strong> nucleotide sequence within otherspecies. An identical analysis was performed for the recentlypublished genomes <strong>of</strong> Caenorhabditis elegans(Consortium 1998), Drosophila melanogaster (Adams etal. 2000), Takifugu rubripes (Aparicio et al. 2002), andMus musculus (Waterston et al. 2002). Very little evidencefor large (≥20 kb), highly homologous (≥90%) duplicationscould be found within these species (Table 3).Although retroposon accumulation biases have been documentedfor Drosophila, no subtelomeric or pericentromericclustering <strong>of</strong> duplicated segments could be ascertained.Furthermore, these comparisons indicate thatthe human genome is enriched 5- to 100-fold for such duplicationswhen compared to genomes <strong>of</strong> model organisms(Table 3). Although there may be several explanationsfor this effect (differences in genome size,differential rates <strong>of</strong> recombination, methodological differencesin sequence assembly, etc.), the structure <strong>of</strong> thehuman genome appears structurally distinct with respectto recent large-scale interspersed duplications.Table 3. Segmental Duplications in Other Sequenced OrganismsSize Fly Worm Fugu a Mouse Human b≥1 kb 1.20% 4.25% 2.18% ND 5.23%≥5 kb 0.37% 1.50% 0.03% 1.95% 4.78%≥10 kb 0.08% 0.66% 0.00% 0.70% 4.52%≥20 kb 0.00% ND 0.00% 0.11% 4.06%a Takifugu rubripes build 3.0.b Build 31, Nov. 2002.CONCLUSIONSBased on our current analysis <strong>of</strong> genomes, the genome<strong>of</strong> <strong>Homo</strong> <strong>sapiens</strong> is unique in the abundance and distribution<strong>of</strong> large (≥20 kb), highly homologous (≥95%) segmentalduplications. This unusual architecture <strong>of</strong> the humangenome has important practical and biologicalimplications. In the late 1990s, there was considerable debatebetween advocates <strong>of</strong> the whole-genome shotgunand those <strong>of</strong> the clone-ordered approaches for humangenome sequence and assembly (Green 1997; Weber andMyers 1997). Our analysis <strong>of</strong> the initial private and publicgenome assemblies (Lander et al. 2001; Venter et al.2001) indicated that neither effectively resolved the organizationand sequence <strong>of</strong> regions containing large segmentalduplications. Ironically, combining both <strong>of</strong> theseapproaches did provide the most effective means for theidentification, characterization, and subsequent resolution<strong>of</strong> many <strong>of</strong> these regions. <strong>The</strong>se data suggest that acombined whole-genome shotgun and clone-ordered approachmay be the best strategy for the completion <strong>of</strong>complex genomes that are laden with large segmental duplications.Given the fact that the human genome harborsnearly 100 such sites, each greater than 300 kb, high-sequence-identityduplications continue to impede gene annotationand SNP characterization <strong>of</strong> the human genome.Emerging evidence that such regions vary structurally dependingon the human haplotype further complicatestheir sequence and assembly. Taken in this light, it is perhapsnot surprising that the majority <strong>of</strong> the large gaps thatremain within the human genome project as <strong>of</strong> April 2003are flanked by such duplications.From the biological perspective, this architecture hastwo important implications—one functional and the otherstructural. <strong>The</strong> ability to juxtapose segments that wouldhave never shared proximity in the genome <strong>of</strong> an ancestralspecies <strong>of</strong>fers tremendous potential for exon shufflingand domain accretion <strong>of</strong> the proteome. Many suchchimeric transcripts have now been documented(Courseaux and Nahon 2001; Bailey et al. 2002a;Stankiewicz and Lupski 2002; Bridgland et al. 2003)where different portions <strong>of</strong> the transcript originate fromdiverse regions. Few <strong>of</strong> these “fusions” appear to producefunctional proteins (Hillier et al. 2003). Rare exceptionshave been noted, such as the emergence <strong>of</strong> the TRE2 (alsoknown as USP6) gene specifically within the hominoidlineage. In this case, approximately half <strong>of</strong> its 30 exonsarose from a segmental duplication <strong>of</strong> the USP32 ancestralgene, whereas the amino-terminal portion <strong>of</strong> thisoncogene originated as a duplication <strong>of</strong> the TBC1D3 ancestralgene. <strong>The</strong> fused transcript emerged during the radiation<strong>of</strong> the great apes, producing a gene with tissuespecificity different from either <strong>of</strong> the progenitor genes(Paulding et al. 2003). In addition to gene innovationthrough exon shuffling, segmental duplications have thepotential to lead to the emergence <strong>of</strong> novel genes throughadaptive evolution. <strong>The</strong> remarkable positive selection <strong>of</strong>the morpheus gene family on chromosome 16, where thegenes show accelerated amino acid replacement an order<strong>of</strong> magnitude above the neutral expectation, may be anexample <strong>of</strong> such an effect (Johnson et al. 2001).From the structural perspective, the current architecture<strong>of</strong> the human genome suggests that subtle remodulation<strong>of</strong> many specific chromosomal regions has occurredover short periods <strong>of</strong> primate evolution. This observationchallenges the rather static notion <strong>of</strong> genome evolutionthat has emerged from early karyotype and chromosomepaintingstudies. Whereas the majority <strong>of</strong> the humangenome seems to fit well with this model <strong>of</strong> conservation,the duplicated regions, in contrast, have been particularlyprone to multiple, independent occurrences <strong>of</strong> rearrangementthrough duplication. In humans, the majority <strong>of</strong> intrachromosomallyduplicated copies are separated bymore than a megabase <strong>of</strong> intervening sequence. Such architecturehas been rarely observed in other model organisms.<strong>The</strong> mechanism responsible for this event is un-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!