13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

20 WATERSTON ET AL.normally will take much longer. Studies <strong>of</strong> homologouselements in experimentally tractable animals will be criticalin this analysis, along with studies <strong>of</strong> human genes intissue culture and in vitro. Ultimately, we will need to understandthese elements in the context <strong>of</strong> the human. Naturalvariation in the human population provides a powerfultool for achieving this understanding. Thus, a keychallenge in the coming years will be to define the variationin the human population, focusing first on commonvariations and then, as methods improve, to extend this toless common variations. In turn the impact <strong>of</strong> these variations,if any, on the human phenotype and in particularon health and disease must be established, leading to anunderstanding <strong>of</strong> the role <strong>of</strong> each element in the whole.<strong>The</strong> complete sequence <strong>of</strong> the human genome is thusonly a beginning. Our efforts on Chromosome 7 were, <strong>of</strong>course, focused on obtaining this reference sequence, butvariation was encountered in the course <strong>of</strong> the analysis,and our results suggest two avenues <strong>of</strong> exploration.In the comparison <strong>of</strong> the cDNAs to genomic sequence,we encountered examples <strong>of</strong> variations with predictableeffects on gene function; that is, alterations to the readingframe that are expected to result in loss <strong>of</strong> function. Thisstrong prediction is testable, and examination <strong>of</strong> the impact<strong>of</strong> such loss-<strong>of</strong>-function variations on the humanphenotype would undoubtedly be highly informativeabout the normal role <strong>of</strong> the gene, just as loss-<strong>of</strong>-functionmutations have been important in the genetic dissection<strong>of</strong> function in experimental animals.<strong>The</strong> number <strong>of</strong> genes with such variations was small,but came at almost no extra cost and with the comparison<strong>of</strong> only two different sequences. As additional humangenome sequences are described, other examples willcome to light. Furthermore, for many genes, missensesubstitutions can have an equally predictable effect onprotein function, expanding the range <strong>of</strong> changes that canbe used. Comparative sequence analysis will reveal thoseresidues under purifying selection, adding still more candidates.As resequencing <strong>of</strong> human genomes becomesroutine, “acquisition by genotype,” as this approachmight be called, will become more widespread, complementingthe more traditional approach <strong>of</strong> “acquisition byphenotype.”<strong>The</strong> regions with unusually high density <strong>of</strong> variationmay point to functionally important regions <strong>of</strong> thegenome. Differences in SNP density have been noted previously,with low density noted in some regions (Millerand Kwok 2001; Miller et al. 2001) and high density inHLA region (Mungall et al. 2003) and around the geneunderlying the ABO blood group antigen (Yip 2002). <strong>The</strong>SNP “deserts” may reflect the relatively recent fixation <strong>of</strong>a single variant in the population, either through selectivemechanisms or through founder effects and chance. <strong>The</strong>regions <strong>of</strong> high SNP density around HLA and the ABOgene are thought to represent instances <strong>of</strong> balanced selection,where multiple versions <strong>of</strong> the region have beenmaintained in the population over millions <strong>of</strong> years forHLA and hundreds <strong>of</strong> thousands <strong>of</strong> years for the ABOlocus.<strong>The</strong> regions described here have a lower density <strong>of</strong>variation than observed in the HLA region and are moresimilar in density to that observed in the ABO locus.<strong>The</strong>ir size (tens <strong>of</strong> kilobases) is similar to that <strong>of</strong> haplotypeblocks described elsewhere in the genome (Gabrielet al. 2002), but whether their boundaries correlate withthe boundaries <strong>of</strong> haplotype blocks remains to be determined.Of course, such regions might simply representthe chance persistence <strong>of</strong> two different haplotype blocksover extensive evolutionary time. However, if balancingselection is responsible for the maintenance <strong>of</strong> these regionsfor the hundreds <strong>of</strong> thousands to millions <strong>of</strong> yearsrequired to accumulate this density <strong>of</strong> variation, understandingtheir role in the generation <strong>of</strong> the variable humanphenotype will be important.Both the enumeration <strong>of</strong> a parts list and a description <strong>of</strong>human variation require the high-quality, essentially completereference human genome sequence that is now availableto all without restriction. Distinguishing genes frompseudogenes was an almost impossible task with the drafthuman sequence and without a high-quality mouse sequencefor comparison. <strong>The</strong> variants that alter the reading<strong>of</strong> genes are sufficiently unusual that errors in sequencewould have obscured them in earlier versions, and evenwith a high-quality human genome, errors in the other datasets predominated. Recognition <strong>of</strong> areas <strong>of</strong> high SNP densityrequires accurate assembly, so that segmentally duplicatedregions are not mistaken for such regions.<strong>The</strong> completion <strong>of</strong> the human genome sequence marksa major milestone in science and comes just 50 years afterthe discovery <strong>of</strong> the structure <strong>of</strong> DNA. For the firsttime, we as a species have before us the genetic instructionset that molds us. Knowledge <strong>of</strong> the sequence presentsthe scientific community with the challenge to understandits content. <strong>The</strong> human genome sequence alsopresents society with enormous challenges, not the least<strong>of</strong> which is to use the knowledge for the betterment <strong>of</strong> humankind.Undoubtedly, the tasks will take longer thansome have forecast, but the potential is clear and the implicationspr<strong>of</strong>ound.ACKNOWLEDGMENTSWe thank the dedicated individuals at the WashingtonUniversity <strong>Genom</strong>e Sequencing Center and other centersacross the world for their contributions throughout thisproject. <strong>The</strong> work was supported through grants from theNational Human <strong>Genom</strong>e Research Institute.REFERENCESAnsorge W., Sproat B., Stegemann J., Schwager C., and ZenkeM. 1987. Automated DNA sequencing: Ultrasensitive detection<strong>of</strong> fluorescent bands during electrophoresis. NucleicAcids Res. 15: 4593.Axelrod J.D. and Majors J. 1989. An improved method forphot<strong>of</strong>ootprinting yeast genes in vivo using Taq polymerase.Nucleic Acids Res. 17: 171.Baer R., Bankier A.T., Biggin M.D., Deininger P.L., Farrell P.J.,Gibson T.J., Hatfull G., Hudson G.S., Satchwell S.C., andSeguin C., et al. 1984. DNA sequence and expression <strong>of</strong> theB95-8 Epstein-Barr virus genome. Nature 310: 207.Birney E. and R. Durbin R. 2000. Using GeneWise in the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!