12.07.2015 Views

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

articlesbecause <strong>of</strong> <strong>the</strong> availability <strong>of</strong> tissues from all developmental timepoints. A challenge will be to de®ne <strong>the</strong> gene-speci®c patterns <strong>of</strong>alternative splicing, which may affect half <strong>of</strong> <strong>human</strong> genes. Existingcollections <strong>of</strong> ESTs <strong>and</strong> cDNAs may allow identi®cation <strong>of</strong> <strong>the</strong> mostabundant <strong>of</strong> <strong>the</strong>se is<strong>of</strong>orms, but systematic exploration <strong>of</strong> thisproblem may require exhaustive <strong>analysis</strong> <strong>of</strong> cDNA libraries frommultiple tissues or perhaps high-throughput reverse transcription±PCR studies. Deep underst<strong>and</strong>ing <strong>of</strong> gene function will probablyrequire knowledge <strong>of</strong> <strong>the</strong> structure, tissue distribution <strong>and</strong> abundance<strong>of</strong> <strong>the</strong>se alternative forms.Large-scale identi®cation <strong>of</strong> regulatory regionsThe one-dimensional script <strong>of</strong> <strong>the</strong> <strong>human</strong> <strong>genome</strong>, shared byessentially all cells in all tissues, contains suf®cient information toprovide for differentiation <strong>of</strong> hundreds <strong>of</strong> different cell types, <strong>and</strong><strong>the</strong> ability to respond to a vast array <strong>of</strong> internal <strong>and</strong> externalin¯uences. Much <strong>of</strong> this plasticity results from <strong>the</strong> carefully orchestratedsymphony <strong>of</strong> transcriptional regulation. Although much hasbeen learned about <strong>the</strong> cis-acting regulatory motifs <strong>of</strong> some speci®cgenes, <strong>the</strong> regulatory signals for most genes remain uncharacterized.Comparative genomics <strong>of</strong> multiple vertebrates <strong>of</strong>fers <strong>the</strong> best hopefor large-scale identi®cation <strong>of</strong> such regulatory sites 440 . Previousstudies <strong>of</strong> sequence alignment <strong>of</strong> regulatory domains <strong>of</strong> orthologousgenes in multiple species has shown a remarkablecorrelation between sequence conservation, dubbed `phylogeneticfootprints' 441 , <strong>and</strong> <strong>the</strong> presence <strong>of</strong> binding motifs for transcriptionfactors. This approach could be particularly powerful if combinedwith expression array technologies that identify cohorts <strong>of</strong> genesthat are coordinately regulated, implicating a common set <strong>of</strong> cisactingregulatory sequences 442±445 . It will also be <strong>of</strong> considerableinterest to study epigenetic modi®cations such as cytosine methylationon a <strong>genome</strong>-wide scale, <strong>and</strong> to determine <strong>the</strong>ir biologicalconsequences 446,447 . Towards this end, a pilot Human Epi<strong>genome</strong>Project has been launched 448,449 .Sequencing <strong>of</strong> additional large <strong>genome</strong>sMore generally, comparative genomics allows biologists to peruseevolution's laboratory notebookÐto identify conserved functionalfeatures <strong>and</strong> recognize new innovations in speci®c lineages. Determination<strong>of</strong> <strong>the</strong> <strong>genome</strong> sequence <strong>of</strong> many organisms is verydesirable. Already, projects are underway to sequence <strong>the</strong> <strong>genome</strong>s<strong>of</strong> <strong>the</strong> mouse, rat, zebra®sh <strong>and</strong> <strong>the</strong> puffer®shes T. nigroviridis <strong>and</strong>Takifugu rubripes. Plans are also under consideration for <strong>sequencing</strong>additional primates <strong>and</strong> o<strong>the</strong>r organisms that will help de®ne keydevelopments along <strong>the</strong> vertebrate <strong>and</strong> nonvertebrate lineages.To realize <strong>the</strong> full promise <strong>of</strong> comparative genomics, however, itneeds to become simple <strong>and</strong> inexpensive to sequence <strong>the</strong> <strong>genome</strong> <strong>of</strong>any organism. Sequencing costs have dropped 100-fold over <strong>the</strong> last10 years, corresponding to a roughly tw<strong>of</strong>old decrease every 18months. This rate is similar to `Moore's law' concerning improvementsin semiconductor manufacture. In both <strong>sequencing</strong> <strong>and</strong>semiconductors, such improvement does not happen automatically,but requires aggressive technological innovation fuelled by majorinvestment. Improvements are needed to move current dideoxy<strong>sequencing</strong> to smaller volumes <strong>and</strong> more rapid <strong>sequencing</strong>times, based upon advances such as microchannel technology.More revolutionary methods, such as mass spectrometry, singlemolecule<strong>sequencing</strong> <strong>and</strong> nanopore approaches 76 , have not yetbeen fully developed, but hold great promise <strong>and</strong> deserve strongencouragement.Completing <strong>the</strong> catalogue <strong>of</strong> <strong>human</strong> variationThe <strong>human</strong> draft <strong>genome</strong> sequence has already allowed <strong>the</strong> identi-®cation <strong>of</strong> more than 1.4 million SNPs, comprising a substantialproportion <strong>of</strong> all common <strong>human</strong> variation. This program shouldbe extended to obtain a nearly complete catalogue <strong>of</strong> commonvariants <strong>and</strong> to identify <strong>the</strong> common ancestral haplotypes present in<strong>the</strong> population. In principle, <strong>the</strong>se genetic tools should make itpossible to perform association studies <strong>and</strong> linkage disequilibriumstudies 376 to identify <strong>the</strong> genes that confer even relatively modest riskfor common diseases. Launching such an intense era <strong>of</strong> <strong>human</strong>molecular epidemiology will also require major advances in <strong>the</strong> costef®ciency <strong>of</strong> genotyping technology, in <strong>the</strong> collection <strong>of</strong> carefullyphenotyped patient cohorts <strong>and</strong> in statistical methods for relatinglarge-scale SNP data to disease phenotype.From sequence to functionThe scienti®c program outlined above focuses on how <strong>the</strong> <strong>genome</strong>sequence can be mined for biological information. In addition, <strong>the</strong>sequence will serve as a foundation for a broad range <strong>of</strong> functionalgenomic tools to help biologists to probe function in a moresystematic manner. These will need to include improved techniques<strong>and</strong> databases for <strong>the</strong> global <strong>analysis</strong> <strong>of</strong>: RNA <strong>and</strong> protein expression,protein localization, protein±protein interactions <strong>and</strong> chemicalinhibition <strong>of</strong> pathways. New computational techniques will beneeded to use such information to model cellular circuitry. A fulldiscussion <strong>of</strong> <strong>the</strong>se important directions is beyond <strong>the</strong> scope <strong>of</strong> thispaper.Concluding thoughtsThe Human Genome Project is but <strong>the</strong> latest increment in aremarkable scienti®c program whose origins stretch back a hundredyears to <strong>the</strong> rediscovery <strong>of</strong> Mendel's laws <strong>and</strong> whose end is nowherein sight. In a sense, it provides a capstone for efforts in <strong>the</strong> pastcentury to discover genetic information <strong>and</strong> a foundation for effortsin <strong>the</strong> coming century to underst<strong>and</strong> it.We ®nd it humbling to gaze upon <strong>the</strong> <strong>human</strong> sequence nowcoming into focus. In principle, <strong>the</strong> string <strong>of</strong> genetic bits holds longsoughtsecrets <strong>of</strong> <strong>human</strong> development, physiology <strong>and</strong> medicine. Inpractice, our ability to transform such information into underst<strong>and</strong>ingremains woefully inadequate. This paper simply recordssome initial observations <strong>and</strong> attempts to frame issues for futurestudy. Ful®lling <strong>the</strong> true promise <strong>of</strong> <strong>the</strong> Human Genome Project willbe <strong>the</strong> work <strong>of</strong> tens <strong>of</strong> thous<strong>and</strong>s <strong>of</strong> scientists around <strong>the</strong> world, inboth academia <strong>and</strong> industry. It is for this reason that our highestpriority has been to ensure that <strong>genome</strong> data are available rapidly,freely <strong>and</strong> without restriction.The scienti®c work will have pr<strong>of</strong>ound long-term consequencesfor medicine, leading to <strong>the</strong> elucidation <strong>of</strong> <strong>the</strong> underlying molecularmechanisms <strong>of</strong> disease <strong>and</strong> <strong>the</strong>reby facilitating <strong>the</strong> design in manycases <strong>of</strong> rational diagnostics <strong>and</strong> <strong>the</strong>rapeutics targeted at thosemechanisms. But <strong>the</strong> science is only part <strong>of</strong> <strong>the</strong> challenge. Wemust also involve society at large in <strong>the</strong> work ahead. We must setrealistic expectations that <strong>the</strong> most important bene®ts will not bereaped overnight. Moreover, underst<strong>and</strong>ing <strong>and</strong> wisdom will berequired to ensure that <strong>the</strong>se bene®ts are implemented broadly <strong>and</strong>equitably. To that end, serious attention must be paid to <strong>the</strong> manyethical, legal <strong>and</strong> social implications (ELSI) raised by <strong>the</strong> acceleratedpace <strong>of</strong> genetic discovery. This paper has focused on <strong>the</strong> scienti®cachievements <strong>of</strong> <strong>the</strong> <strong>human</strong> <strong>genome</strong> <strong>sequencing</strong> efforts. This is not<strong>the</strong> place to engage in a lengthy discussion <strong>of</strong> <strong>the</strong> ELSI issues, whichhave also been a major research focus <strong>of</strong> <strong>the</strong> Human GenomeProject, but <strong>the</strong>se issues are <strong>of</strong> comparable importance <strong>and</strong> couldappropriately ®ll a paper <strong>of</strong> equal length.Finally, it is has not escaped our notice that <strong>the</strong> more we learnabout <strong>the</strong> <strong>human</strong> <strong>genome</strong>, <strong>the</strong> more <strong>the</strong>re is to explore.``We shall not cease from exploration. And <strong>the</strong> end <strong>of</strong> all ourexploring will be to arrive where we started, <strong>and</strong> know <strong>the</strong> place for<strong>the</strong> ®rst time.''ÐT. S. Eliot 450MReceived 7 December 2000; accepted 9 January 2001.1. Correns, C. Untersuchungen uÈber die Xenien bei Zea mays. Berichte der Deutsche BotanischeGesellschaft 17, 410±418 (1899).2. De Vries, H. Sur la loie de disjonction des hybrides. Comptes Rendue Hebdemodaires, Acad. Sci. Paris130, 845±847 (1900).3. von Tschermack, E. Uber KuÈnstliche Kreuzung bei Pisum sativum. Berichte der Deutsche BotanischeGesellschaft 18, 232±239. (1900).4. Sanger, F. et al. Nucleotide sequence <strong>of</strong> bacteriophage F X174 DNA. Nature 265, 687±695 (1977).5. Sanger, F. et al. The nucleotide sequence <strong>of</strong> bacteriophage FX174. J Mol Biol 125, 225±246 (1978).914 NATURE | VOL 409 | 15 FEBRUARY 2001 | www.nature.com© 2001 Macmillan Magazines Ltd

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!