13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6 ROGERSFigure 4. Sequencing center contributions to the finished human genome sequence. Abbreviations: (BCM) Baylor College <strong>of</strong>Medicine; (Beijing) Human <strong>Genom</strong>e Center, Institute <strong>of</strong> Genetics, Chinese Academy <strong>of</strong> Sciences; (CSHL) Cold Spring Harbor Laboratory;(GBF) Gesellschaft fur Biotechnologische Forschung,mbH; (GTC) <strong>Genom</strong>e <strong>The</strong>rapeutics Corporation; (IMB) Institute forMolecular Biology, Jena; (JGI) Joint <strong>Genom</strong>e Institute, U.S. Department <strong>of</strong> Energy; (Keio) Keio University; (MPIMG) Max PlanckInstitute for Molecular Genetics; (RIKEN) RIKEN <strong>Genom</strong>ic Sciences Center; (Sanger Institute) Wellcome Trust Sanger Institute;(SDSTDC) Stanford DNA Sequencing and Technology Center; (SHGC) Stanford Human <strong>Genom</strong>e Center; (TIGR) <strong>The</strong> Institute for<strong>Genom</strong>e Research; (UOKNOR) University <strong>of</strong> Oklahoma; (UTSW) University <strong>of</strong> Texas, Southwestern Medical Center; (UWGC) University<strong>of</strong> Washington <strong>Genom</strong>e Center; University <strong>of</strong> Washington Multimegabase Sequencing Center; (WIBR) Whitehead Institutefor Biomedical Research, MIT; (WUGSC) Washington University <strong>Genom</strong>e Sequencing Center.SEQUENCE ANNOTATIONWith the finished sequence in hand, the next key stepin understanding the biology <strong>of</strong> the genome is accurateannotation <strong>of</strong> the human gene set. Ultimately, we requireall genome features, including genes, alternative transcripts,sequence variations, promoters, enhancers, andother regulatory motifs to be accurately defined and displayedalong the single metric <strong>of</strong> the genome sequence.<strong>The</strong> assembly <strong>of</strong> the working draft sequence providedthe basis for the first completely automated and nonredundantannotation <strong>of</strong> the human genome. Key features<strong>of</strong> the sequence analysis method were that it was based onusing unfinished sequence and that the annotation couldbe updated rapidly as the sequence evolved. Ensembl (ajoint project between the European Bioinformatics Instituteand the Sanger Institute) (Hubbard and Birney 2000;Hubbard et al. 2002) annotates known genes and predictsnovel genes, with functional annotation from the InterPro(Apweiler et al. 2001) protein family databases and fromdisease expression and gene family databases (Enright etal. 1999; Antonarakis and McKusick 2000; Wheeler et al.2002). <strong>The</strong> Ensembl gene build system uses a three-stepprocess, incorporating information on exon structure andplacement from alignment <strong>of</strong> genes predicted from humanproteins in SPTREMBL (Bairoch and Apweiler2000); the alignment <strong>of</strong> paralogous human proteins andproteins from other organisms; and ab initio gene predictionusing genscan (Burge and Karlin 1997). Exons thatare supported by more than one prediction are clusteredto form genes. “Ensembl genes” are regarded as being accuratepredicted gene structures with a low false-positiverate, since they are all supported by experimental evidencefrom protein and nucleotide databases.<strong>Genom</strong>e annotation can be viewed in genome browsersdeveloped by Ensembl (http://www.ensembl.org), theUniversity <strong>of</strong> California at Santa Cruz (http://genome.ucsc.edu), and the National Center for Biotechnology

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!