12.07.2015 Views

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

articlesTable 7 Sequence level contiguity <strong>of</strong> <strong>the</strong> draft <strong>genome</strong> sequenceChromosome <strong>Initial</strong> sequence contigs Sequence contigs Sequence-contig scaffoldsNumber N50 length (kb) Number N50 length (kb) Number N50 length (kb)All 396,913 21.7 149,821 81.9 87,757 274.31 37,656 16.5 12,256 59.1 5,457 278.42 32,280 19.9 13,228 57.3 6,959 248.53 38,848 15.6 15,098 37.7 8,964 167.44 28,600 16.0 13,152 33.0 7,402 158.95 30,096 20.4 10,689 72.9 6,378 241.26 17,472 43.6 5,547 180.3 2,554 485.07 12,733 86.4 4,562 335.7 2,726 591.38 19,042 18.1 8,984 38.2 4,631 198.99 15,955 20.1 6,226 55.6 3,766 216.210 21,762 18.7 9,126 47.9 6,886 133.011 29,723 14.3 8,503 40.0 4,684 193.212 22,050 19.1 8,422 63.4 5,526 217.013 13,737 21.7 5,193 70.5 2,659 300.114 4,470 161.4 829 1,371.0 541 2,009.515 13,134 15.3 5,840 30.3 3,229 149.716 10,297 34.4 4,916 119.5 3,337 356.317 10,369 22.9 4,339 90.6 2,616 248.918 16,266 15.3 4,461 51.4 2,540 216.119 6,009 38.4 2,503 134.4 1,551 375.520 2,884 108.6 511 1,346.7 312 813.821 103 340.0 5 28,515.3 5 28,515.322 526 113.9 11 23,048.1 11 23,048.1X 11,062 58.8 4,607 218.6 2,610 450.7Y 557 154.3 140 1,388.6 106 1,439.7UL 1,282 21.4 613 46.0 297 166.4...................................................................................................................................................................................................................................................................................................................................................................This Table is similar to Table 6 but shows <strong>the</strong> number <strong>and</strong> N50 length for various types <strong>of</strong> sequence contig (see Box 1). See legend to Table 6 concerning treatment <strong>of</strong> gaps. For sequence contigs in <strong>the</strong> draft<strong>genome</strong> sequence, <strong>the</strong> N50 length ranges from 1.7 to 5.5 times <strong>the</strong> arithmetic mean for initial sequence contigs, 2.5 to 8.2 times for merged sequence contigs, <strong>and</strong> 6.1 to 10 times for sequence-contigscaffolds.Table 8 Chromosome size estimatesChromosome*Sequencedbases² (Mb)NumberFCC gaps³ SCC gapsk Sequence gaps# Heterochromatin<strong>and</strong> short armadjustments**(Mb)Total basesin gaps§ (Mb)NumberTotal basesin gaps (Mb)NumberTotal basesin gaps I (Mb)Total estimatedchromosome size(includingartefactualduplication in draft<strong>genome</strong>sequence)²² (Mb)Previouslyestimatedchromosomesize³³ (Mb)All 2,692.9 897 152.0 4,076 142.7 145,514 80.6 212 3,289 3,2861 212.2 104 17.7 347 12.1 11,803 6.5 30 279 2632 221.6 50 8.5 296 10.4 12,880 7.1 3 251 2553 186.2 71 12.1 336 11.8 14,689 8.1 3 221 2144 168.1 39 6.6 343 12.0 12,768 7.1 3 197 2035 169.7 46 7.8 337 11.8 10,304 5.7 3 198 1946 158.1 15 2.6 275 9.6 5,225 2.9 3 176 1837 146.2 27 4.6 195 6.8 4,338 2.4 3 163 1718 124.3 41 7.0 249 8.7 8,692 4.8 3 148 1559 106.9 19 3.2 122 4.3 6,083 3.4 22 140 14510 127.1 14 2.4 163 5.7 8,947 5.0 3 143 14411 128.6 29 4.9 193 6.8 8,279 4.6 3 148 14412 124.5 26 4.4 168 5.9 8,226 4.6 3 142 14313 92.9 12 2.0 115 4.0 5,065 2.8 16 118 11414 86.9 13 2.2 40 1.4 775 0.4 16 107 10915 73.4 18 3.1 104 3.6 5,717 3.2 17 100 10616 73.1 55 9.4 102 3.6 4,757 2.6 15 104 9817 72.8 41 7.0 95 3.3 4,261 2.4 3 88 9218 72.9 22 3.7 113 4.0 4,324 2.4 3 86 8519 55.4 49 8.3 108 3.8 2,344 1.3 3 72 6720 60.5 7 1.2 33 1.2 469 0.3 3 66 7221 33.8 4 0.1 0 0.0 0 0.0 11 45 5022 33.8 10 1.0 0 0.0 0 0.0 13 48 56X 127.7 141 24.0 182 6.4 4,282 2.4 3 163 164Y 21.8 6 1.0 19 0.7 113 0.1 27 51 59NA 5.1 0 0 134 0.0 577 0.3 0 0 0UL 9.3 38 0 7 0.0 566 0.3 0 0 0...................................................................................................................................................................................................................................................................................................................................................................* NA, sequenced clones that could not be associated with ®ngerprint clone contigs. UL, clone contigs that could not be reliably placed on a chromosome.² Total number <strong>of</strong> bases in <strong>the</strong> draft <strong>genome</strong> sequence, excluding gaps. Total length <strong>of</strong> scaffold (including gaps contained within clones) is 2.916 Gb.³ Gaps between those ®ngerprint clone contigs that contain sequenced clones excluding gaps for centromeres.§ For un®nished chromosomes, we estimate an average size <strong>of</strong> 0.17 Mb per FCC gap, based on retrospective estimates <strong>of</strong> <strong>the</strong> clone coverage <strong>of</strong> chromosomes 21 <strong>and</strong> 22. Gap estimates for chromosomes21 <strong>and</strong> 22 are taken from refs 93, 94.k Gaps between sequenced-clone contigs within a ®ngerprint clone contig. For un®nished chromosomes, we estimate sequenced clone gaps at 0.035 Mb each, based on evaluation <strong>of</strong> a sample <strong>of</strong> <strong>the</strong>se gaps.# Gaps between two sequence contigs within a sequenced-clone contig.I We estimate <strong>the</strong> average number <strong>of</strong> bases in sequence gaps from alignments <strong>of</strong> <strong>the</strong> initial sequence contigs <strong>of</strong> un®nished clones (see text) <strong>and</strong> extrapolation to <strong>the</strong> whole chromosome.** Including adjustments for estimates <strong>of</strong> <strong>the</strong> sizes <strong>of</strong> <strong>the</strong> short arms <strong>of</strong> <strong>the</strong> acrocentric chromosomes 13, 14, 15, 21 <strong>and</strong> 22 (ref. 105), estimates for <strong>the</strong> centromere <strong>and</strong> heterochromatic regions <strong>of</strong>chromosomes 1, 9 <strong>and</strong> 16 (refs 106, 107) <strong>and</strong> estimates <strong>of</strong> 3 Mb for <strong>the</strong> centromere <strong>and</strong> 24 Mb for telomeric heterochromatin for <strong>the</strong> Y chromosome 108 .²² The sum <strong>of</strong> <strong>the</strong> ®ve lengths in <strong>the</strong> preceding columns. This is an overestimate, because <strong>the</strong> draft <strong>genome</strong> sequence contains some artefactual sequence owing to inability to correctly to merge allunderlying sequence contigs. The total amount <strong>of</strong> artefactual duplication varies among chromosomes; <strong>the</strong> overall amount is estimated by computational <strong>analysis</strong> to be about 100 Mb, or about 3% <strong>of</strong> <strong>the</strong> totallength given, yielding a total estimated size <strong>of</strong> about 3,200 Mb for <strong>the</strong> <strong>human</strong> <strong>genome</strong>.³³ Including heterochromatic regions <strong>and</strong> acrocentric short arm(s) 105 .NATURE | VOL 409 | 15 FEBRUARY 2001 | www.nature.com © 2001 Macmillan Magazines Ltd873

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!