13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

HUMAN SEQUENCE VARIATION 61Figure 4. LD analysis <strong>of</strong> Chromosome 20. Haplotype blocks (red and black boxes) were computed from LD data on Chromosome 20and are shown for two regions, one each <strong>of</strong> high and low overall LD. <strong>The</strong> analysis is repeated using data from different densities <strong>of</strong>SNPs (average marker spacings in each analysis are listed on the left <strong>of</strong> the figure, and increasing SNP density is indicated by the verticalarrow).gion in high LD was estimated to be 65% in the North Europeansand 45% in the African-Americans (Ke et al.2004). <strong>The</strong> coverage estimates for North Europeans fitsthe trend in Table 1. However, the lower coverage figurefor the African-Americans suggests that more dense SNPsets may need to be used in these populations—echoingthe earlier findings <strong>of</strong> the Gabriel study (Gabriel et al.2002). <strong>The</strong> block patterns also varied significantly whenusing different analytical methods, indicating the need forother ways to describe LD patterns on a fine scale. For example,LD unit maps (Maniatis et al. 2002) reflect continuousviews <strong>of</strong> LD, although they are to some extentsensitive to varying allele frequency. A promising approachis the development <strong>of</strong> methods to estimate recombinationrates on a fine scale using the same data sets describedabove (McVean et al. 2004). <strong>The</strong>se approachescould enable a precise assessment <strong>of</strong> progress in characterization<strong>of</strong> LD in the genome, identify and prioritize regionsthat require more data, and help assess the ability <strong>of</strong>panels <strong>of</strong> anonymous markers to detect unknown diseaserelatedvariants.IMPLICATIONS FOR FUTURE STUDIES<strong>The</strong> first global study <strong>of</strong> patterns <strong>of</strong> common variationin the human genome has crystallized in the form <strong>of</strong> theInternational HapMap Project (International HapMapConsortium 2003). <strong>The</strong> density <strong>of</strong> SNPs that are nowavailable in the public domain is likely to be sufficient forthis initial analysis <strong>of</strong> LD and common haplotypes, althoughsome targeted SNP discovery to fill gaps may benecessary. This project will produce a freely availablegenome-wide panel <strong>of</strong> tested SNPs that can be assessedfor use in association studies. It is intended to reduce theneed for researchers to search for their own SNPs, and toenable better searches for disease variants in genes, chromosomalregions, or ultimately, the whole genome usingthe indirect association approach. Evaluation <strong>of</strong> the parametersthat govern association studies are still required,and the results <strong>of</strong> pilot studies might help calibrate theHapMap and other studies to fit more precisely with theiranticipated applications. In parallel, it is also necessary toaddress the challenge <strong>of</strong> developing sufficiently largesample collections with accurate, accessible phenotypeinformation. Without this, even a perfect SNP panel thatcovers 100% <strong>of</strong> the genome will lack the power to establishan association between genotype and phenotype.We should continue sequence-based discovery <strong>of</strong> variantson a genome-wide basis; this would give us an unbiasedand properly quantified picture <strong>of</strong> the extent <strong>of</strong> commonsequence variation in the human population,replacing the current simulation-based estimates. Achievingthe first step at this level would benefit from upward<strong>of</strong> 100 individual human genome sequences, a task thatrequires a greater than tenfold increase in sequence outputand is likely to demand the successful implementation<strong>of</strong> new technologies for cheap, accurate, high-throughputresequencing. Discovering rare variants (m.a.f. < 1%)would require improvements <strong>of</strong> a further one or two orders<strong>of</strong> magnitude in sequencing. At present, therefore,pursuing rare variants remains in the domain <strong>of</strong> individualcandidate genes and may be best carried out in conjunctionwith a genetic disease study. Both candidategene and whole-genome sequencing will also be instrumentalin characterizing somatic variants that cause cancer,as demonstrated, for example, by Davies et al.(2002). Systematic pursuit <strong>of</strong> these areas <strong>of</strong> research willhelp us gain a view <strong>of</strong> the complete spectrum <strong>of</strong> variantallele frequencies that contributes to human disease.As we learn more about the full extent <strong>of</strong> functional sequencesin the genome, our knowledge <strong>of</strong> the functionallyimportant subset <strong>of</strong> variants will increase (for furtherdiscussion, see Bentley 2004). Developing a panel <strong>of</strong>functional variants for all genes would increase the power<strong>of</strong> association studies. However, we must be aware <strong>of</strong> theunknown; any association study based on the direct approachneeds to take account <strong>of</strong> the possibility that thetrue causative variant is not included in the study. It istherefore necessary to establish the underlying mechanismand explain the functional significance <strong>of</strong> the variant.Within the medical field, understanding the mechanismthat underlies disease will be the most effective wayto make progress in diagnosis, alleviation, and the effectivetreatment <strong>of</strong> disease.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!