11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

190 Journey into genetics and genomicsfew years. They have identified hundreds of common genetic variants thatare associated with complex traits and diseases (http://www.genome.gov/gwastudies/). The emerging next generation sequencing technology offers anexciting new opportunity for sequencing the whole genome, obtaining informationabout both common and rare variants and structural variation. The nextgeneration sequencing data allow to explore the roles of rare genetic variantsand mutations in human diseases. Candidate gene sequencing, whole exomesequencing and whole genome sequencing studies are being conducted. HighthroughputRNA and epigenetic sequencing data are also becoming rapidlyavailable to study gene regulation and functionality, and the mechanisms ofbiological systems. A large number of public genomic databases, such as theHapMap Project (http://hapmap.ncbi.nlm.nih.gov/), the 1000 genomesproject (www.1000genomes.org), are freely available. The NIH database ofGenotypes and Phenotypes (dbGaP) archives and distributes data from manyGWAS and sequencing studies funded by NIH freely to the general researchcommunity for enhancing new discoveries.The emerging sequencing technology presents many new opportunities.Whole genome sequencing measures the complete DNA sequence of thegenome of a subject at three billion base-pairs. Although the current costof whole genome sequencing prohibits conducting large scale studies, with therapid advance of biotechnology, the “1000 dollar genome” era will come in thenear future. This provides a new era of predictive and personalized medicineduring which the full genome sequencing for an individual or patient costsonly $1000 or lower. Individual subject’s genome map will facilitate patientsand physicians with identifying personalized effective treatment decisions andintervention strategies.While the ’omics era presents many exciting research opportunities, theexplosion of massive information about the human genome presents extraordinarychallenges in data processing, integration, analysis and result interpretation.The volume of whole genome sequencing data is substantially largerthan that of GWAS data, and is in the magnitude of tens or hundreds ofterabites (TBs). In recent years, limited quantitative methods suitable for analyzingthese data have emerged as a bottleneck for effectively translating richinformation into meaningful knowledge. There is a pressing need to developstatistical methods for these data to bridge the technology and informationtransfer gap in order to accelerate innovations in disease prevention and treatment.As noted by John McPherson, from the Ontario Institute for CancerResearch,“There is a growing gap between the generation of massively parallelsequencing output and the ability to process and analyze the resultingdata. Bridging this gap is essential, or the coveted $1000 genome willcome with a $20,000 analysis price tag.” (McPherson, 2009)This is an exciting time for statisticians. I discuss in this chapter how I becameinterested in statistical genetics and genomics a few years ago, lessons

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!