11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

182 The road travelledgiven marker is proportional to the power of the study at the effect-size associatedwith that marker. My familiarity with sample-survey theory, whichIdevelopedduringmyPhDthesisontwo-phasestudydesign,againcameveryhandy here. I worked with a post-doctoral Fellow, JuHyun Park, to developan “inverse-power-weighted” method, similar to the widely used “inverseprobability-weighted”(IPW) methods for analysis of survey data, for inferringthe number of underlying susceptibility markers for a trait and their effect-sizedistribution using published information on known discoveries and the studydesign of the underlying GWA studies (Park et al., 2010). We inferred geneticarchitecture for several complex traits using this method and made projectionsabout the expected number of discoveries in GWAS of these traits. Wehave been very pleased to see that our projections were quite accurate whenresults from larger and larger GWA studies have come out for these traitssince the publication of our report (Allen et al., 2010; Anderson et al., 2011;Eeles et al., 2013; Michailidou et al., 2013).Realizing how optimal study design is fundamentally related to the underlyinggenetic architecture of traits, both JuHyun and I continued to delveinto these related issues. Again using known discoveries from published studiesand information on design of existing studies, we showed that there isvery modest or no evidence of an inverse relationship between effect-size andallele frequency for genetic markers, a hypothesis in population genetics postulatedfrom a selection point of view and one that often has been used in thepast by scientists to motivate studies of less common and rare variants usingsequencing technologies (Park et al., 2011). From the design point of view,we conjectured that lack of strong relationship between allele frequency andeffect-size implies future studies for less common and rare variants will requireeven larger sample sizes than current GWAS to make comparable numbers ofdiscoveries for underlying susceptibility loci.Understanding its implications for discoveries made us question the implicationof genetic architecture for risk-prediction, another hotly debated topic.Interestingly, while the modern statistical literature is very rich regarding optimalalgorithms for building models, very little attention is given to morefundamental design questions, such as how our ability to predict a trait isinherently limited by sample-size of the training datasets and the genetic architectureof the trait, or more generally the etiologic architecture that mayinvolve both genetic and non-genetic factors. This motivated us to develop amathematical approximation for the relationship between expected predictiveperformance of models, sample size of training datasets and genetic architectureof traits. Based on these formulations, we projected that highly polygenicnature of complex traits implies future GWAS will require extremely largesample sizes, possibly of a higher order magnitude than even some of thelargest GWAS to date, for substantial improvement of risk-prediction basedon genetic information (Chatterjee et al., 2013).Although the study of genetic architecture and its implications for studydesigns is now a significant part of my research portfolio, it was not by design

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!