05.07.2013 Views

Gao X, Starmer J, Martin ER. A multiple testing correction method for ...

Gao X, Starmer J, Martin ER. A multiple testing correction method for ...

Gao X, Starmer J, Martin ER. A multiple testing correction method for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

368 <strong>Gao</strong> et al.<br />

the SNPs all together if the eigenvalues can be<br />

derived. In the situations where the high dimensionality<br />

prohibits the calculation of eigenvalues, we can<br />

analyze the SNPs on each chromosome separately or<br />

according to the gene functions and then sum all of<br />

the Meff values together. The total Meff can be used to<br />

calculate the adjusted PW<strong>ER</strong>. In genome-wide<br />

association studies, we have to partition the SNPs<br />

into several parts and analyze them separately. Since<br />

SNPs on different chromosomes are expected to be<br />

in linkage equilibrium in general populations, the<br />

genome-wide effective number of independent tests<br />

can be obtained by summing the chromosome<br />

specific Meff values. For each chromosome, we may<br />

use the partition-ligation approach by dividing the<br />

SNPs into several parts, and then sum the Meff<br />

values from each partition, similar to how we tested<br />

our Alzheimer SNP data set. The total Meff is used in<br />

the final adjustment calculation. Due to the interblock<br />

correlations that are unlikely to be captured in<br />

this partition-ligation approach, the total Meff may<br />

be slightly conservative. However, the interblock<br />

correlations may be reduced if we partition SNPs<br />

according to their haplotype block structure.<br />

In summary, the simpleM algorithm provides a<br />

highly accurate approximation to the permutationbased<br />

<strong>correction</strong> threshold and is easily implemented.<br />

Itisshowntobesimple,fastandmoreaccuratethan<br />

recently developed <strong>method</strong>s and is comparable to the<br />

permutation-based <strong>correction</strong> threshold using both<br />

simulated and real SNP data. The efficiency and<br />

accuracy of the simpleM <strong>method</strong> make it an attractive<br />

choice <strong>for</strong> <strong>multiple</strong> <strong>testing</strong> adjustment when there is<br />

high intermarker LD in the SNP data set as in<br />

candidate gene or genome-wide association studies.<br />

ACKNOWLEDGMENTS<br />

This work was supported in part by NIH grants<br />

NS39764, AG019757 and AG20135 and NIEHS T32<br />

ES007126. We thank Dr. Gary Beecham who prepared<br />

the Alzheimer data <strong>for</strong> us. We thank Dr.<br />

Richard Morris <strong>for</strong> initial inspiration.<br />

REF<strong>ER</strong>ENCES<br />

Armitage P. 1955. Tests <strong>for</strong> linear trends in proportions and<br />

frequencies. Biometrics 11:375–386.<br />

Barrett JC, Fry B, Maller J, Daly MJ. 2005. Haploview: analysis and<br />

visualization of LD and haplotype maps. Bioin<strong>for</strong>matics<br />

21:263–265.<br />

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery<br />

rate: a practical and powerful approach to <strong>multiple</strong> <strong>testing</strong>. J R<br />

Stat Soc B 57:289–300.<br />

Bonferroni CE. 1935. Il calcolo delle assicurazioni su gruppi di<br />

teste, chapter ‘‘Studi in Onore del Professore Salvatore ortu<br />

Carboni’’. Rome. p 13–60.<br />

Bonferroni CE. 1936. Teoria statistica delle classi e calcolo delle<br />

probabilitá. Pubblicazioni del Istituto Superiore di Scienze<br />

Economiche e Commerciali di Firenze 8:3–62.<br />

Genet. Epidemiol.<br />

Cheverud JM. 2001. A simple <strong>correction</strong> <strong>for</strong> <strong>multiple</strong> comparisons<br />

in interval mapping genome scans. Heredity 87:52–58.<br />

Churchill GA, Doerge RW. 1994. Empirical threshold values <strong>for</strong><br />

quantitative trait mapping. Genetics 138:963–971.<br />

Deng HW. 2000. Re: ‘‘biased tests of association: comparisons of<br />

allele frequencies when departing from Hardy-Weinberg<br />

proportions’’. Am J Epidemiol 151:335–336.<br />

Excoffier L, Slatkin M. 1995. Maximum-likelihood estimation of<br />

molecular haplotype frequencies in a diploid population. Mol<br />

Biol Evol 12:921–927.<br />

Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel<br />

B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero<br />

SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly<br />

MJ, Altshuler D. 2002. The structure of haplotype blocks in the<br />

human genome. Science 296:2225–2229.<br />

Hastie T, Tibshirani R, Friedman J. 2001. The Elements of<br />

Statistical Learning. Berlin: Springer.<br />

Hoh J, Wille A, Ott J. 2001. Trimming, weighting, and grouping<br />

SNPs in human case-control association studies. Genome Res<br />

11:2115–2119.<br />

Hudson RR. 2002. Generating samples under a Wright-Fisher<br />

neutral modal of genetic variation. Bioin<strong>for</strong>matics 18:337–338.<br />

Knapp M. 2001. Re:‘‘biased tests of association: comparisons of<br />

allele frequencies when departing from Hardy-Weinberg<br />

proportions’’. Am J Epidemiol 154:287–288.<br />

Li J, Ji L. 2005. Adjusting <strong>multiple</strong> <strong>testing</strong> in multilocus analyses using<br />

the eigenvalues of a correlation matrix. Heredity 95:221–227.<br />

Lin Z, Altman RB. 2004. Finding haplotype tagging SNPs by use of<br />

principal components analysis. Am J Hum Genet 75:850–861.<br />

Mardia KV, Kent JT, Bibby JM. 1979. Multivariate Analysis.<br />

London: Academic Press.<br />

Meng Z, Zaykin DV, Xu CF, Wagner M, Ehm MG. 2003. Selection<br />

of genetic markers <strong>for</strong> association analyses, using linkage<br />

disequilibrium and haplotypes. Am J Hum Genet 73:115–130.<br />

Nielsen DM, Ehm MG, Weir BS. 1999. Detecting marker-disease<br />

association by <strong>testing</strong> <strong>for</strong> Hardy-Weinberg disequilibrium at a<br />

marker locus. Am J Hum Genet 63:1531–1540.<br />

Nyholt DR. 2004. A simple <strong>correction</strong> <strong>for</strong> <strong>multiple</strong> <strong>testing</strong> <strong>for</strong><br />

single-nucleotide polymorphisms in linkage disequilibrium<br />

with each other. Am J Hum Genet 74:765–769.<br />

Nyholt DR. 2005. Evaluation of Nyholt’s procedure <strong>for</strong> <strong>multiple</strong><br />

<strong>testing</strong> <strong>correction</strong>—author’s reply. Hum Hered 60:61–62.<br />

Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich<br />

D. 2006. Principal components analysis corrects <strong>for</strong> stratification in<br />

genome-wide association studies. Nat Genet 38:904–909.<br />

Rinaldo A, Bacanu SA, Devlin B, Sonpar V, Wasserman L, Roeder<br />

K. 2005. Characterization of multilocus linkage disequilibrium.<br />

Genet Epidemiol 28:193–206.<br />

Risch N, Merikangas K. 1996. The future of genetic studies of<br />

complex human diseases. Science 273:1516–1517.<br />

Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD,<br />

Parl FF, Moore JH. 2001. Multifactor-dimensionality reduction<br />

reveals high-order interactions among estrogen-metabolism<br />

genes in sporadic breast cancer. Am J Hum Genet 69:138–147.<br />

Salyakina D, Seaman SR, Browning BL, Dudbridge F, Muller-<br />

Myhsok B. 2005. Evaluation of Nyholt’s procedure <strong>for</strong> <strong>multiple</strong><br />

<strong>testing</strong> <strong>correction</strong>. Hum Hered 60:19–25.<br />

Sasieni PD. 1997. From genotypes to genes: doubling the sample<br />

size. Biometrics 53:1253–1261.<br />

Schäfer J, Strimmer K. 2005. A shrinkage approach to large scale<br />

covariance-matrix estimation and implications <strong>for</strong> functional<br />

genomics. Stat Appl Genet Mol Biol 4:32.<br />

Schaid DJ. 2004. Linkage disequilibrium <strong>testing</strong> when linkage<br />

phase is unknown. Genetics 166:505–512.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!