Gao X, Starmer J, Martin ER. A multiple testing correction method for ...
Gao X, Starmer J, Martin ER. A multiple testing correction method for ...
Gao X, Starmer J, Martin ER. A multiple testing correction method for ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
368 <strong>Gao</strong> et al.<br />
the SNPs all together if the eigenvalues can be<br />
derived. In the situations where the high dimensionality<br />
prohibits the calculation of eigenvalues, we can<br />
analyze the SNPs on each chromosome separately or<br />
according to the gene functions and then sum all of<br />
the Meff values together. The total Meff can be used to<br />
calculate the adjusted PW<strong>ER</strong>. In genome-wide<br />
association studies, we have to partition the SNPs<br />
into several parts and analyze them separately. Since<br />
SNPs on different chromosomes are expected to be<br />
in linkage equilibrium in general populations, the<br />
genome-wide effective number of independent tests<br />
can be obtained by summing the chromosome<br />
specific Meff values. For each chromosome, we may<br />
use the partition-ligation approach by dividing the<br />
SNPs into several parts, and then sum the Meff<br />
values from each partition, similar to how we tested<br />
our Alzheimer SNP data set. The total Meff is used in<br />
the final adjustment calculation. Due to the interblock<br />
correlations that are unlikely to be captured in<br />
this partition-ligation approach, the total Meff may<br />
be slightly conservative. However, the interblock<br />
correlations may be reduced if we partition SNPs<br />
according to their haplotype block structure.<br />
In summary, the simpleM algorithm provides a<br />
highly accurate approximation to the permutationbased<br />
<strong>correction</strong> threshold and is easily implemented.<br />
Itisshowntobesimple,fastandmoreaccuratethan<br />
recently developed <strong>method</strong>s and is comparable to the<br />
permutation-based <strong>correction</strong> threshold using both<br />
simulated and real SNP data. The efficiency and<br />
accuracy of the simpleM <strong>method</strong> make it an attractive<br />
choice <strong>for</strong> <strong>multiple</strong> <strong>testing</strong> adjustment when there is<br />
high intermarker LD in the SNP data set as in<br />
candidate gene or genome-wide association studies.<br />
ACKNOWLEDGMENTS<br />
This work was supported in part by NIH grants<br />
NS39764, AG019757 and AG20135 and NIEHS T32<br />
ES007126. We thank Dr. Gary Beecham who prepared<br />
the Alzheimer data <strong>for</strong> us. We thank Dr.<br />
Richard Morris <strong>for</strong> initial inspiration.<br />
REF<strong>ER</strong>ENCES<br />
Armitage P. 1955. Tests <strong>for</strong> linear trends in proportions and<br />
frequencies. Biometrics 11:375–386.<br />
Barrett JC, Fry B, Maller J, Daly MJ. 2005. Haploview: analysis and<br />
visualization of LD and haplotype maps. Bioin<strong>for</strong>matics<br />
21:263–265.<br />
Benjamini Y, Hochberg Y. 1995. Controlling the false discovery<br />
rate: a practical and powerful approach to <strong>multiple</strong> <strong>testing</strong>. J R<br />
Stat Soc B 57:289–300.<br />
Bonferroni CE. 1935. Il calcolo delle assicurazioni su gruppi di<br />
teste, chapter ‘‘Studi in Onore del Professore Salvatore ortu<br />
Carboni’’. Rome. p 13–60.<br />
Bonferroni CE. 1936. Teoria statistica delle classi e calcolo delle<br />
probabilitá. Pubblicazioni del Istituto Superiore di Scienze<br />
Economiche e Commerciali di Firenze 8:3–62.<br />
Genet. Epidemiol.<br />
Cheverud JM. 2001. A simple <strong>correction</strong> <strong>for</strong> <strong>multiple</strong> comparisons<br />
in interval mapping genome scans. Heredity 87:52–58.<br />
Churchill GA, Doerge RW. 1994. Empirical threshold values <strong>for</strong><br />
quantitative trait mapping. Genetics 138:963–971.<br />
Deng HW. 2000. Re: ‘‘biased tests of association: comparisons of<br />
allele frequencies when departing from Hardy-Weinberg<br />
proportions’’. Am J Epidemiol 151:335–336.<br />
Excoffier L, Slatkin M. 1995. Maximum-likelihood estimation of<br />
molecular haplotype frequencies in a diploid population. Mol<br />
Biol Evol 12:921–927.<br />
Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel<br />
B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero<br />
SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly<br />
MJ, Altshuler D. 2002. The structure of haplotype blocks in the<br />
human genome. Science 296:2225–2229.<br />
Hastie T, Tibshirani R, Friedman J. 2001. The Elements of<br />
Statistical Learning. Berlin: Springer.<br />
Hoh J, Wille A, Ott J. 2001. Trimming, weighting, and grouping<br />
SNPs in human case-control association studies. Genome Res<br />
11:2115–2119.<br />
Hudson RR. 2002. Generating samples under a Wright-Fisher<br />
neutral modal of genetic variation. Bioin<strong>for</strong>matics 18:337–338.<br />
Knapp M. 2001. Re:‘‘biased tests of association: comparisons of<br />
allele frequencies when departing from Hardy-Weinberg<br />
proportions’’. Am J Epidemiol 154:287–288.<br />
Li J, Ji L. 2005. Adjusting <strong>multiple</strong> <strong>testing</strong> in multilocus analyses using<br />
the eigenvalues of a correlation matrix. Heredity 95:221–227.<br />
Lin Z, Altman RB. 2004. Finding haplotype tagging SNPs by use of<br />
principal components analysis. Am J Hum Genet 75:850–861.<br />
Mardia KV, Kent JT, Bibby JM. 1979. Multivariate Analysis.<br />
London: Academic Press.<br />
Meng Z, Zaykin DV, Xu CF, Wagner M, Ehm MG. 2003. Selection<br />
of genetic markers <strong>for</strong> association analyses, using linkage<br />
disequilibrium and haplotypes. Am J Hum Genet 73:115–130.<br />
Nielsen DM, Ehm MG, Weir BS. 1999. Detecting marker-disease<br />
association by <strong>testing</strong> <strong>for</strong> Hardy-Weinberg disequilibrium at a<br />
marker locus. Am J Hum Genet 63:1531–1540.<br />
Nyholt DR. 2004. A simple <strong>correction</strong> <strong>for</strong> <strong>multiple</strong> <strong>testing</strong> <strong>for</strong><br />
single-nucleotide polymorphisms in linkage disequilibrium<br />
with each other. Am J Hum Genet 74:765–769.<br />
Nyholt DR. 2005. Evaluation of Nyholt’s procedure <strong>for</strong> <strong>multiple</strong><br />
<strong>testing</strong> <strong>correction</strong>—author’s reply. Hum Hered 60:61–62.<br />
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich<br />
D. 2006. Principal components analysis corrects <strong>for</strong> stratification in<br />
genome-wide association studies. Nat Genet 38:904–909.<br />
Rinaldo A, Bacanu SA, Devlin B, Sonpar V, Wasserman L, Roeder<br />
K. 2005. Characterization of multilocus linkage disequilibrium.<br />
Genet Epidemiol 28:193–206.<br />
Risch N, Merikangas K. 1996. The future of genetic studies of<br />
complex human diseases. Science 273:1516–1517.<br />
Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD,<br />
Parl FF, Moore JH. 2001. Multifactor-dimensionality reduction<br />
reveals high-order interactions among estrogen-metabolism<br />
genes in sporadic breast cancer. Am J Hum Genet 69:138–147.<br />
Salyakina D, Seaman SR, Browning BL, Dudbridge F, Muller-<br />
Myhsok B. 2005. Evaluation of Nyholt’s procedure <strong>for</strong> <strong>multiple</strong><br />
<strong>testing</strong> <strong>correction</strong>. Hum Hered 60:19–25.<br />
Sasieni PD. 1997. From genotypes to genes: doubling the sample<br />
size. Biometrics 53:1253–1261.<br />
Schäfer J, Strimmer K. 2005. A shrinkage approach to large scale<br />
covariance-matrix estimation and implications <strong>for</strong> functional<br />
genomics. Stat Appl Genet Mol Biol 4:32.<br />
Schaid DJ. 2004. Linkage disequilibrium <strong>testing</strong> when linkage<br />
phase is unknown. Genetics 166:505–512.