11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

54 Unorthodox journey to statisticsof sample treatment outcomes between the two means being compared. Ourwork shows that range-based methods lead to a higher proportion of separationsthan individual p-value methods. In connection with the FDR, it appearsthat an old test procedure, the Newman–Keuls, which fell out of favor becauseit did not control FWER, does control FDR. Extensive simulation results supportthis conclusion; proofs are incomplete. Interpretability is the main issueI’m working on at present.5.4 General comments on multiplicityAlthough championed by the very eminent John Tukey, multiplicity was abackwater of research and the issues were ignored by many researchers. Thisarea has become much more prominent with the recent advent of “Big Data.”Technological advances in recent years, as we know, have made massiveamounts of data available bringing the desire to test thousands if not millionsof hypotheses; application areas, for example, are genomics, proteomics,neurology, astronomy. It becomes impossible to ignore the multiplicity issuesin these cases, and the field has enjoyed a remarkable development withinthe last 20 years. Much of the development has been applied in the contextof big data. The FDR as a criterion is often especially relevant in this context.Many variants of the FDR criterion have been proposed, a number ofthem in combination with the use of empirical Bayes methods to estimate theproportion of true hypotheses. Resampling methods are also widely used totake dependencies into account. Another recent approach involves considerationof a balance between Type I and Type II errors, often in the context ofsimultaneous treatment of FDR and some type of false nondiscovery rate.Yet the problems of multiplicity are just as pressing in small data situations,although often not recognized by practitioners in those areas. Accordingto Young (2009), many epidemiologists feel they don’t have to take multiplicityinto account. Young and others claim that the great majority of apparentresults in these fields are Type I errors; see Ioannidis (2005). Many of thenewer approaches can’t be applied satisfactorily in small data problems.The examples cited above — one-way ANOVA designs and the large dataproblems noted — are what might be called well-structured testing problems.In general, there is a single set of hypotheses to be treated uniformly in testing,although there are variations. Most methodological research applies in thiscontext. However, there have always been data problems of a very differentkind, which might be referred to as ill-structured. These are cases in whichthere are hypotheses of different types, and often different importance, and itisn’t clear how to structure them into families, each of which would be treatedwith a nominal error control measure.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!