11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A.P. Dempster 27524.5 Nonparametric inferenceWhen I became a PhD student in the mid-50s, Sam Wilks suggested to methat the topic of “nonparametric” or “distribution-free” statistical inferencehad been largely worked through in the 1940s, in no small part through hisefforts, implying that I might want to look elsewhere for a research topic.I conclude here by sketching how DS could introduce new thinking that goesback to the roots of this important topic.Auseofbinomialsamplingprobabilitiessimilartothatinmycoin-tossingexample arises in connection with sampling a univariate continuous observable.In a 1939 obituary for “Student” (W.S. Gosset), Fisher recalled thatGosset had somewhere remarked that given a random sample of size 2 with acontinuous observable, the probability is 1/2 that the population median liesbetween the observations, with the remaining probabilities 1/4 and 1/4 evenlysplit between the two sides of the data. In a footnote, Fisher pointed out howStudent’s remark could be generalized to use binomial sampling probabilitiesto locate with computed probabilistic uncertainty any nominated populationquantile in each of the n +1intervalsdeterminedbythedata.InDSterms,the same ordered uniformly distributed auxiliaries used in connection with“binomial” sampling a dichotomy extend easily to provide marginal mass distributionposteriors for any unknown quantile of the population distribution,not just the quantiles at the observed data points. When the DS analysis isextended to placing an arbitrary population quantile in intervals other thanexactly determined by the observations, (p, q, r) inferences arise that in generalhave r>0, including r = 1 for assertions concerning the population CDFin regions in the tails of the data beyond the largest and smallest observations.In addition, DS would have allowed Student to extend his analysis to predictthat a third sample draw can be predicted to lie in each of the three regionsdetermined by the data with equal probabilities 1/3, or more generally withequal probabilities 1/(n+1) in the n+1 regions determined by n observations.The “nonparametric” analysis that Fisher pioneered in his 1939 footnoteserves to illustrate DS-ECP logic in action. It can also serve to illustrate theneed for critical examination of particular models and consequent analyses.Consider the situation of a statistician faced with analysis of a sample ofmodest size, such as n = 30, where a casual examination of the data suggeststhat the underlying population distribution has a smooth CDF but does notconform to an obvious simple parametric form such as Gaussian. After plottingthe data, it would not be surprising to see that the lengths of intervals betweensuccessive ordered data point over the middle ranges of the data vary by afactor of two or three. The nonparametric model asserts that these intervalshave equal probabilities 1/(n + 1) = 1/31 of containing the next draw, but abroker offering both sides of bets based on these probabilities would soon belosing money because takers would bet with higher and lower probabilities for

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!