06.06.2013 Views

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

578 8 Nonparametric and Robust Inference<br />

and<br />

pH(y) =<br />

m<br />

k=1<br />

nk<br />

nvk<br />

As we have noted already, this is a bona fide estimator:<br />

<br />

IR d<br />

pH(y) ≥ 0<br />

pH(y)dy =<br />

m<br />

k=1<br />

= 1.<br />

ITk(y). (8.33)<br />

nk<br />

vk<br />

Although our discussion generally concerns observations on multivariate<br />

random variables, we should occasionally consider simple univariate observations.<br />

One reason why the univariate case is simpler is that the derivative is a<br />

scalar function. Another reason why we use the univariate case as a model is<br />

because it is easier to visualize. The density <strong>of</strong> a univariate random variable<br />

is two-dimensional, and densities <strong>of</strong> other types <strong>of</strong> random variables are <strong>of</strong><br />

higher dimension, so only in the univariate case can the density estimates be<br />

graphed directly.<br />

In the univariate case, we assume that the support is the finite interval<br />

[a, b]. We partition [a, b] into a grid <strong>of</strong> m nonoverlapping bins Tk = [tn,k, tn,k+1)<br />

where<br />

a = tn,1 < tn,2 < . . . < tn,m+1 = b.<br />

The univariate histogram is<br />

pH(y) =<br />

m<br />

k=1<br />

nvk<br />

nk<br />

n(tn,k+1 − tn,k) ITk(y). (8.34)<br />

If the bins are <strong>of</strong> equal width, say h (that is, tk = tk−1+h), the histogram<br />

is<br />

pH(y) = nk<br />

, for y ∈ Tk.<br />

nh<br />

This class <strong>of</strong> functions consists <strong>of</strong> polynomial splines <strong>of</strong> degree 0 with fixed<br />

knots, and the histogram is the maximum likelihood estimator over the class<br />

<strong>of</strong> step functions. Generalized versions <strong>of</strong> the histogram can be defined with<br />

respect to splines <strong>of</strong> higher degree. Splines with degree higher than 1 may<br />

yield negative estimators, but such histograms are also maximum likelihood<br />

estimators over those classes <strong>of</strong> functions.<br />

The histogram as we have defined it is sometimes called a “density histogram”,<br />

whereas a “frequency histogram” is not normalized by the n.<br />

<strong>Theory</strong> <strong>of</strong> <strong>Statistics</strong> c○2000–2013 James E. Gentle

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!