20.03.2013 Views

From Algorithms to Z-Scores - matloff - University of California, Davis

From Algorithms to Z-Scores - matloff - University of California, Davis

From Algorithms to Z-Scores - matloff - University of California, Davis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

12.2. BIAS AND VARIANCE 247<br />

people, and for W and s. For convenience, let’s suppose we record that last column as s 2 instead<br />

<strong>of</strong> s.<br />

Now, say we want <strong>to</strong> estimate the population variance σ 2 . As discussed earlier, the natural estima<strong>to</strong>r<br />

for it would be the sample variance, s 2 . What (12.39) says is that after looking at an infinite number<br />

<strong>of</strong> lines in the notebook, the average value <strong>of</strong> s 2 would be just...a...little...bit...<strong>to</strong>o...small. All the<br />

s 2 values would average out <strong>to</strong> 0.999σ 2 , rather than <strong>to</strong> σ 2 . We might say that s 2 has a little bit<br />

more tendency <strong>to</strong> underestimate σ 2 than <strong>to</strong> overestimate it.<br />

So, (12.39) implies that s 2 is a biased estima<strong>to</strong>r <strong>of</strong> the population variance σ 2 , with the amount <strong>of</strong><br />

bias being<br />

n − 1<br />

n · σ2 − σ 2 = − 1<br />

· σ2<br />

n<br />

(12.40)<br />

Let’s prove (12.39). As before, let W be a random variable distributed as the population, and let<br />

W1, ..., Wn be a random sample from that population. So, EWi = µ and V ar(Wi) − σ 2 , where<br />

again µ and σ 2 are the population mean and variance.<br />

It will be more convenient <strong>to</strong> work with ns 2 than s 2 , since it will avoid a lot <strong>of</strong> dividing by n. So,<br />

write<br />

ns 2 =<br />

=<br />

=<br />

But that middle sum is<br />

So,<br />

n<br />

(Wi − W ) 2<br />

i=1<br />

n <br />

(Wi − µ) + (µ − W ) 2 i=1<br />

n<br />

(Wi − µ) 2 + 2(µ − W )<br />

i=1<br />

n<br />

(Wi − µ) =<br />

i=1<br />

ns 2 =<br />

(def.) (12.41)<br />

(alg.) (12.42)<br />

n<br />

(Wi − µ) + n(µ − W ) 2<br />

i=1<br />

(alg.) (12.43)<br />

n<br />

Wi − nµ = nW − nµ (12.44)<br />

i=1<br />

n<br />

(Wi − µ) 2 − n(W − µ) 2<br />

i=1<br />

(12.45)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!