28.07.2014 Views

Answers to Homework #7 - Statistics

Answers to Homework #7 - Statistics

Answers to Homework #7 - Statistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Homework</strong> <strong>#7</strong> solutions<br />

Stat 640<br />

1. True. Let the columns of X 0 be a basis for V 0 . Then the projection of y on<strong>to</strong> V 0<br />

is<br />

Π(y|V 0 ) = X 0 (X ′ 0X 0 ) −1 X ′ 0y<br />

= X 0 (X ′ 0X 0 ) −1 X ′ 0ŷ + X 0(X ′ 0X 0 ) −1 X ′ 0(y − ŷ)<br />

and the second term on the right is zero because the columns of X 0 are contained<br />

in V and hence orthogonal <strong>to</strong> y − ŷ.<br />

2. Let the parameter vec<strong>to</strong>r be (β, α 1 , α 2 , α 3 ), where β is the joint slope and the αs<br />

are the expected dry weights for the middle fertilizer amount. (Note that we use<br />

the “centered” coding for fertilizer values. We will find simultaneous confidence<br />

intervals for η = c ′ β where c 1 = 0 and c 2 + c 3 + c 4 = 0. Thus, q = 2. Because<br />

of the way we have coded the variables, X ′ X is a diagonal matrix with values<br />

10, 5, 5, 5. For any of the desired pair-wise contrasts, we have c ′ (X ′ X) −1 c = 2/5,<br />

and F .95 (2, 11) = 3.98. For these data, we get√S = 2.612 so that we can compare all<br />

pairwise differences in the α parameters <strong>to</strong> S qc ′ (X ′ X) −1 cF 0.95 (q, n − k) = 4.66.<br />

The estimates are α 1 = 45.4, α 2 = 46.8, and α 3 = 50.6. Therefore, we find that<br />

the expected dry weights using soils A and C are significantly different.<br />

3. Here is my code; it should work for any k and n. I get 4.05 for the 95th percentile.<br />

nloop=100000<br />

q=1:nloop*0<br />

k=4<br />

n=5<br />

xm=1:k;xv=xm<br />

for(iloop in 1:nloop){<br />

x=matrix(rnorm(k*n),nrow=k)<br />

for(i in 1:k){<br />

xm[i]=mean(x[i,])


xv[i]=var(x[i,])<br />

}<br />

sp=mean(xv)<br />

dmax=0<br />

for(i in 1:(k-1)){<br />

for(j in (i+1):k){<br />

dist=abs(xm[i]-xm[j])<br />

if(dist>dmax){dmax=dist}<br />

}<br />

}<br />

q[iloop]=dmax/sqrt(sp/n)<br />

}<br />

hist(q)<br />

sort(q)[95000]<br />

His<strong>to</strong>gram of q<br />

Frequency<br />

0 10000<br />

0 2 4 6 8 10<br />

q<br />

4. Whatever j we choose, we will have µ = (6, . . . , 6, 10, . . . , 10, 12, . . . , 12, 8, . . . , 8) ′ ,<br />

so that µ 0 = (8, . . . , 8, 10, . . . , 10) ′ and ‖ µ − µ 0 ‖ 2 = 16j. Then δ = 16j/4 = 4j.<br />

The critical value for the test is F 0.95 (2, 4(j − 1)). Let’s try some values of j.<br />

For j = 3, we have F 0.95 (2, 8) = 4.459, δ = 12, and P (F > 4.459) = .72 under H a .<br />

For j = 4, we have F 0.95 (2, 12) = 3.885, δ = 16, and P (F > 3.885) = .89 under


H a .<br />

For j = 5, we have F 0.95 (2, 16) = 3.634, δ = 20, and P (F > 3.634) = .96 under<br />

H a .<br />

5. We have µ = (16, . . . , 16, 20, . . . , 20) ′ , where there are 2j elements equal <strong>to</strong> 16 and<br />

3j elements equal <strong>to</strong> 20, and µ 0 = (18.4, . . . , 18.4) ′ . ‖ µ − µ 0 ‖ 2 = 19.2j. Then<br />

δ = 19.2j/25 = .768j.<br />

For j = 5, we have F 0.95 (4, 20) = 2.866, δ = 3.84, and P (F > 2.866) = .25 under<br />

H a .<br />

For j = 10, we have F 0.95 (4, 45) = 2.579, δ = 7.68, and P (F > 2.579) = .53 under<br />

H a .<br />

For j = 20, we have F 0.95 (4, 95) = 2.467, δ = 15.36, and P (F > 2.467) = .88 under<br />

H a .<br />

For j = 17, we have F 0.95 (4, 80) = 2.486, δ = 13.01, and P (F > 2.579) = .81 under<br />

H a .<br />

6. (a) Let β 0 be the expected mercury measurement at the plant, let β 1 be the<br />

downstream rate of increase of mercury, and let β 2 be the upstream rate of increase<br />

of mercury. Suppose the measurements are ordered so that y 1 is the measurement<br />

two kilometers downstream, y 9 is the measurement two kilometers upstream, and<br />

the others are consecutive up the river. Then y = Xβ + ɛ, where β = (β 0 , β 1 , β 2 ) ′<br />

and<br />

⎛<br />

⎞<br />

1 2 0<br />

1 1.5 0<br />

1 1 0<br />

1 .5 0<br />

X =<br />

1 0 0<br />

.<br />

1 0 .5<br />

1 0 1<br />

⎜<br />

⎟<br />

⎝ 1 0 1.5 ⎠<br />

1 0 2<br />

(b) H 0 : β 1 = β 2 = 0; H a : at least one is not zero.


(c) Let SSE 0 be ∑ 9<br />

i=1 (y i − ȳ) 2 , the sum of squared residuals under H 0 . Let SSE 1<br />

be sum of squared residuals under the full model in (a). Then<br />

F = (SSE 0 − SSE 1 )/2<br />

SSE 1 /6<br />

is distributed as F (2, 6) if the null hypothesis is really true.<br />

(d) If the environmentalist are correct,<br />

E(y) = µ = (75, .8125, .875, .9375, 1, .875, .75, .625, .5) × 800<br />

= (600, 650, 700, 750, 800, 700, 600, 500, 400).<br />

For the null hypothesis µ 0 , we project µ on<strong>to</strong> the one-vec<strong>to</strong>r, <strong>to</strong> get (633.33)1,<br />

and ‖ µ − µ 0 ‖ 2 = 125000. The estimate of the model variance is σ 2 = 100, so<br />

δ = 1250. We will reject H 0 if our F -statistic is greater than F 0.95 (2, 6) = 5.143.<br />

The power of the test is about one!

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!