28.07.2014 Views

Answers to Homework #5 - Statistics

Answers to Homework #5 - Statistics

Answers to Homework #5 - Statistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Homework</strong> <strong>#5</strong> Solutions<br />

1. (a) No; the one-vec<strong>to</strong>r is not in the column space of the design matrix. (b) Yes, the<br />

one-vec<strong>to</strong>r is in the column space of the design matrix. (c) Yes (d) Yes (e) No<br />

2. We have<br />

X ′ X =<br />

(<br />

5 0<br />

0 50<br />

)<br />

; and (X ′ X) −1 =<br />

(<br />

1/5 0<br />

0 1/50<br />

We can compute the projection matrix using H = X(X ′ X) −1 X ′ , <strong>to</strong> get<br />

⎛<br />

H =<br />

⎜<br />

⎝<br />

19/50 16/50 13/50 10/50 −8/50<br />

16/50 14/50 12/50 10/50 −2/50<br />

13/50 13/50 11/50 4/50 10/50<br />

10/50 10/50 10/50 10/50 10/50<br />

−8/50 −2/50 4/50 10/50 46/50<br />

The covariance matrix for the residual vec<strong>to</strong>r is (I − H)σ 2 , so the variance for each<br />

individual residual can be taken from the diagonal of the hat matrix: var(e 1 ) =<br />

31/50σ 2 ; var(e 2 ) = 36/50σ 2 ; var(e 3 ) = 39/50σ 2 ; var(e 4 ) = 40/50σ 2 ; and var(e 5 ) =<br />

14/50σ 2 . The 5th residual has substanially smaller variance than the others.<br />

The leverages can be taken from the hat matrix as well: 19/50, 14/50, 11/50, 10/50,<br />

and 46/50. Note that the 5th observation has a leverage more than twice as big as<br />

any of the others.<br />

Now, we find the actual residual vec<strong>to</strong>r given the data. We get X ′ y = (36, 25) ′ , and<br />

ˆβ = (7.2, 0.5)’. The residual vec<strong>to</strong>r is e = (4.3, −0.2, −1.7, .4.2, 1.8) ′ . The leaveone-out<br />

residuals are: e (−1) = 6.94, e (−2) = −0.78, e (−3) = −2.18, e (−4) = −5.25,<br />

and e (−5) = 22.5.<br />

The 5th observation has a large leverage and is very influential. If we fit a line <strong>to</strong><br />

the first four observations, we get an intercept estimate of 2.7 and a slope estimate<br />

of -2.2. When we plug in x = 6, the predicted value of y is -10.5. Note that e (−5) is<br />

the difference between the observed value at x 5 and the predicted value at x 5 , with<br />

that observation not used in the prediction.<br />

3. The 95th percentile of the simulated distribution is about 7, and the test statistic is<br />

only 2.57, so we do not reject the null hypothesis. There is no evidence of violation<br />

of the equal variances assumption.<br />

Here is the R code I used:<br />

⎞<br />

⎟<br />

⎠<br />

)<br />

.


# code <strong>to</strong> simulate the null distribution of Fmax when the<br />

# sample sizes are 12, 8, 10, 10, 14.<br />

svec=1:5<br />

nloop=100000<br />

fmax=1:nloop*0<br />

nj=c(12,8,10,10,14)<br />

for(i in 1:nloop){<br />

for(j in 1:5){<br />

x=rnorm(nj[j])<br />

svec[j]=var(x)<br />

}<br />

fmax[i]=max(svec)/min(svec)<br />

}<br />

0 10 20 30 40<br />

4. The fit using two parabolas with vertex at the origin is shown in (a), with the<br />

residuals in (b). The hypothesis test gives p = .044, which is significant at α = .05<br />

but just barely. Because the residual variance seems <strong>to</strong> be growing in time, and<br />

because we expect the rust measurements <strong>to</strong> be more variable if the metal sits<br />

longer, we decide <strong>to</strong> use a weighted model. If we assume that the variance of y is<br />

proportional <strong>to</strong> x, the fit is shown in (c). It is not noticeably different than the<br />

unweighted fit; however, the p-value is now p = .074. The residuals look better, but<br />

now let’s assume that the variance of y is proportional <strong>to</strong> x 2 . The fit again looks<br />

very similar, but now we get p = .141.<br />

(a)<br />

res<br />

-5 0 5<br />

(b)<br />

y<br />

0 10 20 30 40<br />

(c)<br />

restr<br />

-4 -2 0 2 4<br />

(d)<br />

restr<br />

-2 -1 0 1 2<br />

(e)<br />

1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />

1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />

1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />

1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />

1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />

To explain why the results are not significant after we weight the analysis, we point<br />

out <strong>to</strong> the scientists that most of the difference in the data is seen at month 4, where<br />

the paint A measurements (green dots) are smaller than the paint B measurements.<br />

If we don’t account for the fact that the variance is also higher here, these differences<br />

seem more significant. Notice that for the first month’s measurements, the means<br />

are about the same. When we weight these values more, the differences in paint are<br />

not significant.


Here is the R code for the analysis and the plots. The variable x is the time (months),<br />

a is the indica<strong>to</strong>r for paint A, and y is the measurement of the rust.<br />

plot(x-1/20+a/10,y,col=a+2)<br />

title("(a)")<br />

# try the IID model:<br />

x1=x^2;x2=x^2*a<br />

xmat=cbind(x1,x2)<br />

xpxinv=solve(t(xmat)%*%xmat)<br />

betahat=xpxinv%*%t(xmat)%*%y<br />

yhat=xmat%*%betahat<br />

sse=sum((y-yhat)^2)<br />

tstat=betahat[2]/sqrt(sse/38*xpxinv[2,2])<br />

# two-sided p-value<br />

pval=(1-pt(abs(tstat),38))*2<br />

lines(8:42/10,betahat[1]*(8:42/10)^2,col=2)<br />

lines(8:42/10,sum(betahat)*(8:42/10)^2,col=3)<br />

# Plot the residuals and notice "fanning out"<br />

res=y-xmat%*%betahat<br />

plot(x,res)<br />

lines(c(.6,4.4),c(0,0),lty=3)<br />

title("(b)")<br />

# Now assume variance is proportional <strong>to</strong> x;<br />

# get t-stat in transformed model<br />

ytr=y/sqrt(x)<br />

x1tr=x^2/sqrt(x)<br />

x2tr=x^2*a/sqrt(x)<br />

xmattr=cbind(x1tr,x2tr)<br />

xpxinvtr=solve(t(xmattr)%*%xmattr)<br />

betatr=xpxinvtr%*%t(xmattr)%*%ytr<br />

yhattr=xmattr%*%betatr<br />

ssetr=sum((ytr-yhattr)^2)<br />

tstattr=betatr[2]/sqrt(xpxinvtr[2,2]*ssetr/38)


# two-sided p-value<br />

pvaltr=(1-pt(abs(tstattr),38))*2<br />

plot(x-1/20+a/10,y,col=a+2)<br />

lines(8:42/10,betatr[1]*(8:42/10)^2,col=2)<br />

lines(8:42/10,sum(betatr)*(8:42/10)^2,col=3)<br />

title("(c)")<br />

# plot residuals again -- in transformed model<br />

restr=ytr-xmattr%*%betatr<br />

plot(x,restr)<br />

lines(c(.6,4.4),c(0,0),lty=3)<br />

title("(d)")<br />

# Now assume variance is proportional <strong>to</strong> x^2;<br />

# get t-stat in transformed model<br />

ytr=y/x<br />

x1tr=x^2/x<br />

x2tr=x^2*a/x<br />

xmattr=cbind(x1tr,x2tr)<br />

xpxinvtr=solve(t(xmattr)%*%xmattr)<br />

betatr=xpxinvtr%*%t(xmattr)%*%ytr<br />

yhattr=xmattr%*%betatr<br />

ssetr=sum((ytr-yhattr)^2)<br />

tstattr=betatr[2]/sqrt(xpxinvtr[2,2]*ssetr/38)<br />

# two-sided p-value<br />

pvaltr=(1-pt(abs(tstattr),38))*2<br />

# plot residuals again -- in transformed model<br />

restr=ytr-xmattr%*%betatr<br />

plot(x,restr)<br />

lines(c(.6,4.4),c(0,0),lty=3)<br />

title("(e)")<br />

5. (a) We can assume that the water evaporates linearly with time. Let y i be the<br />

amount of evaporation for the one-inch beaker on the ith day, and let y 5+i be<br />

the amount of evaporation for the two-inch beaker removed on the ith day, for<br />

i = 1, . . . , 5. Let β 1 be the evaporation rate (ml per day) from the one-inch beakers,


and let β 2 be the evaporation rate (ml per day) from the two-inch beakers. Then<br />

we can write y = Xβ + ɛ, where<br />

⎛ ⎞<br />

1 0<br />

2 0<br />

3 0<br />

4 0<br />

X =<br />

5 0<br />

0 1<br />

.<br />

0 2<br />

0 3<br />

⎜ ⎟<br />

⎝ 0 4 ⎠<br />

0 5<br />

(b) The null hypothesis is H 0 : β 2 = 2β 1 , and the alternative is β 2 ≠ 2β 1 . We can<br />

use c = (−2, 1) and write H 0 : c ′ β = 0. The test concerning linear combinations of<br />

parameters tells us:<br />

T =<br />

c ′ ˆβ<br />

√<br />

c ′ (X ′ X) −1 cSSE/(n − k)<br />

has a t(n−k) density under H 0 . Plugging things in <strong>to</strong> simplify, we get c ′ (X ′ X) −1 c =<br />

1/11, n − k = 8, and<br />

T = [y 6 + y 7 + y 8 + y 9 + y 10 − 2(y 1 + y 2 + y 3 + y 4 + y 5 )]/55<br />

√<br />

.<br />

SSE/88<br />

6. Let β 0 be the level of white blood cells before the injection, and let β 1 be the rate of<br />

change (per hour) after the injection, and ˆβ = (β 0 , β 1 ). If x is the number of hours<br />

after 9:00, then we have the design matrix<br />

and the model is y = Xβ + ɛ.<br />

⎛<br />

X =<br />

⎜<br />

⎝<br />

1 0<br />

1 0<br />

1 0<br />

1 0<br />

1 1<br />

1 2<br />

1 3<br />

1 4<br />

⎞<br />

⎟<br />


7. If β = (β 1 , β 2 , β 3 , β 4 ), then we can make<br />

⎛<br />

⎞<br />

1 0 1 0<br />

1 0 1 0<br />

1 0 0 1<br />

1 0 0 1<br />

X =<br />

.<br />

0 1 0 1<br />

0 1 0 1<br />

⎜<br />

⎟<br />

⎝ 0 1 1 0 ⎠<br />

0 1 1 0<br />

(b) There are three dimensions in the row space. We can use rows <strong>to</strong> find estimable<br />

linear combinations! Two are: β 1 +β 3 , and β 1 +β 4 . We can not estimate β 1 because<br />

(1, 0, 0, 0) is not in the row space. Similarly, we can’t estimate β 1 + β 2 .<br />

(c) We can simply use the first three dummies in the model, so that ˜β = (β 1 , β 2 , β 3 ),<br />

and<br />

⎛ ⎞<br />

1 0 1<br />

1 0 1<br />

1 0 0<br />

1 0 0<br />

˜X =<br />

.<br />

0 1 0<br />

0 1 0<br />

⎜ ⎟<br />

⎝ 0 1 1 ⎠<br />

0 1 1<br />

(d) β 1 is the expected tumor size after two weeks of Treatment A, low dose; β 2 is the<br />

expected tumor size after two weeks of Treatment B, low dose; β 3 is the expected<br />

increase in tumor size after two weeks of treatment using high dose compared <strong>to</strong><br />

low dose, for either A or B.<br />

8. Let x = (1, 2, 3, 2, 3, 4) ′ and d a = (1, 1, 1, 0, 0, 0) and 1 = (1, 1, 1, 1, 1, 1). Then we<br />

want <strong>to</strong> project the response y on<strong>to</strong> the linear space L(1, d a , x). We know that we<br />

will have “confounding” because the pot B plants are given more fertilizer. Let’s<br />

compute the VIF for fertilizer. To do this, we get the R 2 for the model using x<br />

as the response and d a (and 1) as predic<strong>to</strong>rs. We get ˆx = (2, 2, 2, 3, 3, 3) ′ , so the<br />

SSE = 4. Also, SST = 5.5, and R 2 = 3/11. So V = 11/8.<br />

9. Note that the design matrix is not full rank; we expect <strong>to</strong> get a zero eigenvalue. We<br />

solve<br />

⎛<br />

⎞<br />

4 − λ 2 2<br />

⎜<br />

⎟<br />

det ⎝ 2 2 − λ 0 ⎠ = λ(λ − 2)(λ − 6).<br />

2 0 2 − λ


<strong>to</strong> get ordered eigenvalues 6, 2, and 0. To get the eigenvec<strong>to</strong>r associated with λ = 6,<br />

we solve<br />

⎛ ⎞ ⎛ ⎞ ⎛ ⎞<br />

4 2 2 x 1 x 1<br />

⎜ ⎟ ⎜ ⎟ ⎜ ⎟<br />

⎝ 2 2 0 ⎠ ⎝ x 2 ⎠ = 6 ⎝ x 2 ⎠ .<br />

2 0 2 x 3 x 3<br />

We get v 1 = (2/ √ 6, 1/ √ 6, 1/ √ 6) ′ . Similarly, v 2 = (0, 1/ √ 2, −1/ √ 2) ′ , and v 3 =<br />

(1/ √ 3, −1/ √ 3, −1/ √ 3) ′ .<br />

We have chosen length one for all of the eigenvec<strong>to</strong>rs; also note that they form an<br />

orthogonal set. The spectral decomposition is:<br />

⎛<br />

⎜<br />

⎝<br />

4 2 2<br />

2 2 0<br />

2 0 2<br />

⎞<br />

⎟<br />

⎠ =<br />

⎛<br />

⎜<br />

⎝<br />

2/ √ 6 0 1/ √ 3<br />

1/ √ 6 1/ √ 2 −1/ √ 3<br />

1/ √ 6 −1/ √ 2 −1/ √ 3<br />

⎞ ⎛<br />

⎟ ⎜<br />

⎠ ⎝<br />

6 0 0<br />

0 2 0<br />

0 0 0<br />

⎞ ⎛<br />

⎟ ⎜<br />

⎠ ⎝<br />

2/ √ 6 1/ √ 6 1/ √ 6<br />

0 1/ √ 2 −1/ √ 2<br />

1/ √ 3 −1/ √ 3 −1/ √ 3<br />

⎞<br />

⎟<br />

⎠ .

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!