Answers to Homework #5 - Statistics
Answers to Homework #5 - Statistics
Answers to Homework #5 - Statistics
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Homework</strong> <strong>#5</strong> Solutions<br />
1. (a) No; the one-vec<strong>to</strong>r is not in the column space of the design matrix. (b) Yes, the<br />
one-vec<strong>to</strong>r is in the column space of the design matrix. (c) Yes (d) Yes (e) No<br />
2. We have<br />
X ′ X =<br />
(<br />
5 0<br />
0 50<br />
)<br />
; and (X ′ X) −1 =<br />
(<br />
1/5 0<br />
0 1/50<br />
We can compute the projection matrix using H = X(X ′ X) −1 X ′ , <strong>to</strong> get<br />
⎛<br />
H =<br />
⎜<br />
⎝<br />
19/50 16/50 13/50 10/50 −8/50<br />
16/50 14/50 12/50 10/50 −2/50<br />
13/50 13/50 11/50 4/50 10/50<br />
10/50 10/50 10/50 10/50 10/50<br />
−8/50 −2/50 4/50 10/50 46/50<br />
The covariance matrix for the residual vec<strong>to</strong>r is (I − H)σ 2 , so the variance for each<br />
individual residual can be taken from the diagonal of the hat matrix: var(e 1 ) =<br />
31/50σ 2 ; var(e 2 ) = 36/50σ 2 ; var(e 3 ) = 39/50σ 2 ; var(e 4 ) = 40/50σ 2 ; and var(e 5 ) =<br />
14/50σ 2 . The 5th residual has substanially smaller variance than the others.<br />
The leverages can be taken from the hat matrix as well: 19/50, 14/50, 11/50, 10/50,<br />
and 46/50. Note that the 5th observation has a leverage more than twice as big as<br />
any of the others.<br />
Now, we find the actual residual vec<strong>to</strong>r given the data. We get X ′ y = (36, 25) ′ , and<br />
ˆβ = (7.2, 0.5)’. The residual vec<strong>to</strong>r is e = (4.3, −0.2, −1.7, .4.2, 1.8) ′ . The leaveone-out<br />
residuals are: e (−1) = 6.94, e (−2) = −0.78, e (−3) = −2.18, e (−4) = −5.25,<br />
and e (−5) = 22.5.<br />
The 5th observation has a large leverage and is very influential. If we fit a line <strong>to</strong><br />
the first four observations, we get an intercept estimate of 2.7 and a slope estimate<br />
of -2.2. When we plug in x = 6, the predicted value of y is -10.5. Note that e (−5) is<br />
the difference between the observed value at x 5 and the predicted value at x 5 , with<br />
that observation not used in the prediction.<br />
3. The 95th percentile of the simulated distribution is about 7, and the test statistic is<br />
only 2.57, so we do not reject the null hypothesis. There is no evidence of violation<br />
of the equal variances assumption.<br />
Here is the R code I used:<br />
⎞<br />
⎟<br />
⎠<br />
)<br />
.
# code <strong>to</strong> simulate the null distribution of Fmax when the<br />
# sample sizes are 12, 8, 10, 10, 14.<br />
svec=1:5<br />
nloop=100000<br />
fmax=1:nloop*0<br />
nj=c(12,8,10,10,14)<br />
for(i in 1:nloop){<br />
for(j in 1:5){<br />
x=rnorm(nj[j])<br />
svec[j]=var(x)<br />
}<br />
fmax[i]=max(svec)/min(svec)<br />
}<br />
0 10 20 30 40<br />
4. The fit using two parabolas with vertex at the origin is shown in (a), with the<br />
residuals in (b). The hypothesis test gives p = .044, which is significant at α = .05<br />
but just barely. Because the residual variance seems <strong>to</strong> be growing in time, and<br />
because we expect the rust measurements <strong>to</strong> be more variable if the metal sits<br />
longer, we decide <strong>to</strong> use a weighted model. If we assume that the variance of y is<br />
proportional <strong>to</strong> x, the fit is shown in (c). It is not noticeably different than the<br />
unweighted fit; however, the p-value is now p = .074. The residuals look better, but<br />
now let’s assume that the variance of y is proportional <strong>to</strong> x 2 . The fit again looks<br />
very similar, but now we get p = .141.<br />
(a)<br />
res<br />
-5 0 5<br />
(b)<br />
y<br />
0 10 20 30 40<br />
(c)<br />
restr<br />
-4 -2 0 2 4<br />
(d)<br />
restr<br />
-2 -1 0 1 2<br />
(e)<br />
1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />
1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />
1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />
1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />
1.0 1.5 2.0 2.5 3.0 3.5 4.0<br />
To explain why the results are not significant after we weight the analysis, we point<br />
out <strong>to</strong> the scientists that most of the difference in the data is seen at month 4, where<br />
the paint A measurements (green dots) are smaller than the paint B measurements.<br />
If we don’t account for the fact that the variance is also higher here, these differences<br />
seem more significant. Notice that for the first month’s measurements, the means<br />
are about the same. When we weight these values more, the differences in paint are<br />
not significant.
Here is the R code for the analysis and the plots. The variable x is the time (months),<br />
a is the indica<strong>to</strong>r for paint A, and y is the measurement of the rust.<br />
plot(x-1/20+a/10,y,col=a+2)<br />
title("(a)")<br />
# try the IID model:<br />
x1=x^2;x2=x^2*a<br />
xmat=cbind(x1,x2)<br />
xpxinv=solve(t(xmat)%*%xmat)<br />
betahat=xpxinv%*%t(xmat)%*%y<br />
yhat=xmat%*%betahat<br />
sse=sum((y-yhat)^2)<br />
tstat=betahat[2]/sqrt(sse/38*xpxinv[2,2])<br />
# two-sided p-value<br />
pval=(1-pt(abs(tstat),38))*2<br />
lines(8:42/10,betahat[1]*(8:42/10)^2,col=2)<br />
lines(8:42/10,sum(betahat)*(8:42/10)^2,col=3)<br />
# Plot the residuals and notice "fanning out"<br />
res=y-xmat%*%betahat<br />
plot(x,res)<br />
lines(c(.6,4.4),c(0,0),lty=3)<br />
title("(b)")<br />
# Now assume variance is proportional <strong>to</strong> x;<br />
# get t-stat in transformed model<br />
ytr=y/sqrt(x)<br />
x1tr=x^2/sqrt(x)<br />
x2tr=x^2*a/sqrt(x)<br />
xmattr=cbind(x1tr,x2tr)<br />
xpxinvtr=solve(t(xmattr)%*%xmattr)<br />
betatr=xpxinvtr%*%t(xmattr)%*%ytr<br />
yhattr=xmattr%*%betatr<br />
ssetr=sum((ytr-yhattr)^2)<br />
tstattr=betatr[2]/sqrt(xpxinvtr[2,2]*ssetr/38)
# two-sided p-value<br />
pvaltr=(1-pt(abs(tstattr),38))*2<br />
plot(x-1/20+a/10,y,col=a+2)<br />
lines(8:42/10,betatr[1]*(8:42/10)^2,col=2)<br />
lines(8:42/10,sum(betatr)*(8:42/10)^2,col=3)<br />
title("(c)")<br />
# plot residuals again -- in transformed model<br />
restr=ytr-xmattr%*%betatr<br />
plot(x,restr)<br />
lines(c(.6,4.4),c(0,0),lty=3)<br />
title("(d)")<br />
# Now assume variance is proportional <strong>to</strong> x^2;<br />
# get t-stat in transformed model<br />
ytr=y/x<br />
x1tr=x^2/x<br />
x2tr=x^2*a/x<br />
xmattr=cbind(x1tr,x2tr)<br />
xpxinvtr=solve(t(xmattr)%*%xmattr)<br />
betatr=xpxinvtr%*%t(xmattr)%*%ytr<br />
yhattr=xmattr%*%betatr<br />
ssetr=sum((ytr-yhattr)^2)<br />
tstattr=betatr[2]/sqrt(xpxinvtr[2,2]*ssetr/38)<br />
# two-sided p-value<br />
pvaltr=(1-pt(abs(tstattr),38))*2<br />
# plot residuals again -- in transformed model<br />
restr=ytr-xmattr%*%betatr<br />
plot(x,restr)<br />
lines(c(.6,4.4),c(0,0),lty=3)<br />
title("(e)")<br />
5. (a) We can assume that the water evaporates linearly with time. Let y i be the<br />
amount of evaporation for the one-inch beaker on the ith day, and let y 5+i be<br />
the amount of evaporation for the two-inch beaker removed on the ith day, for<br />
i = 1, . . . , 5. Let β 1 be the evaporation rate (ml per day) from the one-inch beakers,
and let β 2 be the evaporation rate (ml per day) from the two-inch beakers. Then<br />
we can write y = Xβ + ɛ, where<br />
⎛ ⎞<br />
1 0<br />
2 0<br />
3 0<br />
4 0<br />
X =<br />
5 0<br />
0 1<br />
.<br />
0 2<br />
0 3<br />
⎜ ⎟<br />
⎝ 0 4 ⎠<br />
0 5<br />
(b) The null hypothesis is H 0 : β 2 = 2β 1 , and the alternative is β 2 ≠ 2β 1 . We can<br />
use c = (−2, 1) and write H 0 : c ′ β = 0. The test concerning linear combinations of<br />
parameters tells us:<br />
T =<br />
c ′ ˆβ<br />
√<br />
c ′ (X ′ X) −1 cSSE/(n − k)<br />
has a t(n−k) density under H 0 . Plugging things in <strong>to</strong> simplify, we get c ′ (X ′ X) −1 c =<br />
1/11, n − k = 8, and<br />
T = [y 6 + y 7 + y 8 + y 9 + y 10 − 2(y 1 + y 2 + y 3 + y 4 + y 5 )]/55<br />
√<br />
.<br />
SSE/88<br />
6. Let β 0 be the level of white blood cells before the injection, and let β 1 be the rate of<br />
change (per hour) after the injection, and ˆβ = (β 0 , β 1 ). If x is the number of hours<br />
after 9:00, then we have the design matrix<br />
and the model is y = Xβ + ɛ.<br />
⎛<br />
X =<br />
⎜<br />
⎝<br />
1 0<br />
1 0<br />
1 0<br />
1 0<br />
1 1<br />
1 2<br />
1 3<br />
1 4<br />
⎞<br />
⎟<br />
⎠
7. If β = (β 1 , β 2 , β 3 , β 4 ), then we can make<br />
⎛<br />
⎞<br />
1 0 1 0<br />
1 0 1 0<br />
1 0 0 1<br />
1 0 0 1<br />
X =<br />
.<br />
0 1 0 1<br />
0 1 0 1<br />
⎜<br />
⎟<br />
⎝ 0 1 1 0 ⎠<br />
0 1 1 0<br />
(b) There are three dimensions in the row space. We can use rows <strong>to</strong> find estimable<br />
linear combinations! Two are: β 1 +β 3 , and β 1 +β 4 . We can not estimate β 1 because<br />
(1, 0, 0, 0) is not in the row space. Similarly, we can’t estimate β 1 + β 2 .<br />
(c) We can simply use the first three dummies in the model, so that ˜β = (β 1 , β 2 , β 3 ),<br />
and<br />
⎛ ⎞<br />
1 0 1<br />
1 0 1<br />
1 0 0<br />
1 0 0<br />
˜X =<br />
.<br />
0 1 0<br />
0 1 0<br />
⎜ ⎟<br />
⎝ 0 1 1 ⎠<br />
0 1 1<br />
(d) β 1 is the expected tumor size after two weeks of Treatment A, low dose; β 2 is the<br />
expected tumor size after two weeks of Treatment B, low dose; β 3 is the expected<br />
increase in tumor size after two weeks of treatment using high dose compared <strong>to</strong><br />
low dose, for either A or B.<br />
8. Let x = (1, 2, 3, 2, 3, 4) ′ and d a = (1, 1, 1, 0, 0, 0) and 1 = (1, 1, 1, 1, 1, 1). Then we<br />
want <strong>to</strong> project the response y on<strong>to</strong> the linear space L(1, d a , x). We know that we<br />
will have “confounding” because the pot B plants are given more fertilizer. Let’s<br />
compute the VIF for fertilizer. To do this, we get the R 2 for the model using x<br />
as the response and d a (and 1) as predic<strong>to</strong>rs. We get ˆx = (2, 2, 2, 3, 3, 3) ′ , so the<br />
SSE = 4. Also, SST = 5.5, and R 2 = 3/11. So V = 11/8.<br />
9. Note that the design matrix is not full rank; we expect <strong>to</strong> get a zero eigenvalue. We<br />
solve<br />
⎛<br />
⎞<br />
4 − λ 2 2<br />
⎜<br />
⎟<br />
det ⎝ 2 2 − λ 0 ⎠ = λ(λ − 2)(λ − 6).<br />
2 0 2 − λ
<strong>to</strong> get ordered eigenvalues 6, 2, and 0. To get the eigenvec<strong>to</strong>r associated with λ = 6,<br />
we solve<br />
⎛ ⎞ ⎛ ⎞ ⎛ ⎞<br />
4 2 2 x 1 x 1<br />
⎜ ⎟ ⎜ ⎟ ⎜ ⎟<br />
⎝ 2 2 0 ⎠ ⎝ x 2 ⎠ = 6 ⎝ x 2 ⎠ .<br />
2 0 2 x 3 x 3<br />
We get v 1 = (2/ √ 6, 1/ √ 6, 1/ √ 6) ′ . Similarly, v 2 = (0, 1/ √ 2, −1/ √ 2) ′ , and v 3 =<br />
(1/ √ 3, −1/ √ 3, −1/ √ 3) ′ .<br />
We have chosen length one for all of the eigenvec<strong>to</strong>rs; also note that they form an<br />
orthogonal set. The spectral decomposition is:<br />
⎛<br />
⎜<br />
⎝<br />
4 2 2<br />
2 2 0<br />
2 0 2<br />
⎞<br />
⎟<br />
⎠ =<br />
⎛<br />
⎜<br />
⎝<br />
2/ √ 6 0 1/ √ 3<br />
1/ √ 6 1/ √ 2 −1/ √ 3<br />
1/ √ 6 −1/ √ 2 −1/ √ 3<br />
⎞ ⎛<br />
⎟ ⎜<br />
⎠ ⎝<br />
6 0 0<br />
0 2 0<br />
0 0 0<br />
⎞ ⎛<br />
⎟ ⎜<br />
⎠ ⎝<br />
2/ √ 6 1/ √ 6 1/ √ 6<br />
0 1/ √ 2 −1/ √ 2<br />
1/ √ 3 −1/ √ 3 −1/ √ 3<br />
⎞<br />
⎟<br />
⎠ .