STA 36-786: Bayesian Theoretical Statistics I
STA 36-786: Bayesian Theoretical Statistics I
STA 36-786: Bayesian Theoretical Statistics I
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>STA</strong> <strong>36</strong>-<strong>786</strong>: <strong>Bayesian</strong> <strong>Theoretical</strong> <strong>Statistics</strong> I<br />
Assignment 3: Spring 2013<br />
Due: Tuesday February 26 at 10:30 a.m.<br />
Show all your work to obtain full/partial credit. You are not to consult outside sources other<br />
than your class notes/slides/reference books for this assignment (except for the instructor<br />
or TA). No late assignments will be accepted. Please follow the instructions for writing up<br />
solutions given out on 1.12.13. Start each problem on a new page.<br />
1. Let<br />
Y |θ ∼ Exp(θ)<br />
θ ∼ Gamma(a, b).<br />
Suppose we have a new observation Ỹ |θ ∼ Exp(θ), where conditional on θ, Y and Ỹ<br />
are independent. Show that<br />
p(ỹ|y) =<br />
b(a + 1)(by + 1)a+1<br />
(bỹ + by + 1) a+2 ,<br />
where a is an integer. (Note that this is a valid density function that integrates to 1).<br />
Solution: Observe that<br />
p(θ|y) ∝ p(θ)p(y|θ) ∝<br />
(<br />
θ a−1 e −b−1 θ ) ( θe −θy) = θ a e −(b−1 +y)θ<br />
Thus θ|y ∼ Gamma(a + 1, (b −1 + y) −1 ). Next, recall that<br />
∫<br />
=<br />
Observe that<br />
∫<br />
p(ỹ|y) =<br />
p(ỹ|θ)p(θ|y)dθ =<br />
θe −θỹ (b−1 + y) a+1<br />
θ a e −(b−1 +y)θ dθ = (b−1 + y) a+1 ∫<br />
Γ(a + 1)<br />
Γ(a + 1)<br />
= (b−1 + y) a+1<br />
Γ(a + 1)<br />
∫ ∞<br />
0<br />
p(ỹ|y)dỹ =<br />
Γ(a + 2) b(a + 1)(1 + by)a+1<br />
(b −1 =<br />
+ y + ỹ) a+2 (1 + by + bỹ) a+2<br />
∫ ∞<br />
0<br />
b(a + 1)(1 + by) a+1<br />
(1 + by + bỹ) a+2 dỹ =<br />
∣<br />
(1 + by)a+1 ∣∣∣<br />
∞<br />
= −<br />
(1 + by + bỹ) a+1 = 1<br />
0<br />
θ a+1 e −(b−1 +y+ỹ)θ dθ =
2. Suppose<br />
X 1 , . . . , X n |θ iid ∼ Poisson(θ).<br />
(a) Find Jeffreys’ prior. Is it proper or improper?<br />
(b) Find p(θ|x 1 , . . . , x n ) under Jeffreys’ prior.<br />
Solution:<br />
(a) Since (X 1 , . . . , X n )|θ are iid, I X1 ,...,X n<br />
(θ) = nI X1 (θ) ∝ I X1 (θ). Hence, in order to<br />
determine Jeffrey’s prior, it is sufficient to compute the Fisher Information for a<br />
single observation . Since X 1 |θ ∼ Poisson(θ),<br />
Hence,<br />
Thus, p(θ) ∝ I(θ) 1 2 = θ − 1 2 .<br />
log(f(X|θ)) = −θ + X 1 log(θ) − log(Γ(X 1 + 1))<br />
[ ∣ ]<br />
d log f(X1 |θ) ∣∣∣<br />
I(θ) = −E<br />
dθ 2 θ = E[X 1 θ −2 |θ] = θ −1<br />
(b)<br />
∫ ∞<br />
0<br />
θ − 1 2 dθ = 2 −1 θ 1 2 ∣ ∞ 0<br />
= ∞<br />
Since p(θ) is not integrable, the Jeffrey’s prior in this case is improper.<br />
p(θ|x 1 , . . . , x n ) ∝ p(θ)p(x 1 , . . . , x n |θ) ∝<br />
n∏<br />
∝ θ − 1 e −θ θ x i<br />
2<br />
Γ(x i + 1) ∝<br />
i=1<br />
∝ θ n¯x− 1 2 e<br />
−nθ<br />
Conclude that θ|x 1 , . . . , x n ∼ Gamma ( n¯x + 1 2 , n−1) .<br />
3. Consider dose response models. The setup is the following: animals are tested for<br />
development of drugs or other chemical compounds. Someone administers various<br />
levels of doses to k batches of animals. The response variable is a dichotomous (binary)<br />
outcome. So, it might be alive or dead or maybe tumor or no tumor. Let x i represent<br />
the data, n i represent the number of animals receiving the ith dose, and y i the number<br />
of positive outcomes for n i animals.<br />
ind<br />
(a) Suppose that y i ∼ Binomial(n i , θ i ), where θ i is the probability of death (or<br />
tumor) for the ith animal that receives dose x i . The typical modeling the prior<br />
on θ i is a logistic regression. That is, we suppose that logit(θ i ) = α + βx i . Write<br />
out the likelihood in a simple form (it will contain a product).<br />
2
(b) Find Jeffreys’ prior for (α, β). Also, write down the equations you need to solve<br />
for finding the posterior modes α and β under the uniform prior for α and β.<br />
Solution:<br />
(a) It follows from logit(θ i ) = α + βx i , that θ i = exp(α+βx i)<br />
(<br />
problem description, y i |(α, β) ∼ Binomial n i ,<br />
p(y 1 , . . . , y k |(α, β)) =<br />
=<br />
i=1<br />
1+exp(α+βx i ) .<br />
exp(α+βx i )<br />
1+exp(α+βx i )<br />
Hence, from the<br />
)<br />
. Conclude that<br />
k∏<br />
( ) ( )<br />
ni exp(α + βxi )<br />
yi<br />
(<br />
1<br />
y i 1 + exp(α + βx i ) 1 + exp(α + βx i )<br />
k∏<br />
( )<br />
ni<br />
(exp(α + βx i )) y i<br />
(1 + exp(α + βx i )) −n i<br />
y i<br />
i=1<br />
(b) Using the previous item, observe that:<br />
) ni −y i<br />
log(p(y 1 , . . . , y k |(α, β))) =<br />
−<br />
k∑<br />
i=1<br />
( )<br />
ni<br />
log +<br />
y i<br />
k∑<br />
y i (α + βx i )+<br />
i=1<br />
k∑<br />
n i log(1 + exp(α + βx i ))<br />
i=1<br />
Hence, the gradient of the log-likelihood is:<br />
d log(p(y 1 , . . . , y k |(α, β)))<br />
dα<br />
d log(p(y 1 , . . . , y k |(α, β)))<br />
dβ<br />
= kȳ −<br />
=<br />
k∑<br />
i=1<br />
k∑<br />
x i y i −<br />
i=1<br />
n i exp(α + βx i )<br />
1 + exp(α + βx i )<br />
k∑<br />
i=1<br />
n i x i exp(α + βx i )<br />
1 + exp(α + βx i )<br />
Taking a uniform prior on (α, β), the posterior is proportional to the likelihood.<br />
Thus, the modes of the posterior correspond to the MLE. In order to find the<br />
MLE, we must solve the system of equations obtained setting the gradient to 0.<br />
The Hessian matrix of the log-likelihood is:<br />
[ ∑ − k<br />
i=1<br />
H log(p(y 1 , . . . , y k |(α, β))) =<br />
− ∑ k<br />
i=1<br />
n i exp(α+βx i )<br />
(1+exp(α+βx i )) 2<br />
n i x i exp(α+βx i )<br />
(1+exp(α+βx i )) 2<br />
− ∑ k<br />
i=1<br />
− ∑ k<br />
i=1<br />
n i x i exp(α+βx i )<br />
]<br />
(1+exp(α+βx i )) 2<br />
n i x 2 i exp(α+βx i)<br />
(1+exp(α+βx i )) 2<br />
Since the Hessian matrix is constant on y i ,<br />
I(α, β) = −H(α, β) =<br />
[ ∑ k<br />
i=1<br />
n i exp(α+βx i ) ∑ k n i x i exp(α+βx i )<br />
]<br />
(1+exp(α+βx i )) 2 i=1 (1+exp(α+βx i ))<br />
∑ 2<br />
k n i x i exp(α+βx i ) ∑ k n i x 2<br />
i=1 (1+exp(α+βx i )) 2 i exp(α+βx i)<br />
i=1 (1+exp(α+βx i )) 2<br />
3
Conclude that the Jeffreys prior is<br />
p(α, β) = |I(α, β| 1 2<br />
=<br />
⎛<br />
⎝ ∑ i,j<br />
⎞<br />
n i exp(α + βx i )n j exp(α + βx j )(x 2 j − x ix j )<br />
⎠<br />
(1 + exp(α + βx i )) 2 (1 + exp(α + βx j )) 2<br />
1<br />
2<br />
4. Consider the Galenshore distribution (it’s just a transformed Gamma density). That<br />
is, let Y |θ ∼ Galenshore(a, θ). Then<br />
(a) Consider<br />
p(y|θ) = 2<br />
Γ(a) θ2a y 2a−1 e −θ2 y 2 , y > 0, a > 0, θ > 0.<br />
Y |θ ∼ Galenshore(a, θ), a known, θ unknown<br />
θ ∼ Galenshore(c, d).<br />
Find the posterior distribution of θ|y.<br />
Solution:<br />
p(θ|y) ∝ 2<br />
Γ(a) θ2a y 2a−1 e −θ2 y 2 2<br />
Γ(c) d2c θ 2c−1 e −d2 θ 2<br />
∝ θ 2a e −θ2 y 2 θ 2c−1 e −d2 θ 2<br />
= θ 2(a+c)−1 e −θ2 (y 2 +d 2) .<br />
Thus, θ|y ∼ Galenshore(a + c, √ y 2 + d 2 ).<br />
4