STA 36-786: Bayesian Theoretical Statistics I

STA 36-786: Bayesian Theoretical Statistics I

STA 36-786: Bayesian Theoretical Statistics I


Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>STA</strong> <strong>36</strong>-<strong>786</strong>: <strong>Bayesian</strong> <strong>Theoretical</strong> <strong>Statistics</strong> I<br />

Assignment 3: Spring 2013<br />

Due: Tuesday February 26 at 10:30 a.m.<br />

Show all your work to obtain full/partial credit. You are not to consult outside sources other<br />

than your class notes/slides/reference books for this assignment (except for the instructor<br />

or TA). No late assignments will be accepted. Please follow the instructions for writing up<br />

solutions given out on 1.12.13. Start each problem on a new page.<br />

1. Let<br />

Y |θ ∼ Exp(θ)<br />

θ ∼ Gamma(a, b).<br />

Suppose we have a new observation Ỹ |θ ∼ Exp(θ), where conditional on θ, Y and Ỹ<br />

are independent. Show that<br />

p(ỹ|y) =<br />

b(a + 1)(by + 1)a+1<br />

(bỹ + by + 1) a+2 ,<br />

where a is an integer. (Note that this is a valid density function that integrates to 1).<br />

Solution: Observe that<br />

p(θ|y) ∝ p(θ)p(y|θ) ∝<br />

(<br />

θ a−1 e −b−1 θ ) ( θe −θy) = θ a e −(b−1 +y)θ<br />

Thus θ|y ∼ Gamma(a + 1, (b −1 + y) −1 ). Next, recall that<br />

∫<br />

=<br />

Observe that<br />

∫<br />

p(ỹ|y) =<br />

p(ỹ|θ)p(θ|y)dθ =<br />

θe −θỹ (b−1 + y) a+1<br />

θ a e −(b−1 +y)θ dθ = (b−1 + y) a+1 ∫<br />

Γ(a + 1)<br />

Γ(a + 1)<br />

= (b−1 + y) a+1<br />

Γ(a + 1)<br />

∫ ∞<br />

0<br />

p(ỹ|y)dỹ =<br />

Γ(a + 2) b(a + 1)(1 + by)a+1<br />

(b −1 =<br />

+ y + ỹ) a+2 (1 + by + bỹ) a+2<br />

∫ ∞<br />

0<br />

b(a + 1)(1 + by) a+1<br />

(1 + by + bỹ) a+2 dỹ =<br />

∣<br />

(1 + by)a+1 ∣∣∣<br />

∞<br />

= −<br />

(1 + by + bỹ) a+1 = 1<br />

0<br />

θ a+1 e −(b−1 +y+ỹ)θ dθ =

2. Suppose<br />

X 1 , . . . , X n |θ iid ∼ Poisson(θ).<br />

(a) Find Jeffreys’ prior. Is it proper or improper?<br />

(b) Find p(θ|x 1 , . . . , x n ) under Jeffreys’ prior.<br />

Solution:<br />

(a) Since (X 1 , . . . , X n )|θ are iid, I X1 ,...,X n<br />

(θ) = nI X1 (θ) ∝ I X1 (θ). Hence, in order to<br />

determine Jeffrey’s prior, it is sufficient to compute the Fisher Information for a<br />

single observation . Since X 1 |θ ∼ Poisson(θ),<br />

Hence,<br />

Thus, p(θ) ∝ I(θ) 1 2 = θ − 1 2 .<br />

log(f(X|θ)) = −θ + X 1 log(θ) − log(Γ(X 1 + 1))<br />

[ ∣ ]<br />

d log f(X1 |θ) ∣∣∣<br />

I(θ) = −E<br />

dθ 2 θ = E[X 1 θ −2 |θ] = θ −1<br />

(b)<br />

∫ ∞<br />

0<br />

θ − 1 2 dθ = 2 −1 θ 1 2 ∣ ∞ 0<br />

= ∞<br />

Since p(θ) is not integrable, the Jeffrey’s prior in this case is improper.<br />

p(θ|x 1 , . . . , x n ) ∝ p(θ)p(x 1 , . . . , x n |θ) ∝<br />

n∏<br />

∝ θ − 1 e −θ θ x i<br />

2<br />

Γ(x i + 1) ∝<br />

i=1<br />

∝ θ n¯x− 1 2 e<br />

−nθ<br />

Conclude that θ|x 1 , . . . , x n ∼ Gamma ( n¯x + 1 2 , n−1) .<br />

3. Consider dose response models. The setup is the following: animals are tested for<br />

development of drugs or other chemical compounds. Someone administers various<br />

levels of doses to k batches of animals. The response variable is a dichotomous (binary)<br />

outcome. So, it might be alive or dead or maybe tumor or no tumor. Let x i represent<br />

the data, n i represent the number of animals receiving the ith dose, and y i the number<br />

of positive outcomes for n i animals.<br />

ind<br />

(a) Suppose that y i ∼ Binomial(n i , θ i ), where θ i is the probability of death (or<br />

tumor) for the ith animal that receives dose x i . The typical modeling the prior<br />

on θ i is a logistic regression. That is, we suppose that logit(θ i ) = α + βx i . Write<br />

out the likelihood in a simple form (it will contain a product).<br />


(b) Find Jeffreys’ prior for (α, β). Also, write down the equations you need to solve<br />

for finding the posterior modes α and β under the uniform prior for α and β.<br />

Solution:<br />

(a) It follows from logit(θ i ) = α + βx i , that θ i = exp(α+βx i)<br />

(<br />

problem description, y i |(α, β) ∼ Binomial n i ,<br />

p(y 1 , . . . , y k |(α, β)) =<br />

=<br />

i=1<br />

1+exp(α+βx i ) .<br />

exp(α+βx i )<br />

1+exp(α+βx i )<br />

Hence, from the<br />

)<br />

. Conclude that<br />

k∏<br />

( ) ( )<br />

ni exp(α + βxi )<br />

yi<br />

(<br />

1<br />

y i 1 + exp(α + βx i ) 1 + exp(α + βx i )<br />

k∏<br />

( )<br />

ni<br />

(exp(α + βx i )) y i<br />

(1 + exp(α + βx i )) −n i<br />

y i<br />

i=1<br />

(b) Using the previous item, observe that:<br />

) ni −y i<br />

log(p(y 1 , . . . , y k |(α, β))) =<br />

−<br />

k∑<br />

i=1<br />

( )<br />

ni<br />

log +<br />

y i<br />

k∑<br />

y i (α + βx i )+<br />

i=1<br />

k∑<br />

n i log(1 + exp(α + βx i ))<br />

i=1<br />

Hence, the gradient of the log-likelihood is:<br />

d log(p(y 1 , . . . , y k |(α, β)))<br />

dα<br />

d log(p(y 1 , . . . , y k |(α, β)))<br />

dβ<br />

= kȳ −<br />

=<br />

k∑<br />

i=1<br />

k∑<br />

x i y i −<br />

i=1<br />

n i exp(α + βx i )<br />

1 + exp(α + βx i )<br />

k∑<br />

i=1<br />

n i x i exp(α + βx i )<br />

1 + exp(α + βx i )<br />

Taking a uniform prior on (α, β), the posterior is proportional to the likelihood.<br />

Thus, the modes of the posterior correspond to the MLE. In order to find the<br />

MLE, we must solve the system of equations obtained setting the gradient to 0.<br />

The Hessian matrix of the log-likelihood is:<br />

[ ∑ − k<br />

i=1<br />

H log(p(y 1 , . . . , y k |(α, β))) =<br />

− ∑ k<br />

i=1<br />

n i exp(α+βx i )<br />

(1+exp(α+βx i )) 2<br />

n i x i exp(α+βx i )<br />

(1+exp(α+βx i )) 2<br />

− ∑ k<br />

i=1<br />

− ∑ k<br />

i=1<br />

n i x i exp(α+βx i )<br />

]<br />

(1+exp(α+βx i )) 2<br />

n i x 2 i exp(α+βx i)<br />

(1+exp(α+βx i )) 2<br />

Since the Hessian matrix is constant on y i ,<br />

I(α, β) = −H(α, β) =<br />

[ ∑ k<br />

i=1<br />

n i exp(α+βx i ) ∑ k n i x i exp(α+βx i )<br />

]<br />

(1+exp(α+βx i )) 2 i=1 (1+exp(α+βx i ))<br />

∑ 2<br />

k n i x i exp(α+βx i ) ∑ k n i x 2<br />

i=1 (1+exp(α+βx i )) 2 i exp(α+βx i)<br />

i=1 (1+exp(α+βx i )) 2<br />


Conclude that the Jeffreys prior is<br />

p(α, β) = |I(α, β| 1 2<br />

=<br />

⎛<br />

⎝ ∑ i,j<br />

⎞<br />

n i exp(α + βx i )n j exp(α + βx j )(x 2 j − x ix j )<br />

⎠<br />

(1 + exp(α + βx i )) 2 (1 + exp(α + βx j )) 2<br />

1<br />

2<br />

4. Consider the Galenshore distribution (it’s just a transformed Gamma density). That<br />

is, let Y |θ ∼ Galenshore(a, θ). Then<br />

(a) Consider<br />

p(y|θ) = 2<br />

Γ(a) θ2a y 2a−1 e −θ2 y 2 , y > 0, a > 0, θ > 0.<br />

Y |θ ∼ Galenshore(a, θ), a known, θ unknown<br />

θ ∼ Galenshore(c, d).<br />

Find the posterior distribution of θ|y.<br />

Solution:<br />

p(θ|y) ∝ 2<br />

Γ(a) θ2a y 2a−1 e −θ2 y 2 2<br />

Γ(c) d2c θ 2c−1 e −d2 θ 2<br />

∝ θ 2a e −θ2 y 2 θ 2c−1 e −d2 θ 2<br />

= θ 2(a+c)−1 e −θ2 (y 2 +d 2) .<br />

Thus, θ|y ∼ Galenshore(a + c, √ y 2 + d 2 ).<br />


Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!