Solutions to Homework Set No. 4 – Probability Theory (235A), Fall ...
Solutions to Homework Set No. 4 – Probability Theory (235A), Fall ...
Solutions to Homework Set No. 4 – Probability Theory (235A), Fall ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Solutions</strong> <strong>to</strong> <strong>Homework</strong> <strong>Set</strong> <strong>No</strong>. 4 <strong>–</strong> <strong>Probability</strong> <strong>Theory</strong> (<strong>235A</strong>), <strong>Fall</strong> 2011<br />
1. A function ϕ : (a, b) → R is called convex if for any x, y ∈ (a, b) and α ∈ [0, 1] we<br />
have<br />
ϕ(αx + (1 − α)y) ≤ αϕ(x) + (1 − α)ϕ(y).<br />
(a) Prove that an equivalent condition for ϕ <strong>to</strong> be convex is that for any x < z < y in (a, b)<br />
we have<br />
ϕ(z) − ϕ(x) ϕ(y) − ϕ(z)<br />
≤ .<br />
z − x y − z<br />
(1)<br />
Deduce using the mean value theorem that if ϕ is twice continuously differentiable and<br />
satisfies ϕ ′′ ≥ 0 then it is convex.<br />
Solution. Let x ≤ z ≤ y and α = (y − z)/(y − x) so that z = αx + (1 − α)y and<br />
α ∈ [0, 1]. If ϕ is convex, then<br />
ϕ(z) ≤<br />
<br />
y − z<br />
<br />
ϕ(x) +<br />
y − x<br />
subtracting ϕ(z)(z − x)/(y − x) from both sides,<br />
ϕ(z) − ϕ(x)<br />
z − x<br />
≤<br />
ϕ(y) − ϕ(z)<br />
y − z<br />
<br />
z − x<br />
<br />
ϕ(y),<br />
y − x<br />
for any x ≤ z ≤ y<br />
Reverse process gives the original condition. For the second part of question, use the mean<br />
value theorem for [x, z] and [z, y] and ϕ ′′ ≥ 0 <strong>to</strong> give (1).<br />
(b) Prove Jensen’s inequality, which says that if X is a random variable such that<br />
P(X ∈ (a, b)) = 1 and ϕ : (a, b) → R is convex, then<br />
ϕ(EX) ≤ E(ϕ(X)).<br />
Hint. Start by proving the following property of a convex function: If ϕ is convex then at<br />
any point x0 ∈ (a, b), ϕ has a supporting line, that is, a linear function y(x) = ax + b<br />
such that y(x0) = ϕ(x0) and such that ϕ(x) ≥ y(x) for all x ∈ (a, b) (<strong>to</strong> prove its existence,<br />
use the characterization of convexity from part (a) <strong>to</strong> show that the left-sided derivative of<br />
ϕ at x0 is less than or equal <strong>to</strong> the right-sided derivative at x0; the supporting line is a line<br />
1
passing through the point (x0, ϕ(x0)) whose slope lies between these two numbers). <strong>No</strong>w<br />
take the supporting line function at x0 = EX and see what happens.<br />
Solution. If you can show that ϕ has a supporting line y(x) = ax + b, it is easy <strong>to</strong><br />
prove the inequality. As mentioned in Hint, put x0 = EX, and then<br />
ϕ(X) ≥ y(X) = aX + b, which implies E(ϕ(X)) ≥ aEX + b = y(EX) = ϕ(EX).<br />
Existence of supporting line<br />
Choose and fix z ∈ R in (1). Then from (1),<br />
where A := infw≥z (ϕ(w)−ϕ(z))<br />
(w−z)<br />
where B := sup w≤z<br />
(ϕ(z)−ϕ(w))<br />
(z−w)<br />
ϕ(x) ≥ ϕ(z) + A(x − z) for any x ≤ z<br />
that depends only on z. Similarly,<br />
ϕ(y) ≥ ϕ(z) + B(y − z) for any y ≥ z<br />
that depends only on z. By defining α = (A + B)/2, we have<br />
ϕ(x) ≥ ϕ(z) + α(x − z) := L(x) for any x ∈ R<br />
because A ≤ B. So, for each fixed z, L(x) is a linear function, and ϕ(z) = L(z) and<br />
ϕ(x) ≥ L(x) for all x.<br />
2. If X is a random variable satisfying a ≤ X ≤ b, prove that<br />
and identify when equality holds.<br />
V(X) ≤<br />
(b − a)2<br />
, (2)<br />
4<br />
Solution. Suppose that a = b. Since (X − E(X)) 2 is a quadratic function of X, the line<br />
intersecting (a, (a − E(X)) 2 ) and (b, (b − E(X)) 2 ) is above the (X − E(X)) 2 for a ≤ X ≤ b.<br />
That is,<br />
(X − E(X)) 2 ≤ (b − EX)2 − (a − EX) 2<br />
(X − a) + (a − EX)<br />
b − a<br />
2<br />
2
Taking the expectation and simplifying,<br />
V(X) ≤ (EX − a)(b − EX) ≤<br />
(b − a)2<br />
4<br />
Equalities are attained in the first inequality when X = a or X = b (when the parabola and<br />
the line intersect) and in the second inequality when EX − a = b − EX, that is, EX = a+b<br />
2 .<br />
Hence, X must be a random variable on {a, b} with EX = a+b.<br />
If a = b, the inequality (in<br />
2<br />
fact, equality) is obvious.<br />
3. Let X1, X2, . . . be a sequence of i.i.d. (independent and identically distributed) random<br />
variables with distribution U(0, 1). Define events A1, A2, . . . by<br />
An = {Xn = max(X1, X2, . . . , Xn)}<br />
(if An occurred, we say that n is a record time).<br />
(a) Prove that A1, A2, . . . are independent events. Hint: For each n ≥ 1, let πn be the<br />
random permutation of (1, 2, . . . , n) obtained by forgetting the values of (X1, . . . , Xn) and<br />
only retaining their respective order. In other words, define<br />
πn(k) = #{1 ≤ j ≤ n : Xj ≤ Xk}.<br />
By considering the joint density fX1,...,Xn (a uniform density on the n-dimensional unit<br />
cube), show that πn is a uniformly random permutation of n elements, i.e. P(πn = σ) =<br />
1/n! for any permutation σ ∈ Sn. Deduce that the event An = {πn(n) = n} is independent<br />
of πn−1 and therefore is independent of the previous events (A1, . . . , An−1), which are all<br />
determined by πn−1.<br />
Solution. Let σ = (i1, · · · , in) ∈ Sn. Since Xi ∼ U(0, 1) and Xi’s are independent,<br />
fX1,··· ,Xn = 1 only when Xi ∈ (0, 1) for all i. So,<br />
P(πn = σ) = P(Xi1 ≤ · · · ≤ Xin)<br />
1 xin xi2 = · · · fX1,··· ,Xndxi1 · · · dxin<br />
0 0 0<br />
1 xin xi2 = · · · 1dxi1 · · · dxin<br />
0<br />
= 1<br />
n!<br />
0<br />
3<br />
0
<strong>No</strong>w, the probability that Xi ≤ Xn for all 1 ≤ i ≤ n is<br />
by fixing the largest, Xn, and permuting the rest.<br />
P(An) = P({πn(n) = n}) =<br />
(n − 1)!<br />
, where (n − 1)! is obtained<br />
n!<br />
(n − 1)!<br />
n!<br />
For any σ ∈ Sn, P(πn = σ) = 1<br />
n! and for any σn−1 ∈ Sn−1, P(πn−1 = σn−1) = 1 . Hence<br />
(n−1)!<br />
= 1<br />
n<br />
P(An, πn−1 = σn−1) = 1<br />
n! = P(An)P(πn−1 = σn−1),<br />
which shows the independence. Thus, Ai’s are independent since πn−1 are determined by<br />
A1, · · · , An−1.<br />
(b) Define<br />
Rn =<br />
n<br />
k=1<br />
1Ak<br />
= #{1 ≤ k ≤ n : k is a record time}, (n = 1, 2, . . .).<br />
Compute E(Rn) and V(Rn). Deduce that if (mn) ∞ n=1 is a sequence of positive numbers such<br />
that mn ↑ ∞, however slowly, then the number Rn of record times up <strong>to</strong> time n satisfies<br />
<br />
<br />
P |Rn − log n| > mn log n −−−→<br />
n→∞ 0.<br />
Solution.<br />
E(Rn) =<br />
n<br />
k=1<br />
V(Rn) = V n <br />
k=1<br />
E(1Ak ) =<br />
1Ak<br />
=<br />
n<br />
k=1<br />
n<br />
P(Ak) =<br />
k=1<br />
V(1Ak ) =<br />
n<br />
k=1<br />
n<br />
k=1<br />
1<br />
k<br />
1<br />
k<br />
= log n + O(1)<br />
1<br />
− = log n + O(1)<br />
k2 √<br />
The second equality follows from independence. The claim about P(|Rn−log n| > mn log n)<br />
follows directly from Chebyshev’s inequality.<br />
4
4. Compute E(X) and V(X) when X is a random variable having each of the following<br />
distributions:<br />
1. X ∼ Binomial(n, p) ; E(X) = np, V(X) = np(1 − p)<br />
2. X ∼ Poisson(λ) ; E(X) = λ, V(X) = λ<br />
3. X ∼ Geom(p) ; E(X) = 1/p, V(X) = (1 − p)p −2<br />
4. X ∼ U{1, 2, . . . , n} ; E(X) = n+1<br />
(n+1)(n−1)<br />
, V(X) = 2 12<br />
5. X ∼ U(a, b) ; E(X) = a+b<br />
(b−a)2<br />
, V(X) = 2 12<br />
6. X ∼ Exp(λ) ; E(X) = 1<br />
1 , V(X) = λ λ2 6. (a) If X, Y are independent r.v.’s taking values in Z, show that<br />
P(X + Y = n) =<br />
∞<br />
k=−∞<br />
P(X = k)P(Y = n − k) (n ∈ Z)<br />
(compare this formula with the convolution formula in the case of r.v.’s with density).<br />
Solution. {X + Y = n} = <br />
{X = k, Y = n − k}. Since the events in the union<br />
are disjoint,<br />
P(X + Y = n) =<br />
∞<br />
k=−∞<br />
k∈Z<br />
P(X = k, Y = n − k) =<br />
∞<br />
k=−∞<br />
P(X = k)P(Y = n − k)<br />
(b) Use this <strong>to</strong> show that if X ∼ Poisson(λ) and Y ∼ Poisson(µ) are independent then<br />
X + Y ∼ Poisson(λ + µ).<br />
Solution. By part (a),<br />
P(X + Y = n) =<br />
n<br />
e<br />
k=0<br />
−λ λk µn−k<br />
e−µ<br />
k!<br />
−(λ+µ) 1<br />
= e<br />
n!<br />
n<br />
k=0<br />
5<br />
(n − k)!<br />
n<br />
k<br />
= e−(λ+µ)<br />
<br />
λ k µ n−k = e<br />
n<br />
k=0<br />
λ k µ −(n−k)<br />
k!(n − k)!<br />
−(λ+µ) 1<br />
(λ + µ)n<br />
n!
This implies that X + Y ∼ Poisson(λ + µ).<br />
(c) Use the same “discrete convolution” formula <strong>to</strong> prove directly that if X ∼ Bin(n, p)<br />
and Y ∼ Bin(m, p) are independent then X + Y ∼ Bin(n + m, p). You may make use<br />
of the combina<strong>to</strong>rial identity (known as the Vandermonde identity or Chu-Vandermonde<br />
identity)<br />
Solution.<br />
k<br />
j=0<br />
P(X + Y = k) =<br />
<br />
n m<br />
=<br />
j k − j<br />
n + m<br />
k<br />
P(X = j, Y = k − j)<br />
k<br />
<br />
, (n, m ≥ 0, 0 ≤ k ≤ n + m).<br />
j=0<br />
k<br />
<br />
n<br />
= p<br />
j<br />
j=0<br />
j (1 − p) n−j<br />
<br />
m<br />
p<br />
k − j<br />
k−j (1 − p) m−(k−j)<br />
<br />
= p k (1 − p) n+m−k<br />
k<br />
<br />
n m<br />
= p<br />
j k − j<br />
j=0<br />
k (1 − p) n+m−k<br />
n + m<br />
6. Prove that if X is a random variable that is independent of itself, then X is a.s.<br />
constant, i.e., there is a constant c ∈ R such that P(X = c) = 1.<br />
Solution. Let FX be the c.d.f. of X. Since X is independent of itself,<br />
P(X ≤ x) = P(X ≤ x, X ≤ x) = P(X ≤ x)P(X ≤ x),<br />
which implies that FX(x) = 0 or 1 for each x ∈ R. Recall that FX(x) is nondecreasing,<br />
limx→−∞ FX(x) = 0 and limx→∞ FX(x) = 1. Thus, there exists c ∈ R such that<br />
lim<br />
x→c− FX(x) = 0 and lim<br />
x→c + FX(x) = 1<br />
Since FX is right continuous, P(X = c) = FX(c) − FX(c−) = 1.<br />
6<br />
k<br />
<br />
.