12.07.2015 Views

Stat 5101 Lecture Notes - School of Statistics

Stat 5101 Lecture Notes - School of Statistics

Stat 5101 Lecture Notes - School of Statistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2.5. PROBABILITY THEORY AS LINEAR ALGEBRA 69and a definite integral over an interval <strong>of</strong> length zero is zero.Often when we want to assert a fact, it turns out that the best we can getfrom probability is an assertion “with probability one” or “except for an event <strong>of</strong>probability zero.” The most important <strong>of</strong> these is the following theorem, whichis essentially the same as Theorem 5 <strong>of</strong> Chapter 4 in Lindgren.Theorem 2.32. If Y =0with probability one, then E(Y )=0. Conversely, ifY ≥ 0 and E(Y )=0, then Y =0with probability one.The phrase “Y = 0 with probability one” means P (Y = 0) = 1. The pro<strong>of</strong><strong>of</strong> the theorem involves dominated convergence and is beyond the scope <strong>of</strong> thiscourse.Applying linearity <strong>of</strong> expectation to the first half <strong>of</strong> the theorem, we get anobvious corollary.Corollary 2.33. If X = Y with probability one, then E(X) =E(Y).If X = Y with probability one, then the setA = { s : X(s) ≠ Y (s) }has probability zero. Thus a colloquial way to rephrase the corollary is “whathappens on a set <strong>of</strong> probability zero doesn’t matter.” Another rephrasing is “arandom variable can be arbitrarily redefined on a set <strong>of</strong> probability zero withoutchanging any expectations.”There are two more corollaries <strong>of</strong> this theorem that are important in statistics.Corollary 2.34. var(X) =0if and only if X is constant with probability one.Pro<strong>of</strong>. First, suppose X = a with probability one. Then E(X) = a = µ,and (X − µ) 2 equals zero with probability one, hence by Theorem 2.32 itsexpectation, which is var(X), is zero.Conversely, by the second part <strong>of</strong> Theorem 2.32, var(X) =E{(X−µ) 2 }=0implies (X −µ) 2 = 0 with probability one because (X −µ) 2 is a random variablethat is nonnegative and integrates to zero. Since (X − µ) 2 is zero only whenX = µ, this implies X = µ with probability one.Corollary 2.35. |cor(X, Y )| =1if and only if there exist constants α and βsuch that Y = α + βX with probability one.Pro<strong>of</strong>. First suppose Y = α + βX with probability one. Then by (2.33)cor(α + βX,X) = sign(β) cor(X, X) =±1.That proves one direction <strong>of</strong> the “if and only if.”To prove the other direction, we assume ρ X,Y = ±1 and have to prove thatY = α + βX with probability one, where α and β are constants we may choose.I claim that the appropriate choices areβ = ρ X,Yσ Yσ Xα = µ Y − βµ X

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!