we see th<strong>at</strong> T ⇠ Geom(✓), with ✓ =ss+(1. Note th<strong>at</strong> this is consistent with (a).s)pThe distribution of T can also be obtained by a story proof. Imagine th<strong>at</strong> just beforeeach tournament she may play in, Judit retires with probability s (if she retires,she does not play in th<strong>at</strong> or future tournaments). Her tournament history can bewritten as a sequence of W (win), L (lose), R (retire), ending in the first R, wherethe probabilities of W, L, R are (1 s)p, (1 s)(1 p),srespectively. For calcul<strong>at</strong>ingT ,thelossescanbeignored:wewanttocountthenumberofW ’s before the R. Thesprobability th<strong>at</strong> a result is R given th<strong>at</strong> it is W or R is ,soweagainhaves+(1 s)psT ⇠ Geom(s+(1 s)p ).2. Let X 1 ,X 2 be i.i.d., <strong>and</strong> let ¯X = 1 2 (X 1 + X 2 ). In many st<strong>at</strong>istics problems, it isuseful or important to obtain a conditional expect<strong>at</strong>ion given ¯X. As an example ofthis, find E(w 1 X 1 + w 2 X 2 | ¯X), where w 1 ,w 2 are constants with w 1 + w 2 =1.By symmetry E(X 1 | ¯X) =E(X 2 | ¯X) <strong>and</strong>bylinearity<strong>and</strong>takingoutwh<strong>at</strong>’sknown,E(X 1 | ¯X)+E(X 2 | ¯X) =E(X 1 + X 2 | ¯X) =X 1 + X 2 . So E(X 1 | ¯X) =E(X 2 | ¯X) = ¯X(this was also derived in class). Thus,E(w 1 X 1 + w 2 X 2 | ¯X) =w 1 E(X 1 | ¯X)+w 2 E(X 2 | ¯X) =w 1 ¯X + w2 ¯X = ¯X.3. A certain stock has low vol<strong>at</strong>ility on some days <strong>and</strong> high vol<strong>at</strong>ility on other days.Suppose th<strong>at</strong> the probability of a low vol<strong>at</strong>ility day is p <strong>and</strong> of a high vol<strong>at</strong>ility dayis q =1 p, <strong>and</strong>th<strong>at</strong>onlowvol<strong>at</strong>ilitydaysthepercentchangeinthestockpriceis2 2N (0, 1 ), while on high vol<strong>at</strong>ility days the percent change is N (0, 2 ), with 1 < 2 .Let X be the percent change of the stock on a certain day. The distribution issaid to be a mixture of two Normal distributions, <strong>and</strong> a convenient way to representX is as X = I 1 X 1 + I 2 X 2 where I 1 is the indic<strong>at</strong>or r.v. of having a low vol<strong>at</strong>ility day,2I 2 =1 I 1 , X j ⇠N(0, j ), <strong>and</strong> I 1 ,X 1 ,X 2 are independent.(a) Find the variance of X in two ways:Cov(I 1 X 1 + I 2 X 2 ,I 1 X 1 + I 2 X 2 )directly.By Eve’s Law,Var(X) =E(Var(X|I 1 ))+Var(E(X|I 1 )) = E(I 2 1since I 2 1 = I 1 ,I 2 2 = I 2 . For the covariance method, exp<strong>and</strong>using Eve’s Law, <strong>and</strong> by calcul<strong>at</strong>ing21+(1 I 1 ) 2 2 2)+Var(0) = p 2 1+(1 p) 2 2,Var(X) =Cov(I 1 X 1 +I 2 X 2 ,I 1 X 1 +I 2 X 2 )=Var(I 1 X 1 )+Var(I 2 X 2 )+2Cov(I 1 X 1 ,I 2 X 2 ).Then Var(I 1 X 1 )=E(I 2 1X 2 1) (E(I 1 X 1 )) 2 = E(I 1 )E(X 2 1)=pVar(X 1 )sinceE(I 1 X 1 )=E(I 1 )E(X 1 )=0. Similarly, Var(I 2 X 2 )=(1 p)Var(X 2 ). And Cov(I 1 X 1 ,I 2 X 2 )=2
E(I 1 I 2 X 1 X 2 ) E(I 1 X 1 )E(I 2 X 2 )=0sinceI 1 I 2 always equals 0. So again we haveVar(X) =p 1 2 +(1 p) 2.2(b) The kurtosis of a r.v. Y with mean µ <strong>and</strong> st<strong>and</strong>ard devi<strong>at</strong>ion is defined byKurt(Y )=E(Y µ)443.This is a measure of how heavy-tailed the distribution of Y . Find the kurtosis of X(in terms of p, q,21 ,22 ,fullysimplified). Theresultwillshowth<strong>at</strong>eventhoughthekurtosis of any Normal distribution is 0, the kurtosis of X is positive <strong>and</strong> in fact canbe very large depending on the parameter values.Note th<strong>at</strong> (I 1 X 1 + I 2 X 2 ) 4 = I 1 X 4 1 + I 2 X 4 2 since the cross terms disappear (becauseI 1 I 2 is always 0) <strong>and</strong> any positive power of an indic<strong>at</strong>or r.v. is th<strong>at</strong> indic<strong>at</strong>or r.v.! SoE(X 4 )=E(I 1 X 4 1 + I 2 X 4 2)=3p 4 1 +3q 4 2.Altern<strong>at</strong>ively, we can use E(X 4 )=E(X 4 |I 1 =1)p + E(X 4 |I 1 =0)q to find E(X 4 ).The mean of X is E(I 1 X 1 )+E(I 2 X 2 ) = 0, so the kurtosis of X isKurt(X) = 3p 4 1 +3q 4 2(p 2 1 + q 2 2) 2 3.This becomes 0 if 1 = 2 , since then we have a Normal distribution r<strong>at</strong>her than amixture of two di↵erent Normal distributions. For 1 < 2 ,thekurtosisispositivesince p 4 1 + q 4 2 > (p 2 1 + q 2 2) 2 , as seen by a Jensen’s inequality argument, or byinterpreting this as saying E(Y 2 ) > (EY ) 2 where Y is21 with probability p <strong>and</strong> 2 2with probability q.4. We wish to estim<strong>at</strong>e an unknown parameter ✓, basedonar.v.X we will getto observe. As in the Bayesian perspective, assume th<strong>at</strong> X <strong>and</strong> ✓ have a jointdistribution. Let ˆ✓ be the estim<strong>at</strong>or (which is a function of X). Then ˆ✓ is said to beunbiased if E(ˆ✓|✓) =✓, <strong>and</strong>ˆ✓ is said to be the Bayes procedure if E(✓|X) =ˆ✓.(a) Let ˆ✓ be unbiased. Find E(ˆ✓ ✓) 2 (the average squared di↵erence between theestim<strong>at</strong>or <strong>and</strong> the true value of ✓), in terms of marginal moments of ˆ✓ <strong>and</strong> ✓.Hint: condition on ✓.3