Stat 134 Fall 2011: Distributions of numbers of children
Stat 134 Fall 2011: Distributions of numbers of children
Stat 134 Fall 2011: Distributions of numbers of children
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Stat</strong> <strong>134</strong> <strong>Fall</strong> <strong>2011</strong>: <strong>Distributions</strong> <strong>of</strong> <strong>numbers</strong> <strong>of</strong> <strong>children</strong><br />
Michael Lugo<br />
November 16, <strong>2011</strong><br />
We considered the following model in class today: in a certain town, the probability<br />
that a random family has t <strong>children</strong> is P (T = t) = (1 − p) t p. All <strong>children</strong> have probability<br />
1/2 <strong>of</strong> being a boy; the gender <strong>of</strong> each child is independent <strong>of</strong> all other <strong>children</strong> and <strong>of</strong> the<br />
number <strong>of</strong> <strong>children</strong> the family has. So if B is the number <strong>of</strong> boys that a family has, then<br />
P (B = b|T = t) = 2 −t( t<br />
b)<br />
. We want to know the distribution <strong>of</strong> the number <strong>of</strong> boys.<br />
By the definition <strong>of</strong> conditional probability we can say<br />
P (B = b) = ∑ t<br />
= ∑ t≥b<br />
P (B = b|T = t)P (T = t)<br />
( ) t<br />
2 −t (1 − p) t p<br />
b<br />
but this sum is tricky! To be able to do it let’s make it even harder; we’ll find ∑ b≥0 P (B =<br />
b)z b , the generating function <strong>of</strong> B. This is given by the double sum<br />
∑<br />
P (B = b)z b = ∑ ( ∑<br />
( ) )<br />
t<br />
2 −t q t p z b<br />
b<br />
b≥0 n≥0 t≥b<br />
where q = 1 − p. Often with these sums it turns out to be useful to interchange the order <strong>of</strong><br />
summation. If we do this we get<br />
∑ t∑<br />
( ) t<br />
2 −t q t pz n<br />
n<br />
t≥0 n=0<br />
and pulling otu constants gives<br />
∑ ( q t<br />
t∑<br />
( t p z<br />
2)<br />
n)<br />
n .<br />
t≥0<br />
n=0<br />
Now the inner sum (over n) can be done by the binomial theorem; it’s (1 + z) t . So we’re left<br />
with<br />
∑ ( q<br />
) t<br />
p(1 + z) t .<br />
2<br />
t≥0<br />
1
This is just a geometric series, which is a little more obvious if it’s written in the form<br />
( ) t q(1 + z)<br />
.<br />
p ∑ t≥0<br />
2<br />
Its sum is therefore<br />
and after some rewriting this is<br />
p<br />
1 − q (1 + z)<br />
2<br />
(<br />
1 −<br />
2p<br />
1+p<br />
1−p<br />
1+p<br />
)<br />
z.<br />
Of course this is just a geometric series; we can expand it to get<br />
2p<br />
1 + p + 2p<br />
1 + p<br />
1 − p<br />
1 + p z + 2p<br />
1 + p<br />
and the coefficient <strong>of</strong> z b is P (B = b); this is<br />
P (B = b) =<br />
2p<br />
1 + p<br />
( ) 2 1 − p<br />
z 2 + · · ·<br />
1 + p<br />
( ) b 1 − p<br />
.<br />
1 + p<br />
In particular, the number <strong>of</strong> boys that a family has is geometric, with parameter 2p/(1 + p).<br />
Also, if you recall that the mean <strong>of</strong> a geometric(p) is just (1 − p)/p, then you find that<br />
E(T ) = (1 − p)/p and<br />
E(B) = 1 − 2p<br />
1+p<br />
2p<br />
1+p<br />
= 1 + p − 2p<br />
2p<br />
= 1 − p<br />
2p .<br />
So E(B) = E(T )/2, which is <strong>of</strong> course what you’d expect – the expected number <strong>of</strong> boys is<br />
exactly half the expected number <strong>of</strong> <strong>children</strong>. What’s surprising is that B turns out to be<br />
geometric. We can work out P (T = t|B = b) as well:<br />
P (T = t)P (B = b|T = t)<br />
P (T = t|B = b) =<br />
P (B = b)<br />
= (1 − p)t p2 −t( )<br />
t<br />
b<br />
(<br />
=<br />
=<br />
2p<br />
1+p<br />
1−p<br />
1+p<br />
) b<br />
( ) t ( ) ( ) b 1 − p t 1 + p 1 + p<br />
.<br />
2 b 2 1 − p<br />
( ) ( ) b+1 ( ) t−b (t + 1) − 1 1 + p 1 − p<br />
.<br />
(b + 1) − 1 2 2<br />
2
The reason for writing it int his form is for comparison with the negative binomial. The<br />
number <strong>of</strong> trials T r until the rth success in Bernoulli(π) trials has distribution<br />
( ) t − 1<br />
P (T r = t) = π r (1 − π) t−r<br />
r − 1<br />
and so comparing these, we see that, given B = b, the distribution <strong>of</strong> T +1 is the distribution<br />
<strong>of</strong> the waiting time until the (b + 1)st success in Bernoulli((1 + p)/2) trials.<br />
In particular that means that<br />
and so we get the formula<br />
E(T + 1|B = b) = b + 1<br />
1+p<br />
2<br />
E(T |B = b) = b + 1<br />
1+p<br />
2<br />
− 1 = 2b − p + 1 .<br />
p + 1<br />
In particular, as p → 0 (which corresponds to large families) we get<br />
2b − p + 1<br />
lim<br />
p→0 p + 1<br />
= 2b + 1<br />
and so if we learn a family has b boys, we expect them to have 2b + 1 <strong>children</strong>; the 2 here is<br />
probably not surprising but the 1 is. In the limit as p → 1 (small families) we have<br />
2b − p + 1<br />
lim<br />
p→1 p + 1<br />
If people just don’t have <strong>children</strong> at all, then if we learn a family has, say, three boys, it’s<br />
very likely that that’s all <strong>of</strong> their <strong>children</strong>.<br />
= b.<br />
3