28.07.2014 Views

Stat 134 Fall 2011: Distributions of numbers of children

Stat 134 Fall 2011: Distributions of numbers of children

Stat 134 Fall 2011: Distributions of numbers of children

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Stat</strong> <strong>134</strong> <strong>Fall</strong> <strong>2011</strong>: <strong>Distributions</strong> <strong>of</strong> <strong>numbers</strong> <strong>of</strong> <strong>children</strong><br />

Michael Lugo<br />

November 16, <strong>2011</strong><br />

We considered the following model in class today: in a certain town, the probability<br />

that a random family has t <strong>children</strong> is P (T = t) = (1 − p) t p. All <strong>children</strong> have probability<br />

1/2 <strong>of</strong> being a boy; the gender <strong>of</strong> each child is independent <strong>of</strong> all other <strong>children</strong> and <strong>of</strong> the<br />

number <strong>of</strong> <strong>children</strong> the family has. So if B is the number <strong>of</strong> boys that a family has, then<br />

P (B = b|T = t) = 2 −t( t<br />

b)<br />

. We want to know the distribution <strong>of</strong> the number <strong>of</strong> boys.<br />

By the definition <strong>of</strong> conditional probability we can say<br />

P (B = b) = ∑ t<br />

= ∑ t≥b<br />

P (B = b|T = t)P (T = t)<br />

( ) t<br />

2 −t (1 − p) t p<br />

b<br />

but this sum is tricky! To be able to do it let’s make it even harder; we’ll find ∑ b≥0 P (B =<br />

b)z b , the generating function <strong>of</strong> B. This is given by the double sum<br />

∑<br />

P (B = b)z b = ∑ ( ∑<br />

( ) )<br />

t<br />

2 −t q t p z b<br />

b<br />

b≥0 n≥0 t≥b<br />

where q = 1 − p. Often with these sums it turns out to be useful to interchange the order <strong>of</strong><br />

summation. If we do this we get<br />

∑ t∑<br />

( ) t<br />

2 −t q t pz n<br />

n<br />

t≥0 n=0<br />

and pulling otu constants gives<br />

∑ ( q t<br />

t∑<br />

( t p z<br />

2)<br />

n)<br />

n .<br />

t≥0<br />

n=0<br />

Now the inner sum (over n) can be done by the binomial theorem; it’s (1 + z) t . So we’re left<br />

with<br />

∑ ( q<br />

) t<br />

p(1 + z) t .<br />

2<br />

t≥0<br />

1


This is just a geometric series, which is a little more obvious if it’s written in the form<br />

( ) t q(1 + z)<br />

.<br />

p ∑ t≥0<br />

2<br />

Its sum is therefore<br />

and after some rewriting this is<br />

p<br />

1 − q (1 + z)<br />

2<br />

(<br />

1 −<br />

2p<br />

1+p<br />

1−p<br />

1+p<br />

)<br />

z.<br />

Of course this is just a geometric series; we can expand it to get<br />

2p<br />

1 + p + 2p<br />

1 + p<br />

1 − p<br />

1 + p z + 2p<br />

1 + p<br />

and the coefficient <strong>of</strong> z b is P (B = b); this is<br />

P (B = b) =<br />

2p<br />

1 + p<br />

( ) 2 1 − p<br />

z 2 + · · ·<br />

1 + p<br />

( ) b 1 − p<br />

.<br />

1 + p<br />

In particular, the number <strong>of</strong> boys that a family has is geometric, with parameter 2p/(1 + p).<br />

Also, if you recall that the mean <strong>of</strong> a geometric(p) is just (1 − p)/p, then you find that<br />

E(T ) = (1 − p)/p and<br />

E(B) = 1 − 2p<br />

1+p<br />

2p<br />

1+p<br />

= 1 + p − 2p<br />

2p<br />

= 1 − p<br />

2p .<br />

So E(B) = E(T )/2, which is <strong>of</strong> course what you’d expect – the expected number <strong>of</strong> boys is<br />

exactly half the expected number <strong>of</strong> <strong>children</strong>. What’s surprising is that B turns out to be<br />

geometric. We can work out P (T = t|B = b) as well:<br />

P (T = t)P (B = b|T = t)<br />

P (T = t|B = b) =<br />

P (B = b)<br />

= (1 − p)t p2 −t( )<br />

t<br />

b<br />

(<br />

=<br />

=<br />

2p<br />

1+p<br />

1−p<br />

1+p<br />

) b<br />

( ) t ( ) ( ) b 1 − p t 1 + p 1 + p<br />

.<br />

2 b 2 1 − p<br />

( ) ( ) b+1 ( ) t−b (t + 1) − 1 1 + p 1 − p<br />

.<br />

(b + 1) − 1 2 2<br />

2


The reason for writing it int his form is for comparison with the negative binomial. The<br />

number <strong>of</strong> trials T r until the rth success in Bernoulli(π) trials has distribution<br />

( ) t − 1<br />

P (T r = t) = π r (1 − π) t−r<br />

r − 1<br />

and so comparing these, we see that, given B = b, the distribution <strong>of</strong> T +1 is the distribution<br />

<strong>of</strong> the waiting time until the (b + 1)st success in Bernoulli((1 + p)/2) trials.<br />

In particular that means that<br />

and so we get the formula<br />

E(T + 1|B = b) = b + 1<br />

1+p<br />

2<br />

E(T |B = b) = b + 1<br />

1+p<br />

2<br />

− 1 = 2b − p + 1 .<br />

p + 1<br />

In particular, as p → 0 (which corresponds to large families) we get<br />

2b − p + 1<br />

lim<br />

p→0 p + 1<br />

= 2b + 1<br />

and so if we learn a family has b boys, we expect them to have 2b + 1 <strong>children</strong>; the 2 here is<br />

probably not surprising but the 1 is. In the limit as p → 1 (small families) we have<br />

2b − p + 1<br />

lim<br />

p→1 p + 1<br />

If people just don’t have <strong>children</strong> at all, then if we learn a family has, say, three boys, it’s<br />

very likely that that’s all <strong>of</strong> their <strong>children</strong>.<br />

= b.<br />

3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!