Stat 134 Fall 2011: Distributions of numbers of children

Stat 134 Fall 2011: Distributions of numbers of children 

Michael Lugo 

November 16, 2011 

We considered the following model in class today: in a certain town, the probability 

that a random family has t children is P (T = t) = (1 − p) t p. All children have probability 

1/2 of being a boy; the gender of each child is independent of all other children and of the 

number of children the family has. So if B is the number of boys that a family has, then 

P (B = b|T = t) = 2 −t( t 

b) 

. We want to know the distribution of the number of boys. 

By the definition of conditional probability we can say 

P (B = b) = ∑ t 

= ∑ t≥b 

P (B = b|T = t)P (T = t) 

( ) t 

2 −t (1 − p) t p 

b 

but this sum is tricky! To be able to do it let’s make it even harder; we’ll find ∑ b≥0 P (B = 

b)z b , the generating function of B. This is given by the double sum 

∑ 

P (B = b)z b = ∑ ( ∑ 

( ) ) 

t 

2 −t q t p z b 

b 

b≥0 n≥0 t≥b 

where q = 1 − p. Often with these sums it turns out to be useful to interchange the order of 

summation. If we do this we get 

∑ t∑ 

( ) t 

2 −t q t pz n 

n 

t≥0 n=0 

and pulling otu constants gives 

∑ ( q t 

t∑ 

( t p z 

2) 

n) 

n . 

t≥0 

n=0 

Now the inner sum (over n) can be done by the binomial theorem; it’s (1 + z) t . So we’re left 

with 

∑ ( q 

) t 

p(1 + z) t . 

2 

t≥0 

1

This is just a geometric series, which is a little more obvious if it’s written in the form 

( ) t q(1 + z) 

. 

p ∑ t≥0 

2 

Its sum is therefore 

and after some rewriting this is 

p 

1 − q (1 + z) 

2 

( 

1 − 

2p 

1+p 

1−p 

1+p 

) 

z. 

Of course this is just a geometric series; we can expand it to get 

2p 

1 + p + 2p 

1 + p 

1 − p 

1 + p z + 2p 

1 + p 

and the coefficient of z b is P (B = b); this is 

P (B = b) = 

2p 

1 + p 

( ) 2 1 − p 

z 2 + · · · 

1 + p 

( ) b 1 − p 

. 

1 + p 

In particular, the number of boys that a family has is geometric, with parameter 2p/(1 + p). 

Also, if you recall that the mean of a geometric(p) is just (1 − p)/p, then you find that 

E(T ) = (1 − p)/p and 

E(B) = 1 − 2p 

1+p 

2p 

1+p 

= 1 + p − 2p 

2p 

= 1 − p 

2p . 

So E(B) = E(T )/2, which is of course what you’d expect – the expected number of boys is 

exactly half the expected number of children. What’s surprising is that B turns out to be 

geometric. We can work out P (T = t|B = b) as well: 

P (T = t)P (B = b|T = t) 

P (T = t|B = b) = 

P (B = b) 

= (1 − p)t p2 −t( ) 

t 

b 

( 

= 

= 

2p 

1+p 

1−p 

1+p 

) b 

( ) t ( ) ( ) b 1 − p t 1 + p 1 + p 

. 

2 b 2 1 − p 

( ) ( ) b+1 ( ) t−b (t + 1) − 1 1 + p 1 − p 

. 

(b + 1) − 1 2 2 

2

The reason for writing it int his form is for comparison with the negative binomial. The 

number of trials T r until the rth success in Bernoulli(π) trials has distribution 

( ) t − 1 

P (T r = t) = π r (1 − π) t−r 

r − 1 

and so comparing these, we see that, given B = b, the distribution of T +1 is the distribution 

of the waiting time until the (b + 1)st success in Bernoulli((1 + p)/2) trials. 

In particular that means that 

and so we get the formula 

E(T + 1|B = b) = b + 1 

1+p 

2 

E(T |B = b) = b + 1 

1+p 

2 

− 1 = 2b − p + 1 . 

p + 1 

In particular, as p → 0 (which corresponds to large families) we get 

2b − p + 1 

lim 

p→0 p + 1 

= 2b + 1 

and so if we learn a family has b boys, we expect them to have 2b + 1 children; the 2 here is 

probably not surprising but the 1 is. In the limit as p → 1 (small families) we have 

2b − p + 1 

lim 

p→1 p + 1 

If people just don’t have children at all, then if we learn a family has, say, three boys, it’s 

very likely that that’s all of their children. 

= b. 

3

Stat 134 Fall 2011: Distributions of numbers of children

Create successful ePaper yourself

Delete template?

Save as template?