08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

are O(ln n) in size and the expected size <strong>of</strong> the small components is O(1).<br />

An important tool in our analysis <strong>of</strong> branching processes is the generating function.<br />

The generating function for a nonnegative integer valued random variable y is<br />

∑<br />

f (x) = ∞ p i x i where p i is the probability that y equals i. The reader not familiar with<br />

i=0<br />

generating functions should consult Section 12.8 <strong>of</strong> the appendix.<br />

Let the random variable z j be the number <strong>of</strong> children in the j th generation and let<br />

f j (x) be the generating function for z j . Then f 1 (x) = f (x) is the generating function for<br />

the first generation where f(x) is the generating function for the number <strong>of</strong> children at a<br />

node in the tree. The generating function for the 2 nd generation is f 2 (x) = f (f (x)). In<br />

general, the generating function for the j + 1 st generation is given by f j+1 (x) = f j (f (x)).<br />

To see this, observe two things.<br />

First, the generating function for the sum <strong>of</strong> two identically distributed integer valued<br />

random variables x 1 and x 2 is the square <strong>of</strong> their generating function<br />

f 2 (x) = p 2 0 + (p 0 p 1 + p 1 p 0 ) x + (p 0 p 2 + p 1 p 1 + p 2 p 0 ) x 2 + · · · .<br />

For x 1 + x 2 to have value zero, both x 1 and x 2 must have value zero, for x 1 + x 2 to have<br />

value one, exactly one <strong>of</strong> x 1 or x 2 must have value zero and the other have value one, and<br />

so on. In general, the generating function for the sum <strong>of</strong> i independent random variables,<br />

each with generating function f (x), is f i (x).<br />

The second observation is that the coefficient <strong>of</strong> x i in f j (x) is the probability <strong>of</strong><br />

there being i children in the j th generation. If there are i children in the j th generation,<br />

the number <strong>of</strong> children in the j + 1 st generation is the sum <strong>of</strong> i independent random<br />

variables each with generating function f(x). Thus, the generating function for the j +1 st<br />

generation, given i children in the j th generation, is f i (x). The generating function for<br />

the j + 1 st generation is given by<br />

f j+1 (x) =<br />

∞∑<br />

Prob(z j = i)f i (x).<br />

∑<br />

If f j (x) = ∞ a i x i , then f j+1 is obtained by substituting f(x) for x in f j (x).<br />

i=0<br />

i=0<br />

Since f (x) and its iterates, f 2 , f 3 , . . ., are all polynomials in x with nonnegative coefficients,<br />

f (x) and its iterates are all monotonically increasing and convex on the unit<br />

interval. Since the probabilities <strong>of</strong> the number <strong>of</strong> children <strong>of</strong> a node sum to one, if p 0 < 1,<br />

some coefficient <strong>of</strong> x to a power other than zero in f (x) is nonzero and f (x) is strictly<br />

increasing.<br />

97

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!