08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The density <strong>of</strong> y is the unit variance, zero mean Gaussian, thus E(yy T ) = I.<br />

Bernoulli trials and the binomial distribution<br />

A Bernoulli trial has two possible outcomes, called success or failure, with probabilities<br />

p and 1 − p, respectively. If there are n independent Bernoulli trials, the probability <strong>of</strong><br />

exactly k successes is given by the binomial distribution<br />

( n<br />

B (n, p) = p<br />

k)<br />

k (1 − p) n−k<br />

The mean and variance <strong>of</strong> the binomial distribution B(n, p) are np and np(1 − p), respectively.<br />

The mean <strong>of</strong> the binomial distribution is np, by linearity <strong>of</strong> expectations. The<br />

variance is np(1 − p) since the variance <strong>of</strong> a sum <strong>of</strong> independent random variables is the<br />

sum <strong>of</strong> their variances.<br />

Let x 1 be the number <strong>of</strong> successes in n 1 trials and let x 2 be the number <strong>of</strong> successes<br />

in n 2 trials. The probability distribution <strong>of</strong> the sum <strong>of</strong> the successes, x 1 + x 2 , is the same<br />

as the distribution <strong>of</strong> x 1 + x 2 successes in n 1 + n 2 trials. Thus, B (n 1 , p) + B (n 2 , p) =<br />

B (n 1 + n 2 , p).<br />

When p is a constant, the expected degree <strong>of</strong> vertices in G (n, p) increases with n. For<br />

example, in G ( n, 2) 1 , the expected degree <strong>of</strong> a vertex is n/2. In many real applications,<br />

we will be concerned with G (n, p) where p = d/n, for d a constant; i.e., graphs whose<br />

expected degree is a constant d independent <strong>of</strong> n. Holding d = np constant as n goes to<br />

infinity, the binomial distribution<br />

( n<br />

Prob (k) = p<br />

k)<br />

k (1 − p) n−k<br />

approaches the Poisson distribution<br />

Prob(k) = (np)k e −np = dk<br />

k! k! e−d .<br />

To see this, assume k = o(n) and use the approximations n − k ∼ = n, ( )<br />

n ∼= n k<br />

( , and<br />

k k!<br />

1 −<br />

1 n−k<br />

∼<br />

n)<br />

= e −1 to approximate the binomial distribution by<br />

( ( ) k n<br />

lim<br />

)p k (1 − p) n−k = nk d<br />

(1 − d<br />

n→∞ k<br />

k! n n )n = dk<br />

k! e−d .<br />

Note that for p = d , where d is a constant independent <strong>of</strong> n, the probability <strong>of</strong> the binomial<br />

distribution falls <strong>of</strong>f rapidly for k > d, and is essentially zero for all but some<br />

n<br />

finite number <strong>of</strong> values <strong>of</strong> k. This justifies the k = o(n) assumption. Thus, the Poisson<br />

distribution is a good approximation.<br />

393

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!