08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The normal distribution is<br />

1<br />

√ e − 1 (x−m) 2<br />

2 σ 2<br />

2πσ<br />

where m is the mean and σ 2 1<br />

is the variance. The coefficient √<br />

2πσ<br />

makes the integral <strong>of</strong><br />

the distribution be one. If we measure distance in units <strong>of</strong> the standard deviation σ from<br />

the mean, then<br />

φ(x) = √ 1 e − 1 2 x2<br />

2π<br />

Standard tables give values <strong>of</strong> the integral<br />

∫ t<br />

0<br />

φ(x)dx<br />

and from these values one can compute probability integrals for a normal distribution<br />

with mean m and variance σ 2 .<br />

General Gaussians<br />

So far we have seen spherical Gaussian densities in R d . The word spherical indicates<br />

that the level curves <strong>of</strong> the density are spheres. If a random vector y in R d has a spherical<br />

Gaussian density with zero mean, then y i and y j , i ≠ j, are independent. However, in<br />

many situations the variables are correlated. To model these Gaussians, level curves that<br />

are ellipsoids rather than spheres are used.<br />

For a random vector x, the covariance <strong>of</strong> x i and x j is E((x i − µ i )(x j − µ j )). We list<br />

the covariances in a matrix called the covariance matrix, denoted Σ. 36 Since x and µ are<br />

column vectors, (x − µ)(x − µ) T is a d × d matrix. Expectation <strong>of</strong> a matrix or vector<br />

means componentwise expectation.<br />

Σ = E ( (x − µ)(x − µ) T ) .<br />

The general Gaussian density with mean µ and positive definite covariance matrix Σ is<br />

1<br />

f(x) = √<br />

(−<br />

(2π)d det(Σ) exp 1 )<br />

2 (x − µ)T Σ −1 (x − µ) .<br />

To compute the covariance matrix <strong>of</strong> the Gaussian, substitute y = Σ −1/2 (x − µ). Noting<br />

that a positive definite symmetric matrix has a square root:<br />

E((x − µ)(x − µ) T = E(Σ 1/2 yy T Σ 1/2 )<br />

= Σ 1/2 ( E(yy T ) ) Σ 1/2 = Σ.<br />

36 Σ is the standard notation for the covariance matrix. We will use it sparingly so as not to confuse<br />

with the summation sign.<br />

392

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!