08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

a probability distribution function.<br />

In many situations, a probability distribution function does not exist. For example,<br />

for the uniform probability on the interval [0,1], the probability <strong>of</strong> any specific value is<br />

zero. What we can do is define a probability density function p(x) such that<br />

Prob(a < x < b) =<br />

∫ b<br />

a<br />

p(x)dx<br />

If x is a continuous random variable for which a density function exists, then the cumulative<br />

distribution function f (a) is defined by<br />

f(a) =<br />

which gives the probability that x ≤ a.<br />

∫ a<br />

−∞<br />

p(x)dx<br />

12.4.1 Sample Space, Events, Independence<br />

There may be more than one relevant random variable in a situation. For example, if<br />

one tosses n coins, there are n random variables, x 1 , x 2 , . . . , x n , taking on values 0 and 1,<br />

a 1 for heads and a 0 for tails. The set <strong>of</strong> possible outcomes, the sample space, is {0, 1} n .<br />

An event is a subset <strong>of</strong> the sample space. The event <strong>of</strong> an odd number <strong>of</strong> heads, consists<br />

<strong>of</strong> all elements <strong>of</strong> {0, 1} n with an odd number <strong>of</strong> 1’s.<br />

Let A and B be two events. The joint occurrence <strong>of</strong> the two events is denoted by<br />

(A∧B). The conditional probability <strong>of</strong> event A given that event B has occurred is denoted<br />

by Prob(A|B)and is given by<br />

Prob(A|B) =<br />

Prob(A ∧ B)<br />

.<br />

Prob(B)<br />

Events A and B are independent if the occurrence <strong>of</strong> one event has no influence on the<br />

probability <strong>of</strong> the other. That is, Prob(A|B) = Prob(A) or equivalently, Prob(A ∧ B) =<br />

Prob(A)Prob(B). Two random variables x and y are independent if for every possible set<br />

A <strong>of</strong> values for x and every possible set B <strong>of</strong> values for y, the events x in A and y in B<br />

are independent.<br />

A collection <strong>of</strong> n random variables x 1 , x 2 , . . . , x n<br />

possible sets A 1 , A 2 , . . . , A n <strong>of</strong> values <strong>of</strong> x 1 , x 2 , . . . , x n ,<br />

is mutually independent if for all<br />

Prob(x 1 ∈ A 1 , x 2 ∈ A 2 , . . . , x n ∈ A n ) = Prob(x 1 ∈ A 1 )Prob(x 2 ∈ A 2 ) · · · Prob(x n ∈ A n ).<br />

If the random variables are discrete, it would suffice to say that for any real numbers<br />

a 1 , a 2 , . . . , a n<br />

Prob(x 1 = a 1 , x 2 = a 2 , . . . , x n = a n ) = Prob(x 1 = a 1 )Prob(x 2 = a 2 ) · · · Prob(x n = a n ).<br />

388

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!