06.06.2013 Views

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

36 1 Probability <strong>Theory</strong><br />

distribution used in an expectation, we may use notation for the expectation<br />

operator similar to that we use on the individual distribution, as described<br />

on page 23. Given the random variables X1 and X2, we use the notation EX1<br />

to indicate an expectation taken with respect to the marginal distribution <strong>of</strong><br />

X1.<br />

We <strong>of</strong>ten denote the expectation taken with respect to the joint distribution<br />

as simply E, but for emphasis, we may use the notation EX1,X2.<br />

We also use notation <strong>of</strong> the form EP, where P denotes the relevant probability<br />

distribution <strong>of</strong> whatever form, or Eθ in a parametric family <strong>of</strong> probability<br />

distributions.<br />

Expectations <strong>of</strong> PDFs and <strong>of</strong> Likelihoods<br />

If the marginal PDFs <strong>of</strong> the random variables X1 and X2 are fX1 and fX2,<br />

we have the equalities<br />

<br />

fX2(X1) fX1(X2)<br />

EX1 = EX2 = 1. (1.61)<br />

fX1(X1) fX2(X2)<br />

On the other hand,<br />

EX1(−log(fX1(X1))) ≤ EX1(−log(fX2(X1))), (1.62)<br />

with equality only if fX1(x) = fX2(x) a.e. (see page 41).<br />

When the distributions are in the same parametric family, we may write<br />

fθ with different values <strong>of</strong> θ instead <strong>of</strong> fX1 and fX2. In that case, it is more<br />

natural to think <strong>of</strong> the functions as likelihoods since the parameter is the<br />

variable. From equation (1.61), for example, we have for the likelihood ratio,<br />

<br />

L(θ2; X)<br />

Eθ1 = 1. (1.63)<br />

L(θ1; X)<br />

Covariance and Correlation<br />

Expectations are also used to define relationships among random variables.<br />

We will first consider expectations <strong>of</strong> scalar random variables, and then discuss<br />

expectations <strong>of</strong> vector and matrix random variables.<br />

For two scalar random variables, X and Y , useful measures <strong>of</strong> a linear<br />

relationship between them are the covariance and correlation. The covariance<br />

<strong>of</strong> X and X, if it exists, is denoted by Cov(X, Y ), and is defined as<br />

Cov(X, Y ) = E((X − E(X))(Y − E(Y ))) (1.64)<br />

From the Cauchy-Schwarz inequality (B.21) (see page 845), we see that<br />

(Cov(X, Y )) 2 ≤ V(X)V(Y ). (1.65)<br />

<strong>Theory</strong> <strong>of</strong> <strong>Statistics</strong> c○2000–2013 James E. Gentle

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!