06.06.2013 Views

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

Theory of Statistics - George Mason University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Power <strong>of</strong> a Statistical Test<br />

7.1 Statistical Hypotheses 509<br />

We call the probability <strong>of</strong> rejecting H0 the power <strong>of</strong> the test, and denote it by<br />

β, or for the particular test δ(X), by βδ. The power is defined over the full<br />

set <strong>of</strong> distributions in the union <strong>of</strong> the hypotheses. For hypotheses concerning<br />

the parameter θ, as in Example 7.1, the power can be represented as a curve<br />

β(θ), as shown in Figure 7.1. We see that the power function <strong>of</strong> the test, for<br />

any given θ ∈ Θ as<br />

βδ(θ) = Eθ(δ(X)). (7.9)<br />

The power in the case that H1 is true is 1 minus the probability <strong>of</strong> a<br />

type II error. Thus, minimizing the error in equation (7.7) is equivalent to<br />

maximizing the power within Θ1.<br />

The probability <strong>of</strong> a type II error is generally a function <strong>of</strong> the true distribution<br />

<strong>of</strong> the sample Pθ, and hence so is the power, which we may emphasize<br />

by the notation βδ(Pθ) or βδ(θ). In much <strong>of</strong> the following, we will assume<br />

that θ ∈ Θ ⊆ IR k ; that is, the statistical inference is “parametric”. This setup<br />

is primarily one <strong>of</strong> convenience, because most concepts carry over to more<br />

general nonparametric situations. There are some cases, however, when they<br />

do not, as for example, when we speak <strong>of</strong> continuity wrt θ. We now can focus<br />

on the test under either hypothesis (that is, under either subset <strong>of</strong> the family<br />

<strong>of</strong> distributions) in a unified fashion.<br />

Because the power is generally a function <strong>of</strong> θ, what does maximizing the<br />

power mean? That is, maximize it for what values <strong>of</strong> θ? Ideally, we would<br />

like a procedure that yields the maximum for all values <strong>of</strong> θ; that is, one that<br />

is most powerful for all values <strong>of</strong> θ. We call such a procedure a uniformly<br />

most powerful or UMP test. For a given problem, finding such procedures, or<br />

establishing that they do not exist, will be one <strong>of</strong> our primary objectives.<br />

In some cases, βδ(θ) may be a continuous function <strong>of</strong> θ. Such cases may<br />

allow us to apply analytic methods for identifying most powerful tests within a<br />

class <strong>of</strong> tests satisfying certain desirable restrictions. (We do this on page 521.)<br />

Randomized Tests<br />

We defined a randomized test (page 289) as one whose range is not a.s. {0, 1}.<br />

Because in this definition, a randomized test does not yield a “yes/no” decision<br />

about the hypothesis being tested, a test with a random component is more<br />

useful.<br />

Given a randomized test δ(X) that maps X onto {0, 1} ∪ DR, we can construct<br />

a test with a random component using the rule that if if δ(X) ∈ DR,<br />

then the experiment R is performed with δR(X) chosen so that the overall<br />

probability <strong>of</strong> a type I error is the desired level. The experiment R is independent<br />

<strong>of</strong> the random variable about whose distribution the hypothesis applies<br />

to. As a practical matter a U(0, 1) random variable can be used to define<br />

the random experiment. The random variable itself is <strong>of</strong>ten simulated on a<br />

computer.<br />

<strong>Theory</strong> <strong>of</strong> <strong>Statistics</strong> c○2000–2013 James E. Gentle

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!