ST5224: Advanced Statistical Theory II - The Department of Statistics ...
ST5224: Advanced Statistical Theory II - The Department of Statistics ...
ST5224: Advanced Statistical Theory II - The Department of Statistics ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong><br />
Chen Zehua<br />
<strong>Department</strong> <strong>of</strong> <strong>Statistics</strong> & Applied Probability<br />
Thursday, February 16, 2012<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Lecture 9: Unbiased tests and UMPU<br />
Unbiased tests<br />
When a UMP test does not exist, we may use the same approach<br />
used in estimation problems, i.e., imposing a reasonable restriction<br />
on the tests. One such restriction is unbiasedness.<br />
A UMP test T <strong>of</strong> size α has the property that<br />
β T (P) ≤ α, P ∈ P 0 and β T (P) ≥ α, P ∈ P 1 , (1)<br />
since T is at least as good as the silly test T ≡ α. This leads to<br />
the following definition.<br />
Definition 6.3<br />
Let α be a given level <strong>of</strong> significance. A test T for H 0 : P ∈ P 0<br />
versus H 1 : P ∈ P 1 is said to be unbiased <strong>of</strong> level α if and only if<br />
(1) holds. A test <strong>of</strong> size α is called a uniformly most powerful<br />
unbiased (UMPU) test if and only if it is UMP within the class <strong>of</strong><br />
unbiased tests <strong>of</strong> level α.<br />
<strong>The</strong> consideration <strong>of</strong> UMPU tests can be confined to the tests<br />
based on a sufficient statistics. Why?<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
UMPU tests in exponential families<br />
Consider the following hypotheses:<br />
H 0 : θ ∈ Θ 0 versus H 1 : θ ∈ Θ 1 ,<br />
where θ = θ(P) is a functional from P onto Θ and Θ 0 and Θ 1 are<br />
disjoint and Θ 0 ∪ Θ 1 = Θ. (P j = {P : θ ∈ Θ j }, j = 0, 1.)<br />
Definition 6.4 (Similarity)<br />
Consider the hypotheses H 0 : θ ∈ Θ 0 vs H 1 : θ ∈ Θ 1 .<br />
Let α be a given level <strong>of</strong> significance and let ¯Θ 01 be the common<br />
boundary <strong>of</strong> Θ 0 and Θ 1 , i.e., the set <strong>of</strong> points θ that are points or<br />
limit points <strong>of</strong> both Θ 0 and Θ 1 .<br />
A test T is similar on ¯Θ 01 if and only if β T (P) = α for all θ ∈ ¯Θ 01 .<br />
Remark<br />
It is more convenient to work with similarity than to work with<br />
unbiasedness for testing H 0 : θ ∈ Θ 0 vs H 1 : θ ∈ Θ 1 .<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Continuity <strong>of</strong> the power function<br />
For a given test T , the power function β T (P) is said to be<br />
continuous in θ if and only if for any {θ j : j = 0, 1, 2, ...} ⊂ Θ,<br />
θ j → θ 0 implies β T (P j ) → β T (P 0 ), where P j ∈ P satisfying<br />
θ(P j ) = θ j , j = 0, 1,.... If β T is a function <strong>of</strong> θ, then this<br />
continuity property is simply the continuity <strong>of</strong> β T (θ).<br />
Lemma 6.5<br />
Consider hypotheses H 0 : θ ∈ Θ 0 vs H 1 : θ ∈ Θ 1 . Suppose that, for<br />
every T , β T (P) is continuous in θ. If T ∗ is uniformly most<br />
powerful among all similar tests and has size α, then T ∗ is a<br />
UMPU test.<br />
Pro<strong>of</strong><br />
Under the continuity assumption on β T , the class <strong>of</strong> similar tests<br />
contains the class <strong>of</strong> unbiased tests.<br />
Since T ∗ is uniformly at least as powerful as the test T ≡ α, T ∗ is<br />
unbiased.<br />
Hence, T ∗ is a UMPU test.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Neyman structure<br />
Let U(X ) be a sufficient statistic for P ∈ ¯P = {P : θ ∈ ¯Θ 01 } and<br />
let ¯P U be the family <strong>of</strong> distributions <strong>of</strong> U as P ranges over ¯P.<br />
A test is said to have Neyman structure w.r.t. U if<br />
E[T (X )|U] = α a.s. ¯P U ,<br />
Clearly, if T has Neyman structure, then<br />
E[T (X )] = E{E[T (X )|U]} = α<br />
P ∈ ¯P,<br />
i.e., T is similar on ¯Θ 01 .<br />
If all tests similar on ¯Θ 01 have Neyman structure w.r.t. U, then<br />
working with tests having Neyman structure is the same as working<br />
with tests similar on ¯Θ 01 .<br />
Lemma 6.6<br />
Let U(X ) be a sufficient statistic for P ∈ ¯P.<br />
A necessary and sufficient condition for all tests similar on ¯Θ 01 to<br />
have Neyman structure w.r.t. U is that U is boundedly complete<br />
for P ∈ ¯P.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Pro<strong>of</strong><br />
(i) Suppose first that U is boundedly complete for P ∈ ¯P.<br />
Let T (X ) be a test similar on ¯Θ 01 .<br />
<strong>The</strong>n E[T (X ) − α] = 0 for all P ∈ ¯P.<br />
From the boundedness <strong>of</strong> T (X ), E[T (X )|U] is bounded.<br />
Since E{E[T (X )|U] − α} = E[T (X ) − α] = 0 for all P ∈ ¯P and U<br />
is boundedly complete, E[T (X )|U] = α a.s. ¯P U , i.e., T has<br />
Neyman structure.<br />
(ii) Suppose now that all tests similar on ¯Θ 01 have Neyman<br />
structure w.r.t. U.<br />
Suppose also that U is not boundedly complete for P ∈ ¯P.<br />
<strong>The</strong>n there is a function h such that |h(u)| ≤ C, E[h(U)] = 0 for<br />
all P ∈ ¯P, and h(U) ≠ 0 with positive probability for some P ∈ ¯P.<br />
Let T (X ) = α + ch(U), where c = min{α, 1 − α}/C.<br />
<strong>The</strong>n T is a test similar on ¯Θ 01 but T does not have Neyman<br />
structure w.r.t. U (because h(U) ≠ 0).<br />
Thus, U must be boundedly complete for P ∈ ¯P.<br />
This proves the result.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
<strong>The</strong>orem 6.4 (UMPU tests in exponential families)<br />
Suppose that X has the following p.d.f. w.r.t. a σ-finite measure:<br />
f θ,ϕ (x) = exp {θY (x) + ϕ τ U(x) − ζ(θ, ϕ)} ,<br />
where θ is a real-valued parameter, ϕ is a vector-valued parameter,<br />
and Y (real-valued) and U (vector-valued) are statistics.<br />
(i) For testing H 0 : θ ≤ θ 0 versus H 1 : θ > θ 0 , a UMPU test <strong>of</strong> size<br />
⎧<br />
α is<br />
⎨ 1 Y > c(U)<br />
T ∗ (Y , U) = γ(U) Y = c(U)<br />
⎩<br />
0 Y < c(U),<br />
where c(u) and γ(u) are Borel functions determined by<br />
E θ0 [T ∗ (Y , U)|U = u] = α for every u<br />
and E θ0 is the expectation w.r.t. f θ0 ,ϕ.<br />
(ii) For testing H 0 : θ ≤ θ 1 or θ ≥ θ 2 versus H 1 : θ 1 < θ < θ 2 , a<br />
UMPU test <strong>of</strong> size α⎧<br />
is<br />
⎨ 1 c 1 (U) < Y < c 2 (U)<br />
T ∗ (Y , U) = γ<br />
⎩ i (U) Y = c i (U), i = 1, 2,<br />
0 Y < c 1 (U) or Y > c 2 (U),<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
where c i (u)’s and γ i (u)’s are Borel functions determined by<br />
E θ1 [T ∗ (Y , U)|U = u] = E θ2 [T ∗ (Y , U)|U = u] = α for every u.<br />
(iii) For testing H 0 : θ 1 ≤ θ ≤ θ 2 versus H 1 : θ < θ 1 or θ > θ 2 , a<br />
UMPU test <strong>of</strong> size α is<br />
⎧<br />
⎨ 1 Y < c 1 (U) or Y > c 2 (U)<br />
T ∗ (Y , U) = γ<br />
⎩ i (U) Y = c i (U), i = 1, 2,<br />
0 c 1 (U) < Y < c 2 (U),<br />
where c i (u)’s and γ i (u)’s are Borel functions determined by<br />
E θ1 [T ∗ (Y , U)|U = u] = E θ2 [T ∗ (Y , U)|U = u] = α for every u.<br />
(iv) For testing H 0 : θ = θ 0 versus H 1 : θ ≠ θ 0 , a UMPU test <strong>of</strong><br />
size α is given by T ∗ (Y , U) in (iii), where c i (u)’s and γ i (u)’s are<br />
Borel functions determined by<br />
and<br />
E θ0 [T ∗ (Y , U)|U = u] = α for every u<br />
E θ0 [T ∗ (Y , U)Y |U = u] = αE θ0 (Y |U = u) for every u.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Pro<strong>of</strong><br />
By sufficiency, we only need to consider tests that are functions <strong>of</strong><br />
(Y , U).<br />
It follows from <strong>The</strong>orem 2.1(i) that the p.d.f. <strong>of</strong> (Y , U) (w.r.t. a<br />
σ-finite measure) is in a natural exponential family <strong>of</strong> the form<br />
exp {θy + ϕ τ u − ζ(θ, ϕ)} and, given U = u, the p.d.f. <strong>of</strong> the<br />
conditional distribution <strong>of</strong> Y (w.r.t. a σ-finite measure ν u ) is in a<br />
natural exponential family <strong>of</strong> the form exp {θy − ζ u (θ)}.<br />
Hypotheses in (i)-(iv) are <strong>of</strong> the form H 0 : θ ∈ Θ 0 vs H 1 : θ ∈ Θ 1<br />
with ¯Θ 01 = {(θ, ϕ) : θ = θ 0 } or = {(θ, ϕ) : θ = θ i , i = 1, 2}.<br />
In case (i) or (iv), U is sufficient and complete for P ∈ ¯P and,<br />
hence, Lemma 6.6 applies.<br />
In case (ii) or (iii), applying Lemma 6.6 to each {(θ, ϕ) : θ = θ i }<br />
also shows that working with tests having Neyman structure is the<br />
same as working with tests similar on ¯Θ 01 .<br />
By <strong>The</strong>orem 2.1, the power functions <strong>of</strong> all tests are continuous<br />
and, hence, Lemma 6.5 applies.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Thus, for (i), it suffices to show T ∗ is UMP among all tests T<br />
satisfying<br />
E θ0 [T (Y , U)|U = u] = α for every u (2)<br />
and for part (ii) or (iii)), it suffices show T ∗ is UMP among all<br />
tests T satisfying<br />
E θ1 [T (Y , U)|U = u] = E θ2 [T (Y , U)|U = u] = α for every u.<br />
For (iv), any unbiased T should satisfy (2) and<br />
∂<br />
∂θ E θ,ϕ[T (Y , U)] = 0, θ ∈ ¯Θ 01 . (3)<br />
One can show (exercise) that (3) is equivalent to<br />
E θ,ϕ [T (Y , U)Y − αY ] = 0, θ ∈ ¯Θ 01 . (4)<br />
Using the argument in the pro<strong>of</strong> <strong>of</strong> Lemma 6.6, one can show<br />
(exercise) that (4) is equivalent to<br />
E θ0 [T (Y , U)Y |U = u] = αE θ0 (Y |U = u) for every u. (5)<br />
Hence, for (iv), it suffices to show T ∗ is UMP among all tests T<br />
satisfying (2) and (5).<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Note that the power function <strong>of</strong> any test T (Y , U) is<br />
∫ [∫<br />
]<br />
β T (θ, ϕ) = T (y, u)dP Y |U=u (y) dP U (u).<br />
Thus, it suffices to show that for every fixed u and θ ∈ Θ 1 , T ∗<br />
maximizes<br />
∫<br />
T (y, u)dP Y |U=u (y)<br />
over all T subject to the given side conditions.<br />
Since P Y |U=u is in a one-parameter exponential family, the results<br />
in (i) and (ii) follow from Corollary 6.1 and <strong>The</strong>orem 6.3,<br />
respectively. <strong>The</strong> result in (iii) follows from <strong>The</strong>orem 6.3(ii) by<br />
considering 1 − T ∗ .<br />
To prove the result in (iv), it suffices to show that if Y has the<br />
p.d.f. given by exp {θy − ζ u (θ)} and if u is treated as a constant in<br />
(2) and (5), T ∗ in (iii) with a fixed u is UMP subject to conditions<br />
(2) and (5). We now omit u in the following pro<strong>of</strong> for (iv), which<br />
is very similar to the pro<strong>of</strong> <strong>of</strong> <strong>The</strong>orem 6.3.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
First, (α, αE θ0 (Y )) is an interior point <strong>of</strong> the set <strong>of</strong> points<br />
(E θ0 [T (Y )], E θ0 [T (Y )Y ]) as T ranges over all tests <strong>of</strong> the form<br />
T (Y ). By Lemma 6.2 and Proposition 6.1, for testing θ = θ 0<br />
versus θ = θ 1 , the UMP test is equal to 1 when<br />
(k 1 + k 2 y)e θ 0y < C(θ 0 , θ 1 )e θ 1y ,<br />
where k i ’s and C(θ 0 , θ 1 ) are constants. This inequality is<br />
equivalent to<br />
a 1 + a 2 y < e by<br />
for some constants a 1 , a 2 , and b. This region is either one-sided or<br />
the outside <strong>of</strong> an interval. By <strong>The</strong>orem 6.2(ii), a one-sided test has<br />
a strictly monotone power function and therefore cannot satisfy<br />
(5). Thus, this test must have the form <strong>of</strong> T ∗ in (iii). Since T ∗ in<br />
(iii) does not depend on θ 1 , by Lemma 6.1, it is UMP over all tests<br />
satisfying (2) and (5); in particular, the test ≡ α. Thus, T ∗ is<br />
UMPU.<br />
Finally, it can be shown that all the c- and γ-functions in (i)-(iv)<br />
are Borel functions <strong>of</strong> u (see Lehmann (1986, p. 149)).<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Example 6.11<br />
A problem arising in many different contexts is the comparison <strong>of</strong><br />
two treatments.<br />
If the observations are integer-valued, the problem <strong>of</strong>ten reduces to<br />
testing the equality <strong>of</strong> two Poisson distributions (e.g., a comparison<br />
<strong>of</strong> the radioactivity <strong>of</strong> two substances or the car accident rate in<br />
two cities) or two binomial distributions (when the observation is<br />
the number <strong>of</strong> successes in a sequence <strong>of</strong> trials for each treatment).<br />
Consider first the Poisson problem in which X 1 and X 2 are<br />
independently distributed as the Poisson distributions P(λ 1 ) and<br />
P(λ 2 ), respectively.<br />
<strong>The</strong> p.d.f. <strong>of</strong> X = (X 1 , X 2 ) is<br />
[e −(λ 1+λ 2 ) /x 1 !x 2 !] exp {x 2 log(λ 2 /λ 1 ) + (x 1 + x 2 ) log λ 1 }<br />
w.r.t. the counting measure on<br />
{(i, j) : i = 0, 1, 2, ..., j = 0, 1, 2, ...}.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Example 6.11 (continued)<br />
Let θ = log(λ 2 /λ 1 ).<br />
<strong>The</strong>n hypotheses such as λ 1 = λ 2 and λ 1 ≥ λ 2 are equivalent to<br />
θ = 0 and θ ≤ 0, respectively.<br />
<strong>The</strong> p.d.f. <strong>of</strong> X is in a multiparameter exponential family with<br />
ϕ = log λ 1 , Y = X 2 , and U = X 1 + X 2 .<br />
Thus, <strong>The</strong>orem 6.4 applies.<br />
To obtain various tests in <strong>The</strong>orem 6.4, it is enough to derive the<br />
conditional distribution <strong>of</strong> Y = X 2 given U = X 1 + X 2 = u.<br />
Using the fact that X 1 + X 2 has the Poisson distribution<br />
P(λ 1 + λ 2 ), one can show that<br />
( u<br />
P(Y = y|U = u) = p<br />
y)<br />
y (1−p) u−y I {0,1,...,u} (y), u = 0, 1, 2, ...,<br />
where p = λ 2 /(λ 1 + λ 2 ) = e θ /(1 + e θ ).<br />
This is the binomial distribution Bi(p, u).<br />
On the boundary set ¯Θ 01 , θ = θ j (a known value) and the<br />
distribution P Y |U=u is known.<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Example 6.11 (continued)<br />
Consider next the binomial problem in which X j , j = 1, 2, are<br />
independently distributed as the binomial distributions Bi(p j , n j ),<br />
j = 1, 2, respectively, where n j ’s are known but p j ’s are unknown.<br />
<strong>The</strong> p.d.f. <strong>of</strong> X = (X 1 , X 2 ) is<br />
( )( )<br />
n1 n2<br />
{<br />
(1−p 1 ) n 1<br />
(1−p 2 ) n 2<br />
exp x 2 log p 2(1−p 1 )<br />
x 1 x<br />
p 1 (1−p 2 ) + (x 1 + x 2 ) log<br />
2<br />
w.r.t. the counting measure on<br />
{(i, j) : i = 0, 1, ..., n 1 , j = 0, 1, ..., n 2 }.<br />
This p.d.f. is in a multiparameter exponential family with<br />
θ = log p 2(1 − p 1 )<br />
p 1 (1 − p 2 ) , Y = X 2, and U = X 1 + X 2 .<br />
Thus, <strong>The</strong>orem 6.4 applies.<br />
Note that hypotheses such as p 1 = p 2 and p 1 ≥ p 2 are equivalent<br />
to θ = 0 and θ ≤ 0, respectively.<br />
}<br />
p 1<br />
(1−p 1 )<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>
Example 6.11 (continued)<br />
Using the joint distribution <strong>of</strong> (X 1 , X 2 ), one can show (exercise)<br />
that<br />
( )( )<br />
n1 n2<br />
P(Y = y|U = u) = K u (θ)<br />
e θy I A (y), u = 0, 1, ..., n 1 +n 2 ,<br />
u − y y<br />
where<br />
A = {y : y = 0, 1, ..., min{u, n 2 }, u − y ≤ n 1 }<br />
and<br />
K u (θ) =<br />
⎡<br />
⎤<br />
⎣ ∑ ( )( )<br />
−1<br />
n1 n2<br />
e θy ⎦ .<br />
u − y y<br />
y∈A<br />
If θ = 0, this distribution reduces to a known distribution: the<br />
hypergeometric distribution HG(u, n 2 , n 1 ) (Table 1.1, page 18).<br />
Chen Zehua<br />
<strong>ST5224</strong>: <strong>Advanced</strong> <strong>Statistical</strong> <strong><strong>The</strong>ory</strong> <strong>II</strong>