22.02.2013 Views

Supplementary Web Material - Biometrics

Supplementary Web Material - Biometrics

Supplementary Web Material - Biometrics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Web</strong>-based supplementary materials for Calculating sample size<br />

for studies with expected all-or-none nonadherence and selection<br />

bias by Michelle D. Shardell and Samer S. El-Kamary<br />

May 15, 2008<br />

<strong>Web</strong> Appendix: Sample size calculations using exter-<br />

nal or internal pilot data that accommodate noncom-<br />

pliance<br />

Sample size formulas and test statistics<br />

FHF proposed methods for calculating sample size for normal and Poisson F while ac-<br />

counting for variability of existing data. The approach of calculating “calibrated power”<br />

for normal F with assumed equal variances by integrating out the variance estimator in<br />

FHF Section 3 was motivated by external pilot data when ρc = 1. Many trials do not have<br />

preliminary data, including compliance information, at the outset. Thus, one approach<br />

for calculating sample size is to perform an unblinded internal pilot study as a source for<br />

deriving assumptions about means, variances, and compliance. Interestingly, the critical<br />

value for calibrated power in FHF for external pilot studies equals the exact “inflation<br />

factor” in Zucker et al. (1999 p. 3502) needed for internal pilot studies (also calculated<br />

by integrating out the variance estimator). To circumvent the numerical integration re-<br />

quired for calibrated power/exact inflation factors, methods have been proposed utilizing<br />

1


approximate inflation factors in sample size calculations when ρc = 1 under the assump-<br />

tion of equal variances between groups (Wittes et al., 1999; Zucker et al., 1999; Miller,<br />

2005). These authors also proposed adaptations of the t-test when internal pilot data are<br />

collected to avoid elevated type I errors. However, no previous work has accommodated<br />

noncompliance in this context.<br />

In this <strong>Web</strong> Appendix, we adapt previously proposed approximate sample size formulas<br />

for external and internal pilot data to accommodate noncompliance and unequal variances.<br />

In the case of internal pilot studies, we also adapt two test statistics to accommodate<br />

noncompliance. The first method adapted is the second-segment Stein (1945) (SS) method<br />

described in Zucker et al. (1999). The second method adapted is the approach by Miller<br />

(2005) in which bounded bias (BB) of the variance estimate is accommodated in the<br />

sample size formula and t-test.<br />

Assuming equal variances in both groups, ρc = 1, and r = 1, the original Stein (1945)<br />

approach involved calculating<br />

N = 2{(tα/2,2(np−1) + tβ,2(np−1))/δ} 2 S 2 p<br />

+ 1, (1)<br />

where S 2 p is the pooled variance estimate calculated from the pilot data, tq,ν is the 1−q<br />

quantile of the t distribution with ν degrees of freedom, and np is the sample size per<br />

group in the pilot study. Thus, the approximate inflation factor used in (1) to account for<br />

variance uncertainty is {(tα/2,2(np−1) +tβ,2(np−1))/(zα/2 +zβ)} 2 , whereas the exact inflation<br />

factor corresponding to calibrated power in FHF is {(zα/2 + zβ∗)/(zα/2 + zβ)} 2 , where<br />

1 − β∗ is calibrated power (Zucker et al., 1999).<br />

If the pilot study is internal, then ns = N − np additional observations per group are<br />

collected, and the test statistic is then tStein = � N/2( ¯ Y1 − ¯ Y0)/Sp, to be compared with<br />

tα/2,2(np−1). The SS procedure in Zucker et al. (1999) uses (1), but with a different test<br />

2


statistic, and can only be used if some minimum sample size for the second segment (i.e.,<br />

ns) is imposed. Let Dz = ( ¯ Yzs − ¯ Yzp) be the mean observed study outcome difference<br />

in group z of the second segment data compared to the pilot data. Further, Let S 2 s<br />

be the pooled variance estimate of the second-segment data. The test statistic is then<br />

tSS = � N/2( ¯ Y1 − ¯ Y0)/SSS to be compared to tα/2,2ns, where<br />

S 2 SS<br />

�<br />

−1<br />

= (2ns) 2(ns − 1)S 2 npns<br />

s +<br />

N (D2 1 + D2 0 )<br />

�<br />

.<br />

Before describing our adaptation of the SS approach, we describe the BB method by<br />

Miller (2005). Assuming equal variances in both groups, ρc = 1, and r = 1, the approach<br />

involved calculating v = 2{(Zα/2 +Zβ)/δ} 2 , and setting N = max{vS 2 p +1, np +ns(min)},<br />

where ns(min) ≥ 0 is an arbitrary minimum for the second segment. Let S 2 be the pooled<br />

variance estimate from all of the study data. Miller (2005) showed that − np−1<br />

(np−2)v ≤<br />

E(S 2 ) − σ 2 ≤ 0, where σ 2 is the true variance for both groups. The proposed correction<br />

for this bias is the variance estimator S 2 BB = S2 + np−1<br />

(np−2)v 1{N>np+n s(min)}, where 1{·} is the<br />

indicator function, with test statistic tBB = � N/2( ¯ Y1 − ¯ Y0)/SBB, compared to tα/2,2(N−1).<br />

We consider new sample-size formulas for internal and external pilot studies that<br />

adapt (1) and Miller (2005) approaches. We also propose two adaptations of the t-<br />

test that extend tSS and tBB for internal pilot studies. The extensions of the sample<br />

size calculation involve 1) using the t inflation factor like in (1), 2) letting treatment-<br />

group variances and sample sizes differ (and using the Welch approximation for degrees<br />

of freedom), and 3) incorporating compliance-group proportion estimates. We calculate<br />

a noncompliance corrected v, vnc = {(tα/2,νp + tβ,νp)/δˆρcp} 2 , where νp is the degrees of<br />

freedom calculated with the Welch approximation using the pilot data, and ˆρcp is the<br />

estimated proportion of compliers calculated from the pilot data.<br />

For external pilot studies, the sample size formula for the control group is<br />

3


N0 = vnc(S 2 0p + S2 1p /r) + 1. (2)<br />

For internal pilot studies, we propose a formula that produces restricted designs, i.e.,<br />

N0 > n0p (Wittes et al., 1999):<br />

�<br />

N0 = max vnc(S 2 0p + S2 1p /r) + 1, n0p<br />

�<br />

+ n0s(min) , (3)<br />

where S 2 zp is the sample variance of group z calculated from the pilot data, nzp is the<br />

sample size of the pilot data in group z, and n0s(min) is an arbitrary minimum sample<br />

size for the control group second segment. We assume that the pre-specified ratio r is<br />

maintained for both stages of the study so that n1p/n0p = n1s(min)/n0s(min) = r.<br />

Our adaptation of tSS for internal pilot studies involves deriving group-specific variance<br />

estimates. Let<br />

S 2 zSS<br />

�<br />

−1<br />

= (nzs) (nzs − 1)S 2 zs<br />

nzpnzs<br />

+ D<br />

Nz<br />

2 �<br />

z .<br />

The noncompliance SS (NSS) test statistic is then tNSS = (S 2 0SS /N0+S 2 1SS /N1) −1/2 ( ¯ Y1−<br />

¯Y0), compared to tα/2,νNSS , where<br />

νNSS =<br />

(S2 0SS /N0 + S2 2<br />

1SS /N1)<br />

(S 2 0SS /N0) 2 /n0s + (S 2 1SS /N1) 2 /n1s<br />

Note that nzs are the degrees of freedom for S 2 zSS and were hence used in νNSS instead<br />

of nzs − 1.<br />

Our adaptation of tBB involves accounting for the potential bias in two variance es-<br />

timates for the special case of r = 1, thus n1p = n0p = np, n1s = n0s = ns, n1s(min) =<br />

n0s(min) = ns(min), and N1 = N0 = N. We calculate the variance<br />

4<br />

.


S 2 0BB + S2 1BB = S2 0 + S2 1 + 2 np − 1<br />

where the term −2 np−1<br />

(νp−2)vnc<br />

(νp − 2)vnc<br />

1{N>np+n s(min)}, (4)<br />

is an approximate bound for bias of S 2 0 + S 2 1 based on<br />

the proof of theorem 1 in Miller (2005) using the Welch approximation of S 2 0p + S 2 1p to a<br />

chi-square random variable. The derivation of (4) is found in the subsection below. The<br />

noncompliance BB (NBB) test statistic is then tNBB = (S 2 0BB /N +S2 1BB /N)−1/2 ( ¯ Y1 − ¯ Y0),<br />

compared to tα/2,νNBB , where<br />

νNBB =<br />

(S 2 0BB /N + S2 1BB /N)2<br />

(S 2 0BB /N)2 /(N − 1) + (S 2 1BB /N)2 /(N − 1) .<br />

The proposed sample size formulas involve adapting the approximate inflation factor<br />

in (1) to handle unequal variances that may be due to noncompliance with selection bias.<br />

However, use of the Welch approximation in sample size calculations and both tests in<br />

the presence of noncompliance is not entirely accurate. When ρc = 1, the Welch degrees<br />

of freedom are used to approximate the distribution of the variance of mean differences, a<br />

sum of weighted chi-square random variables, to a chi-square random variable. However,<br />

in the presence of noncompliance, the variance of mean differences is a sum of weighted<br />

central and noncentral chi-square random variables, where the noncentrality parameters<br />

depend on true adherence-subgroup means and compliance group distribution. The source<br />

of this inaccuracy is a special case of averaging over a factor causing heterogeneity in the<br />

treatment-specific outcome distribution. Even in cases with ρc = 1, other factors not<br />

considered in the sample size calculation (e.g., sex, age, etc.) are averaged over that<br />

may also cause heterogeneity in the outcome distribution. A closed-form approximate<br />

method that is reasonably accurate in accounting for heterogeneity from noncompliance<br />

is of practical utility, thus it is of interest to evaluate the performance of the proposed<br />

procedures for different levels of noncompliance and selection bias.<br />

5


Derivation of Bounded Bias Expression (4)<br />

This subsection can be skipped without loss of continuity. From the work of Wittes<br />

et al. (1999), the bias of S2 z for internal pilot studies with r = 1 can be expressed as<br />

E(S2 z − σ2 �<br />

(np−1)(S2 zp−σ<br />

z) = E<br />

2 � �<br />

z)<br />

= (np − 1)cov S np+ns−1<br />

2 �<br />

1<br />

zp, , because ns is a random<br />

np+ns−1<br />

variable. Thus,<br />

E{S 2 0 + S2 1 − (σ2 0 + σ2 1 )} = (np − 1)cov<br />

�<br />

S 2 0p + S2 1p ,<br />

�<br />

σ2 0 +σ<br />

where we approximate E<br />

2 1<br />

S2 0p +S2 �<br />

1p<br />

⎡<br />

⎢<br />

= (np − 1)cov ⎢<br />

⎣S2 0p + S2 1p ,<br />

⎧<br />

⎨<br />

> (np − 1)cov<br />

⎩ S2 0p + S 2 1p,<br />

1<br />

np + ns − 1<br />

�<br />

�<br />

v(S2 0p +S<br />

max<br />

2 1p )<br />

2ρ2 , np + ns(min) − 1<br />

c<br />

1<br />

v(S 2 0p +S2 1p )<br />

2ρ 2 c<br />

= np − 1<br />

v/(2ρ2 � � 2 σ0 + σ<br />

1 − E<br />

c)<br />

2 1<br />

S2 0p + S2 �<br />

1p<br />

�<br />

≈ np − 1<br />

v/(2ρ2 �<br />

· 1 −<br />

c ) νp<br />

�<br />

,<br />

νp − 2<br />

⎫<br />

⎬<br />

⎭<br />

1<br />

⎤<br />

⎥<br />

�⎥<br />

⎦<br />

by E � � νp<br />

, where X is a chi-square random variable<br />

X<br />

with νp degrees of freedom. Therefore, E(X −1 ) = 1/(νp − 2). The inequality follows from<br />

Lemma A.1 in Miller (2005). Estimating v/(2ρ 2 c) with vnc completes the derivation.<br />

6


Simulation study<br />

We assess the performance of our proposed sample size calculations and tests via extensive<br />

simulation studies under a variety of specifications for ρc, selection bias, and variance<br />

inequality. In all specifications, the sample size is calculated to detect δ = 0.5 with 80%<br />

power, where µc1 = 0.5, µc0 = 0, and σ 2 c0 = σ2 n<br />

= 1, using two-sided tests with α = 0.05.<br />

Pilot studies were simulated with r = 1, n0p = n1p = 10, 20, 30, 50. For internal pilot<br />

studies, the sample size was calculated using (3), where n0s(min) = 10. For external pilot<br />

studies, (2) was used to calculate sample size. For each specification, 100,000 ‘studies’<br />

were simulated to estimate empirical power and type I error of the naive t-test. For<br />

internal pilot studies, tNSS and tNBB were also evaluated.<br />

The empirical type I error for the approaches with internal pilot data is found in <strong>Web</strong><br />

Table 1. We see that both tNSS and tNBB are more conservative and control type I error<br />

better than the naive t-test. However, like in Wittes et al. (1999) where ρc = 1, when nzp<br />

and nzs are large, empirical type I error is approximately unbiased for the naive t-test. The<br />

largest biases occurred when σ 2 c1 �= σ2 a . The tNSS and tNBB statistics perform similarly to<br />

each other. The empirical power (when µc1 = 0.5 and µc0 = 0) for the approaches with<br />

internal pilot data is found in <strong>Web</strong> Table 2. As in Wittes et al. (1999), when nzp is a<br />

small percentage of the required sample size, the study is under-powered. In particular,<br />

empirical power is lowest in the specification with ρc = 0.20, but improves with increasing<br />

nzp. As in Zucker et al. (1999) where ρc = 1, tNSS performs well for attaining adequate<br />

power when nzp is sufficient, at least 10% of the required sample size for the specifications<br />

explored in this study, regardless of selection bias. Again, the performances of tNSS and<br />

tNBB are similar.<br />

The empirical type I error and power of the naive t-test for external pilot studies are<br />

found in <strong>Web</strong> Table 3. We see that the approach performs well, and achieves close to<br />

7


nominal power when nzp = 50 for all specifications studied except when ρc < 0.5 (i.e., the<br />

specifications requiring the largest sample size). We also see that, like with internal pilot<br />

studies, the median required sample size per group decreases with increasing nzp. This<br />

result reflects that when nzp is larger compared to when it is smaller, inflation factors<br />

are closer to one owing to larger degrees of freedom in (2) and more precise estimates of<br />

variance and ρc from pilot data.<br />

Conclusion<br />

Although more accurate approximations for the distribution of a sum of weighted noncen-<br />

tral chi-square variables have been proposed (e.g., Castaño-Martínez and López-Blázquez,<br />

2005), their mathematical complexity limits their use and practicality for calculating sam-<br />

ple size. Further, the simulation studies show that using the Welch approximation for<br />

degrees of freedom in the inflation factor and resulting t-tests performs reasonably well<br />

for controlling type I error and attaining desired power in the presence of noncompliance.<br />

The main factor that impacts the performance of the methods is the ratio of nzp to the<br />

total required sample size. If ρc is expected to be small, then nzp should be chosen to<br />

have a sufficient number in each adherence subgroup to provide a reasonable estimate of<br />

subgroup means and variances for precisely estimating treatment-group variances, partic-<br />

ularly when large levels of selection bias are expected.<br />

Zucker et al. (1999) discussed concerns about jeopardizing blinding for internal pi-<br />

lot studies that are relevant here. In particular, unblinding may risk interim testing in<br />

addition to sample size calculation, which further impacts type I error. The authors<br />

noted that interim sample size calculation without testing can be allowed by disclosing<br />

treatment-group variance estimates, but not interim mean differences. To adapt this rec-<br />

ommendation to allow noncompliance, the interim compliance proportion estimate also<br />

8


needs to be revealed.<br />

References<br />

Castaño-Martínez, A. and López-Blázquez, F. (2005). Distribution of a sum of weighted<br />

noncentral chi-square variables. Test 14, 397-415.<br />

Miller, F. (2005). Variance estimation in clinical studies with interim sample size reesti-<br />

mation. <strong>Biometrics</strong> 61, 355-361.<br />

Stein, C. (1945). A two-sample test for a linear hypothesis whose power is independent of<br />

the variance. Annals of Mathematical Statistics 16, 243-258.<br />

Wittes, J., Schabenberger, O., Zucker, D., Brittain, E., and Proschan, M. (1999). Internal<br />

pilot studies I: Type I error rate of the naive t-test. Statistics in Medicine 18, 3481-<br />

3491.<br />

Zucker, D.M., Wittes, J.T., Schabenberger, O., and Brittain, E. (1999). Internal pilot<br />

studies II: Comparison of various procedures. Statistics in Medicine 18, 3493-3509.<br />

9


10<br />

Table 1: Comparison of empirical type I error to nominal type I error (5%) with sample sizes calculated using Equation<br />

3 with internal pilot data to detect δ = 0.5 with µc1 = 0.5 and µc0 = 0 where σ 2 c0 = σ2 n = 1, and ρn = ρa = (1 − ρc)/2.<br />

Two groups compared using the naive t-test, tNSS, and tNBB. N = median sample size per group. Monte Carlo standard<br />

error = 0.06% from 100,000 iterations.<br />

nzp = 10 nzp = 20 nzp = 30 nzp = 50<br />

ρc µa µn σ 2 c1 σ2 a N t tNSS tNBB N t tNSS tNBB N t tNSS tNBB N t tNSS tNBB<br />

0.20 0.00 0.00 1.00 1.00 1671 5.06 5.05 5.05 1618 5.13 5.13 5.12 1586 4.95 4.93 4.94 1584 5.02 5.00 5.01<br />

0.30 0.00 0.00 1.00 1.00 743 5.04 5.01 5.02 718 4.99 4.96 4.97 719 5.06 5.03 5.04 708 5.02 4.99 5.00<br />

0.50 0.00 0.00 1.00 1.00 271 5.07 5.01 5.02 261 5.12 5.08 5.06 257 5.09 5.05 5.04 256 5.03 4.94 4.98<br />

0.75 0.00 0.00 1.00 1.00 122 5.13 4.98 4.99 117 5.18 5.07 5.09 116 5.18 5.09 5.08 120 5.08 5.04 4.99<br />

0.50 -0.25 0.00 1.00 1.00 274 5.00 4.97 4.97 264 5.06 4.98 5.00 260 5.10 5.06 5.06 258 5.05 5.00 5.00<br />

0.50 -0.25 0.25 1.00 1.00 278 5.13 5.06 5.08 268 5.15 5.09 5.10 266 5.03 4.98 4.99 263 5.05 5.00 4.99<br />

0.50 0.25 0.25 1.00 1.00 275 5.00 4.93 4.96 264 5.09 5.04 5.04 262 5.05 5.00 5.00 260 5.05 5.02 5.00<br />

0.50 0.25 -0.25 1.00 1.00 280 4.99 4.96 4.94 268 5.05 5.02 5.01 266 5.10 5.04 5.04 263 4.98 4.94 4.94<br />

0.50 0.50 -0.25 1.00 1.00 289 5.07 5.01 5.02 279 5.17 5.11 5.11 277 5.00 4.96 4.97 275 5.02 4.97 4.99<br />

0.50 0.25 -0.50 1.00 1.00 290 5.02 4.97 4.97 279 5.03 4.98 4.97 277 4.92 4.88 4.88 274 4.85 4.83 4.82<br />

0.50 0.50 -0.50 1.00 1.00 304 5.09 5.03 5.04 293 5.07 5.02 5.03 290 4.98 4.93 4.93 288 5.07 5.03 5.03<br />

0.50 0.50 0.50 1.00 1.00 287 5.03 4.95 4.98 276 4.98 4.92 4.93 275 5.08 5.04 5.03 272 5.03 4.98 4.98<br />

0.50 0.00 0.00 1.50 1.00 304 5.02 4.97 4.97 293 5.11 5.07 5.06 290 4.86 4.82 4.82 287 4.97 4.92 4.93<br />

0.50 0.00 0.00 1.50 1.50 337 5.04 5.00 5.00 324 4.98 4.93 4.95 321 4.90 4.86 4.85 327 5.12 5.08 5.09<br />

0.50 0.00 0.00 0.75 1.00 255 5.01 4.95 4.96 244 5.07 5.01 5.00 241 5.11 5.05 5.07 247 5.06 4.99 5.01<br />

0.50 0.00 0.00 0.75 0.75 240 5.12 5.06 5.07 231 5.05 4.98 4.99 229 5.10 5.03 5.03 239 5.07 5.04 5.02<br />

0.50 0.00 0.00 2.00 1.00 338 5.01 4.94 4.95 324 4.97 4.92 4.94 321 5.10 5.02 5.04 319 5.20 5.16 5.16<br />

0.50 0.00 0.00 2.00 2.00 404 5.07 5.04 5.03 390 4.99 4.94 4.95 385 4.97 4.91 4.93 384 4.99 4.95 4.95<br />

0.50 0.00 0.00 0.50 1.00 237 5.02 4.96 4.96 227 5.17 5.11 5.10 225 5.08 4.98 5.02 224 5.07 5.01 5.02<br />

0.50 0.00 0.00 0.50 0.50 204 5.02 4.93 4.94 195 5.08 5.03 5.00 194 5.04 4.96 4.97 192 5.07 5.02 5.00<br />

0.50 0.25 -0.25 0.75 0.75 245 5.01 4.94 4.95 235 5.12 5.06 5.06 234 5.00 4.97 4.94 240 5.02 4.99 4.97<br />

0.50 0.25 -0.50 1.50 1.50 357 5.07 5.01 5.02 344 5.04 4.98 5.00 339 4.86 4.85 4.83 350 4.89 4.86 4.85<br />

0.50 0.50 -0.50 2.00 2.00 438 4.92 4.87 4.88 420 4.93 4.89 4.89 418 5.01 4.99 4.98 423 4.92 4.89 4.89<br />

0.50 0.50 -0.50 0.50 0.50 237 5.07 5.00 5.00 227 5.10 4.99 5.03 226 5.06 5.00 5.00 231 5.02 4.94 4.96<br />

0.50 0.50 -0.50 2.00 0.50 339 5.13 5.08 5.09 326 5.08 5.03 5.04 322 5.02 4.99 4.98 327 5.23 5.21 5.20


11<br />

Table 2: Comparison of empirical power to nominal power (80%) from studies with sample sizes calculated using Equation<br />

3 with internal pilot data to detect δ = 0.5 with µc1 = 0.5 and µc0 = 0 where σ 2 c0 = σ2 n = 1, and ρn = ρa = (1 − ρc)/2.<br />

Two groups compared using the naive t-test, tNSS, and tNBB. N = median sample size per group. Monte Carlo standard<br />

error = 0.13% from 100,000 iterations.<br />

nzp = 10 nzp = 20 nzp = 30 nzp = 50<br />

ρc µa µn σ 2 c1 σ2 a N t tNSS tNBB N t tNSS tNBB N t tNSS tNBB N t tNSS tNBB<br />

0.20 0.00 0.00 1.00 1.00 1685 69.35 69.35 69.32 1643 72.30 72.28 72.27 1628 73.98 73.96 73.95 1627 75.60 75.56 75.57<br />

0.30 0.00 0.00 1.00 1.00 761 73.65 73.57 73.66 739 75.83 75.75 75.78 732 76.81 76.74 76.76 726 78.03 77.96 77.97<br />

0.50 0.00 0.00 1.00 1.00 278 78.02 77.82 77.89 269 79.10 78.84 78.97 266 79.73 79.49 79.60 264 80.24 79.99 80.14<br />

0.75 0.00 0.00 1.00 1.00 125 81.02 80.60 80.67 120 81.06 80.46 80.84 118 81.32 80.75 81.08 120 81.11 80.26 80.93<br />

0.50 -0.25 0.00 1.00 1.00 285 77.77 77.58 77.64 275 78.52 78.33 78.42 273 79.39 79.16 79.28 271 79.72 79.46 79.62<br />

0.50 -0.25 0.25 1.00 1.00 287 77.57 77.39 77.42 277 78.72 78.48 78.58 274 79.14 78.94 79.05 272 79.62 79.35 79.52<br />

0.50 0.25 0.25 1.00 1.00 275 77.85 77.63 77.70 264 79.09 78.85 78.95 262 79.55 79.31 79.40 259 80.21 79.95 80.08<br />

0.50 0.25 -0.25 1.00 1.00 287 78.65 78.47 78.53 276 79.73 79.50 79.61 274 80.32 80.10 80.21 271 80.98 80.72 80.88<br />

0.50 0.50 -0.25 1.00 1.00 295 78.57 78.35 78.45 284 79.83 79.62 79.73 281 80.78 80.54 80.67 278 81.09 80.82 81.00<br />

0.50 0.25 -0.50 1.00 1.00 306 78.64 78.44 78.51 291 79.89 79.66 79.79 288 80.83 80.53 80.71 286 81.35 81.13 81.25<br />

0.50 0.50 -0.50 1.00 1.00 314 78.55 78.34 78.43 300 80.04 79.78 79.91 297 80.85 80.62 80.76 295 81.61 81.35 81.49<br />

0.50 0.50 0.50 1.00 1.00 277 78.19 77.99 78.04 268 79.16 78.93 79.04 265 79.90 79.64 79.80 263 79.89 79.62 79.79<br />

0.50 0.00 0.00 1.50 1.00 315 77.76 77.60 77.64 301 79.08 78.87 78.96 297 79.59 79.39 79.48 296 80.05 79.87 79.95<br />

0.50 0.00 0.00 1.50 1.50 346 77.71 77.54 77.60 332 78.86 78.70 78.77 329 79.44 79.26 79.32 326 80.01 79.84 79.94<br />

0.50 0.00 0.00 0.75 1.00 263 78.03 77.80 77.89 252 79.25 78.99 79.12 249 79.68 79.43 79.56 247 80.58 80.27 80.46<br />

0.50 0.00 0.00 0.75 0.75 253 78.70 78.49 78.56 243 79.63 79.36 79.49 240 80.06 79.77 79.92 239 80.78 80.46 80.67<br />

0.50 0.00 0.00 2.00 1.00 346 77.89 77.74 77.78 334 78.84 78.68 78.75 330 79.48 79.32 79.39 328 79.91 79.76 79.82<br />

0.50 0.00 0.00 2.00 2.00 413 77.67 77.53 77.59 399 78.68 78.55 78.60 393 79.34 79.21 79.28 389 79.90 79.71 79.83<br />

0.50 0.00 0.00 0.50 1.00 246 78.26 77.97 78.08 236 79.38 79.06 79.24 233 79.88 79.56 79.76 231 80.29 79.94 80.17<br />

0.50 0.00 0.00 0.50 0.50 212 78.13 77.81 77.93 204 79.30 78.94 79.12 202 80.08 79.73 79.94 200 80.77 80.42 80.61<br />

0.50 0.25 -0.25 0.75 0.75 252 78.32 78.08 78.19 243 79.93 79.65 79.80 241 80.50 80.21 80.38 240 81.35 81.08 81.22<br />

0.50 0.25 -0.50 1.50 1.50 371 78.18 78.01 78.07 358 79.63 79.47 79.54 353 80.27 80.07 80.17 349 80.71 80.52 80.63<br />

0.50 0.50 -0.50 2.00 2.00 447 78.42 78.27 78.32 430 79.53 79.39 79.46 425 80.18 80.03 80.12 422 80.74 80.55 80.66<br />

0.50 0.50 -0.50 0.50 0.50 246 79.08 78.76 78.91 235 80.70 80.34 80.53 234 81.50 81.12 81.37 232 82.39 82.03 82.26<br />

0.50 0.50 -0.50 2.00 0.50 347 78.89 78.71 78.77 333 79.96 79.77 79.86 330 80.39 80.20 80.29 326 81.42 81.21 81.34


12<br />

Table 3: Comparison of empirical Type I error (αe), and power, (1−βe), to nominal Type I error (5%) and power (80%),<br />

respectively, from studies with sample sizes calculated using Equation 2 with external pilot data to detect δ = 0.5 with<br />

µc1 = 0.5 and µc0 = 0 where σ 2 c0 = σ2 n = 1, and ρn = ρa = (1 − ρc)/2. Two groups compared using the naive t-test.<br />

NH0 = median sample size when the null hypothesis is true; NH1 = median sample size when the alternative hypothesis<br />

is true. Monte Carlo standard error = 0.06% for αe and 0.13% for 1 − βe from 100,000 iterations.<br />

nzp = 10 nzp = 20 nzp = 30 nzp = 50<br />

ρc µa µn σ 2 c1 σ2 a NH0 αe NH1 1−βe NH0 αe NH1 1−βe NH0 αe NH1 1−βe NH0 αe NH1 1−βe<br />

0.20 0.00 0.00 1.00 1.00 1676 5.10 1714 69.67 1651 4.94 1620 72.10 1633 5.04 1601 73.38 1582 5.01 1616 75.10<br />

0.30 0.00 0.00 1.00 1.00 740 4.99 760 73.53 738 4.96 720 75.49 731 4.86 712 76.41 702 4.93 721 77.78<br />

0.50 0.00 0.00 1.00 1.00 270 4.83 270 77.49 268 5.03 260 78.56 265 5.09 265 78.96 254 4.86 262 79.72<br />

0.75 0.00 0.00 1.00 1.00 121 4.96 121 79.90 118 5.04 116 80.40 117 4.98 117 80.31 114 5.01 116 80.66<br />

0.50 -0.25 0.00 1.00 1.00 275 4.91 275 77.81 274 5.12 262 78.59 272 5.06 272 79.04 257 4.97 269 79.75<br />

0.50 -0.25 0.25 1.00 1.00 278 5.05 278 77.68 275 4.98 267 78.59 272 4.89 272 78.78 263 5.01 271 79.38<br />

0.50 0.25 0.25 1.00 1.00 274 5.01 274 77.64 264 4.95 263 78.74 260 5.08 260 79.02 259 4.88 258 79.55<br />

0.50 0.25 -0.25 1.00 1.00 279 5.10 279 77.42 276 4.98 268 78.43 271 4.91 271 79.06 263 5.20 271 79.49<br />

0.50 0.50 -0.25 1.00 1.00 289 5.00 289 77.45 282 4.99 278 78.54 279 4.92 279 78.93 274 4.96 278 79.52<br />

0.50 0.25 -0.50 1.00 1.00 289 4.91 289 77.38 290 5.00 278 78.35 288 4.93 288 78.80 273 4.93 285 79.57<br />

0.50 0.50 -0.50 1.00 1.00 304 4.95 304 77.27 301 4.98 293 78.43 296 5.10 296 78.91 287 5.04 295 79.38<br />

0.50 0.50 0.50 1.00 1.00 286 4.97 286 77.45 267 4.93 275 78.54 264 5.07 264 78.97 270 4.99 262 79.50<br />

0.50 0.00 0.00 1.50 1.00 303 4.99 311 77.70 299 5.11 291 78.74 297 4.97 288 79.10 287 5.06 294 79.49<br />

0.50 0.00 0.00 1.50 1.50 336 4.93 345 77.49 323 4.93 323 78.49 328 5.03 320 78.98 325 5.01 326 79.41<br />

0.50 0.00 0.00 0.75 1.00 252 5.03 260 77.42 243 4.89 243 78.33 248 4.95 240 79.16 246 4.86 247 79.50<br />

0.50 0.00 0.00 0.75 0.75 240 4.85 240 77.79 230 4.92 230 78.48 238 4.92 226 78.82 238 5.00 238 79.47<br />

0.50 0.00 0.00 2.00 1.00 338 4.90 347 77.85 332 5.06 324 78.67 329 4.99 321 78.93 318 5.03 326 79.55<br />

0.50 0.00 0.00 2.00 2.00 402 5.02 410 77.31 397 4.94 389 78.60 394 5.00 386 79.18 381 4.96 390 79.48<br />

0.50 0.00 0.00 0.50 1.00 236 5.04 244 77.04 235 4.99 227 78.16 232 5.09 224 78.83 223 4.93 231 79.40<br />

0.50 0.00 0.00 0.50 0.50 202 5.01 211 77.52 203 4.99 194 78.44 201 5.01 193 79.01 191 5.03 199 79.91<br />

0.50 0.25 -0.25 0.75 0.75 244 4.90 252 77.20 234 5.00 234 78.68 240 5.02 232 79.06 239 5.16 239 79.40<br />

0.50 0.25 -0.50 1.50 1.50 357 5.16 370 77.17 344 4.97 344 78.29 351 4.91 339 78.96 350 5.05 349 79.30<br />

0.50 0.50 -0.50 2.00 2.00 438 4.93 447 77.17 418 4.93 418 78.29 425 4.96 416 79.17 421 4.99 422 79.53<br />

0.50 0.50 -0.50 0.50 0.50 237 4.87 245 76.75 226 4.95 226 78.17 233 5.12 225 78.77 231 5.06 231 79.44<br />

0.50 0.50 -0.50 2.00 0.50 340 4.99 348 77.40 324 5.07 324 78.53 329 5.02 321 79.09 327 5.06 327 79.52

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!