Confounding, interaction, and mediation in multivariable/multivariate ...
Confounding, interaction, and mediation in multivariable/multivariate ...
Confounding, interaction, and mediation in multivariable/multivariate ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
<strong>Confound<strong>in</strong>g</strong>, <strong><strong>in</strong>teraction</strong>, <strong>and</strong> <strong>mediation</strong> <strong>in</strong><br />
<strong>multivariable</strong>/<strong>multivariate</strong> regression model<strong>in</strong>g<br />
William Wu<br />
Department of Biostatistics<br />
Cancer Biostatistics Center, V<strong>and</strong>erbilt-Ingram Cancer Center<br />
May 21, 2010<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Outl<strong>in</strong>e<br />
1 Mediation<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
2 <strong>Confound<strong>in</strong>g</strong><br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
3 Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
MEDIATION<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
P<strong>in</strong>gsheng’s study<br />
Study question:<br />
Whether <strong>and</strong> how childhood asthma was associated with maternal smok<strong>in</strong>g <strong>and</strong><br />
<strong>in</strong>fancy bronchiolitis.<br />
Prelim<strong>in</strong>ary model<strong>in</strong>g f<strong>in</strong>d<strong>in</strong>g:<br />
The significant association between maternal smok<strong>in</strong>g <strong>and</strong> asthma was found, but the<br />
association was gone after adjust<strong>in</strong>g for bronchiolitis <strong>in</strong> <strong>multivariable</strong> model<strong>in</strong>g.<br />
”We hypothesize that one mechanism through which maternal smok<strong>in</strong>g dur<strong>in</strong>g<br />
pregnancy contributes to the known <strong>in</strong>creased risk of develop<strong>in</strong>g childhood asthma is<br />
through <strong>in</strong>creas<strong>in</strong>g the risk of an important <strong>in</strong>termediate event, bronchiolitis dur<strong>in</strong>g<br />
<strong>in</strong>fancy.”<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Dan Weeks’ talk<br />
In the paper:<br />
ˆ ’Interpretation of Genetic Association Studies: Markers with Replicated Highly<br />
Significant Odds Ratios May Be Poor Classifiers.’<br />
ˆ ’Although a set of SNPs can be strongly associated with disease risk with<br />
extremely small P-values, the same set of SNPs may not necessarily have high<br />
discrim<strong>in</strong>ation ability or may not dramatically improve the discrim<strong>in</strong>ation ability<br />
of a classification model constructed us<strong>in</strong>g ’convent<strong>in</strong>al’ non-genetic risk factors<br />
without the SNPs.’<br />
PLoS genetics 2009;5(2)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Adriana Gonzales’ study<br />
Study question:<br />
How biomarkers EGFR, AKT, <strong>and</strong> Ki-67 were correlated among 69 osteosarcoma<br />
patients.<br />
H Wu et al. Biomarker Insights 2007;2:469-76<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
EGFR pathway<br />
PTEN<br />
PI3-K<br />
AKT<br />
R R<br />
pY<br />
K K<br />
pY<br />
pY<br />
STAT<br />
RAS RAF<br />
SOS<br />
GRB2<br />
MEK<br />
MAPK<br />
Gene transcription<br />
Cell cycle progression<br />
PP<br />
myc<br />
cycl<strong>in</strong> D1<br />
DNA<br />
JunFos<br />
Myc<br />
Cycl<strong>in</strong> D1<br />
Proliferation/<br />
maturation<br />
Survival<br />
(anti-apoptosis)<br />
Angiogenesis<br />
Signal<strong>in</strong>g events are ordered both spatially <strong>and</strong> temporally<br />
Metastasis<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Possible approaches to Adriana’s data<br />
ˆ Correlation analysis of the 3 biomarkers<br />
ˆ Regression of Ki-67 on other 2 biomarkers<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
What is <strong>mediation</strong><br />
A <strong>mediation</strong> effect occurs when the third variable (mediator, M) carries the <strong>in</strong>fluence<br />
of a given <strong>in</strong>dependent variable (X) to a given dependent variable (Y).<br />
Mediation models expla<strong>in</strong> how an effect occurred by hypothesiz<strong>in</strong>g a causal sequence.<br />
Mediator<br />
a<br />
b<br />
X<br />
c<br />
Y<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Approaches to identification<br />
ˆ Approach 1: Causal steps<br />
ˆ Approach 2: Statistical test<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Approach 1: causual steps<br />
Model 1<br />
(1)<br />
X<br />
<br />
Y<br />
Y = 0(1) + X + (1) (1)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Causual steps<br />
Model 2 <strong>and</strong> Model 3<br />
(3)<br />
<br />
M<br />
<br />
(2)<br />
X<br />
Y<br />
’<br />
Y = 0(2) + ’ X + M + (2) (2)<br />
M = 0(3) + X + (3) (3)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
A significant <strong>mediation</strong> effect should be:<br />
ˆ τ <strong>in</strong> Model 1<br />
The total effect of the <strong>in</strong>dependent variable X on the dependent variable Y must<br />
be significant.<br />
ˆ α <strong>in</strong> Model 3<br />
The path from X to M must be significant.<br />
ˆ β <strong>in</strong> Model 2<br />
The path from M to Y must be significant.<br />
ˆ τ ′ <strong>in</strong> Model 2<br />
Evidence for <strong>mediation</strong> when τ ′ becomes <strong>in</strong>significant when the M is <strong>in</strong>cluded<br />
(effect of X on Y is zero). This would be complete <strong>mediation</strong><br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Approach 2: statistical test of <strong>mediation</strong><br />
Sobel test:<br />
to test the products√<br />
of coefficients of the two paths a <strong>and</strong> b.<br />
z − value = α ∗ β/ α 2 σβ 2 + β2 σα<br />
2<br />
The null hypothesis is a test of α ∗ β = 0.<br />
MacK<strong>in</strong>non <strong>and</strong> Dwyer (1994) <strong>and</strong> MacK<strong>in</strong>non, Warsi, <strong>and</strong> Dwyer (1995)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Adriana’s study<br />
ˆ Initial variable EGFR <strong>and</strong> mediat<strong>in</strong>g variable Akt were<br />
immunosta<strong>in</strong><strong>in</strong>g <strong>in</strong>dex.<br />
ˆ Outcome variable Ki-67 was a cancer cell proliferation <strong>in</strong>dex<br />
<strong>and</strong> also an immunosta<strong>in</strong><strong>in</strong>g <strong>in</strong>dex.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Approach 1: the 3 models<br />
ˆ Model1 ← ols(Ki67 ∼ EGFR + age + sex)<br />
ˆ Model2 ← ols(Ki67 ∼ EGFR + AKT + age + sex)<br />
ˆ Model3 ← ols(AKT ∼ EGFR + age + sex)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Jo<strong>in</strong>t model<strong>in</strong>g (SEM)<br />
0, 1<br />
e2<br />
Age<br />
AKT<br />
Gender<br />
EGFR<br />
0, 1<br />
e1<br />
Ki-67<br />
0, 1<br />
d1<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Approach 1: results<br />
Model<br />
Coefficient<br />
SE<br />
F<br />
p<br />
ols (Ki67~ EGFR+)<br />
: 0.0003069<br />
0.0001400<br />
4.80<br />
0.0340<br />
ols (AKT~ EGFR+)<br />
: 0.6584<br />
0.1996<br />
10.89<br />
0.0020<br />
ols (Ki67~ EGFR+AKT+)<br />
: 0.0003019<br />
0.0000993<br />
9.24<br />
0.0042<br />
ols (Ki67~ EGFR+AKT+)<br />
’: 0.00009687<br />
0.0001444<br />
0.45<br />
0.5061<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Components of <strong>mediation</strong> model<br />
ˆ Total effect=<br />
α ∗ β + τ ′ = 0.6584x0.0003019 + 0.00009687 = 0.0002957(∼ = τ = 0.0003069)<br />
ˆ Direct effect=<br />
ˆ Mediated effect=<br />
τ ′ = 0.00009687<br />
α ∗ β = 0.6584x0.0003019 = 0.0001988 =<br />
(total - direct = τ − τ ′ = 0.0002957 − 0.00009687 = 0.0001988)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
3 examples<br />
Def<strong>in</strong>ition <strong>and</strong> identification<br />
An application<br />
Approach 2: results<br />
Test<br />
p value<br />
Sobel 0.00000152<br />
Goodman (I) 0.00000172<br />
Goodman (II) 0.00000133<br />
A significant mediat<strong>in</strong>g effect for Akt was found with the tests.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
CONFOUNDING<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
What is a confounder<br />
Criteria for a confounder<br />
1 It is a risk factor for the disease, <strong>in</strong>dependent of the putative<br />
risk factor (exposure variable or X).<br />
2 It is associated with putative risk factor (exposure).<br />
3 It is not <strong>in</strong> the causal pathway between exposure <strong>and</strong> disease.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
<strong>Confound<strong>in</strong>g</strong> model<br />
The association between X (exposure) <strong>and</strong> Y (outcome) is<br />
distorted by the presence of another variable C (confounder)<br />
α<br />
X<br />
C<br />
β<br />
Y<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
An example<br />
ˆ Age may confound the positive relationship between annual<br />
<strong>in</strong>come <strong>and</strong> cancer <strong>in</strong>cidence <strong>in</strong> the US.<br />
1 Older <strong>in</strong>dividuals are also more likely to get cancer.<br />
2 Older <strong>in</strong>dividuals are likely to earn more money than younger<br />
ones who have not spent as much time <strong>in</strong> the work force.<br />
3 Income does not cause age, which then causes cancer.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
<strong>Confound<strong>in</strong>g</strong><br />
ˆ Not an issue <strong>in</strong> r<strong>and</strong>omized study<br />
R<strong>and</strong>omization process will elim<strong>in</strong>ate the correlation between confounder <strong>and</strong><br />
exposure, e.g., age <strong>and</strong> annual <strong>in</strong>come (i.e. we should have roughly equal<br />
numbers of age category <strong>in</strong> each annual <strong>in</strong>come group).<br />
ˆ An issue <strong>in</strong> observational study.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
Consequence of confound<strong>in</strong>g<br />
ˆ Bias effect estimate.<br />
ˆ Widen confidence <strong>in</strong>terval.<br />
ˆ Inclusion of additional potential confounders may only widen<br />
the CI <strong>and</strong> have no impact of effect estimate.<br />
ˆ <strong>Confound<strong>in</strong>g</strong> is the mask<strong>in</strong>g of the true effect of a risk factor<br />
on a disease or outcome by the presence of another variable.<br />
ˆ The presence of confound<strong>in</strong>g effect leads to a spurious<br />
association between exposure <strong>and</strong> outcome.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
How to pick potential confounders<br />
philosophically<br />
ˆ Not a simple question<br />
ˆ Your knowledge<br />
ˆ Prior experience with data<br />
ˆ The three criteria for confounders<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
How to pick potential confounders<br />
statistically<br />
ˆ When you get to do<strong>in</strong>g <strong>multivariable</strong> logistic regression, for example, one rule of<br />
thumb is that if the odds ratio changes by 10% or more then this is reason to<br />
<strong>in</strong>clude the potential confounder <strong>in</strong> your multi-variable model. We don’t tend to<br />
look at just whether it is statistically significant, but <strong>in</strong>stead, how much does it<br />
change with this effect. This change is what we want to measure. If it changes<br />
the effect by 10% or more, then we consider it a confounder <strong>and</strong> leave it <strong>in</strong> the<br />
model.<br />
ˆ P values will not tell confound<strong>in</strong>g effect. Rather, only, change <strong>in</strong> β between with<br />
adjustment <strong>and</strong> w/o adjustment can tell if the confound<strong>in</strong>g is work<strong>in</strong>g.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
Design stage<br />
1 R<strong>and</strong>omization<br />
2 Restriction (of the study population to a category of a<br />
confounder, but will limit generalizability)<br />
3 Match<strong>in</strong>g (Not feasible match<strong>in</strong>g all, residual confound<strong>in</strong>g will<br />
still bias estimate)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
Data analysis stage<br />
1 Adjustment (a few to dozen)<br />
2 propensity score (all)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
What is Unobserved confound<strong>in</strong>g<br />
<strong>Confound<strong>in</strong>g</strong> that rema<strong>in</strong>s after adjustment for observed<br />
confounders are referred as residual confound<strong>in</strong>g. This <strong>in</strong>volves<br />
unmeasured confound<strong>in</strong>g as well as <strong>in</strong>accurately measured<br />
confound<strong>in</strong>g.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
Methods for unobserved confound<strong>in</strong>g<br />
ˆ A challenge<br />
can not adjust for<br />
only r<strong>and</strong>omization<br />
ˆ But you can<br />
adjust for as many as allowed<br />
improve the measurement (Measurement error <strong>in</strong> confounders<br />
will lead to residual confound<strong>in</strong>g).<br />
use more appropriate scal<strong>in</strong>g of the measurement<br />
more ...<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
Other proposed approaches for unobserved confound<strong>in</strong>g<br />
BIOMETRICS 56, 915-921 September 2000<br />
When Should Epidemiologic Regressions Use R<strong>and</strong>om Coefficients<br />
S<strong>and</strong>er Greenl<strong>and</strong><br />
Department of Epidemiology, UCLA School of Public Health,<br />
Los Angeles, California 90095-1772, U.S.A.<br />
SUMMARY. Regression models with r<strong>and</strong>om coefficients arise naturally <strong>in</strong> both frequentist <strong>and</strong> Bayesian approaches<br />
to estimation problems. They are becom<strong>in</strong>g widely available <strong>in</strong> st<strong>and</strong>ard computer packages under the head<strong>in</strong>gs of<br />
generalized l<strong>in</strong>ear mixed models, hierarchical models, <strong>and</strong> multilevel models. I here argue that such models offer a<br />
more scientifically defensible framework for epidemiologic analysis than the fixed-effects models now prevalent <strong>in</strong><br />
epidemiology. The argument <strong>in</strong>vokes an antiparsimony pr<strong>in</strong>ciple attributed to L. J. Savage, which is that models should<br />
be rich enough to reflect the complexity of the relations under study. It also <strong>in</strong>vokes the countervail<strong>in</strong>g pr<strong>in</strong>ciple that<br />
you cannot estimate anyth<strong>in</strong>g if you try to estimate everyth<strong>in</strong>g (often used to justify parsimony). Regression with<br />
r<strong>and</strong>om coefficients offers a rational compromise between these pr<strong>in</strong>ciples as well as an alternative to analyses based<br />
on st<strong>and</strong>ard variable-selection algorithms <strong>and</strong> their attendant distortion of uncerta<strong>in</strong>ty assessments. These po<strong>in</strong>ts are<br />
illustrated with an analysis of data on diet, nutrition, <strong>and</strong> breast cancer.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
<strong>Confound<strong>in</strong>g</strong> is different from <strong>mediation</strong> <strong>in</strong>:<br />
ˆ Temorality (Exposure occurs first <strong>and</strong> then M <strong>and</strong> outcome,<br />
<strong>and</strong> conceptually follows an experimental design)<br />
ˆ Directionality<br />
ˆ Causality<br />
ˆ Confounders often demographic variables that typically cannot<br />
be changed <strong>in</strong> an experimental design. Mediators are by<br />
def<strong>in</strong>ition cable of be<strong>in</strong>g changed <strong>and</strong> are often selected based<br />
on malleability.<br />
ˆ statistical test<br />
logo<br />
William Wu<br />
Cancer Biostatistics
¢ ¤ ¡ ¦ ¨ ¦ ¤ ¤ ¤ ¤ ¦ ¤ ¨<br />
¡<br />
! " ¡ ¢ $ ¢ ¤ ¡ % ¤ ¡ ¦ ¨ ¦ "<br />
<br />
. 0 2 4 6 2 4 8 : -<br />
< > @ B C B F G H J K C G O K R S U < X S C < X J<br />
;<br />
< O \ ] C G X H a B G @ a G f G \ S X C O G @ C<br />
;<br />
@ B F G X K B C J < l ; S n B l < X @ B S q R < K t @ > G n G K q ; t<br />
i<br />
’ œ ” ª £ ¥ ° ¥ ¥ ° £ œ œ ‘ ¤ ˜ ¥ ˜ ‘ ’ ˜ œ ’ ‘ ¥ ¡ ª ‘ — ” ¤ £ — ³ » “ ¤ ¥ ‘ ª œ ‘ ¥ ° £ ª ¥ ° ’ ¥ ° £ £ ± £ ¤ ¥ ” ’ — £ ª œ ¥ ” — » ¦ É ° £<br />
£<br />
ª £ œ £ ’ ¤ £ ‘ “ œ ¡ ” ª ˜ ‘ ” œ œ œ ‘ ¤ ˜ ¥ ˜ ‘ ’ Ä — ” £ “ ‘ ª £ ½ Ÿ ¡ ¢ £ ¥ ‘ ¥ ° £ ˜ ’ Î ” £ ’ ¤ £ ‘ “ £ ½ ¥ ª ’ £ ‘ ” œ ² ª ˜ ³ ¢ £ œ Ä ˜ œ<br />
¡<br />
¢ ¢ £ — Ï Ð Ñ Ò Ð Ó Ñ Ô Õ Ñ Ö œ ˜ ¥ ¥ £ ’ — œ ¥ ‘ ¤ ‘ ’ “ ‘ ” ’ — ‘ ” ª ª £ — ˜ ’ š ’ — ¥ ‘ ³ ˜ œ ‘ ” ª £ œ ¥ ˜ Ÿ ¥ £ ‘ “ ¥ ° £ £ ± £ ¤ ¥<br />
¤<br />
¥ ” — ˜ £ — ¦ ‘ ’ ¤ £ ¡ ¥ ” ¢ ¢ » Ä ¥ ° £ ª £ “ ‘ ª £ Ä © £ ¤ ’ œ » ¥ ° ¥ µ ’ — ¹ ª £ ¤ ‘ ’ “ ‘ ” ’ — £ — © ° £ ’ ¥ ° £ ª £ ˜ œ<br />
œ<br />
° ˜ ª — ² ª ˜ ³ ¢ £ Û ¥ ° ¥ ˜ ’ Î ” £ ’ ¤ £ œ ³ ‘ ¥ ° µ ’ — ¹ Ü œ ” ¤ ° ² ª ˜ ³ ¢ £ ˜ œ ¥ ° £ ’ ¤ ¢ ¢ £ — Ý ¤ ‘ ’ “ ‘ ” ’ — £ ª Þ<br />
¥<br />
“ µ ’ — ¹ ¦ ‘<br />
œ œ ˜ Ÿ ¡ ¢ £ œ ¥ ° ˜ œ ¤ ‘ ’ ¤ £ ¡ ¥ ˜ œ Ä ˜ ¥ ° œ ª £ œ ˜ œ ¥ £ — “ ‘ ª Ÿ ¢ ¥ ª £ ¥ Ÿ £ ’ ¥ “ ‘ ª œ £ ² £ ª ¢ — £ ¤ — £ œ Ä ’ —<br />
ß<br />
‘ ª š ‘ ‘ — ª £ œ ‘ ’ â É ° £ ² £ ª » ’ ‘ ¥ ˜ ‘ ’ œ ‘ “ Ý £ ± £ ¤ ¥ Þ ’ — Ý ˜ ’ Î ” £ ’ ¤ £ Þ Ä ª £ ¢ ¥ ˜ ² £ ¥ ‘ © ° ˜ ¤ ° Ý œ ¡ ” ä<br />
“<br />
˜ ‘ ” œ œ œ ‘ ¤ ˜ ¥ ˜ ‘ ’ Þ ˜ œ ¥ ‘ ³ £ — £ æ ’ £ — Ä ° œ ª £ œ ˜ œ ¥ £ — Ÿ ¥ ° £ Ÿ ¥ ˜ ¤ ¢ “ ‘ ª Ÿ ” ¢ ¥ ˜ ‘ ’ ¦ É ° £ £ Ÿ ¡ ˜ ª ˜ ¤ ¢<br />
ª<br />
£ æ ’ ˜ ¥ ˜ ‘ ’ ‘ “ £ ± £ ¤ ¥ œ ’ œ œ ‘ ¤ ˜ ¥ ˜ ‘ ’ ¥ ° ¥ © ‘ ” ¢ — ¡ ª £ ² ˜ ¢ ˜ ’ ¤ ‘ ’ ¥ ª ‘ ¢ ¢ £ — ª ’ — ‘ Ÿ ˜ è £ — £ ½ ¡ £ ª ä<br />
—<br />
£ ª £ ¥ ‘ ¤ ° ’ š £ Ä œ » “ ª ‘ Ÿ ‘ ³ œ £ ª ² ¥ ˜ ‘ ’ ¢ ¥ ‘ ¤ ‘ ’ ¥ ª ‘ ¢ ¢ £ — œ ¥ ” — ˜ £ œ ¦ ë ” ¤ ° ¡ ª £ — ˜ ¤ ¥ ˜ ‘ ’ œ ª £ ì ” ˜ ª £ £ ½ ¥ ª<br />
©<br />
’ “ ‘ ª Ÿ ¥ ˜ ‘ ’ Ä ˜ ’ ¥ ° £ “ ‘ ª Ÿ ‘ “ ¤ ” œ ¢ ‘ ª ¤ ‘ ” ’ ¥ £ ª “ ¤ ¥ ” ¢ œ œ ” Ÿ ¡ ¥ ˜ ‘ ’ œ î ï ª £ £ ’ ¢ ’ — ’ — ñ ‘ ³ ˜ ’ œ<br />
˜<br />
ó ô õ Ü ö ˜ ¤ ® ª Ÿ ª ¥ ’ £ ’ — ÷ ‘ ¢ “ ‘ ª — ò ó ô ø ù Ä © ° ˜ ¤ ° ˜ œ ’ ‘ ¥ — ˜ œ ¤ £ ª ’ ˜ ³ ¢ £ “ ª ‘ Ÿ — £ ’ œ ˜ ¥ » “ ” ’ ¤ ¥ ˜ ‘ ’ œ ¦<br />
ò<br />
° £ œ £ — ˜ û ¤ ” ¢ ¥ ˜ £ œ ’ ‘ ¥ © ˜ ¥ ° œ ¥ ’ — ˜ ’ š Ä £ ¡ ˜ — £ Ÿ ˜ ‘ ¢ ‘ š ˜ œ ¥ œ Ä ³ ˜ ‘ œ ¥ ¥ ˜ œ ¥ ˜ ¤ ˜ ’ œ Ä œ ‘ ¤ ˜ ¢ œ ¤ ˜ £ ’ ¥ ˜ œ ¥ œ ’ —<br />
É<br />
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition <strong>and</strong> determ<strong>in</strong>ation<br />
Methods for confound<strong>in</strong>g effect<br />
Unobserved confound<strong>in</strong>g<br />
Difference from <strong>mediation</strong><br />
statistical test for confound<strong>in</strong>g<br />
TECHNICAL REPORT<br />
R-256<br />
January 1998<br />
¡ ¢ ¤ ¡ ¦ ¢ ¨ ¦ $ ¤ ¨ ! ¡ ¤ ,<br />
v w w x y<br />
z { | } ~ € ‚ { € ƒ ~ ‚ } | {<br />
„ … † ‡ ˆ Š ‹ Œ ‡ … Š †<br />
’ “ ‘ ” ’ — ˜ ’ š ˜ œ œ ˜ Ÿ ¡ ¢ £ ¤ ‘ ’ ¤ £ ¡ ¥ ¦ § “ © £ ” ’ — £ ª ¥ ® £ ¥ ‘ £ œ ¥ ˜ Ÿ ¥ £ ¥ ° £ £ ± £ ¤ ¥ ‘ “ ‘ ’ £ ² ª ˜ ³ ¢ £<br />
‘<br />
µ · ‘ ’ ’ ‘ ¥ ° £ ª ´ ¹ · ³ » £ ½ Ÿ ˜ ’ ˜ ’ š ¥ ° £ œ ¥ ¥ ˜ œ ¥ ˜ ¤ ¢ œ œ ‘ ¤ ˜ ¥ ˜ ‘ ’ ³ £ ¥ © £ £ ’ ¥ ° £ ¥ © ‘ Ä © £ ‘ ” š ° ¥ ¥ ‘<br />
´<br />
Ÿ £ ’ ¥ Ä ¤ ’ ’ ‘ ¥ £ œ ˜ ¢ » ³ £ £ ½ ¡ ª £ œ œ £ — ˜ ’ ¥ ° £ œ ¥ ’ — ª — ¢ ’ š ” š £ ‘ “ ¡ ª ‘ ³ ³ ˜ ¢ ˜ ¥ » ¥ ° £ ‘ ª » Ä ³ £ ¤ ” œ £<br />
˜<br />
° ¥ ¥ ° £ ‘ ª » — £ ¢ œ © ˜ ¥ ° œ ¥ ¥ ˜ ¤ ¤ ‘ ’ — ˜ ¥ ˜ ‘ ’ œ Ä ’ — — ‘ £ œ ’ ‘ ¥ ¡ £ ª Ÿ ˜ ¥ ” œ ¥ ‘ ¡ ª £ — ˜ ¤ ¥ Ä £ ² £ ’ “ ª ‘ Ÿ “ ” ¢ ¢<br />
¥<br />
logo<br />
œ ¡ £ ¤ ˜ æ ¤ ¥ ˜ ‘ ’ ‘ “<br />
¡ ‘ ¡ ” ¢ ¥ ˜ ‘ ’ — £ ’ œ ˜ ¥ » “ ” ’ ¤ ¥ ˜ ‘ ’ Ä © ° ¥ ª £ ¢ ¥ ˜ ‘ ’ œ ° ˜ ¡ œ © ‘ ” ¢ — ¡ ª £ ² ˜ ¢ ˜ “ ¤ ‘ ’ — ˜ ¥ ˜ ‘ ’ œ<br />
¤ ‘ ’ ‘ Ÿ ˜ œ ¥ œ ü ° ² £ Ÿ — £ ’ ” Ÿ £ ª ‘ ” œ ¥ ¥ £ Ÿ ¡ ¥ œ ¥ ‘ £ ½ ¡ ª £ œ œ ¤ ‘ ’ “ ‘ ” ’ — ˜ ’ š ˜ ’ œ ¥ ¥ ˜ œ ¥ ˜ ¤ ¢ ¥ £ ª Ÿ œ Ä ¡ ª ¥ ¢ »<br />
£<br />
William Wu Cancer Biostatistics<br />
ý þ ¢ ¤ ¦ ¨ ¢ ! " # % % & ( ) - ¢ " 1 ¢ 5 5 ¦ 7 8 ¢ 5 (<br />
= > ¦ ¢ ¦ 5 ¨ ¢ 7 ¤ C 5 ¨ 7 5 ¨ F C " F ¢ F 7 ¦ I F ¨ K ¢ ! ¨ 7 ¨ ¦ F M ¦ ¨ ¦ I N P I ¦ ¨ " S T V ¦ I
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
INTERACTION<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
What is <strong><strong>in</strong>teraction</strong><br />
ˆ An <strong><strong>in</strong>teraction</strong> means that the effect of X on Y depends on<br />
the level of a third variable.<br />
ˆ No causal sequence is implied by <strong><strong>in</strong>teraction</strong>.<br />
ˆ Also known as modification or moderation<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
Underst<strong>and</strong><strong>in</strong>g <strong><strong>in</strong>teraction</strong> effect<br />
ˆ We have the follow<strong>in</strong>g regression on x 1 <strong>and</strong> x 2 :<br />
y = α + β 1 x 1 + β 2 x 2 + β 3 (x 1 ∗ x 2 ) + ε<br />
ˆ The null hypothesis is H 0 : β 3 = 0, or product of the two<br />
variables, x 1 <strong>and</strong> x 2 , has no effect on Y.<br />
ˆ The test of H 0 : β 3 = 0 is a test for parallellism of the two<br />
slopes (if x 2 has two levels).<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
Underst<strong>and</strong><strong>in</strong>g <strong><strong>in</strong>teraction</strong> effect<br />
ˆ Given:<br />
y = α + β 1 x 1 + β 2 x 2 + β 3 (x 1 ∗ x 2 ) + ε<br />
ˆ Without <strong><strong>in</strong>teraction</strong>, effect x 1 on y is measured by β 1 .<br />
ˆ With <strong><strong>in</strong>teraction</strong> term, effect of x 1 on y is measured by<br />
β 1 + β 3 x 2 .<br />
effect changes as x 2 <strong>in</strong>creases.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
When x 2 has two levels<br />
Effect (slope) of X1 on Y does depend on X2 value.<br />
12<br />
8<br />
Y<br />
Y = 1 + 2X2<br />
1 + 3X3<br />
2 + 4X4<br />
1 X 2<br />
Y = 1 + 2X2<br />
+ 3(1) 4<br />
1 ) + 4X 1 (1)) = 4 + 6X 1<br />
Y = 1 + 2X2<br />
+ 3(0) 4<br />
1 ) + 4X 1 (0)) = 1 + 2X 1<br />
4<br />
0<br />
0 0.5 1 1.5<br />
X 1<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
How to determ<strong>in</strong>e <strong><strong>in</strong>teraction</strong><br />
philosophically<br />
ˆ Interaction can be an <strong>in</strong>terest of the study<br />
ˆ Interaction is usually pre-specified<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
An example about the philosophy<br />
For example, imag<strong>in</strong>e a study that tests the effects of a treatment<br />
of an outcome measure. The treatment variable is composed of<br />
two groups, e.g., treatment <strong>and</strong> control. The results are that the<br />
mean of the treatment group is higher than the mean for the<br />
control group. But what if the research is also <strong>in</strong>terested <strong>in</strong><br />
whether the treatment is equally effective for females <strong>and</strong> males.<br />
That is a difference <strong>in</strong> treatment depend<strong>in</strong>g on gender group. This<br />
is a question of <strong><strong>in</strong>teraction</strong>.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
How to determ<strong>in</strong>e <strong><strong>in</strong>teraction</strong><br />
statistically<br />
ˆ Likelihood ratio test can be applied to test the <strong><strong>in</strong>teraction</strong>.<br />
ˆ Interaction terms can be excluded from the model if they are<br />
as a whole <strong>in</strong>significant.<br />
ˆ Ma<strong>in</strong> effect may turn to be <strong>in</strong>significant when <strong><strong>in</strong>teraction</strong> is<br />
<strong>in</strong>cluded <strong>in</strong> model.<br />
ˆ Ma<strong>in</strong> effect won’t tell the whole story <strong>in</strong> the presence of<br />
significant <strong><strong>in</strong>teraction</strong>.<br />
ˆ Stratified estimates are to be reported if the <strong><strong>in</strong>teraction</strong> is<br />
tested significant.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
A note<br />
The statistical power to test the significant <strong><strong>in</strong>teraction</strong> is 5-times<br />
lower that to test ma<strong>in</strong> effect. So, p = 0.10 could be considered<br />
significant. Keep <strong>in</strong> m<strong>in</strong>d we do not want to miss any important<br />
<strong><strong>in</strong>teraction</strong>.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
Interaction is different from <strong>mediation</strong> <strong>in</strong>:<br />
ˆ No causality<br />
ˆ Test for product of two measurements (test for product of two<br />
coefficients for <strong>mediation</strong>)<br />
ˆ Can be tested (confound<strong>in</strong>g can not)<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
Acknowledgements<br />
P<strong>in</strong>gsheng Wu, Ph.D.<br />
Adriana Gonzalez, M.D., Ph.D.<br />
Debra Friedman, M.D., Ph.D.<br />
logo<br />
William Wu<br />
Cancer Biostatistics
Mediation<br />
<strong>Confound<strong>in</strong>g</strong><br />
Interaction<br />
Def<strong>in</strong>ition<br />
Determ<strong>in</strong>ation of <strong><strong>in</strong>teraction</strong><br />
Difference from <strong>mediation</strong> <strong>and</strong> confound<strong>in</strong>g<br />
Thank you!<br />
logo<br />
William Wu<br />
Cancer Biostatistics