06.09.2021 Views

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16.5.1 The F test comparing two models<br />

Let’s frame this in a slightly more abstract way. We’ll say that our full model can be written as an R<br />

<strong>for</strong>mula that contains several different terms, say Y ~ A + B + C + D. Our null model only contains some<br />

subset of these terms, say Y ~ A + B. Some of these terms might be main effect terms, <strong>other</strong>s might be<br />

interaction terms. It really doesn’t matter. The only thing that matters here is that we want to treat<br />

some of these terms as the “starting point” (i.e. the terms in the null model, A <strong>and</strong> B), <strong>and</strong> we want to<br />

see if including the <strong>other</strong> terms (i.e., C <strong>and</strong> D) leads to a significant improvement in model per<strong>for</strong>mance,<br />

over <strong>and</strong> above what could be achieved by a model that includes only A <strong>and</strong> B. In essence, we have null<br />

<strong>and</strong> alternative hypotheses that look like this:<br />

Hypothesis Correct model? R <strong>for</strong>mula <strong>for</strong> correct model<br />

Null M0 Y ~ A + B<br />

Alternative M1 Y ~ A + B + C + D<br />

Is there a way of making this comparison directly?<br />

To answer this, let’s go back to fundamentals. As we saw in Chapter 14, theF -test is constructed<br />

from two kinds of quantity: sums of squares (SS) <strong>and</strong> degrees of freedom (df). These two things define a<br />

mean square value (MS = SS/df), <strong>and</strong> we obtain our F statistic by contrasting the MS value associated<br />

<strong>with</strong> “the thing we’re interested in” (the model) <strong>with</strong> the MS value associated <strong>with</strong> “everything else”<br />

(the residuals). What we want to do is figure out how to talk about the SS value that is associated <strong>with</strong><br />

the difference between two models. It’s actually not all that hard to do.<br />

Let’s start <strong>with</strong> the fundamental rule that we used throughout the chapter on regression:<br />

SS T “ SS M ` SS R<br />

That is, the total sums of squares (i.e., the overall variability of the outcome variable) can be decomposed<br />

into two parts: the variability associated <strong>with</strong> the model SS M , <strong>and</strong> the residual or leftover variability,<br />

SS R . However, it’s kind of useful to rearrange this equation slightly, <strong>and</strong> say that the SS value associated<br />

<strong>with</strong> a model is defined like this...<br />

SS M “ SS T ´ SS R<br />

Now, in our scenario, we have two models: the null model (M0) <strong>and</strong> the full model (M1):<br />

SS M0 “ SS T ´ SS R0<br />

SS M1 “ SS T ´ SS R1<br />

Next, let’s think about what it is we actually care about here. What we’re interested in is the difference<br />

between the full model <strong>and</strong> the null model. So, if we want to preserve the idea that what we’re doing<br />

is an “analysis of the variance” (ANOVA) in the outcome variable, what we should do is define the SS<br />

associated <strong>with</strong> the difference to be equal to the difference in the SS:<br />

SS Δ “ SS M1 ´ SS M0<br />

“ pSS T ´ SS R1 q´pSS T ´ SS R0 q<br />

“ SS R0 ´ SS R1<br />

Now that we have our degrees of freedom, we can calculate mean squares <strong>and</strong> F values in the usual<br />

way. Specifically, we’re interested in the mean square <strong>for</strong> the difference between models, <strong>and</strong> the mean<br />

square <strong>for</strong> the residuals associated <strong>with</strong> the full model (M1), which are given by<br />

MS Δ “ SS Δ<br />

df Δ<br />

- 519 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!