06.09.2021 Views

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

Learning Statistics with R - A tutorial for psychology students and other beginners, 2018a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

• data. The data frame containing the variables.<br />

As we saw <strong>with</strong> aov() in Chapter 14, the output of the lm() function is a fairly complicated object, <strong>with</strong><br />

quite a lot of technical in<strong>for</strong>mation buried under the hood. Because this technical in<strong>for</strong>mation is used by<br />

<strong>other</strong> functions, it’s generally a good idea to create a variable that stores the results of your regression.<br />

With this in mind, to run my linear regression, the comm<strong>and</strong> I want to use is this:<br />

> regression.1 print( regression.1 )<br />

Call:<br />

lm(<strong>for</strong>mula = dan.grump ~ dan.sleep, data = parenthood)<br />

Coefficients:<br />

(Intercept) dan.sleep<br />

125.956 -8.937<br />

This looks promising. There’s two separate pieces of in<strong>for</strong>mation here. Firstly, R is politely reminding<br />

us what the comm<strong>and</strong> was that we used to specify the model in the first place, which can be helpful.<br />

More importantly from our perspective, however, is the second part, in which R givesustheintercept<br />

ˆb0 “ 125.96 <strong>and</strong> the slope ˆb 1 “´8.94. In <strong>other</strong> words, the best-fitting regression line that I plotted in<br />

Figure 15.2 has this <strong>for</strong>mula:<br />

Ŷ i “´8.94 X i ` 125.96<br />

15.2.2 Interpreting the estimated model<br />

The most important thing to be able to underst<strong>and</strong> is how to interpret these coefficients. Let’s start<br />

<strong>with</strong> ˆb 1 , the slope. If we remember the definition of the slope, a regression coefficient of ˆb 1 “´8.94 means<br />

that if I increase X i by 1, then I’m decreasing Y i by 8.94. That is, each additional hour of sleep that I gain<br />

will improve my mood, reducing my grumpiness by 8.94 grumpiness points. What about the intercept?<br />

Well, since ˆb 0 corresponds to “the expected value of Y i when X i equals 0”, it’s pretty straight<strong>for</strong>ward. It<br />

implies that if I get zero hours of sleep (X i “ 0) then my grumpiness will go off the scale, to an insane<br />

value of (Y i “ 125.96). Best to be avoided, I think.<br />

15.3<br />

Multiple linear regression<br />

The simple linear regression model that we’ve discussed up to this point assumes that there’s a single<br />

predictor variable that you’re interested in, in this case dan.sleep. In fact, up to this point, every<br />

statistical tool that we’ve talked about has assumed that your analysis uses one predictor variable <strong>and</strong><br />

one outcome variable. However, in many (perhaps most) research projects you actually have multiple<br />

predictors that you want to examine. If so, it would be nice to be able to extend the linear regression<br />

- 461 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!