Multiple Regression Analysis - essentiavitae.com
Multiple Regression Analysis - essentiavitae.com
Multiple Regression Analysis - essentiavitae.com
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
RHenson<strong>Multiple</strong><strong>Regression</strong> 1<br />
Module 3 Assignment 3: <strong>Multiple</strong> <strong>Regression</strong><br />
Robin Henson<br />
August 8, 2009<br />
NSG 6163 Health Out<strong>com</strong>es<br />
Texas Woman’s University
RHenson<strong>Multiple</strong><strong>Regression</strong> 2<br />
Assign 3: <strong>Multiple</strong> <strong>Regression</strong><br />
Use the sleep.sav data set and the multiple correlation/regression<br />
procedures that you have learned to use in SPSS to generate the statistics<br />
necessary to <strong>com</strong>plete the following tasks. Place your answers in a word<br />
file and post by the Assignment Function under Module 3 Assignments by<br />
August 8, 2009- @2400hrs.<br />
PART 1<br />
Run a standard multiple regression procedure to explore how factors such<br />
as gender (sex), age (age), physical fitness (fitrate) and depression<br />
(depress) impact the level of daytime sleepiness (totSAS dep variable).<br />
Provide relevant <strong>com</strong>puter output to support your check of the<br />
assumptions, outliers and multicollinearity. Present and interpret your<br />
findings. Copy and paste relevant output from your SPSS run into your<br />
word file.<br />
Descriptive Statistics:<br />
Sleepy & Associated sensations (totSAS) is ordinal data that is treated as continuous<br />
data. Scores 5=low and 50= extreme sleepiness. There are 251 valid cases and the<br />
mean score is 26 (SD 10.5).<br />
Gender (sex) is dichotomous data. 0=females and 1=males. The mean is .45 (SD .498)<br />
and contains 271 valid cases. There are slightly more female participants than males.<br />
Age (age) is continuous data. The mean age is 44 (SD 12.7) and contains 248 valid<br />
cases.<br />
Physical fitness (fitrate) is ordinal data that is treated as continuous data. Scores<br />
1=very poor and 10=very good. The mean score is 6.42 (SD 1.7) and contains 266 valid<br />
cases. These individuals perceive themselves to have an above average level of<br />
fitness.<br />
Depression (HADS Depression) is ordinal data that is treated as continuous data.<br />
Scores 0=no anxiety and 21= severe anxiety. The mean score is 3.5 (SD 2.99) and<br />
contains 269 valid cases. These individuals have lower average reported levels of<br />
anxiety.<br />
sleepy & assoc sensations<br />
scale<br />
Descriptive Statistics<br />
Mean Std. Deviation N<br />
26.04 10.520 251<br />
Sex .45 .498 271<br />
Age 43.87 12.684 248<br />
physical fitness 6.42 1.717 266<br />
HADS Depression 3.50 2.993 269
RHenson<strong>Multiple</strong><strong>Regression</strong> 3<br />
Pearson<br />
Correlation<br />
Sig. (1-tailed)<br />
N<br />
Sleepy & assoc<br />
sensations scale<br />
Correlations<br />
sleepy &<br />
assoc<br />
sensations<br />
scale sex age<br />
physical<br />
fitness<br />
HADS<br />
Depression<br />
1.000 -.199 -.141 -.267 .482<br />
Sex -.199 1.000 -.017 .110 -.071<br />
Age -.141 -.017 1.000 -.039 -.004<br />
physical fitness -.267 .110 -.039 1.000 -.314<br />
HADS Depression .482 -.071 -.004 -.314 1.000<br />
Sleepy & assoc<br />
sensations scale<br />
. .001 .017 .000 .000<br />
Sex .001 . .393 .037 .124<br />
Age .017 .393 . .271 .473<br />
physical fitness .000 .037 .271 . .000<br />
HADS Depression .000 .124 .473 .000 .<br />
Sleepy & assoc<br />
sensations scale<br />
251 251 230 247 249<br />
Sex 251 271 248 266 269<br />
Age 230 248 248 243 246<br />
physical fitness 247 266 243 266 265<br />
HADS Depression 249 269 246 265 269<br />
Model<br />
Variables Entered/Removed b<br />
Variables<br />
Entered<br />
Variables<br />
Removed<br />
Method
RHenson<strong>Multiple</strong><strong>Regression</strong> 4<br />
1 HADS<br />
Depression,<br />
age, sex,<br />
physical fitness a<br />
a. All requested variables entered.<br />
. Enter<br />
b. Dependent Variable: sleepy & assoc sensations<br />
scale<br />
Model R R Square<br />
Model Summary b<br />
Adjusted R<br />
Square<br />
Std. Error of the<br />
Estimate<br />
1 .541 a .293 .280 8.927<br />
a. Predictors: (Constant), HADS Depression, age, sex, physical<br />
fitness<br />
b. Dependent Variable: sleepy & assoc sensations scale<br />
ANOVA b<br />
Model Sum of Squares df Mean Square F Sig.<br />
1 <strong>Regression</strong> 7413.343 4 1853.336 23.258 .000 a<br />
Residual 17929.187 225 79.685<br />
Total 25342.530 229<br />
a. Predictors: (Constant), HADS Depression, age, sex, physical fitness<br />
b. Dependent Variable: sleepy & assoc sensations scale<br />
Coefficients a<br />
Standar<br />
dized<br />
95.0%<br />
Unstandardized<br />
Coefficie<br />
Confidence<br />
Collinearity<br />
Coefficients<br />
nts<br />
Interval for B<br />
Correlations<br />
Statistics<br />
Model<br />
B<br />
Std.<br />
Error Beta t Sig.<br />
Lower<br />
Bound<br />
Upper<br />
Bound<br />
Zeroorder<br />
Partia<br />
l<br />
Part<br />
Tolera<br />
nce<br />
VIF<br />
1 (Constant) 32.211 3.481 9.255 .000 25.352 39.069
RHenson<strong>Multiple</strong><strong>Regression</strong> 5<br />
Sex -3.323 1.193 -.157 -<br />
2.786<br />
Age -.121 .047 -.146 -<br />
2.604<br />
.006 -5.673 -.973 -.199 -.183 -.156 .986 1.014<br />
.010 -.213 -.029 -.141 -.171 -.146 .998 1.002<br />
physical<br />
fitness<br />
-.731 .364 -.119 -<br />
2.008<br />
.046 -1.447 -.014 -.267 -.133 -.113 .892 1.121<br />
HADS<br />
Depressio<br />
n<br />
1.522 .208 .433 7.329 .000 1.113 1.932 .482 .439 .411 .900 1.111<br />
a. Dependent Variable: sleepy & assoc sensations scale<br />
Model<br />
1<br />
Dime<br />
nsion<br />
Eigenvalu<br />
e<br />
Condition<br />
Index<br />
Collinearity Diagnostics a<br />
(Constant<br />
) sex age<br />
Variance Proportions<br />
physical<br />
fitness<br />
HADS<br />
Depression<br />
1 4.027 1.000 .00 .02 .00 .00 .02<br />
2 .539 2.734 .00 .69 .00 .00 .21<br />
3 .343 3.424 .00 .28 .02 .03 .58<br />
4 .071 7.509 .00 .01 .62 .34 .03<br />
5 .020 14.235 .99 .00 .35 .63 .16<br />
a. Dependent Variable: sleepy & assoc sensations scale<br />
Residuals Statistics a<br />
Minimum Maximum Mean Std. Deviation N<br />
Predicted Value 15.04 42.28 26.15 5.653 242<br />
Std. Predicted Value -1.933 2.853 .018 .993 242<br />
Standard Error of Predicted<br />
Value<br />
.816 2.269 1.281 .283 242<br />
Adjusted Predicted Value 15.04 41.95 26.27 5.577 225<br />
Residual -24.781 19.438 .178 8.839 225<br />
Std. Residual -2.776 2.178 .020 .990 225<br />
Stud. Residual -2.810 2.207 .020 1.002 225
RHenson<strong>Multiple</strong><strong>Regression</strong> 6<br />
Deleted Residual -25.393 19.980 .172 9.050 225<br />
Stud. Deleted Residual -2.854 2.226 .019 1.006 225<br />
Mahal. Distance .919 13.804 3.949 2.314 242<br />
Cook's Distance .000 .075 .005 .009 225<br />
Centered Leverage Value .004 .060 .017 .010 242<br />
a. Dependent Variable: sleepy & assoc sensations scale
RHenson<strong>Multiple</strong><strong>Regression</strong> 7<br />
State 2 applicable research questions (Hint: use questions on page 151 of<br />
SPSS survival manual as an example)<br />
1. How well do the variables gender, age, fitness, and depression predict totSAS?<br />
How much variance in totSAS can be explained by scores on these<br />
variables/scales?<br />
2. Which is the best predictor in totSAS: gender, age, fitness, or depression?<br />
Were the assumptions met? Yes, and are described below:<br />
A. Sample Size: Based on the equation N>50 + 8 (# of independent variables),<br />
this sample size is adequate (248> 50 + 8(4)= 248> 82). The assumption for<br />
sample size is not violated.<br />
B. Multicollinearity:<br />
Correlation of independent variables- The independent variables sex (r= -<br />
.199), age (r=-.141), fitrate (r=-.267) have a small negative correlation with<br />
totSAS. The independent variable, HADS Depression has a moderate positive<br />
correlation with totSAS. All 4 variables are
RHenson<strong>Multiple</strong><strong>Regression</strong> 8<br />
There is an apparent linear relationship between the independent variables and totSAS.<br />
There is no apparent curvature. This supports the proposition that there is no<br />
violation of linearity.<br />
G. Outliers<br />
Scatterplot-On the scatterplot there are no potential outliers. All cases fall within<br />
-3.3 to 3.3.<br />
Mahalonobis Distances (MAH-1)-Outliers can also be checked through the use<br />
of MAH-1. The critical chi-square value for 4 independent variables is 18.47. The<br />
maximum value for this data file is 13.804, which does not exceed the critical value. If<br />
the MAH-1 exceeded this critical value, the researcher would need to investigate the<br />
cases of concern on SPSS by looking up the MAH-1 value for each potential outlier. If<br />
the case(s) in question exceed the MAH-1 number, the researcher would need to<br />
consider if that particular case (cases) would need to be removed. In this instance, no<br />
cases should be considered for removal. There is no major violation of the<br />
assumption of multicollinearity.<br />
Cook’s Distance-The maximum value for Cook’s Distance is .086. Since it is 1, the SPSS data file<br />
must be checked for all cases that have a COO_1 value >1. The researcher would need<br />
to consider removing high value cases.<br />
Evaluating the model:<br />
29% (R square=.293) of the variance in totSAS is explained by the model (gender, age,<br />
physical fitness, and depression). The adjusted R square (.280) corrects the value of R 2<br />
when the study sample is small, which is not a problem in this case because the sample<br />
size is adequate. The ANOVA table indicates that this model reaches statistical<br />
significance [F(4,225)= 23.268, p
RHenson<strong>Multiple</strong><strong>Regression</strong> 9<br />
explaining totSAS (p
RHenson<strong>Multiple</strong><strong>Regression</strong> 10<br />
Run a hierarchical multiple regression procedure using the same variables<br />
you used in regression number one. Control for gender and age then<br />
examine the effects of physical fitness and depression on daytime<br />
sleepiness. Present and interpret your findings. Copy and paste relevant<br />
<strong>com</strong>puter output that displays results and provide a narrative summary of<br />
the output.<br />
State your research question.<br />
Research question:<br />
1. If we control for the possible effect of gender and age, is our set of variables<br />
(physical fitness and depression) still able to predict a significant amount of the<br />
variance in perceived stress?<br />
sleepy & assoc sensations<br />
scale<br />
Descriptive Statistics<br />
Mean Std. Deviation N<br />
26.04 10.520 251<br />
Sex .45 .498 271<br />
Age 43.87 12.684 248<br />
physical fitness 6.42 1.717 266<br />
HADS Depression 3.50 2.993 269
RHenson<strong>Multiple</strong><strong>Regression</strong> 11<br />
Pearson<br />
Correlation<br />
Sig. (1-tailed)<br />
N<br />
sleepy & assoc<br />
sensations scale<br />
Correlations<br />
sleepy &<br />
assoc<br />
sensations<br />
scale Sex age<br />
physical<br />
fitness<br />
HADS<br />
Depression<br />
1.000 -.199 -.141 -.267 .482<br />
Sex -.199 1.000 -.017 .110 -.071<br />
Age -.141 -.017 1.000 -.039 -.004<br />
physical fitness -.267 .110 -.039 1.000 -.314<br />
HADS Depression .482 -.071 -.004 -.314 1.000<br />
sleepy & assoc<br />
sensations scale<br />
. .001 .017 .000 .000<br />
Sex .001 . .393 .037 .124<br />
Age .017 .393 . .271 .473<br />
physical fitness .000 .037 .271 . .000<br />
HADS Depression .000 .124 .473 .000 .<br />
sleepy & assoc<br />
sensations scale<br />
251 251 230 247 249<br />
Sex 251 271 248 266 269<br />
Age 230 248 248 243 246<br />
physical fitness 247 266 243 266 265<br />
HADS Depression 249 269 246 265 269<br />
Model<br />
Variables Entered/Removed b<br />
Variables<br />
Entered<br />
Variables<br />
Removed<br />
1 age, sex a . Enter<br />
2 HADS<br />
Depression,<br />
physical fitness a<br />
Method<br />
. Enter
RHenson<strong>Multiple</strong><strong>Regression</strong> 12<br />
a. All requested variables entered.<br />
b. Dependent Variable: sleepy & assoc sensations<br />
scale<br />
Mod<br />
el<br />
R<br />
R<br />
Square<br />
Adjusted R<br />
Square<br />
Model Summary c<br />
Std. Error<br />
of the<br />
Estimate<br />
R Square<br />
Change<br />
Change Statistics<br />
F<br />
Change df1 df2<br />
Sig. F<br />
Change<br />
1 .245 a .060 .052 10.243 .060 7.267 2 227 .001<br />
2 .541 b .293 .280 8.927 .232 36.948 2 225 .000<br />
a. Predictors: (Constant), age, sex<br />
b. Predictors: (Constant), age, sex, HADS Depression, physical fitness<br />
c. Dependent Variable: sleepy & assoc sensations scale<br />
ANOVA c<br />
Model Sum of Squares Df Mean Square F Sig.<br />
1<br />
2<br />
<strong>Regression</strong> 1524.909 2 762.455 7.267 .001 a<br />
Residual 23817.621 227 104.923<br />
Total 25342.530 229<br />
<strong>Regression</strong> 7413.343 4 1853.336 23.258 .000 b<br />
Residual 17929.187 225 79.685<br />
Total 25342.530 229<br />
a. Predictors: (Constant), age, sex<br />
b. Predictors: (Constant), age, sex, HADS Depression, physical fitness<br />
c. Dependent Variable: sleepy & assoc sensations scale
RHenson<strong>Multiple</strong><strong>Regression</strong> 13<br />
Coefficients a<br />
Standardized<br />
Unstandardized Coefficients<br />
Coefficients Correlations Collinearity Statistics<br />
Model<br />
B Std. Error Beta<br />
T<br />
Sig.<br />
Zero-order Partial Part Tolerance VIF<br />
1<br />
(Constant) 33.184 2.521 13.162 .000<br />
Sex -4.246 1.359 -.201 -3.123 .002 -.199 -.203 -.201 1.000 1.000<br />
Age -.120 .053 -.144 -2.240 .026 -.141 -.147 -.144 1.000 1.000<br />
2<br />
(Constant) 32.211 3.481 9.255 .000<br />
Sex -3.323 1.193 -.157 -2.786 .006 -.199 -.183 -.156 .986 1.014<br />
Age -.121 .047 -.146 -2.604 .010 -.141 -.171 -.146 .998 1.002<br />
physical fitness -.731 .364 -.119 -2.008 .046 -.267 -.133 -.113 .892 1.121<br />
HADS Depression 1.522 .208 .433 7.329 .000 .482 .439 .411 .900 1.111<br />
a. Dependent Variable: sleepy & assoc sensations scale<br />
Excluded Variables b<br />
Collinearity Statistics<br />
Model Beta In T Sig. Partial Correlation<br />
Tolerance VIF Minimum Tolerance<br />
1<br />
physical fitness -.254 a -4.046 .000 -.260 .987 1.014 .987<br />
HADS Depression .470 a 8.303 .000 .483 .995 1.005 .995<br />
a. Predictors in the Model: (Constant), age, sex<br />
b. Dependent Variable: sleepy & assoc sensations scale<br />
Collinearity Diagnostics a<br />
Model<br />
Dimensio<br />
n Eigenvalue Condition Index<br />
Variance Proportions<br />
(Constant) Sex Age physical fitness HADS Depression<br />
1<br />
1 2.522 1.000 .01 .06 .01<br />
2 .440 2.395 .02 .92 .03<br />
3 .038 8.112 .97 .02 .96<br />
2 1 4.027 1.000 .00 .02 .00 .00 .02
RHenson<strong>Multiple</strong><strong>Regression</strong> 14<br />
2 .539 2.734 .00 .69 .00 .00 .21<br />
3 .343 3.424 .00 .28 .02 .03 .58<br />
4 .071 7.509 .00 .01 .62 .34 .03<br />
5 .020 14.235 .99 .00 .35 .63 .16<br />
a. Dependent Variable: sleepy & assoc sensations scale<br />
Residuals Statistics a<br />
Minimum Maximum Mean Std. Deviation N<br />
Predicted Value 15.04 42.28 26.15 5.653 242<br />
Std. Predicted Value -1.933 2.853 .018 .993 242<br />
Standard Error of Predicted<br />
Value<br />
.816 2.269 1.281 .283 242<br />
Adjusted Predicted Value 15.04 41.95 26.27 5.577 225<br />
Residual -24.781 19.438 .178 8.839 225<br />
Std. Residual -2.776 2.178 .020 .990 225<br />
Stud. Residual -2.810 2.207 .020 1.002 225<br />
Deleted Residual -25.393 19.980 .172 9.050 225<br />
Stud. Deleted Residual -2.854 2.226 .019 1.006 225<br />
Mahal. Distance .919 13.804 3.949 2.314 242<br />
Cook's Distance .000 .075 .005 .009 225<br />
Centered Leverage Value .004 .060 .017 .010 242<br />
a. Dependent Variable: sleepy & assoc sensations scale<br />
STEP 1: Evaluating the model:<br />
After the variables in Block 1 (age & sex) have been entered, the overall model explains<br />
5.2% of the variance (.052 X 100). After Block 2 (fitrate & HADS Depression) has been<br />
included, the model as a whole explains 28% (.28 X 100).<br />
How much of this overall variance is explained by the variables fitrate and HADS after<br />
the effects of age and gender are removed? The R square change for Model 2 is .232.<br />
Fitrate and HADS explain an additional 23% of the variance of totSAS, even when the<br />
effects of gender and age are statistically controlled for. This is a statistically significant<br />
contribution (Sig. F change is .000). The ANOVA table indicates that the model as a<br />
whole is statistically significant [F(4,225)= 23.26, p
RHenson<strong>Multiple</strong><strong>Regression</strong> 15<br />
(beta=-.157), age (beta=-.146), and fitness (beta=-.119).The relationship between sex (-.199),<br />
age (-.141), and physical fitness (-.267) show a low negative relationship with totSAS. HADS<br />
Depression has a positive moderate correlation (.482) with totSAS.<br />
<strong>Analysis</strong> conducted to assure that there was no violation of the assumption of sample<br />
size, multicollinearity, normality, homoscedasticity, and linearity can be found in the first<br />
section of this assignment.<br />
Collinearity diagnostics: Tolerance results for each variable are large (.892-.998)<br />
indicating that multiple correlation with other variables is low and suggests low<br />
probability of multicollinearity. The VIF scores for each variable (1.002-1.111) are less<br />
than 10, indicating low probability of multicollinearity. The Tolerance and VIF values<br />
meet the criteria for multicollinearity.<br />
Outliers: The critical chi-square value for 2 independent variables is 13.82 and the<br />
MAH-1 value obtained (13.804) does not exceed this value. Because these values are<br />
very close, the researcher might consider examining the MAH-1 data values on SPSS<br />
data view window and consider removing large value cases. To determine if this outlier<br />
has an influence on the results the researcher would check the Cook’s distance. If the<br />
value was >1, the SPSS data file must be checked for all cases that have a COO_1<br />
value >1. The researcher would need to consider removing high value cases. In this<br />
model, the Cook’s distance is .075 and suggests that there are no problems with<br />
outliers. There is no major violation of the assumption of multicollinearity.<br />
Be sure to answer your research question:<br />
1. If we control for the possible effect of gender and age, is our set of<br />
variables (physical fitness and depression) still able to predict a significant<br />
amount of the variance in perceived stress? Yes. When controlling for gender<br />
and age, physical fitness and depression share 23% of the variance of daytime<br />
sleepiness.<br />
Narrative Summary:<br />
Hierarchical multiple regression was used to assess the ability to control two<br />
measures (fitrate & HADS Depression) to predict sleepiness and associated<br />
symptoms (totSAS). After controlling for age and gender, preliminary analysis<br />
were conducted to assure that there was no violation of the assumption of<br />
normality, linearity, multicollinearity, and homoscedasticity. Gender and age were<br />
entered at Step 1, explaining 5.2% of the variance in totSAS. After entry of fitrate<br />
and HADS Depression scale at Step 2, the total variance explained by the model<br />
as a whole is 28%, [F(4,225)=23.26, p
RHenson<strong>Multiple</strong><strong>Regression</strong> 16<br />
(beta=.433, p