A Step by Step Guide for SPSS and Exercise Studies
A Step by Step Guide for SPSS and Exercise Studies
A Step by Step Guide for SPSS and Exercise Studies
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Statistical tests 125<br />
outliers in other analyses (e.g., MANOVA). In such cases, the dependent<br />
variable should be a separate column in the data file with the case numbers.<br />
Cook’s distance shows how much the regression coefficients would change if a<br />
particular case was omitted. Norusis (1998) suggests that Cook’s distances<br />
greater than 1 usually deserve scrutiny, as they may be too influential. Leverage<br />
values also measure multivariate outliers. This distance measure ranges from 0<br />
to close to 1, with greater values indicating potential outliers. Norusis (1998)<br />
suggests, as a rule of thumb, to look at values greater than 2p/N, where p is the<br />
number of independent variables <strong>and</strong> N is the number of cases. However, this<br />
rule of thumb identifies too many cases in small samples.<br />
The Save option also contains a number of Influence Statistics. These<br />
statistics identify cases which exert considerable influence on the calculation of<br />
various coefficients. DfBeta(s) show how much the regression coefficient of<br />
each independent variable <strong>and</strong> the constant term would change if a particular<br />
case was excluded from the analysis. St<strong>and</strong>ardized DfBeta(s) contain the same<br />
in<strong>for</strong>mation <strong>for</strong> st<strong>and</strong>ardised regression coefficients. Norusis (1998) proposes<br />
another rule of thumb, which states that cases p should be scrutinised if they have<br />
absolute st<strong>and</strong>ardised values greater than 2 N . DfFit shows the change in the<br />
predicted value of a dependent variable if a particular case is omitted.<br />
St<strong>and</strong>ardized DfFit shows p the st<strong>and</strong>ardised changes in the predicted values.<br />
Again, you can use the 2 N rule to identify influential cases.<br />
In Dialog box 82 click Plots (Dialog box 86). This option can be used to<br />
examine the assumptions underlying the regression analysis <strong>and</strong> to identify<br />
outliers <strong>and</strong> influential cases. A number of scatterplots can be plotted using the<br />
dependent variable (DEPENDNT), the st<strong>and</strong>ardised predicted values of the<br />
dependent variable (ZPRED), the st<strong>and</strong>ardised residuals (ZRESID), the residuals<br />
<strong>for</strong> a case when this case is excluded from the regression (DRESID), the<br />
predicted value of a case when the latter is excluded from the regression<br />
(ADJPRED), the studentized residuals (SRESID), <strong>and</strong> the studentized residuals<br />
<strong>for</strong> a case when it is excluded (deleted) from the regression (SDRESID). To<br />
obtain a bivariate scatterplot with any of the above variables, move one of them<br />
into the Y box <strong>and</strong> the other into the X box. To create more than one scatterplot,<br />
use the Next button.<br />
Norusis (1998) suggests a number of scatterplots to examine the assumptions<br />
of regression analysis. For example, to check the assumption of homoscedasticity,<br />
you can create a plot of ZPRED against the DEPENDNT. However, it is<br />
easier to examine this assumption if you plot the residuals against the predicted<br />
values. Norusis (1998) recommends the use of SRESID, as these should be<br />
normally distributed with a relatively large sample size. SRESID can be plotted<br />
against ZPRED (<strong>for</strong> an example of a similar plot, see Figure 25). Note that if you<br />
save the residuals in the data file (using the Save option), you could plot them<br />
against each of the independent variables. To create such plots, use the Simple<br />
Scatterplot in the Graphs menu. If the linearity assumption is met, such plots<br />
should not show any patterns. If they do, the relationship between the dependent<br />
variable <strong>and</strong> the particular independent variable is probably not linear.