14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3 Fitting St<strong>and</strong>ard Least Squares Models 63<br />

Regression Reports<br />

should consider adding interaction terms, if appropriate, or try to better capture the functional form of a<br />

regressor.<br />

Table 3.7 shows information about the calculations in the Lack of Fit report.<br />

Table 3.7 Description of the Lack of Fit Report<br />

Source<br />

DF<br />

Sum of Squares<br />

Lists the three sources of variation: Lack of Fit, Pure Error, <strong>and</strong> Total Error.<br />

Records an associated DF for each source of error:<br />

• The DF for Total Error are also found on the Error line of the Analysis of<br />

Variance table. It is the difference between the C. Total DF <strong>and</strong> the<br />

Model DF found in that table. The Error DF is partitioned into degrees<br />

of freedom for lack of fit <strong>and</strong> for pure error.<br />

• The Pure Error DF is pooled from each group where there are multiple<br />

rows with the same values for each effect. For example, in the sample<br />

data, Big Class.jmp, there is one instance where two subjects have the<br />

same values of age <strong>and</strong> weight (Chris <strong>and</strong> Alfred are both 14 <strong>and</strong> have a<br />

weight of 99). This gives 1(2 - 1) = 1 DF for Pure Error. In general, if<br />

there are g groups having multiple rows with identical values for each<br />

effect, the pooled DF, denoted DF p, can be calculated as follows:<br />

g<br />

<br />

DF p<br />

= ( n i<br />

– 1)<br />

i = 1<br />

where n i is the number of replicates in the ith group.<br />

• The Lack of Fit DF is the difference between the Total Error <strong>and</strong> Pure<br />

Error DFs.<br />

Records an associated sum of squares (SS) for each source of error:<br />

• The Total Error SS is the sum of squares found on the Error line of the<br />

corresponding Analysis of Variance table.<br />

• The Pure Error SS is pooled from each group where there are multiple<br />

rows with the same values for each effect. This estimates the portion of<br />

the true r<strong>and</strong>om error that is not explained by model effects. In general,<br />

if there are g groups having multiple rows with like values for each effect,<br />

the pooled SS, denoted SS p, is calculated as follows:<br />

g<br />

<br />

SS p<br />

= SS i<br />

i = 1<br />

where SS i is the sum of squares for the ith group corrected for its mean.<br />

• The Lack of Fit SS is the difference between the Total Error <strong>and</strong> Pure<br />

Error sum of squares. If the lack of fit SS is large, it is possible that the<br />

model is not appropriate for the data. The F-ratio described below tests<br />

whether the variation due to lack of fit is small, relative to the Pure Error.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!