22.03.2013 Views

Final Technical Report: - Southwest Fisheries Science Center - NOAA

Final Technical Report: - Southwest Fisheries Science Center - NOAA

Final Technical Report: - Southwest Fisheries Science Center - NOAA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.2 Modeling Framework : GLM and GAM<br />

4.2.1 Comparisons of GAM Algorithms<br />

During the comparison of GAM algorithms, we found a bug in the step.gam function<br />

from the R package gam code that previously had not been reported to the R mailing lists, and<br />

that was unknown to the package developer (pers comm. with Hastie). The bug prevented<br />

step.gam from including the offset term for survey effort in any encounter rate model that was<br />

examined during the stepwise search. As a result, we only modeled group size (and not<br />

encounter rates) using the step.gam algorithm from R package gam.<br />

The group size GAMs built using the S-PLUS and R package gam algorithms were<br />

essentially identical: the best models contained the exact same predictor variables and associated<br />

degrees of freedom, and the parameterization of the smoothing splines were identical, except for<br />

small differences that were likely due to the precision of the software platforms.<br />

GAMs built using R package mgcv were more variable. The mgcv gam algorithm allows<br />

users to adjust more parameters and settings to build the models compared to the S-PLUS<br />

analogue. To the knowledgeable user, this flexibility enables fine-tuning of the GAMs. On the<br />

other hand, having numerous adjustable arguments makes the algorithm less user-friendly<br />

because a greater investment of time must be spent to learn how to build appropriate models.<br />

Tables 10 and 11 show the range of encounter rate and group size models, respectively,<br />

selected as the final model by mgcv gam given the specified combination of settings for the<br />

gam.method, smoothing spline, and gamma arguments. The paired models for each<br />

species/response variable that are provided in these tables were chosen based on the sum of the<br />

absolute value of the deviation of the observed-to-predicted ratios of the response variable in the<br />

geographic strata shown in Figure 7. The “simple models” in Tables 10 and 11 represent the<br />

models having relatively few effective degrees of freedom and the smallest sum of absolute<br />

deviations of the observed-to-predicted ratios. Similarly, the “complex models” represent those<br />

having a relatively large number of effective degrees of freedom in addition to good agreement<br />

between observed and predicted values of the response variable. For cases in which a single<br />

model clearly outperformed all of the others, only one model is presented in the table.<br />

The variability in model complexity can be illustrated using the rough-toothed dolphin<br />

encounter rate models, where the preferred simple model had 8.9 degrees of freedom and the<br />

preferred complex model had over fifty degrees of freedom. The sum of absolute deviations of<br />

the observed-to-predicted ratios is smaller for the complex model. This is to be expected<br />

because the data used for predictions were also used to build the models; in this scenario, a<br />

complex model is more likely to exhibit fidelity to the data.<br />

48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!