A spatial, climate-determined risk rating for Scleroderris disease of ...
A spatial, climate-determined risk rating for Scleroderris disease of ...
A spatial, climate-determined risk rating for Scleroderris disease of ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
1400 Can. J. For. Res. Vol. 28, 1998<br />
Table 2. Diagnostics from four logistic regression analyses.<br />
Diagnostics<br />
Backward<br />
stepwise selection<br />
(11 variables)<br />
al. (1996) generated mathematical <strong>climate</strong> surfaces <strong>for</strong> the province<br />
using the thin-plate smoothing-spline techniques <strong>of</strong> Hutchinson<br />
(1991). These monthly surfaces were created as a function <strong>of</strong><br />
latitude, longitude, and elevation to capture both <strong>spatial</strong> and temporal<br />
variations in lapse rates. Errors associated with the surfaces<br />
are approximately ±0.5°C <strong>for</strong> temperature related variables, and<br />
10–20 mm <strong>for</strong> precipitation.<br />
These surfaces enable us to append <strong>climate</strong> variables to<br />
georeferenced (latitude, longitude, elevation) historical field survey<br />
data. Also, the <strong>climate</strong> variables can be mapped by coupling the<br />
mathematical surfaces to a digital elevation model (a regular grid<br />
<strong>of</strong> latitude, longitude, and elevation representing the topography on<br />
an area; see Moore et al. 1991). A digital elevation model <strong>for</strong> Ontario<br />
has been developed based on the National Topographic Series<br />
1 : 250 000 digital topographic data resolved at a 1-km grid. Mathematical<br />
surfaces were developed <strong>for</strong> long-term mean monthly averages<br />
<strong>of</strong> minimum temperature, maximum temperature, and<br />
precipitation. Secondary <strong>climate</strong> variables, that are likely to reflect<br />
processes determining distributions <strong>of</strong> organisms, were derived<br />
from these primary surfaces (see Mackey et al. 1996 <strong>for</strong> further details<br />
on <strong>climate</strong> modelling).<br />
Analysis<br />
We used logistic regression analysis to examine the probability<br />
<strong>of</strong> occurrence <strong>of</strong> <strong>Scleroderris</strong> <strong>disease</strong> as a function <strong>of</strong> aspects <strong>of</strong><br />
temperature and precipitation. For each sample location where<br />
<strong>Scleroderris</strong> <strong>disease</strong> was confirmed as present or absent, we attached<br />
an estimate <strong>of</strong> each <strong>of</strong> the <strong>climate</strong> variables listed in<br />
Table 1. The severity <strong>of</strong> the <strong>disease</strong> was not considered as all plantations<br />
are at <strong>risk</strong> after the <strong>disease</strong> becomes established. We ran<br />
several different models. First, we used all <strong>climate</strong> variables in a<br />
backward stepwise selection procedure to identify the best possible<br />
predictability that could be derived from these explanatory variables<br />
using a criterion <strong>of</strong> α = 0.05 to remove a variable. Then we<br />
selected two variables (the mean temperature <strong>of</strong> the coldest quarter<br />
and the precipitation <strong>of</strong> the coldest quarter) that we anticipated<br />
would explain variance in the probability <strong>of</strong> occurrence based on<br />
our experience and on previously published work on the ecology<br />
<strong>of</strong> the <strong>disease</strong>. We compared these two models using concordance<br />
(an index <strong>of</strong> classification accuracy). We were attempting to identify<br />
the most parsimonious model without incurring a large loss in<br />
our ability to predict (based on classification). We then ran models<br />
<strong>for</strong> each <strong>of</strong> the two selected <strong>climate</strong> variables separately and compared<br />
the concordance with the two-variable model to evaluate the<br />
usefulness <strong>of</strong> these variables on their own. Once we identified our<br />
XTMPCLDQ<br />
PRPCLDQ<br />
XTMPCLDQ<br />
(mean temperature<br />
in coldest quarter)<br />
–2log L (intercept only) 1534.2 1534.2 1534.2 1534.2<br />
–2log L (intercept + covariates) 821.0 1089.9 1294.0 1520.9<br />
Chi-square <strong>for</strong> covariates 713.2 444.3 240.3 13.3<br />
df (model) 11 2 1 1<br />
P (model) 0.0001 0.0001 0.0001 0.0003<br />
Concordance (%) 91.4 84.4 74.0 57.6<br />
Best probability 0.46 0.40 0.34 0.40<br />
Sensitivity 85.6 76.8 60.8 56.7<br />
Specificity 83.6 77.9 66.6 64.1<br />
False negatives 22.3 30.1 45.1 48.6<br />
False positives 10.4 16.6 28.3 31.2<br />
PRPCLDQ<br />
(precipitation in<br />
coldest quarter)<br />
Note: There were 1139 observations in the data set: 457 observations <strong>of</strong> presence and 682 observations <strong>of</strong> absence. See Table 1<br />
<strong>for</strong> a description <strong>of</strong> the <strong>climate</strong> variables.<br />
final model we examined the model in more detail by randomly<br />
splitting the original data in half, running the model with half <strong>of</strong><br />
the data and gene<strong>rating</strong> a classification table from the other half <strong>of</strong><br />
the data. We repeated this procedure 10 times and summarized the<br />
results. This procedure examines the ability <strong>of</strong> the model to predict<br />
the occurrence <strong>of</strong> the species <strong>for</strong> locations that are not included in<br />
the model and is there<strong>for</strong>e a more rigorous test <strong>of</strong> classification accuracy.<br />
This test also indicates how dependent these results are on<br />
the individual observations that are included in the model. Lastly,<br />
we used the regression equation from the final model to predict the<br />
probability <strong>of</strong> occurrence over the entire area <strong>of</strong> Ontario using the<br />
gridded estimates <strong>of</strong> <strong>climate</strong>. We compared this map <strong>of</strong> probability<br />
<strong>of</strong> occurrence to the original map <strong>of</strong> the observed locations <strong>of</strong> the<br />
<strong>disease</strong>.<br />
The results show a good match between the observed distribution<br />
<strong>of</strong> <strong>Scleroderris</strong> <strong>disease</strong> (Fig. 1a) and the map <strong>of</strong><br />
probability <strong>of</strong> occurrence (Fig. 1b). We have not made predictions<br />
<strong>for</strong> the most northern parts <strong>of</strong> Ontario because <strong>of</strong><br />
the relative scarcity <strong>of</strong> appropriate hosts in that area and because<br />
we have no data that samples that environmental<br />
space. Disease distribution was found to be closely related to<br />
certain climatic variables. Using a backward stepwise procedure,<br />
11 <strong>of</strong> the 23 <strong>climate</strong> variables were retained in the logistic<br />
regression model (Table 1). The overall concordance<br />
was 91.4%, an extremely high value, with a sensitivity (the<br />
occurrence <strong>of</strong> an event (presence) when it was predicted) <strong>of</strong><br />
85.6% and specificity (the absence <strong>of</strong> an event when it was<br />
predicted to be absent) <strong>of</strong> 83.6% (Table 2). This model has a<br />
very good fit but is not very interpretable because there are<br />
11 explanatory variables in the model. Also, more explanatory<br />
variables result in a model that is increasingly dependent<br />
on a particular set <strong>of</strong> observations.<br />
On the basis <strong>of</strong> our knowledge <strong>of</strong> the ecology <strong>of</strong> the fungus,<br />
two variables, mean temperature <strong>of</strong> the coldest quarter<br />
and precipitation in the coldest quarter, were tested separately.<br />
This model with only two variables was also a good<br />
fit with a concordance <strong>of</strong> 84.4%, sensitivity <strong>of</strong> 76.8%, and<br />
specificity <strong>of</strong> 77.9% (Table 2). This indicates that most <strong>of</strong><br />
the 11 explanatory variables in the first model do not<br />
© 1998 NRC Canada