considering autocorrelation in predictive models - Department of ...
considering autocorrelation in predictive models - Department of ...
considering autocorrelation in predictive models - Department of ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
28 Def<strong>in</strong>ition <strong>of</strong> the Problem<br />
• Non-l<strong>in</strong>ear relationships may exist between species and environments, but these relationships may<br />
be <strong>in</strong>correctly modeled as l<strong>in</strong>ear;<br />
• Classical statistical model<strong>in</strong>g may fail <strong>in</strong> the identification <strong>of</strong> the relationships between different<br />
k<strong>in</strong>ds <strong>of</strong> data without tak<strong>in</strong>g <strong>in</strong>to account their spatial arrangement (Besag, 1974);<br />
• The spatial resolution <strong>of</strong> data should be taken <strong>in</strong>to account: Coarser gra<strong>in</strong>s lead to spatial smooth<strong>in</strong>g<br />
<strong>of</strong> data.<br />
Spatial <strong>autocorrelation</strong> has many <strong>in</strong>terpretations. We present some <strong>of</strong> them (Dub<strong>in</strong>, 1998):<br />
• As a nuisance parameter, <strong>in</strong>serted <strong>in</strong> a model specification because its presence is necessary for a<br />
good description but it is not the topic <strong>of</strong> <strong>in</strong>terest.<br />
• As a self-correlation, aris<strong>in</strong>g from the geographical context with<strong>in</strong> which attribute values occur.<br />
• As a map pattern, viewed <strong>in</strong> terms <strong>of</strong> trends, gradients or mosaics across a map.<br />
• As a diagnostic tool for proper sampl<strong>in</strong>g design, model misspecification, nonconstant variance or<br />
outliers.<br />
• As a redundant (duplicate) <strong>in</strong>formation <strong>in</strong> geographical data, connected to the miss<strong>in</strong>g values estimation,<br />
as well as to notation <strong>of</strong> effective sample size and degrees <strong>of</strong> freedom.<br />
• As a miss<strong>in</strong>g variables <strong>in</strong>dicator/surrogate, popular <strong>in</strong> spatial econometrics (Ansel<strong>in</strong>, 1988).<br />
• As an outcome <strong>of</strong> areal unit demarcation <strong>in</strong> statistical analysis.<br />
Consequently, when analyz<strong>in</strong>g spatial data, it is important to check for <strong>autocorrelation</strong>. If there is no<br />
evidence <strong>of</strong> spatial <strong>autocorrelation</strong>, then proceed<strong>in</strong>g with a standard approach is acceptable. However, if<br />
there is evidence <strong>of</strong> spatial <strong>autocorrelation</strong>, then one <strong>of</strong> the underly<strong>in</strong>g assumptions <strong>of</strong> your analysis may<br />
be violated and your results may not be valid.<br />
Spatial <strong>autocorrelation</strong> is more complicated than temporal <strong>autocorrelation</strong> as it can occur <strong>in</strong> any<br />
direction. Moreover, the phenomenon <strong>of</strong> exist<strong>in</strong>g spatial and temporal <strong>autocorrelation</strong> at the same time<br />
has even higher complexity.<br />
Spatio-Temporal Autocorrelation<br />
In nature, it is a common case that data is not only affected by the phenomenon <strong>of</strong> spatial <strong>autocorrelation</strong>,<br />
but also by the phenomenon <strong>of</strong> temporal <strong>autocorrelation</strong> at the same time. In that case, we consider<br />
spatio-temporal <strong>autocorrelation</strong>, as a special case <strong>of</strong> <strong>autocorrelation</strong>.<br />
Spatio-temporal <strong>autocorrelation</strong> is a property <strong>of</strong> a random variable tak<strong>in</strong>g values, at pairs location<br />
a certa<strong>in</strong> distance apart <strong>in</strong> space and time, that a more similar or less similar than expected for pairs<br />
<strong>of</strong> observations at random selected locations and times. It is the actual correlation among values <strong>of</strong> a<br />
variable strictly due to their relative location proximity and time proximity.<br />
For a cont<strong>in</strong>uous variable X, measured at locations at a distance (∆a, ∆b), the spatio-temporal <strong>autocorrelation</strong><br />
can be expressed as:<br />
SpatioTemporal_AC(∆a,∆b,τ) = E[(Xa,b,k − µx)(Xa+∆a,b+∆b,k+τ − µx)]<br />
E[(Xa,b,k − µx)]<br />
(2.22)<br />
where d = (∆a) 2 + (∆b) 2 is the spatial distance between the values <strong>of</strong> the variable X at the two locations,<br />
τ is the temporal lag, E is the expected value operator and µx is the mean <strong>of</strong> the variable X.