Analysing spatial point patterns in R - Department of Statistical and ...
Analysing spatial point patterns in R - Department of Statistical and ...
Analysing spatial point patterns in R - Department of Statistical and ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Analys<strong>in</strong>g</strong> <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> <strong>in</strong> R<br />
Adrian Baddeley<br />
CSIRO <strong>and</strong> University <strong>of</strong> Western Australia<br />
Adrian.Baddeley@csiro.au<br />
adrian@maths.uwa.edu.au<br />
Workshop Notes<br />
Version 3<br />
October 2008<br />
Copyright c○CSIRO 2008<br />
Abstract<br />
This is a detailed set <strong>of</strong> notes for a workshop on <strong>Analys<strong>in</strong>g</strong> <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> that has<br />
been held several times <strong>in</strong> Australia <strong>and</strong> New Zeal<strong>and</strong> <strong>in</strong> 2006–2008.<br />
It covers statistical methods that are currently feasible <strong>in</strong> practice <strong>and</strong> available <strong>in</strong> public<br />
doma<strong>in</strong> s<strong>of</strong>tware. Some <strong>of</strong> these techniques are well established <strong>in</strong> the applications literature,<br />
while some are very recent developments.<br />
The workshop uses the statistical package R <strong>and</strong> is based on spatstat, an add-on library<br />
for R for the analysis <strong>of</strong> <strong>spatial</strong> data.<br />
Topics covered <strong>in</strong>clude: statistical formulation <strong>and</strong> methodological issues; data <strong>in</strong>put<br />
<strong>and</strong> h<strong>and</strong>l<strong>in</strong>g; R concepts such as classes <strong>and</strong> methods; nonparametric <strong>in</strong>tensity estimates;<br />
goodness-<strong>of</strong>-fit test<strong>in</strong>g for Complete Spatial R<strong>and</strong>omness; maximum likelihood <strong>in</strong>ference for<br />
Poisson processes; model validation for Poisson processes; distance methods <strong>and</strong> summary<br />
functions such as Ripley’s K function; non-Poisson <strong>po<strong>in</strong>t</strong> process models; simulation techniques;<br />
fitt<strong>in</strong>g models us<strong>in</strong>g summary statistics; Gibbs <strong>po<strong>in</strong>t</strong> process models; fitt<strong>in</strong>g Gibbs<br />
models; simulat<strong>in</strong>g Gibbs models; validat<strong>in</strong>g Gibbs models; multitype <strong>and</strong> marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>;<br />
exploratory analysis <strong>of</strong> marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>; multitype Poisson process models <strong>and</strong><br />
maximum likelihood <strong>in</strong>ference; multitype Gibbs process models <strong>and</strong> maximum pseudolikelihood;<br />
<strong>and</strong> l<strong>in</strong>e segment data.<br />
This version <strong>of</strong> the notes requires R version 2.7.0 or later, <strong>and</strong> spatstat version 1.14-5<br />
or later.<br />
Acknowledgements<br />
The author gratefully acknowledges countless comments <strong>and</strong> suggestions from workshop participants,<br />
<strong>and</strong> the support <strong>of</strong> CSIRO Mathematical <strong>and</strong> Information Sciences, The New<br />
Zeal<strong>and</strong> <strong>Statistical</strong> Association, The University <strong>of</strong> Waikato, The <strong>Statistical</strong> Society<br />
<strong>of</strong> Australia <strong>and</strong> The University <strong>of</strong> Western Australia.
2<br />
Copyright c○CSIRO Australia 2008<br />
All rights are reserved. Permission to reproduce <strong>in</strong>dividual copies <strong>of</strong> this document for<br />
personal use is granted. Redistribution <strong>in</strong> any other form is prohibited.<br />
The <strong>in</strong>formation conta<strong>in</strong>ed <strong>in</strong> this document is based on a number <strong>of</strong> technical, circumstantial<br />
or otherwise specified assumptions <strong>and</strong> parameters. The user must make its own analysis <strong>and</strong><br />
assessment <strong>of</strong> the suitability <strong>of</strong> the <strong>in</strong>formation or material conta<strong>in</strong>ed <strong>in</strong> or generated from this<br />
document. To the extent permitted by law, CSIRO excludes all liability to any party for any<br />
expenses, losses, damages <strong>and</strong> costs aris<strong>in</strong>g directly or <strong>in</strong>directly from us<strong>in</strong>g this document.<br />
Copyright c○CSIRO 2008
CONTENTS 3<br />
Contents<br />
PART I. OVERVIEW 5<br />
1 Introduction 6<br />
2 <strong>Statistical</strong> formulation 13<br />
3 The R system 17<br />
4 Introduction to spatstat 19<br />
PART II. DATA TYPES 26<br />
5 Objects, classes <strong>and</strong> methods <strong>in</strong> R 27<br />
6 Po<strong>in</strong>t <strong>patterns</strong> <strong>in</strong> spatstat 33<br />
7 W<strong>in</strong>dows <strong>in</strong> spatstat 39<br />
8 Manipulat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 46<br />
9 Pixel images <strong>in</strong> spatstat 54<br />
10 Tessellations 60<br />
PART III. INTENSITY AND RANDOMNESS 65<br />
11 Methods 1: Investigat<strong>in</strong>g <strong>in</strong>tensity 66<br />
12 Methods 2: Tests <strong>of</strong> Complete Spatial R<strong>and</strong>omness 72<br />
13 Methods 3: Maximum likelihood for Poisson processes 79<br />
14 Methods 4: check<strong>in</strong>g a fitted Poisson model 90<br />
PART IV. INTERACTION 97<br />
15 Simple models <strong>of</strong> non-Poisson <strong>patterns</strong> 98<br />
16 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 102<br />
17 Methods 6: simulation envelopes <strong>and</strong> goodness-<strong>of</strong>-fit tests 119<br />
18 Methods 7: model-fitt<strong>in</strong>g us<strong>in</strong>g summary statistics 125<br />
19 Methods 8: adjust<strong>in</strong>g for <strong>in</strong>homogeneity 128<br />
20 Gibbs models 132<br />
21 Methods 9: fitt<strong>in</strong>g Gibbs models 139<br />
22 Methods 10: validation <strong>of</strong> fitted Gibbs models 148<br />
Copyright c○CSIRO 2008
4 CONTENTS<br />
PART IV. MARKED POINT PATTERNS 155<br />
23 Marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 156<br />
24 H<strong>and</strong>l<strong>in</strong>g marked <strong>po<strong>in</strong>t</strong> pattern data 160<br />
25 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 165<br />
26 Methods 12: multitype Poisson models 179<br />
27 Methods 13: Gibbs models for multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 185<br />
28 L<strong>in</strong>e segment data 190<br />
29 Further <strong>in</strong>formation on spatstat 192<br />
Bibliography 193<br />
Index 195<br />
Copyright c○CSIRO 2008
CONTENTS 5<br />
PART I. OVERVIEW<br />
The first part <strong>of</strong> the workshop is a quick overview <strong>of</strong> <strong>spatial</strong> statistics for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, <strong>and</strong> a<br />
very quick <strong>in</strong>troduction to the s<strong>of</strong>tware.<br />
Copyright c○CSIRO 2008
6 Introduction<br />
1 Introduction<br />
1.1 Types <strong>of</strong> data<br />
1.1.1 Po<strong>in</strong>ts<br />
A <strong>po<strong>in</strong>t</strong> pattern dataset gives the locations <strong>of</strong> objects/events occurr<strong>in</strong>g <strong>in</strong> a study region.<br />
The <strong>po<strong>in</strong>t</strong>s could represent trees, animal nests, earthquake epicentres, petty crimes, domiciles<br />
<strong>of</strong> new cases <strong>of</strong> <strong>in</strong>fluenza, galaxies, etc.<br />
The <strong>po<strong>in</strong>t</strong>s might be situated <strong>in</strong> a region <strong>of</strong> the two-dimensional (2D) plane, or on the Earth’s<br />
surface, or a 3D volume, etc. They could be <strong>po<strong>in</strong>t</strong>s <strong>in</strong> space-time (e.g. earthquake epicentre<br />
location <strong>and</strong> time). The s<strong>of</strong>tware presented here is only applicable to 2D <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> (but<br />
we’re work<strong>in</strong>g on it).<br />
1.1.2 Marks<br />
The <strong>po<strong>in</strong>t</strong>s may have extra <strong>in</strong>formation called marks attached to them. The mark represents an<br />
“attribute” <strong>of</strong> the <strong>po<strong>in</strong>t</strong>. The mark variable could be categorical, e.g. species or disease status:<br />
<strong>of</strong>f<br />
on<br />
The mark variable could be cont<strong>in</strong>uous, e.g. tree diameter:<br />
Copyright c○CSIRO 2008
1.1 Types <strong>of</strong> data 7<br />
The mark could be multivariate, or even more complicated.<br />
1.1.3 Covariates<br />
Our dataset may also <strong>in</strong>clude covariates — any data that we treat as explanatory, rather than<br />
as part <strong>of</strong> the ‘response’.<br />
Covariate data may be a <strong>spatial</strong> function Z(u) def<strong>in</strong>ed at all <strong>spatial</strong> locations u, e.g. altitude,<br />
soil pH, displayed as a pixel image or a contour plot:<br />
120 130 140 150 160<br />
Covariate data may be another <strong>spatial</strong> pattern such as another <strong>po<strong>in</strong>t</strong> pattern, or a l<strong>in</strong>e<br />
segment pattern, e.g. a map <strong>of</strong> geological faults:<br />
140<br />
130<br />
125<br />
145<br />
145<br />
135<br />
140<br />
150<br />
130<br />
135<br />
150<br />
155<br />
140<br />
Copyright c○CSIRO 2008<br />
135<br />
130<br />
125<br />
130<br />
130
8 Introduction<br />
1.2 Typical scientific questions<br />
1.2.1 Intensity<br />
‘Intensity’ is the average density <strong>of</strong> <strong>po<strong>in</strong>t</strong>s (expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s per unit area). Intensity<br />
may be constant (‘uniform’) or may vary from location to location (‘non-uniform’ or ‘<strong>in</strong>homogeneous’).<br />
1.2.2 Interaction<br />
uniform<br />
<strong>in</strong>homogeneous<br />
‘Inter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction’ is stochastic dependence between the <strong>po<strong>in</strong>t</strong>s <strong>in</strong> a <strong>po<strong>in</strong>t</strong> pattern. Usually<br />
we expect dependence to be strongest between <strong>po<strong>in</strong>t</strong>s that are close to one another.<br />
<strong>in</strong>dependent regular clustered<br />
Example 1 (Japanese p<strong>in</strong>es) Locations <strong>of</strong> 65 sapl<strong>in</strong>gs <strong>of</strong> Japanese p<strong>in</strong>e <strong>in</strong> a 5.7 × 5.7 metre<br />
square sampl<strong>in</strong>g region <strong>in</strong> a natural st<strong>and</strong>.<br />
Ma<strong>in</strong> question: is the spac<strong>in</strong>g between sapl<strong>in</strong>gs greater than would be expected for a r<strong>and</strong>om<br />
pattern? (reflect<strong>in</strong>g competition for resources)<br />
Copyright c○CSIRO 2008
1.2 Typical scientific questions 9<br />
1.2.3 Covariate effects<br />
Japanese P<strong>in</strong>es<br />
For a <strong>po<strong>in</strong>t</strong> pattern dataset with covariate data, we typically<br />
• <strong>in</strong>vestigate whether the <strong>in</strong>tensity depends on the covariates<br />
• allow for covariate effects on <strong>in</strong>tensity before study<strong>in</strong>g <strong>in</strong>teraction between <strong>po<strong>in</strong>t</strong>s<br />
Example 2 (Tropical ra<strong>in</strong>forest data) Locations <strong>of</strong> 3605 trees <strong>in</strong> a tropical ra<strong>in</strong>forest, with<br />
supplementary grid map <strong>of</strong> elevation (altitude).<br />
Ma<strong>in</strong> questions: (1) does tree density depend on slope? (2) after account<strong>in</strong>g for variation <strong>in</strong><br />
tree density due to slope, is there evidence <strong>of</strong> cluster<strong>in</strong>g <strong>of</strong> trees?<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ ++ +<br />
+ +<br />
++<br />
+ + + +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+ ++ +<br />
+ +<br />
+<br />
+ + + +<br />
++<br />
+<br />
+<br />
+ + +<br />
+ ++ + +<br />
+ + +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ + +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ + + +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++ +<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+ + ++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+ +<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
++<br />
++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +++<br />
+ + +<br />
+<br />
+ +<br />
+++<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
++ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++<br />
++ +<br />
++<br />
+<br />
+ + ++ +<br />
+<br />
+<br />
++<br />
+ ++ +<br />
+ + ++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
++ +++<br />
++++<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
++<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
++++<br />
+<br />
+<br />
+ +<br />
+++<br />
+<br />
+ ++<br />
++<br />
+<br />
+++<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+ + ++ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+ +<br />
++<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ + +<br />
+++ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
++<br />
++<br />
+ ++ ++++<br />
++++ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+ +++ + +<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++ +<br />
+<br />
++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++ ++ +<br />
+<br />
++ +<br />
++ +<br />
+ +<br />
+<br />
+<br />
+++<br />
+++ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+ +<br />
++<br />
+ + + +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
++ +<br />
+++<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++ +<br />
++<br />
+<br />
++<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+ +<br />
+ +++<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+ +++<br />
+<br />
+ ++ + + ++<br />
+<br />
+<br />
+++<br />
++<br />
+ +<br />
+<br />
+ ++ +<br />
+ ++<br />
++ + +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ + ++<br />
++++<br />
+<br />
+ + ++<br />
+<br />
++++ +<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
++ +<br />
++<br />
+<br />
+ + ++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
++<br />
++<br />
++<br />
+ +<br />
+<br />
++ + ++<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
++<br />
+++<br />
+ + + + +<br />
+<br />
++ + ++++<br />
+++<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
++<br />
++ +<br />
+<br />
+++<br />
+<br />
++ +<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+ ++<br />
+ + +<br />
+ ++<br />
+<br />
+++ +<br />
+++ ++ + +<br />
+ + +++ ++ +<br />
+<br />
+ +<br />
+<br />
++ +<br />
+<br />
+<br />
+ + + ++<br />
+<br />
++<br />
+<br />
++<br />
+ +++<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
++<br />
+++<br />
+ ++<br />
++<br />
+<br />
+++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +++<br />
++<br />
+ +<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
++ +<br />
+<br />
++ +<br />
+<br />
+<br />
+ +++<br />
++<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
++<br />
+ +++<br />
++<br />
+<br />
++++<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++ +<br />
+ ++<br />
+ +++<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
+ ++<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+ +<br />
+++<br />
+<br />
++<br />
++ ++<br />
++<br />
+<br />
+ + + + ++<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
++ +<br />
+<br />
++ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
++<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+ ++++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ + ++++<br />
+++<br />
+ +<br />
+++ ++<br />
+ +<br />
+<br />
++<br />
+<br />
++++<br />
+<br />
+<br />
++ +<br />
+<br />
++<br />
++<br />
+ +<br />
+<br />
+ ++<br />
+<br />
+++<br />
++<br />
+++<br />
+<br />
+<br />
+ + + ++ +<br />
+ + ++<br />
+ + ++<br />
++<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+++<br />
+<br />
++++<br />
+ +<br />
+<br />
+ + ++<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
++<br />
+ ++ +<br />
++<br />
+<br />
+++ + +<br />
+<br />
+++<br />
+<br />
++++ + + + ++<br />
+ ++<br />
+ + ++ + +<br />
+<br />
++<br />
++ +<br />
+<br />
+<br />
+ ++<br />
+<br />
+ + + + +<br />
+ + + + ++<br />
+ +<br />
++<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+ +<br />
+<br />
++<br />
++ +<br />
+ + +<br />
++ +<br />
+ ++ +<br />
+ +<br />
+ +<br />
+ +++<br />
++ + +<br />
+ + + ++<br />
+<br />
++<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
++<br />
++ +<br />
+<br />
+ ++ +++<br />
++<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+ + ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++ +++<br />
++<br />
+ ++<br />
+ + +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+ + +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ ++<br />
+<br />
+ + ++ ++<br />
+<br />
+<br />
+ +<br />
++ + + +<br />
+ +<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
+ ++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++ +<br />
+++ + +++<br />
+ + +<br />
++ +<br />
+ +<br />
+ +<br />
++++<br />
+<br />
+ ++ ++<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+++<br />
+ + +<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++ +<br />
++<br />
+ +<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
++ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ ++ +++<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++++<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ ++ +<br />
+<br />
+ +<br />
+<br />
+ ++<br />
+<br />
++ +<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
++ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + + +<br />
++ +<br />
++<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+ + + +<br />
+ + ++<br />
+<br />
++ ++<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ + + ++<br />
+<br />
+ ++<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ + ++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ + + + + +<br />
+<br />
++ +<br />
++<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
Example 3 (Queensl<strong>and</strong> copper data) A <strong>in</strong>tensive m<strong>in</strong>eralogical survey yields a map <strong>of</strong><br />
copper deposits (essentially <strong>po<strong>in</strong>t</strong>like at this scale) <strong>and</strong> geological faults (straight l<strong>in</strong>es). The<br />
faults can easily be observed from satellites, but the copper deposits are hard to f<strong>in</strong>d. The ma<strong>in</strong><br />
question is whether the faults are ‘predictive’ for copper deposits (e.g. copper less/more likely to<br />
be found near faults).<br />
120 130 140 150 160<br />
Copyright c○CSIRO 2008
10 Introduction<br />
Example 4 (Chorley-Ribble data) An apparent cluster <strong>of</strong> cases <strong>of</strong> cancer <strong>of</strong> the larynx occurred<br />
near a disused <strong>in</strong>dustrial <strong>in</strong>c<strong>in</strong>erator. The area health authority mapped the domicile<br />
locations <strong>of</strong> all cases (58) <strong>of</strong> cancer <strong>of</strong> the larynx <strong>and</strong>, for control purposes, a r<strong>and</strong>om sample <strong>of</strong><br />
cases (978) <strong>of</strong> lung cancer.<br />
Ma<strong>in</strong> question: after allow<strong>in</strong>g for <strong>spatial</strong> variation <strong>in</strong> density <strong>of</strong> the susceptible population<br />
(for which the lung cancer cases are a surrogate), is there evidence <strong>of</strong> raised <strong>in</strong>cidence <strong>of</strong> laryngeal<br />
cancer near the <strong>in</strong>c<strong>in</strong>erator?<br />
larynx<br />
lung<br />
<strong>in</strong>c<strong>in</strong>erator<br />
Chorley−Ribble Data<br />
1.2.4 Segregation <strong>of</strong> <strong>po<strong>in</strong>t</strong>s with different marks<br />
In a marked <strong>po<strong>in</strong>t</strong> pattern, we need to <strong>in</strong>vestigate whether <strong>po<strong>in</strong>t</strong>s with different mark values are<br />
‘segregated’ (found <strong>in</strong> different parts <strong>of</strong> the study region).<br />
Example 5 (Lans<strong>in</strong>g Woods) In a 20-acre study region <strong>in</strong> Lans<strong>in</strong>g Woods, Michigan, the<br />
locations <strong>of</strong> 2251 trees <strong>and</strong> the botanical classification <strong>of</strong> each tree were recorded.<br />
Ma<strong>in</strong> question: is the study region divided <strong>in</strong>to doma<strong>in</strong>s where a s<strong>in</strong>gle tree species dom<strong>in</strong>ates,<br />
or are the different species r<strong>and</strong>omly <strong>in</strong>terspersed?<br />
Copyright c○CSIRO 2008
1.2 Typical scientific questions 11<br />
blackoak<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
++<br />
+ +<br />
+<br />
++ +<br />
+ ++ +<br />
+<br />
++<br />
+<br />
+ + + + +<br />
+<br />
+ +<br />
+<br />
++ +<br />
++ + +<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ + ++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+ ++++++<br />
+++ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+ ++ + +<br />
+ ++<br />
+<br />
+<br />
+ + +<br />
+<br />
hickory<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+ + +<br />
++ +<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+ +<br />
+ + +<br />
+ +<br />
++ +<br />
++++<br />
+ +++ +<br />
+ ++<br />
+ +<br />
+<br />
+ + +<br />
+ ++<br />
+++ + ++<br />
++ +<br />
+ +<br />
+ ++<br />
++ ++ +<br />
++<br />
+ ++ +<br />
+ +<br />
++<br />
++<br />
+ +<br />
+ ++<br />
+ +++ + +++<br />
+<br />
+ + ++<br />
++<br />
++ ++ + ++<br />
+ +<br />
+ + +++<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ + +<br />
+ +<br />
+<br />
+ +<br />
+ ++<br />
+ +++ +<br />
+<br />
++<br />
+ +<br />
++<br />
+ ++<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+ + + +++<br />
+ +<br />
+<br />
++ +<br />
+ + +<br />
+ ++ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+ + + ++<br />
++<br />
+ ++ +<br />
++<br />
+<br />
+ + + +<br />
+<br />
+ +++ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
++ +<br />
+<br />
+ +<br />
+<br />
+ + +<br />
++<br />
+ +<br />
+<br />
+ + +<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++ +<br />
++ ++<br />
+ +<br />
++ + +<br />
+ ++<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+ + ++++ + +<br />
++<br />
+<br />
++ +<br />
++<br />
+<br />
+<br />
+++<br />
++<br />
+<br />
+<br />
+ ++<br />
+ ++<br />
+<br />
++ +<br />
+ +<br />
+ + + ++<br />
++ +<br />
+<br />
+ +<br />
+ +<br />
+++<br />
+<br />
+<br />
++ + ++ +<br />
+<br />
+<br />
+<br />
+ + + +<br />
+<br />
+ +<br />
+<br />
++ +<br />
+ +<br />
++<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+ +<br />
+ +++<br />
+ +<br />
+<br />
++<br />
++<br />
+<br />
++ +<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++ + +++ +++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
++ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
++ + ++ +<br />
+<br />
+ ++ +<br />
+<br />
++ +<br />
+<br />
+ +<br />
+ + + +<br />
+ ++ +<br />
++ +++ + +<br />
++ ++<br />
++ + + + + + ++ +<br />
+<br />
+<br />
+<br />
++ + + +<br />
+ +<br />
maple<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+ +<br />
+ +<br />
+ ++ + +<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
++ + ++ ++<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ + + +<br />
+<br />
+ +<br />
+ +<br />
+<br />
++ + +<br />
+ ++ + +<br />
+<br />
+<br />
+ +<br />
+ + ++++ +<br />
+ +<br />
++<br />
+ +<br />
+ + +<br />
+<br />
+<br />
+ +<br />
++ + + +<br />
++<br />
+ ++ +<br />
+<br />
++<br />
+<br />
++ + + ++ ++ + + + ++<br />
+ +<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ ++<br />
+ +<br />
+ ++<br />
+ ++ +<br />
+ + +<br />
+ + ++<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+ + +<br />
+ ++<br />
++<br />
+ +<br />
++<br />
+<br />
+ +<br />
+ + +<br />
++ +<br />
++<br />
++ +++ + + + + + + + ++++<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ + +<br />
+<br />
++<br />
+<br />
+ +<br />
++<br />
+ ++<br />
+<br />
++ + ++<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
++++<br />
+<br />
++<br />
++<br />
+ ++ + + ++<br />
+<br />
+++ + ++ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
++++<br />
+ ++<br />
+<br />
+ ++ +<br />
+++<br />
+ ++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ + + + +<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+++ +<br />
++ +++ +<br />
+++<br />
+ +<br />
+ +<br />
+ + +<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+ +++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
Example 6 (Longleaf P<strong>in</strong>es) In a forest <strong>of</strong> Longleaf P<strong>in</strong>e trees <strong>in</strong> Georgia, USA, the locations<br />
<strong>of</strong> 584 trees were recorded along with their diameter at breast height (dbh), a convenient surrogate<br />
measure <strong>of</strong> size <strong>and</strong> age.<br />
Ma<strong>in</strong> question: expla<strong>in</strong> any <strong>spatial</strong> variation <strong>in</strong> the density <strong>and</strong> age <strong>of</strong> trees.<br />
Longleaf P<strong>in</strong>es<br />
1.2.5 Dependence between <strong>po<strong>in</strong>t</strong>s <strong>of</strong> different types<br />
In a <strong>po<strong>in</strong>t</strong> pattern dataset with categorical marks, (aka multitype <strong>po<strong>in</strong>t</strong> pattern), dependence<br />
between the different types may be formulated either as<br />
• <strong>in</strong>teraction between the sub-pattern <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> type i <strong>and</strong> the sub-pattern <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong><br />
type j; or<br />
• dependence between the mark values <strong>of</strong> <strong>po<strong>in</strong>t</strong>s at two specified locations.<br />
Example 7 (Amacr<strong>in</strong>e cells) The ret<strong>in</strong>a is a flat sheet conta<strong>in</strong><strong>in</strong>g several layers <strong>of</strong> cells.<br />
Amacr<strong>in</strong>e cells occupy two adjacent layers, the ‘on’ <strong>and</strong> ‘<strong>of</strong>f’ layers. In a microscope field <strong>of</strong><br />
view, the locations <strong>of</strong> all amacr<strong>in</strong>e cells were mapped, <strong>and</strong> classified <strong>in</strong>to ‘on’ <strong>and</strong> ‘<strong>of</strong>f’.<br />
Ma<strong>in</strong> question: is there evidence that the ‘on’ <strong>and</strong> ‘<strong>of</strong>f’ layers grew <strong>in</strong>dependently <strong>of</strong> one<br />
another?<br />
Copyright c○CSIRO 2008
12 Introduction<br />
<strong>of</strong>f<br />
on<br />
amacr<strong>in</strong>e<br />
Example 8 (Ants’ nests) The nests <strong>of</strong> two species <strong>of</strong> ants <strong>in</strong> a plot <strong>in</strong> Greece were mapped.<br />
Auxiliary <strong>in</strong>formation records a field/scrub boundary, <strong>and</strong> the position <strong>of</strong> a walk<strong>in</strong>g track.<br />
Ma<strong>in</strong> question: does species A <strong>in</strong>tentionally place its nests close to species B?<br />
1.3 Overview <strong>of</strong> statistical methods<br />
A<br />
B<br />
ants<br />
scrub<br />
<strong>Statistical</strong> methods for <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> have a quirky history, <strong>and</strong> have not yet coalesced<br />
<strong>in</strong>to a mature statistical methodology. They <strong>in</strong>clude<br />
• summary statistics: the applied literature is dom<strong>in</strong>ated by ad hoc methods based on<br />
evaluat<strong>in</strong>g a summary statistic (e.g. average distance from a <strong>po<strong>in</strong>t</strong> to its nearest neighbour)<br />
with very little statistical theory to support them.<br />
• comparison to Poisson process: <strong>in</strong> the applied literature, hypothesis tests are <strong>in</strong>voked<br />
chiefly to decide whether the <strong>po<strong>in</strong>t</strong> pattern is ‘completely r<strong>and</strong>om’ (a uniform Poisson <strong>po<strong>in</strong>t</strong><br />
process) whether or not this is scientifically relevant. Lots <strong>of</strong> misunderst<strong>and</strong><strong>in</strong>gs prevail.<br />
• modell<strong>in</strong>g: only <strong>in</strong> the last decade has it f<strong>in</strong>ally become possible to formulate <strong>and</strong> fit<br />
realistic models to <strong>po<strong>in</strong>t</strong> pattern data. There’s still a lot <strong>of</strong> work to be done e.g. <strong>in</strong><br />
algorithms, model choice, goodness-<strong>of</strong>-fit.<br />
We’ll cover both classical <strong>and</strong> modern methods. Useful textbooks <strong>in</strong>clude [17, 19, 23, 31, 46,<br />
37]. An important recent survey is [38].<br />
Copyright c○CSIRO 2008<br />
field
2 <strong>Statistical</strong> formulation<br />
2.1 Po<strong>in</strong>t processes<br />
In this workshop, the observed <strong>po<strong>in</strong>t</strong> pattern x will be treated as a realisation <strong>of</strong> a r<strong>and</strong>om<br />
<strong>po<strong>in</strong>t</strong> process X <strong>in</strong> two-dimensional space. A <strong>po<strong>in</strong>t</strong> process is simply a r<strong>and</strong>om set <strong>of</strong> <strong>po<strong>in</strong>t</strong>s;<br />
the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s is r<strong>and</strong>om, as well as the locations <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s. Our goal is usually to<br />
estimate parameters <strong>of</strong> the distribution <strong>of</strong> X.<br />
2.2 Should I treat the data as a <strong>po<strong>in</strong>t</strong> process?<br />
Treat<strong>in</strong>g the <strong>po<strong>in</strong>t</strong> pattern as a <strong>po<strong>in</strong>t</strong> process effectively assumes that the pattern is r<strong>and</strong>om<br />
(the locations <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s, <strong>and</strong> the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s, are r<strong>and</strong>om) <strong>and</strong> that the pattern is<br />
the observation or ‘response’ <strong>of</strong> <strong>in</strong>terest. A realisation <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process is an unordered set <strong>of</strong><br />
<strong>po<strong>in</strong>t</strong>s, so the <strong>po<strong>in</strong>t</strong>s do not have a serial order (unless there are marks attached).<br />
Example 9 A silicon wafer is <strong>in</strong>spected for defects <strong>in</strong> the crystal surface, <strong>and</strong> the locations <strong>of</strong><br />
all defects are recorded.<br />
This can be analysed as a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> two dimensions, assum<strong>in</strong>g the defects are <strong>po<strong>in</strong>t</strong>like.<br />
We’re <strong>in</strong>terested <strong>in</strong> the <strong>in</strong>tensity <strong>of</strong> defects, spac<strong>in</strong>g between defects, etc.<br />
Example 10 Earthquake aftershocks <strong>in</strong> Japan are detected <strong>and</strong> their latitude, longitude <strong>and</strong><br />
time <strong>of</strong> occurrence are recorded.<br />
This can be analysed as a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> space-time (where space is the two-dimensional<br />
plane or the Earth’s surface). If the occurrence times are ignored, it becomes a <strong>spatial</strong> <strong>po<strong>in</strong>t</strong><br />
process.<br />
Example 11 The locations <strong>of</strong> petty crimes that occurred <strong>in</strong> the past week are plotted on a street<br />
map <strong>of</strong> Chicago.<br />
This can be analysed as a <strong>po<strong>in</strong>t</strong> process. We’re <strong>in</strong>terested <strong>in</strong> the <strong>in</strong>tensity (propensity for<br />
crimes to occur), any <strong>spatial</strong> variation <strong>in</strong> <strong>in</strong>tensity, clusters <strong>of</strong> crimes, etc. One issue here is<br />
whether the recorded crime locations can be anywhere <strong>in</strong> two dimensional space, or whether<br />
they are actually restricted to locations on the streets (mak<strong>in</strong>g them a <strong>po<strong>in</strong>t</strong> process on a 1dimensional<br />
network).<br />
Example 12 A tiger shark is captured, tagged with a satellite transmitter, <strong>and</strong> released. Over<br />
the next month its location is reported daily. These <strong>po<strong>in</strong>t</strong>s are plotted on a map.<br />
It is probably not appropriate to analyse these data as a <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> process. At the very<br />
least, the time <strong>of</strong> each observation should be <strong>in</strong>cluded. They could be treated as a space-time<br />
<strong>po<strong>in</strong>t</strong> process, except that it’s a strange process, as it consists <strong>of</strong> exactly one <strong>po<strong>in</strong>t</strong> at each <strong>in</strong>stant<br />
<strong>of</strong> time. These data should really be treated as a sparse sample <strong>of</strong> a cont<strong>in</strong>uous trajectory, <strong>and</strong><br />
analysed us<strong>in</strong>g other methods [which, alas, are fairly underdeveloped.] See the R package trip.<br />
Example 13 A herd <strong>of</strong> deer is photographed from the air at noon each day for 10 days. Each<br />
photograph is processed to produce a <strong>po<strong>in</strong>t</strong> pattern <strong>of</strong> <strong>in</strong>dividual deer locations on a map.<br />
13<br />
Copyright c○CSIRO 2008
14 <strong>Statistical</strong> formulation<br />
Each day produces a <strong>po<strong>in</strong>t</strong> pattern that could be analysed as a realisation <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process.<br />
However, the observations on successive days are dependent (e.g. constant herd size, systematic<br />
forag<strong>in</strong>g behaviour). Assum<strong>in</strong>g <strong>in</strong>dividual deer cannot be identified from day to day, this is<br />
effectively a ‘repeated measures’ dataset where each response is a <strong>po<strong>in</strong>t</strong> pattern. Methods for<br />
this problem are <strong>in</strong> their <strong>in</strong>fancy.<br />
Example 14 In a designed controlled experiment, silicon wafers are produced under various<br />
conditions. Each wafer is <strong>in</strong>spected for defects <strong>in</strong> the crystal surface, <strong>and</strong> the locations <strong>of</strong> all<br />
defects are recorded as a <strong>po<strong>in</strong>t</strong> pattern.<br />
This is a designed experiment <strong>in</strong> which the response is a <strong>po<strong>in</strong>t</strong> pattern. Methods for this<br />
problem are <strong>in</strong> their <strong>in</strong>fancy. There are some methods for replicated <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
[9, 13, 24, 25, 29] that apply when each experimental group conta<strong>in</strong>s several <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
Example 15 The <strong>po<strong>in</strong>t</strong>s are not the orig<strong>in</strong>al data, but were obta<strong>in</strong>ed after process<strong>in</strong>g the data.<br />
For example,<br />
• the orig<strong>in</strong>al dataset is a pattern <strong>of</strong> small blobs, <strong>and</strong> the <strong>po<strong>in</strong>t</strong>s are the blob centres;<br />
• the orig<strong>in</strong>al dataset is a collection <strong>of</strong> l<strong>in</strong>e segments, <strong>and</strong> the <strong>po<strong>in</strong>t</strong>s are the end<strong>po<strong>in</strong>t</strong>s,<br />
cross<strong>in</strong>g <strong>po<strong>in</strong>t</strong>s, mid<strong>po<strong>in</strong>t</strong>s etc;<br />
• the orig<strong>in</strong>al dataset is a space-fill<strong>in</strong>g tessellation <strong>of</strong> biological cells, <strong>and</strong> the <strong>po<strong>in</strong>t</strong>s are the<br />
centres <strong>of</strong> the cells.<br />
This is a grey area. Po<strong>in</strong>t process methodology can be applied, <strong>and</strong> may be more powerful<br />
or more flexible than exist<strong>in</strong>g methodology for the unprocessed data. However the orig<strong>in</strong> <strong>of</strong> the<br />
<strong>po<strong>in</strong>t</strong> pattern may lead to artefacts (for example the centres <strong>of</strong> biological cells never lie very close<br />
together, because cells have nonzero size) which must be taken <strong>in</strong>to account <strong>in</strong> the analysis.<br />
2.3 Assumptions about the data<br />
The “st<strong>and</strong>ard model” assumes that the <strong>po<strong>in</strong>t</strong> process X extends throughout 2-D space, but<br />
is observed only <strong>in</strong>side a region W, the “sampl<strong>in</strong>g w<strong>in</strong>dow”. Our data consist <strong>of</strong> an unordered<br />
set<br />
x = {x1,... ,xn}, xi ∈ W, n ≥ 0<br />
<strong>of</strong> <strong>po<strong>in</strong>t</strong>s xi <strong>in</strong> W. The w<strong>in</strong>dow W is fixed <strong>and</strong> known. Usually our goal is <strong>in</strong>ference about<br />
parameters <strong>of</strong> X.<br />
Copyright c○CSIRO 2008
2.4 Marks <strong>and</strong> covariates 15<br />
Data are <strong>of</strong>ten supplied without <strong>in</strong>formation about the sampl<strong>in</strong>g w<strong>in</strong>dow W. It is important<br />
to know the w<strong>in</strong>dow W, s<strong>in</strong>ce we need to know where <strong>po<strong>in</strong>t</strong>s were not observed. Even<br />
someth<strong>in</strong>g as simple as estimat<strong>in</strong>g the density <strong>of</strong> <strong>po<strong>in</strong>t</strong>s depends on the w<strong>in</strong>dow. It would be<br />
wrong, or at least different, to analyze a <strong>po<strong>in</strong>t</strong> pattern dataset by “guess<strong>in</strong>g” the appropriate<br />
w<strong>in</strong>dow. An analogy may be drawn with the difference between sequential experiments <strong>and</strong><br />
experiments <strong>in</strong> which the sample size is fixed a priori.<br />
For the same reason, it is not sufficient to observe the values <strong>of</strong> covariates at the data <strong>po<strong>in</strong>t</strong>s<br />
only. In order to <strong>in</strong>vestigate the dependence <strong>of</strong> the <strong>po<strong>in</strong>t</strong> process on the covariate, we need to<br />
have at least some observations <strong>of</strong> the covariate at other (“non-data”) locations.<br />
It’s implicitly assumed that all <strong>po<strong>in</strong>t</strong>s <strong>of</strong> X with<strong>in</strong> W have been mapped without omission.<br />
Most models we use will assume that r<strong>and</strong>om <strong>po<strong>in</strong>t</strong>s could have been observed at any location<br />
<strong>in</strong> the w<strong>in</strong>dow W, without further constra<strong>in</strong>t. (Examples where this does not apply: GPS<br />
locations <strong>of</strong> cars will usually lie along roads; certa<strong>in</strong> cells lie only <strong>in</strong>side certa<strong>in</strong> tissues).<br />
When th<strong>in</strong>k<strong>in</strong>g about methodological issues it’s <strong>of</strong>ten useful to th<strong>in</strong>k about the discretised<br />
version <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process. Suppose the w<strong>in</strong>dow W is chopped <strong>in</strong>to <strong>in</strong>f<strong>in</strong>itely many ‘pixels’.<br />
Each pixel is assigned the value I = 1 if it conta<strong>in</strong>s a <strong>po<strong>in</strong>t</strong> <strong>of</strong> X, <strong>and</strong> I = 0 otherwise. This<br />
array <strong>of</strong> 0’s <strong>and</strong> 1’s constitutes the data that must be modelled. [e.g. obviously we can’t model<br />
the dependence <strong>of</strong> these <strong>in</strong>dicators on a covariate if we only observe the covariate value at the<br />
locations where I = 1.]<br />
2.4 Marks <strong>and</strong> covariates<br />
The ma<strong>in</strong> differences between marks <strong>and</strong> covariates are that<br />
• marks are associated with data <strong>po<strong>in</strong>t</strong>s;<br />
1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br />
0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0 1<br />
0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0<br />
0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1<br />
0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0<br />
1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0<br />
0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0<br />
0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1<br />
0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0<br />
0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0<br />
0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0<br />
1 0 0 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0<br />
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1<br />
0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0<br />
0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0<br />
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0<br />
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1<br />
0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0<br />
1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1<br />
0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0<br />
• marks are part <strong>of</strong> the ‘response’ (the <strong>po<strong>in</strong>t</strong> pattern) while covariates are ‘explanatory’.<br />
2.4.1 Marks<br />
A mark variable may be <strong>in</strong>terpreted as an additional coord<strong>in</strong>ate for the <strong>po<strong>in</strong>t</strong>: for example<br />
a <strong>po<strong>in</strong>t</strong> process <strong>of</strong> earthquake epicentre locations (longitude, latitude), with marks giv<strong>in</strong>g the<br />
occurrence time <strong>of</strong> each earthquake, can alternatively be viewed as a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> space-time<br />
with coord<strong>in</strong>ates (longitude, latitude, time).<br />
A marked <strong>po<strong>in</strong>t</strong> process <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> space S with marks belong<strong>in</strong>g to a set M is mathematically<br />
def<strong>in</strong>ed as a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> the cartesian product S × M. The space M <strong>of</strong> possible marks<br />
may be ‘anyth<strong>in</strong>g’. In current applications, typically the mark is either a categorical variable<br />
(so that the <strong>po<strong>in</strong>t</strong>s are grouped <strong>in</strong>to ‘types’) or a real number. Multivariate marks consist<strong>in</strong>g <strong>of</strong><br />
several such variables are also common.<br />
Copyright c○CSIRO 2008
16 <strong>Statistical</strong> formulation<br />
A marked <strong>po<strong>in</strong>t</strong> pattern is an unordered set<br />
y = {(x1,m1),... ,(xn,mn)}, xi ∈ W, mi ∈ M<br />
where xi are the locations <strong>and</strong> mi are the correspond<strong>in</strong>g marks.<br />
Marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> are discussed <strong>in</strong> detail <strong>in</strong> section 23.<br />
2.4.2 Covariates<br />
Any k<strong>in</strong>d <strong>of</strong> data may be recruited as an explanatory variable (covariate).<br />
A ‘<strong>spatial</strong> function’, ‘<strong>spatial</strong> covariate’ or ‘geostatistical covariate’ is a function Z(u) observable<br />
(potentially) at every <strong>spatial</strong> location u ∈ W. Values <strong>of</strong> Z(u) may be available for a f<strong>in</strong>e<br />
grid <strong>of</strong> locations u:<br />
The values <strong>of</strong> a <strong>spatial</strong> function Z(u) may only be observable at some scattered sampl<strong>in</strong>g<br />
locations u. An example is the measurement <strong>of</strong> soil pH at a few sampl<strong>in</strong>g locations. In this case,<br />
the value <strong>of</strong> the covariate Z must be observed for all <strong>po<strong>in</strong>t</strong>s xi <strong>of</strong> the <strong>po<strong>in</strong>t</strong> pattern x, <strong>and</strong> must<br />
also be observed at some other ‘non-data’ or ‘background’ locations u ∈ W with u ∈ x.<br />
Alternatively, the covariate <strong>in</strong>formation may consist <strong>of</strong> another <strong>spatial</strong> pattern, such as a<br />
<strong>po<strong>in</strong>t</strong> pattern or a l<strong>in</strong>e segment pattern. The way <strong>in</strong> which this covariate <strong>in</strong>formation enters<br />
the analysis or statistical model depends very much on the context <strong>and</strong> the choice <strong>of</strong> model.<br />
Typically the covariate pattern would be used to def<strong>in</strong>e a surrogate <strong>spatial</strong> function Z, for<br />
example, Z(u) may be the distance from u to the nearest l<strong>in</strong>e segment.<br />
Copyright c○CSIRO 2008<br />
120 130 140 150 160
3 The R system<br />
We will be us<strong>in</strong>g the statistical package R.<br />
3.1 How to obta<strong>in</strong> R<br />
R is free s<strong>of</strong>tware with an open-source licence. You can download it from r-project.org <strong>and</strong><br />
it should be easy to <strong>in</strong>stall on any computer (see the <strong>in</strong>structions at the website).<br />
Books <strong>and</strong> onl<strong>in</strong>e tutorials are available to help you learn to use R.<br />
3.2 How comm<strong>and</strong>s are pr<strong>in</strong>ted <strong>in</strong> the notes<br />
You can run an R session us<strong>in</strong>g either a <strong>po<strong>in</strong>t</strong>-<strong>and</strong>-click <strong>in</strong>terface or a l<strong>in</strong>e-by-l<strong>in</strong>e comm<strong>and</strong><br />
<strong>in</strong>terpreter. In these notes, R comm<strong>and</strong>s are pr<strong>in</strong>ted as they would appear when typed at the<br />
comm<strong>and</strong> l<strong>in</strong>e. So a typical series <strong>of</strong> R comm<strong>and</strong>s looks like this:<br />
> pi/2<br />
> s<strong>in</strong>(pi/2)<br />
> x x<br />
Note that you are not meant to type the > symbol; this is just the prompt for comm<strong>and</strong> <strong>in</strong>put<br />
<strong>in</strong> R. To type the first comm<strong>and</strong>, just type pi/2.<br />
In these notes we will sometimes also pr<strong>in</strong>t the response that R gives to a set <strong>of</strong> comm<strong>and</strong>s.<br />
In the example above, it would look like this:<br />
> pi/2<br />
[1] 1.570796<br />
> s<strong>in</strong>(pi/2)<br />
[1] 1<br />
> x x<br />
[1] 1.414214<br />
If the <strong>in</strong>put is too long, R will break it <strong>in</strong>to several l<strong>in</strong>es, <strong>and</strong> pr<strong>in</strong>t the character + to <strong>in</strong>dicate<br />
that the <strong>in</strong>put cont<strong>in</strong>ues from the previous l<strong>in</strong>e. (You don’t type the +). Also if you type an<br />
expression <strong>in</strong>volv<strong>in</strong>g brackets <strong>and</strong> hit Return before all the open brackets have been closed, then<br />
R will pr<strong>in</strong>t a + <strong>in</strong>dicat<strong>in</strong>g that it expects you to f<strong>in</strong>ish the expression.<br />
> folderol s<strong>in</strong>(folderol * folderol * folderol * folderol * folderol * folderol *<br />
+ folderol * folderol * folderol * folderol)<br />
[1] -0.09132148<br />
17<br />
Copyright c○CSIRO 2008
18 The R system<br />
3.3 Contributed libraries for R<br />
In addition to the basic R system, the R website also <strong>of</strong>fers many add-on modules (‘libraries’ or<br />
‘packages’) contributed by users. These can be downloaded from cran.r-project.org (under<br />
‘Contributed Packages’).<br />
Packages that may be useful for analys<strong>in</strong>g <strong>spatial</strong> data <strong>in</strong>clude:<br />
ads <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> pattern analysis<br />
DCluster detect<strong>in</strong>g clusters <strong>in</strong> <strong>spatial</strong> count data<br />
fields curve <strong>and</strong> function fitt<strong>in</strong>g<br />
geoR model-based geostatistical methods<br />
geoRglm model-based geostatistical methods<br />
GeoXB <strong>in</strong>teractive <strong>spatial</strong> exploratory data analysis<br />
grasp <strong>spatial</strong> prediction<br />
maptools geographical <strong>in</strong>formation systems<br />
rgdal <strong>in</strong>terface to GDAL geographical data analysis<br />
sp base library for some <strong>spatial</strong> data analysis packages<br />
spatclus detect<strong>in</strong>g clusters <strong>in</strong> <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> pattern data<br />
<strong>spatial</strong>Covariance <strong>spatial</strong> covariance for data on grids<br />
<strong>spatial</strong>kernel <strong>in</strong>terpolation <strong>and</strong> segregation <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
spatstat Spatial <strong>po<strong>in</strong>t</strong> pattern analysis <strong>and</strong> modell<strong>in</strong>g<br />
spBayes Gaussian <strong>spatial</strong> process MCMC (grid data)<br />
spdep <strong>spatial</strong> statistics for variables observed at fixed sites<br />
spgwr geographically weighted regression<br />
splancs <strong>spatial</strong> <strong>and</strong> space-time <strong>po<strong>in</strong>t</strong> pattern analysis<br />
spsurvey <strong>spatial</strong> survey methods<br />
trip analysis <strong>of</strong> <strong>spatial</strong> trip data<br />
To make use <strong>of</strong> a package, you need to:<br />
1. download the package code (once only) without unpack<strong>in</strong>g;<br />
2. ‘<strong>in</strong>stall’ the package code on your system (once only);<br />
3. ‘load’ the package <strong>in</strong>to your current R session us<strong>in</strong>g the comm<strong>and</strong> library (each time you<br />
start a new R session).<br />
The <strong>in</strong>stallation step is performed automatically us<strong>in</strong>g R, not by manually unpack<strong>in</strong>g the code.<br />
Installation is usually a very easy process.<br />
Instructions on how to <strong>in</strong>stall a package are given atcran.r-project.org. If you are runn<strong>in</strong>g<br />
W<strong>in</strong>dows, first start an R session. Then try the pull-down menu item Packages — Install<br />
packages. If this menu item is available, then you will be able to download <strong>and</strong> <strong>in</strong>stall any<br />
desired packages by simply select<strong>in</strong>g the package name from the pulldown list. If this menu item<br />
is not available (for <strong>in</strong>ternet security reasons), you can manually download packages by go<strong>in</strong>g<br />
to the CRAN website under Contributed packages -- W<strong>in</strong>dows b<strong>in</strong>aries <strong>and</strong> download<strong>in</strong>g<br />
the desired zip files <strong>of</strong> W<strong>in</strong>dows b<strong>in</strong>ary files. To perform step 2, start an R session <strong>and</strong> use the<br />
menu item Packages — Install from local zip files to <strong>in</strong>stall.<br />
If you are runn<strong>in</strong>g L<strong>in</strong>ux, step 1 is performed manually by go<strong>in</strong>g to the CRAN website under<br />
Contributed Packages <strong>and</strong> download<strong>in</strong>g the tar filepackagename.tar.gz. Step 2 is performed<br />
by issu<strong>in</strong>g the comm<strong>and</strong> R CMD INSTALL packagename.tar.gz.<br />
Copyright c○CSIRO 2008
4 Introduction to spatstat<br />
4.1 The spatstat package<br />
Spatstat is a contributed R package for analys<strong>in</strong>g <strong>spatial</strong> data, written by Adrian Baddeley <strong>and</strong><br />
Rolf Turner. Current versions <strong>of</strong> spatstat deal ma<strong>in</strong>ly with <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> <strong>in</strong> two<br />
dimensions. The package supports<br />
• creation, manipulation <strong>and</strong> plott<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
• exploratory data analysis<br />
• simulation <strong>of</strong> <strong>po<strong>in</strong>t</strong> process models<br />
• parametric model-fitt<strong>in</strong>g<br />
• hypothesis tests, residual plots, diagnostics<br />
Spatstat is one <strong>of</strong> the largest contributed packages available for R, with over 300 user-level<br />
functions <strong>and</strong> a 500-page manual. It has its own web doma<strong>in</strong>, www.spatstat.org, <strong>of</strong>fer<strong>in</strong>g<br />
<strong>in</strong>formation about the package.<br />
Spatstat can be downloaded from cran.r-project.org (under ‘Contributed packages —<br />
spatstat’). To <strong>in</strong>stall spatstat you will also need to download the packages mgcv <strong>and</strong> sm.<br />
4.2 Please acknowledge spatstat<br />
If you use spatstat for research that leads to publications, it would be much appreciated if<br />
you could acknowledge spatstat <strong>in</strong> your publications, preferably cit<strong>in</strong>g [4]. Citations help us<br />
to justify the expenditure <strong>of</strong> time <strong>and</strong> effort on ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g <strong>and</strong> develop<strong>in</strong>g the package.<br />
4.3 Gett<strong>in</strong>g started<br />
Here is a quick demonstration <strong>of</strong> spatstat <strong>in</strong> action. You can follow the demonstration by<br />
typ<strong>in</strong>g the comm<strong>and</strong>s <strong>in</strong>to R.<br />
To beg<strong>in</strong> any analysis us<strong>in</strong>g spatstat, first start the R system, <strong>and</strong> type<br />
> library(spatstat)<br />
The response will be someth<strong>in</strong>g like this:<br />
> library(spatstat)<br />
This is mgcv 1.3-20<br />
spatstat 1.14-5<br />
Type ’help(spatstat)’ for <strong>in</strong>formation<br />
The pr<strong>in</strong>tout shows that, before load<strong>in</strong>g spatstat, the system has loaded the package mgcv<br />
that is required by spatstat. Then it loads spatstat, show<strong>in</strong>g the version number <strong>of</strong> the<br />
package.<br />
For a list <strong>of</strong> the comm<strong>and</strong>s available <strong>in</strong> spatstat, type<br />
> help(spatstat)<br />
To get <strong>in</strong>formation on a particular comm<strong>and</strong>, type help(comm<strong>and</strong>).<br />
To ga<strong>in</strong> an impression <strong>of</strong> what is available <strong>in</strong> spatstat, you can run the package demonstration<br />
by typ<strong>in</strong>g demo(spatstat).<br />
19<br />
Copyright c○CSIRO 2008
20 Introduction to spatstat<br />
4.4 Inspect<strong>in</strong>g data<br />
For our first demonstration, we’ll use one <strong>of</strong> the st<strong>and</strong>ard <strong>po<strong>in</strong>t</strong> pattern datasets that is <strong>in</strong>stalled<br />
with the package. The ‘Swedish P<strong>in</strong>es’ dataset represent the positions <strong>of</strong> 71 trees <strong>in</strong> a forest plot<br />
9.6 by 10.0 metres.<br />
> data(swedishp<strong>in</strong>es)<br />
To avoid typ<strong>in</strong>g ‘swedishp<strong>in</strong>es’ all the time, let us copy the data to another dataset with a<br />
shorter name:<br />
> X plot(X)<br />
> X<br />
X<br />
Simply typ<strong>in</strong>g the name <strong>of</strong> the dataset gives you some basic <strong>in</strong>formation:<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 71 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 96] x [0, 100] units (one unit = 0.1 metres)<br />
Let’s study the <strong>in</strong>tensity (density <strong>of</strong> <strong>po<strong>in</strong>t</strong>s) <strong>in</strong> this <strong>po<strong>in</strong>t</strong> pattern. For a few basic summary<br />
statistics, type<br />
> summary(X)<br />
Planar <strong>po<strong>in</strong>t</strong> pattern: 71 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 0.0074 <strong>po<strong>in</strong>t</strong>s per square unit (one unit = 0.1 metres)<br />
W<strong>in</strong>dow: rectangle = [0, 96] x [0, 100] units<br />
W<strong>in</strong>dow area = 9600 square units<br />
Unit <strong>of</strong> length: 0.1 metres<br />
The coord<strong>in</strong>ates are <strong>in</strong> decimetres (0.1 metre), so the average <strong>in</strong>tensity is 0.0074 trees per<br />
square decimetre or 0.74 trees per square metre.<br />
To get an impression <strong>of</strong> local <strong>spatial</strong> variations <strong>in</strong> <strong>in</strong>tensity, we can plot a kernel estimate <strong>of</strong><br />
<strong>in</strong>tensity:<br />
Copyright c○CSIRO 2008
4.5 Exploratory data analysis 21<br />
> plot(density(X, 10))<br />
density(X, 10)<br />
0.002 0.004 0.006 0.008 0.01 0.012 0.014<br />
where 10 is my chosen value for the st<strong>and</strong>ard deviation <strong>of</strong> the Gaussian smooth<strong>in</strong>g kernel.<br />
If you prefer a contour plot,<br />
> contour(density(X, 10), axes = FALSE)<br />
0.011<br />
0.004<br />
0.003<br />
0.01<br />
0.009<br />
0.008<br />
0.005<br />
0.004<br />
0.009<br />
0.01<br />
0.006<br />
density(X, 10)<br />
0.005<br />
0.007<br />
0.005<br />
0.006<br />
0.007<br />
0.004<br />
0.008<br />
0.009<br />
0.01<br />
0.006<br />
0.007<br />
0.011<br />
0.012<br />
0.005<br />
0.009<br />
0.009<br />
0.01<br />
0.013<br />
0.015<br />
0.014<br />
The contours are labelled <strong>in</strong> density units <strong>of</strong> “trees per square decimetre”.<br />
4.5 Exploratory data analysis<br />
Spatstat is designed to support all the st<strong>and</strong>ard types <strong>of</strong> exploratory data analysis for <strong>po<strong>in</strong>t</strong><br />
<strong>patterns</strong>.<br />
One example is quadrat count<strong>in</strong>g. The study region is divided <strong>in</strong>to rectangles (‘quadrats’) <strong>of</strong><br />
equal size, <strong>and</strong> the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> each rectangle is counted.<br />
> Q Q<br />
x<br />
y [0,24] (24,48] (48,72] (72,96]<br />
(66.7,100] 7 3 6 5<br />
(33.3,66.7] 5 9 7 7<br />
[0,33.3] 4 3 6 9<br />
Copyright c○CSIRO 2008
22 Introduction to spatstat<br />
> plot(X)<br />
> plot(Q, add = TRUE, cex = 2)<br />
X<br />
7 3 6 5<br />
5 9 7 7<br />
4 3 6 9<br />
Another example is Ripley’s K function. I’ll expla<strong>in</strong> more about the K function later. For<br />
now, we’ll just demonstrate how easy it is to compute <strong>and</strong> plot it. To compute the K function<br />
for a <strong>po<strong>in</strong>t</strong> pattern X, type Kest(X). This returns an object which can be plotted.<br />
> K plot(K)<br />
K(r)<br />
0 500 1000 1500<br />
0 5 10 15 20<br />
K<br />
r (one unit = 0.1 metres)<br />
4.6 Multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
A marked <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> which the marks are a categorical variable is usually called a multitype<br />
<strong>po<strong>in</strong>t</strong> pattern. The ‘types’ are the different values or levels <strong>of</strong> the mark variable.<br />
Here is the famous Lans<strong>in</strong>g Woods dataset record<strong>in</strong>g the positions <strong>of</strong> 2251 trees <strong>of</strong> 6 different<br />
species (hickories, maples, red oaks, white oaks, black oaks <strong>and</strong> miscellaneous trees).<br />
> data(lans<strong>in</strong>g)<br />
> lans<strong>in</strong>g<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 2251 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = blackoak hickory maple misc redoak<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units (one unit = 924 feet)<br />
Copyright c○CSIRO 2008
4.6 Multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 23<br />
> summary(lans<strong>in</strong>g)<br />
Marked planar <strong>po<strong>in</strong>t</strong> pattern: 2251 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 2250 <strong>po<strong>in</strong>t</strong>s per square unit (one unit = 924 feet)<br />
*Pattern conta<strong>in</strong>s duplicated <strong>po<strong>in</strong>t</strong>s*<br />
Multitype:<br />
frequency proportion <strong>in</strong>tensity<br />
blackoak 135 0.0600 135<br />
hickory 703 0.3120 703<br />
maple 514 0.2280 514<br />
misc 105 0.0466 105<br />
redoak 346 0.1540 346<br />
whiteoak 448 0.1990 448<br />
W<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units<br />
W<strong>in</strong>dow area = 1 square unit<br />
Unit <strong>of</strong> length: 924 feet<br />
> plot(lans<strong>in</strong>g)<br />
blackoak hickory maple misc redoak whiteoak<br />
1 2 3 4 5 6<br />
lans<strong>in</strong>g<br />
In this plot, each type <strong>of</strong> <strong>po<strong>in</strong>t</strong> (i.e. each species <strong>of</strong> tree) is represented by a different plot<br />
symbol. The last l<strong>in</strong>e <strong>of</strong> output above expla<strong>in</strong>s the encod<strong>in</strong>g: black oak is coded as symbol 1<br />
(open circle) <strong>and</strong> so on.<br />
An alternative way to plot these data is to split them <strong>in</strong>to 6 <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, each pattern<br />
conta<strong>in</strong><strong>in</strong>g the trees <strong>of</strong> one species. This is done us<strong>in</strong>g split:<br />
> plot(split(lans<strong>in</strong>g))<br />
Copyright c○CSIRO 2008
24 Introduction to spatstat<br />
split(lans<strong>in</strong>g)<br />
blackoak hickory maple<br />
misc redoak whiteoak<br />
The result <strong>of</strong> split(lans<strong>in</strong>g) is a list <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. The names <strong>of</strong> the list entries are the<br />
names <strong>of</strong> the types (<strong>in</strong> this case "blackoak","hickory", etc). To extract one <strong>of</strong> these <strong>patterns</strong>,<br />
e.g. the hickories,<br />
> hick plot(hick)<br />
hick<br />
Copyright c○CSIRO 2008
4.7 Installed datasets 25<br />
4.7 Installed datasets<br />
For reference, here is a list <strong>of</strong> the st<strong>and</strong>ard <strong>po<strong>in</strong>t</strong> pattern datasets that are supplied with the<br />
<strong>in</strong>stallation <strong>of</strong> spatstat:<br />
name description marks covariates w<strong>in</strong>dow<br />
amacr<strong>in</strong>e Hughes’ rabbit amacr<strong>in</strong>e cells 2 types ·<br />
anemones Upton-F<strong>in</strong>gleton sea anemones diameter ·<br />
ants Harkness-Isham ant nests 2 species 2 zones convex poly<br />
bei Tropical ra<strong>in</strong>forest trees · topography<br />
betacells Wässle et al. cat ret<strong>in</strong>al ganglia 2 types ·<br />
bramblecanes Bramble Canes 3 ages ·<br />
cells Crick-Ripley biological cells · ·<br />
chorley Chorley-South Ribble cancers case/control · irregular<br />
copper Queensl<strong>and</strong> copper deposits · fault l<strong>in</strong>es<br />
demopat artificial data 2 types · irregular<br />
f<strong>in</strong>p<strong>in</strong>es F<strong>in</strong>nish P<strong>in</strong>es diameter ·<br />
hamster Aherne’s hamster tumour data 2 types ·<br />
humberside Humberside child leukaemia case/control · irregular<br />
japanesep<strong>in</strong>es Japanese P<strong>in</strong>es · ·<br />
lans<strong>in</strong>g Lans<strong>in</strong>g Woods 6 species ·<br />
longleaf Longleaf P<strong>in</strong>e trees diameter ·<br />
murchison Murchison gold deposits · fault l<strong>in</strong>es irregular<br />
nbfires New Brunswick fires several · irregular<br />
nztrees Mark-Esler-Ripley NZ trees · ·<br />
ponderosa Getis-Frankl<strong>in</strong> Ponderosa p<strong>in</strong>es · ·<br />
redwood Strauss-Ripley redwood sapl<strong>in</strong>gs · ·<br />
redwoodfull Strauss redwood map (full set) · 2 zones<br />
simdat Simulated <strong>po<strong>in</strong>t</strong> pattern · ·<br />
spruces Spruce trees <strong>in</strong> Saxony diameter ·<br />
swedishp<strong>in</strong>es Str<strong>and</strong>-Ripley Swedish p<strong>in</strong>es · ·<br />
urkiola Urkiola Woods, Spa<strong>in</strong> 2 species · irregular<br />
The symbol <strong>in</strong>dicates that the w<strong>in</strong>dow for the pattern is a rectangle.<br />
To flick through a nice display <strong>of</strong> all these datasets, type demo(data).<br />
To access one <strong>of</strong> these datasets, type data(name) where name is the name listed above. To<br />
see <strong>in</strong>formation about the dataset, type help(name). To plot the dataset, type plot(name).<br />
4.8 Po<strong>in</strong>t-<strong>and</strong>-click on the screen<br />
There is a graphical <strong>in</strong>terface which allows you to draw a <strong>po<strong>in</strong>t</strong> pattern on the screen. Type<br />
> X plot(X)<br />
Copyright c○CSIRO 2008
26 Introduction to spatstat<br />
PART II. DATA TYPES<br />
In Part II <strong>of</strong> the workshop, we look at the different types <strong>of</strong> <strong>spatial</strong> data (<strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, w<strong>in</strong>dows,<br />
pixel images, etc) <strong>and</strong> how to manipulate them <strong>in</strong> spatstat.<br />
Copyright c○CSIRO 2008
5 Objects, classes <strong>and</strong> methods <strong>in</strong> R<br />
The tutorial examples above have used some <strong>of</strong> the ‘object-oriented’ features <strong>of</strong> R. It is very<br />
useful to know a little about how these work.<br />
5.1 Classes <strong>in</strong> R<br />
R is an ‘object-oriented’ language. A dataset with some k<strong>in</strong>d <strong>of</strong> structure on it (e.g. a cont<strong>in</strong>gency<br />
table, a time series, a <strong>po<strong>in</strong>t</strong> pattern) is treated as a s<strong>in</strong>gle ‘object’.<br />
For example, R <strong>in</strong>cludes a datasetsunspots which is a time series conta<strong>in</strong><strong>in</strong>g monthly sunspot<br />
counts from 1749 to 1983. This dataset can be manipulated as if it were a s<strong>in</strong>gle object:<br />
> plot(sunspots)<br />
> summary(sunspots)<br />
> X class(sunspots)<br />
[1] "ts"<br />
St<strong>and</strong>ard operations, such as pr<strong>in</strong>t<strong>in</strong>g, plott<strong>in</strong>g, or calculat<strong>in</strong>g the sample mean, are def<strong>in</strong>ed<br />
separately for each class <strong>of</strong> object.<br />
For example, typ<strong>in</strong>g plot(sunspots) <strong>in</strong>vokes the generic comm<strong>and</strong> plot. Now sunspots is<br />
an object <strong>of</strong> class "ts" represent<strong>in</strong>g a time series, <strong>and</strong> there is a special “method” for plott<strong>in</strong>g<br />
time series, called plot.ts. So the system executes plot.ts(sunspots). It is said that the plot<br />
comm<strong>and</strong> is “dispatched” to the method plot.ts. The plot method for time series produces a<br />
display that is sensible for time series, with axes properly annotated.<br />
Tip: to f<strong>in</strong>d out how to modify the plot for an object <strong>of</strong> class "foo", consult<br />
help(plot.foo) rather than help(plot).<br />
5.2 Classes <strong>in</strong> spatstat<br />
To h<strong>and</strong>le <strong>po<strong>in</strong>t</strong> pattern datasets <strong>and</strong> related data, the spatstat package def<strong>in</strong>es the follow<strong>in</strong>g<br />
classes <strong>of</strong> objects:<br />
• ppp: planar <strong>po<strong>in</strong>t</strong> pattern<br />
• ow<strong>in</strong>: <strong>spatial</strong> region (‘observation w<strong>in</strong>dow’)<br />
• im: pixel image<br />
• psp: pattern <strong>of</strong> l<strong>in</strong>e segments<br />
• tess: tessellation<br />
27<br />
Copyright c○CSIRO 2008
28 Objects, classes <strong>and</strong> methods <strong>in</strong> R<br />
Rectangular w<strong>in</strong>dow<br />
(class ow<strong>in</strong>)<br />
Po<strong>in</strong>t pattern (class ppp)<br />
Polygonal w<strong>in</strong>dow<br />
(class ow<strong>in</strong>)<br />
Pixel image (class im)<br />
B<strong>in</strong>ary mask w<strong>in</strong>dow<br />
(class ow<strong>in</strong>)<br />
L<strong>in</strong>e segment pattern (class psp) Tessellation (class tess)<br />
Most <strong>of</strong> the functionality <strong>in</strong> spatstat works on such objects. To use this functionality, you’ll<br />
need to read your raw data <strong>in</strong>to R <strong>and</strong> then convert it <strong>in</strong>to an object <strong>of</strong> the appropriate format.<br />
In particular spatstat has methods for plot, pr<strong>in</strong>t <strong>and</strong> summary for each <strong>of</strong> these classes.<br />
For example, the plot method for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, plot.ppp, ensures that the x <strong>and</strong> y scales<br />
are equal, <strong>and</strong> does various other th<strong>in</strong>gs that are sensible when plott<strong>in</strong>g a <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> pattern<br />
rather than just a list <strong>of</strong> (x,y) pairs.<br />
Copyright c○CSIRO 2008<br />
120 130 140 150 160
5.2 Classes <strong>in</strong> spatstat 29<br />
> data(humberside)<br />
> plot(humberside)<br />
humberside<br />
Exercise 1 F<strong>in</strong>d out how to modify the comm<strong>and</strong> plot(swedishp<strong>in</strong>es) so that the title reads<br />
“Swedish P<strong>in</strong>es data” <strong>and</strong> the <strong>po<strong>in</strong>t</strong>s are represented by plus-signs <strong>in</strong>stead <strong>of</strong> circles.<br />
When you type pr<strong>in</strong>t(swedishp<strong>in</strong>es) or just swedishp<strong>in</strong>es, this <strong>in</strong>vokes the generic comm<strong>and</strong>pr<strong>in</strong>t,<br />
which dispatches to the methodpr<strong>in</strong>t.ppp, which pr<strong>in</strong>ts some sensible <strong>in</strong>formation<br />
about the <strong>po<strong>in</strong>t</strong> pattern swedishp<strong>in</strong>es at the term<strong>in</strong>al.<br />
> swedishp<strong>in</strong>es<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 71 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 96] x [0, 100] units (one unit = 0.1 metres)<br />
The generic comm<strong>and</strong> summary is meant to provide basic summary statistics for a dataset.<br />
When you type summary(swedishp<strong>in</strong>es) this is dispatched to the method summary.ppp, which<br />
computes a sensible set <strong>of</strong> summary statistics for a <strong>po<strong>in</strong>t</strong> pattern, <strong>and</strong> pr<strong>in</strong>ts them at the term<strong>in</strong>al.<br />
> summary(swedishp<strong>in</strong>es)<br />
Planar <strong>po<strong>in</strong>t</strong> pattern: 71 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 0.0074 <strong>po<strong>in</strong>t</strong>s per square unit (one unit = 0.1 metres)<br />
W<strong>in</strong>dow: rectangle = [0, 96] x [0, 100] units<br />
W<strong>in</strong>dow area = 9600 square units<br />
Unit <strong>of</strong> length: 0.1 metres<br />
The comm<strong>and</strong> density is also generic. It is normally used to compute a kernel density<br />
estimate <strong>of</strong> a probability distribution from a vector <strong>of</strong> numbers. (This “default method” is<br />
called density.default.) But there is also a method for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, so that when you type<br />
density(swedishp<strong>in</strong>es), this is dispatched to density.ppp which computes a two-dimensional<br />
kernel estimate <strong>of</strong> the <strong>in</strong>tensity function.<br />
> plot(density(swedishp<strong>in</strong>es, sigma = 10))<br />
Copyright c○CSIRO 2008
30 Objects, classes <strong>and</strong> methods <strong>in</strong> R<br />
density(swedishp<strong>in</strong>es, sigma = 10)<br />
0.002 0.006 0.01 0.014<br />
To see a list <strong>of</strong> all methods available <strong>in</strong> R for a particular generic function such as plot:<br />
> methods(plot)<br />
To see a list <strong>of</strong> all methods that are available for a particular class such as ppp:<br />
> methods(class = "ppp")<br />
[1] [.ppp [ plot(cells)<br />
Copyright c○CSIRO 2008
5.3 Return values 31<br />
There’s still a return value from the function, which can be captured by assign<strong>in</strong>g the result<br />
to a variable:<br />
> a a<br />
NULL<br />
Tip: Many plott<strong>in</strong>g comm<strong>and</strong>s return a value which is useful if you want to annotate<br />
the plot. In spatstat the function plot.ppp plots a <strong>po<strong>in</strong>t</strong> pattern <strong>and</strong> returns<br />
<strong>in</strong>formation about the encod<strong>in</strong>g <strong>of</strong> the marks. After plott<strong>in</strong>g a multitype pattern, to<br />
make a nice legend for the plot, save the result <strong>of</strong> the plot call <strong>and</strong> pass it to the<br />
legend comm<strong>and</strong>:<br />
> data(lans<strong>in</strong>g)<br />
> a legend(-0.25, 0.5, names(a), pch = a)<br />
blackoak<br />
hickory<br />
maple<br />
misc<br />
redoak<br />
whiteoak<br />
lans<strong>in</strong>g<br />
Tip: To f<strong>in</strong>d out the format <strong>of</strong> the output returned by a particular function fun,<br />
type help(fun) <strong>and</strong> read the section headed ‘Value’.<br />
5.3.2 Return<strong>in</strong>g an object<br />
A function which performs a complicated analysis <strong>of</strong> your data will typically return an object<br />
belong<strong>in</strong>g to a special class. This is a convenient way to h<strong>and</strong>le calculations that yield large or<br />
complicated output. It enables you to store the result for later use, <strong>and</strong> provides methods for<br />
h<strong>and</strong>l<strong>in</strong>g the result.<br />
Many <strong>of</strong> the functions <strong>in</strong> spatstat return an object <strong>of</strong> a special class. For example, the<br />
value returned by density.ppp is a pixel image (an object <strong>of</strong> class "im"). This is effectively a<br />
large matrix, giv<strong>in</strong>g the values <strong>of</strong> the kernel estimate <strong>of</strong> <strong>in</strong>tensity at each <strong>po<strong>in</strong>t</strong> <strong>in</strong> a f<strong>in</strong>e regular<br />
grid <strong>of</strong> locations.<br />
Copyright c○CSIRO 2008
32 Objects, classes <strong>and</strong> methods <strong>in</strong> R<br />
> Z Z<br />
real-valued pixel image<br />
100 x 100 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 96] x [0, 100] units (one unit = 0.1 metres)<br />
The class <strong>of</strong> pixel images <strong>in</strong> spatstat has methods for pr<strong>in</strong>t, summary, plot <strong>and</strong> so on.<br />
> summary(Z)<br />
real-valued pixel image<br />
100 x 100 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 96] x [0, 100] units<br />
dimensions <strong>of</strong> each pixel: 0.96 x 1 units<br />
(one unit = 0.1 metres)<br />
Image is def<strong>in</strong>ed on the full rectangular grid<br />
Frame area = 9600 square units<br />
Pixel values :<br />
range = [0.00188947243195950,0.0155470858797917]<br />
<strong>in</strong>tegral = 71.3036909843861<br />
mean = 0.00742746781087355<br />
Another example is the comm<strong>and</strong> Kest which estimates Ripley’s K-function. The value<br />
returned by Kest is an object <strong>of</strong> class "fv" (‘function value table’) conta<strong>in</strong><strong>in</strong>g the estimated<br />
values <strong>of</strong> K(r), obta<strong>in</strong>ed us<strong>in</strong>g several different estimators, for a range <strong>of</strong> r values. This class<br />
has methods for pr<strong>in</strong>t, plot <strong>and</strong> so on.<br />
> u u<br />
Function value object (class ’fv’)<br />
for the function r -> K(r)<br />
Entries:<br />
id label description<br />
-- ----- ----------r<br />
r distance argument r<br />
theo Kpois(r) theoretical Poisson K(r)<br />
border Kbord(r) border-corrected estimate <strong>of</strong> K(r)<br />
trans Ktrans(r) translation-corrected estimate <strong>of</strong> K(r)<br />
iso Kiso(r) Ripley isotropic correction estimate <strong>of</strong> K(r)<br />
--------------------------------------<br />
Default plot formula:<br />
. ~ r<br />
Recommended range <strong>of</strong> argument r: [0, 24]<br />
Unit <strong>of</strong> length: 0.1 metres<br />
> plot(u)<br />
Copyright c○CSIRO 2008
K(r)<br />
0 500 1000 1500<br />
0 5 10 15 20<br />
u<br />
r (one unit = 0.1 metres)<br />
6 Po<strong>in</strong>t <strong>patterns</strong> <strong>in</strong> spatstat<br />
To analyse your own <strong>po<strong>in</strong>t</strong> pattern data <strong>in</strong> spatstat, you’ll need to read the raw data <strong>in</strong>to R<br />
<strong>and</strong> convert them <strong>in</strong>to an object <strong>of</strong> class "ppp". This tutorial gives one basic recipe.<br />
6.1 Basic recipe<br />
In many cases, the observation w<strong>in</strong>dow is a rectangle. The follow<strong>in</strong>g steps will then be sufficient.<br />
1. store the x <strong>and</strong> y coord<strong>in</strong>ates for the <strong>po<strong>in</strong>t</strong>s <strong>in</strong> two vectors x <strong>and</strong> y.<br />
2. if there are marks attached to the <strong>po<strong>in</strong>t</strong>s, store the correspond<strong>in</strong>g marks <strong>in</strong> a vector m.<br />
(Note: only a s<strong>in</strong>gle mark value per <strong>po<strong>in</strong>t</strong> is allowed; multivariate marks are not supported.<br />
But we’re work<strong>in</strong>g on it.)<br />
3. create the <strong>po<strong>in</strong>t</strong> pattern object by<br />
> ppp(x, y, xrange, yrange)<br />
or, if there are marks,<br />
> ppp(x, y, xrange, yrange, marks = m)<br />
where xrange, yrange are vectors <strong>of</strong> length 2 giv<strong>in</strong>g the x <strong>and</strong> y dimensions <strong>of</strong> the rectangular<br />
w<strong>in</strong>dow.<br />
The value returned by the function ppp is an object <strong>of</strong> class "ppp" represent<strong>in</strong>g a <strong>po<strong>in</strong>t</strong><br />
pattern.<br />
If the w<strong>in</strong>dow is not a rectangle, then you need to use a comm<strong>and</strong> like<br />
> ppp(x, y, w<strong>in</strong>dow = W)<br />
where W is a w<strong>in</strong>dow object. See Section 7.5 for details on how to do this.<br />
33<br />
Copyright c○CSIRO 2008
34 Po<strong>in</strong>t <strong>patterns</strong> <strong>in</strong> spatstat<br />
Enter<strong>in</strong>g coord<strong>in</strong>ate data<br />
Suppose we have recorded the x,y coord<strong>in</strong>ates <strong>of</strong> 25 <strong>po<strong>in</strong>t</strong>s that lie <strong>in</strong> a rectangle [0,2] × [0,1].<br />
They can be entered <strong>in</strong>to R <strong>in</strong> various ways, for example by typ<strong>in</strong>g them directly:<br />
> x y P plot(P)<br />
> P<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 25 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 2] x [0, 1] units<br />
Copyright c○CSIRO 2008<br />
P
6.1 Basic recipe 35<br />
Marked <strong>po<strong>in</strong>t</strong> pattern<br />
Mark values may have any atomic type: numeric, <strong>in</strong>teger, character, logical, or complex. For<br />
example, let’s take a vector <strong>of</strong> real numbers:<br />
> m Q Q<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 25 <strong>po<strong>in</strong>t</strong>s<br />
marks are numeric, <strong>of</strong> type ’double’<br />
w<strong>in</strong>dow: rectangle = [0, 2] x [0, 1] units<br />
> plot(Q)<br />
0 10 20 30 40<br />
0.00000000 0.04323888 0.08647777 0.12971665 0.17295553<br />
Q<br />
The last l<strong>in</strong>e <strong>of</strong> output is the return value from plot(Q), which <strong>in</strong>dicates the scale used to plot<br />
the marks. The mark value 10 was plotted as a circle <strong>of</strong> radius 0.0432.<br />
Categorical marks<br />
When the mark is a categorical variable, we have a multitype <strong>po<strong>in</strong>t</strong> pattern. The ‘types’ are the<br />
different levels <strong>of</strong> the mark variable. The mark values should be stored as a ‘factor’ <strong>in</strong> R.<br />
For example, let’s attach r<strong>and</strong>om marks to the pattern, tak<strong>in</strong>g two possible values Yes <strong>and</strong><br />
No with equal probability.<br />
> m m YN YN<br />
Copyright c○CSIRO 2008
36 Po<strong>in</strong>t <strong>patterns</strong> <strong>in</strong> spatstat<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 25 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = No Yes<br />
w<strong>in</strong>dow: rectangle = [0, 2] x [0, 1] units<br />
> plot(YN)<br />
No Yes<br />
1 2<br />
YN<br />
If the marks are <strong>in</strong>tended to be a categorical variable, ensure that m is stored as<br />
a ‘factor’.<br />
The last l<strong>in</strong>e <strong>of</strong> output <strong>in</strong>dicates how the marks were plotted: the mark No was plotted as<br />
symbol 1 (circle) <strong>and</strong> mark Yes was plotted as symbol 2 (triangle).<br />
Notice that the factor levels have been re-sorted alphabetically (by default). This is one <strong>of</strong><br />
the common slip-ups with factors <strong>in</strong> R. To stipulate a different order<strong>in</strong>g <strong>of</strong> the levels,<br />
> m YN YN<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 25 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = Yes No<br />
w<strong>in</strong>dow: rectangle = [0, 2] x [0, 1] units<br />
Tip: whenever you create a factor, check that the factor levels are as you <strong>in</strong>tended,<br />
us<strong>in</strong>g levels(x).<br />
Other ways <strong>of</strong> add<strong>in</strong>g marks to a <strong>po<strong>in</strong>t</strong> pattern will be described <strong>in</strong> Section 25.<br />
6.2 Check<strong>in</strong>g data<br />
It is prudent to check for quirks <strong>in</strong> the data.<br />
• Pr<strong>in</strong>t out the coord<strong>in</strong>ate values <strong>and</strong> marks to check for errors <strong>in</strong> data entry, <strong>and</strong> to determ<strong>in</strong>e<br />
whether the coord<strong>in</strong>ates have been rounded.<br />
• Duplicated <strong>po<strong>in</strong>t</strong>s are surpris<strong>in</strong>gly common <strong>in</strong> data files (i.e. where two records <strong>in</strong> the file<br />
refer to the same (x,y) location). Once you have entered the coord<strong>in</strong>ates <strong>in</strong>to R as a twocolumn<br />
matrix or a data frame D say, you can check for duplication us<strong>in</strong>g the comm<strong>and</strong><br />
any(duplicated(D)). If your data are already <strong>in</strong> the form <strong>of</strong> a <strong>po<strong>in</strong>t</strong> pattern X, you can<br />
also type any(duplicated(X)) to detect duplication. To remove duplicated <strong>po<strong>in</strong>t</strong>s, type<br />
Y
6.3 Units 37<br />
• Plott<strong>in</strong>g the <strong>po<strong>in</strong>t</strong> pattern is always wise. Look for unexpected <strong>patterns</strong>, <strong>and</strong> <strong>po<strong>in</strong>t</strong>s that<br />
lie outside the w<strong>in</strong>dow.<br />
• On a plot <strong>of</strong> a <strong>po<strong>in</strong>t</strong> pattern X, you can identify an <strong>in</strong>dividual <strong>po<strong>in</strong>t</strong> by typ<strong>in</strong>g plot(X);<br />
identify(X) then click<strong>in</strong>g on the <strong>po<strong>in</strong>t</strong>.<br />
The function ppp automatically checks for duplicated <strong>po<strong>in</strong>t</strong>s, <strong>and</strong> for <strong>po<strong>in</strong>t</strong>s that lie outside<br />
the specified w<strong>in</strong>dow.<br />
6.3 Units<br />
A <strong>po<strong>in</strong>t</strong> pattern X may <strong>in</strong>clude <strong>in</strong>formation about the units <strong>of</strong> length <strong>in</strong> which the x <strong>and</strong> y<br />
coord<strong>in</strong>ates are recorded. This <strong>in</strong>formation is optional; it merely enables the package to pr<strong>in</strong>t<br />
better reports <strong>and</strong> to annotate the axes <strong>in</strong> plots.<br />
If the x <strong>and</strong> y coord<strong>in</strong>ates <strong>in</strong> the <strong>po<strong>in</strong>t</strong> pattern P were recorded <strong>in</strong> metres, type<br />
> unitname(P) unitname(P) unitname(P)
38 Po<strong>in</strong>t <strong>patterns</strong> <strong>in</strong> spatstat<br />
The package help file help(spatstat) lists all the available options.<br />
Note that it is a st<strong>and</strong>ard nam<strong>in</strong>g convention <strong>in</strong> R that, for a class "foo", there should<br />
be a ‘creator’ function foo that creates objects <strong>of</strong> this class from raw numerical data, <strong>and</strong> a<br />
‘converter’ function as.foo that converts data from other formats <strong>in</strong>to objects <strong>of</strong> class "foo".<br />
We adhere to this convention <strong>in</strong> spatstat:<br />
Class Creator Converter<br />
"ppp" ppp as.ppp<br />
"ow<strong>in</strong>" ow<strong>in</strong> as.ow<strong>in</strong><br />
"im" im as.im<br />
More alternatives for us<strong>in</strong>g ppp will be covered <strong>in</strong> Section 7.5.<br />
Copyright c○CSIRO 2008
7 W<strong>in</strong>dows <strong>in</strong> spatstat<br />
Many comm<strong>and</strong>s <strong>in</strong> spatstat require us to specify a w<strong>in</strong>dow, study region or doma<strong>in</strong>. It will<br />
be h<strong>and</strong>y to know more about w<strong>in</strong>dows <strong>in</strong> spatstat.<br />
An object <strong>of</strong> class "ow<strong>in</strong>" (“observation w<strong>in</strong>dow”) represents a region or w<strong>in</strong>dow <strong>in</strong> twodimensional<br />
space. The w<strong>in</strong>dow may be<br />
• a rectangle;<br />
• a polygon or polygons, with polygonal holes; or<br />
• an irregular shape represented by a b<strong>in</strong>ary pixel image mask.<br />
Rectangular w<strong>in</strong>dow<br />
Polygonal w<strong>in</strong>dow B<strong>in</strong>ary mask w<strong>in</strong>dow<br />
Objects <strong>of</strong> this class are created by the function ow<strong>in</strong>. There are methods for pr<strong>in</strong>t<strong>in</strong>g <strong>and</strong><br />
plott<strong>in</strong>g w<strong>in</strong>dows, <strong>and</strong> numerous geometrical operations.<br />
7.1 Mak<strong>in</strong>g w<strong>in</strong>dows<br />
7.1.1 Rectangular w<strong>in</strong>dow<br />
To create a rectangular w<strong>in</strong>dow, type<br />
> ow<strong>in</strong>(xrange, yrange)<br />
where xrange, yrange are vectors <strong>of</strong> length 2 giv<strong>in</strong>g the x <strong>and</strong> y dimensions, respectively, <strong>of</strong><br />
the rectangle.<br />
> ow<strong>in</strong>(c(0, 3), c(1, 2))<br />
w<strong>in</strong>dow: rectangle = [0, 3] x [1, 2] units<br />
For a square w<strong>in</strong>dow you can also use square:<br />
> square(5)<br />
w<strong>in</strong>dow: rectangle = [0, 5] x [0, 5] units<br />
7.1.2 Circular w<strong>in</strong>dow<br />
For a circular w<strong>in</strong>dow use disc:<br />
> W
40 W<strong>in</strong>dows <strong>in</strong> spatstat<br />
7.1.3 Polygonal w<strong>in</strong>dow<br />
Spatstat supports polygonal w<strong>in</strong>dows <strong>of</strong> arbitrary shape <strong>and</strong> topology. That is, the boundary<br />
<strong>of</strong> the w<strong>in</strong>dow may consist <strong>of</strong> one or more closed polygonal curves, which do not <strong>in</strong>tersect<br />
themselves or each other. The w<strong>in</strong>dow may have ‘holes’. Type<br />
> ow<strong>in</strong>(poly = p)<br />
or<br />
> ow<strong>in</strong>(poly = p, xrange, yrange)<br />
to create a polygonal w<strong>in</strong>dow. The argument poly=p <strong>in</strong>dicates that the w<strong>in</strong>dow is polygonal<br />
<strong>and</strong> its boundary is given by the dataset p. Note we must use the “name=value” syntax to give<br />
the argument poly. The arguments xrange <strong>and</strong> yrange are optional here; if they are absent,<br />
the x <strong>and</strong> y dimensions <strong>of</strong> the bound<strong>in</strong>g rectangle will be computed from the polygon.<br />
If the w<strong>in</strong>dow boundary is a s<strong>in</strong>gle polygon, then p should be a list with components x <strong>and</strong> y<br />
giv<strong>in</strong>g the coord<strong>in</strong>ates <strong>of</strong> the vertices <strong>of</strong> the w<strong>in</strong>dow boundary, traversed anticlockwise. For<br />
example, the triangle with corners (0,0), (1,0) <strong>and</strong> (0,1) is created by<br />
> Z plot(Z)<br />
Z<br />
Note that polygons should not be closed, i.e. the last vertex should not equal the first<br />
vertex. The same convention is used <strong>in</strong> the st<strong>and</strong>ard plott<strong>in</strong>g function polygon().<br />
If the w<strong>in</strong>dow boundary consists <strong>of</strong> several separate polygons, then p should be a list, each<br />
<strong>of</strong> whose components p[[i]] is a list with components x <strong>and</strong> y describ<strong>in</strong>g one <strong>of</strong> the polygons.<br />
The vertices <strong>of</strong> each polygon should be traversed anticlockwise for external boundaries <strong>and</strong><br />
clockwise for <strong>in</strong>ternal boundaries (holes). For example, the follow<strong>in</strong>g creates a triangle<br />
with a square hole.<br />
> Z plot(Z)<br />
Copyright c○CSIRO 2008
7.1 Mak<strong>in</strong>g w<strong>in</strong>dows 41<br />
Z<br />
Notice that the first boundary polygon is traversed anticlockwise <strong>and</strong> the second clockwise,<br />
because it is a hole.<br />
It is <strong>of</strong>ten useful to plot a polygonal w<strong>in</strong>dow with l<strong>in</strong>e shad<strong>in</strong>g:<br />
> plot(Z, hatch = TRUE)<br />
7.1.4 B<strong>in</strong>ary mask<br />
Z<br />
A w<strong>in</strong>dow may be def<strong>in</strong>ed by a discrete pixel approximation. Type<br />
ow<strong>in</strong>(mask=m, xrange, yrange)<br />
to create the w<strong>in</strong>dow object. Heremshould be a matrix with logical entries; it will be <strong>in</strong>terpreted<br />
as a b<strong>in</strong>ary pixel image whose entries are TRUE where the correspond<strong>in</strong>g pixel belongs to the<br />
w<strong>in</strong>dow.<br />
The rectangle with dimensionsxrange, yrange is divided <strong>in</strong>to equal rectangular pixels. The<br />
correspondence between matrix <strong>in</strong>dicesm[i,j] <strong>and</strong> cartesian coord<strong>in</strong>ates is slightly idiosyncratic:<br />
the rows <strong>of</strong> m correspond to the y coord<strong>in</strong>ate, <strong>and</strong> the columns to the x coord<strong>in</strong>ate. The entry<br />
m[i,j] is TRUE if the <strong>po<strong>in</strong>t</strong> (xx[j],yy[i]) (sic) belongs to the w<strong>in</strong>dow, where xx, yy are<br />
vectors <strong>of</strong> pixel coord<strong>in</strong>ates equally spaced over xrange <strong>and</strong> yrange respectively. The length <strong>of</strong><br />
xx is ncol(m) while the length <strong>of</strong> yy is nrow(m).<br />
In some GIS applications the study region will be given as a b<strong>in</strong>ary pixel image. A safe<br />
strategy is to dump the data from the GIS system to a text file, <strong>and</strong> read the text file <strong>in</strong>to R<br />
us<strong>in</strong>g scan. Then reformat it as a matrix, <strong>and</strong> use ow<strong>in</strong> to create the w<strong>in</strong>dow object.<br />
To convert a rectangle or polygonal w<strong>in</strong>dow to a b<strong>in</strong>ary mask, use as.mask.<br />
Copyright c○CSIRO 2008
42 W<strong>in</strong>dows <strong>in</strong> spatstat<br />
> Z W plot(W)<br />
W<br />
7.2 Convert<strong>in</strong>g from GIS formats<br />
There is a wide variety <strong>of</strong> s<strong>of</strong>tware packages for h<strong>and</strong>l<strong>in</strong>g <strong>spatial</strong> data, especially Geographical<br />
Information Systems (GIS). These packages use many different formats to represent <strong>spatial</strong> data.<br />
Typically spatstat does not support these formats: this would not be good s<strong>of</strong>tware design.<br />
Specialised R packages exist for h<strong>and</strong>l<strong>in</strong>g different <strong>spatial</strong> data file formats. The most useful<br />
ones are rgdal,shapefiles <strong>and</strong> maptools. These packages will make it possible for you to read<br />
your data from a file <strong>in</strong>to an R session. The rgdal package has the most functionality, but can<br />
sometimes be difficult to <strong>in</strong>stall, as it requires <strong>in</strong>stallation <strong>of</strong> an external library on your system.<br />
The packages shapefiles <strong>and</strong> maptools have no such difficulty.<br />
The package sp provides generic support for <strong>spatial</strong> data types <strong>in</strong> R. It enables you to convert<br />
between different representations <strong>of</strong> your data <strong>in</strong> R.<br />
The usual procedure for convert<strong>in</strong>g <strong>spatial</strong> data is:<br />
1. read your data file <strong>in</strong>to R us<strong>in</strong>g a package designed specifically for that file format (e.g.<br />
shapefiles for ESRI shapefiles), convert<strong>in</strong>g it <strong>in</strong>to an R dataset;<br />
2. convert this R dataset <strong>in</strong>to a generic format used by sp;<br />
3. convert the generic sp format to the required spatstat format, us<strong>in</strong>g sp.<br />
For example, if your w<strong>in</strong>dow (<strong>spatial</strong> region) is supplied as an “ESRI shapefile” with a name<br />
like myfile.shp, then type the follow<strong>in</strong>g:<br />
> library(maptools)<br />
> S library(sp)<br />
> SP W
7.3 Functions that return a w<strong>in</strong>dow 43<br />
> S SP P elev W plot(W)<br />
W<br />
The result W is a w<strong>in</strong>dow.<br />
The accompany<strong>in</strong>g dataset bei.extra$grad is a pixel image <strong>of</strong> the slope (gradient) <strong>of</strong> the<br />
terra<strong>in</strong>. To f<strong>in</strong>d the subset where altitude is below 140 <strong>and</strong> slope exceeds 0.1,<br />
> grad V plot(V)<br />
Copyright c○CSIRO 2008
44 W<strong>in</strong>dows <strong>in</strong> spatstat<br />
7.4 Operations on w<strong>in</strong>dows<br />
V<br />
Basic methods for the class "ow<strong>in</strong>" <strong>in</strong>clude<br />
pr<strong>in</strong>t.ow<strong>in</strong> pr<strong>in</strong>t short description <strong>of</strong> a w<strong>in</strong>dow<br />
summary.ow<strong>in</strong> pr<strong>in</strong>t detailed summary <strong>of</strong> a w<strong>in</strong>dow<br />
plot.ow<strong>in</strong> plot a w<strong>in</strong>dow<br />
Numerous geometrical operations are implemented for w<strong>in</strong>dow objects. They <strong>in</strong>clude:<br />
area.ow<strong>in</strong> compute w<strong>in</strong>dow’s area<br />
diameter compute w<strong>in</strong>dow’s diameter<br />
<strong>in</strong>tersect.ow<strong>in</strong> <strong>in</strong>tersection <strong>of</strong> two w<strong>in</strong>dows<br />
union.ow<strong>in</strong> union <strong>of</strong> two w<strong>in</strong>dows<br />
bound<strong>in</strong>g.box F<strong>in</strong>d a tight bound<strong>in</strong>g box for the w<strong>in</strong>dow<br />
complement.ow<strong>in</strong> swap <strong>in</strong>side <strong>and</strong> outside<br />
rotate rotate w<strong>in</strong>dow<br />
shift translate w<strong>in</strong>dow<br />
aff<strong>in</strong>e apply aff<strong>in</strong>e transformation<br />
rescale change scale <strong>and</strong> adjust units<br />
as.mask convert to b<strong>in</strong>ary image mask<br />
dilate.ow<strong>in</strong> morphological dilation<br />
erode.ow<strong>in</strong> morphological erosion<br />
eroded.areas compute areas <strong>of</strong> eroded w<strong>in</strong>dows<br />
<strong>in</strong>side.ow<strong>in</strong> determ<strong>in</strong>e whether a <strong>po<strong>in</strong>t</strong> is <strong>in</strong>side a w<strong>in</strong>dow<br />
distmap.ow<strong>in</strong> distance transform image<br />
centroid.ow<strong>in</strong> compute centroid (centre <strong>of</strong> mass) <strong>of</strong> w<strong>in</strong>dow<br />
is.subset.ow<strong>in</strong> determ<strong>in</strong>e whether one w<strong>in</strong>dow conta<strong>in</strong>s another<br />
7.5 Creat<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> any w<strong>in</strong>dow<br />
As we saw <strong>in</strong> Section 6.1, the function ppp() will create a <strong>po<strong>in</strong>t</strong> pattern (an object <strong>of</strong> class<br />
"ppp") from raw numerical data <strong>in</strong> R.<br />
Suppose the x,y coord<strong>in</strong>ates <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> the pattern are conta<strong>in</strong>ed <strong>in</strong> vectors x <strong>and</strong> y <strong>of</strong><br />
equal length. Then<br />
ppp(x, y, other.arguments)<br />
will create the <strong>po<strong>in</strong>t</strong> pattern. The ‘other arguments’ must determ<strong>in</strong>e a w<strong>in</strong>dow for the pattern,<br />
<strong>in</strong> one <strong>of</strong> two ways:<br />
Copyright c○CSIRO 2008
7.5 Creat<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> any w<strong>in</strong>dow 45<br />
• the other arguments can be passed to ow<strong>in</strong> to determ<strong>in</strong>e a w<strong>in</strong>dow:<br />
ppp(x, y, xrange, yrange) <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> rectangle<br />
ppp(x, y, poly=p) <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> polygonal w<strong>in</strong>dow<br />
ppp(x, y, poly=p, xrange, yrange) <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> polygonal w<strong>in</strong>dow<br />
ppp(x, y, mask=m, xrange, yrange) <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> b<strong>in</strong>ary mask w<strong>in</strong>dow<br />
• if W is a w<strong>in</strong>dow object (class "ow<strong>in</strong>") then<br />
> ppp(x, y, w<strong>in</strong>dow = W)<br />
will create the <strong>po<strong>in</strong>t</strong> pattern.<br />
You may already have a w<strong>in</strong>dow W (an object <strong>of</strong> class "ow<strong>in</strong>") ready to h<strong>and</strong>, <strong>and</strong> now want<br />
to create a pattern <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> this w<strong>in</strong>dow. For example you may want to put a new <strong>po<strong>in</strong>t</strong><br />
pattern <strong>in</strong>side the w<strong>in</strong>dow <strong>of</strong> an exist<strong>in</strong>g <strong>po<strong>in</strong>t</strong> pattern X; the w<strong>in</strong>dow is accessed as X$w<strong>in</strong>dow,<br />
so type<br />
ppp(x, y, w<strong>in</strong>dow=X$w<strong>in</strong>dow)<br />
Copyright c○CSIRO 2008
46 Manipulat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
8 Manipulat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Before proceed<strong>in</strong>g, we need to know more about how to manipulate <strong>and</strong> <strong>in</strong>terrogate <strong>po<strong>in</strong>t</strong> pattern<br />
data.<br />
8.1 Format <strong>of</strong> ppp objects<br />
A <strong>po<strong>in</strong>t</strong> pattern is represented <strong>in</strong> spatstat by an object <strong>of</strong> the class "ppp". This conta<strong>in</strong>s the<br />
coord<strong>in</strong>ates <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s, optional ‘mark’ values attached to the <strong>po<strong>in</strong>t</strong>s, <strong>and</strong> a description <strong>of</strong> the<br />
study region or <strong>spatial</strong> ‘w<strong>in</strong>dow’.<br />
8.1.1 Format<br />
A <strong>po<strong>in</strong>t</strong> pattern object P has the follow<strong>in</strong>g components:<br />
• P$n is the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s (which may be zero).<br />
• P$x is a numeric vector conta<strong>in</strong><strong>in</strong>g the x coord<strong>in</strong>ates <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s. Its length equals P$n<br />
(<strong>and</strong> may be zero).<br />
• P$y is a numeric vector conta<strong>in</strong><strong>in</strong>g the y coord<strong>in</strong>ates <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s. Its length also equals<br />
P$n.<br />
• P$marks conta<strong>in</strong>s the marks. It is either NULL, or a vector <strong>of</strong> length P$n conta<strong>in</strong><strong>in</strong>g the<br />
mark values. The entries <strong>of</strong> P$marks may be <strong>of</strong> any atomic type (character, numeric,<br />
logical, complex).<br />
• P$w<strong>in</strong>dow is an object <strong>of</strong> class"ow<strong>in</strong>" (“observation w<strong>in</strong>dow”) determ<strong>in</strong><strong>in</strong>g the study region<br />
or <strong>spatial</strong> ‘w<strong>in</strong>dow’.<br />
You can extract these components <strong>in</strong>dividually; for example, to make a histogram <strong>of</strong> the<br />
x coord<strong>in</strong>ates just type hist(P$x). However, do not assign values to these components<br />
directly, or you may create <strong>in</strong>consistencies <strong>in</strong> the data which cause spatstat to crash. To<br />
manipulate <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, use the functions provided.<br />
Although a <strong>po<strong>in</strong>t</strong> pattern should be treated as an unordered set, the coord<strong>in</strong>ates are obviously<br />
stored <strong>in</strong> a particular order, <strong>and</strong> can be addressed us<strong>in</strong>g that order.<br />
> data(longleaf)<br />
> x y diameter cb<strong>in</strong>d(x, y, diameter)[1:5, ]<br />
x y diameter<br />
[1,] 200.0 8.8 32.9<br />
[2,] 199.3 10.0 53.5<br />
[3,] 193.6 22.4 68.0<br />
[4,] 167.7 35.6 17.7<br />
[5,] 183.9 45.4 36.9<br />
If the marks are a categorical variable, then P$marks is a factor.<br />
Copyright c○CSIRO 2008
8.2 Operations on ppp objects 47<br />
> data(chorley)<br />
> x y type data.frame(x, y, type)[55:60, ]<br />
x y type<br />
55 355.6 413.9 larynx<br />
56 355.5 413.9 larynx<br />
57 355.7 413.9 larynx<br />
58 355.6 414.1 larynx<br />
59 359.0 417.3 lung<br />
60 353.1 426.9 lung<br />
> is.factor(type)<br />
[1] TRUE<br />
> levels(type)<br />
[1] "larynx" "lung"<br />
> table(type)<br />
type<br />
larynx lung<br />
58 978<br />
8.1.2 A <strong>po<strong>in</strong>t</strong> pattern needs a w<strong>in</strong>dow<br />
Note especially that, when you create a new <strong>po<strong>in</strong>t</strong> pattern object, you need to specify the <strong>spatial</strong><br />
region or w<strong>in</strong>dow <strong>in</strong> which the pattern was observed. In spatstat, the observation w<strong>in</strong>dow is an<br />
<strong>in</strong>tegral part <strong>of</strong> the <strong>po<strong>in</strong>t</strong> pattern. A <strong>po<strong>in</strong>t</strong> pattern dataset consists <strong>of</strong> knowledge about where<br />
<strong>po<strong>in</strong>t</strong>s were not observed, as well as the locations where they were observed. Even someth<strong>in</strong>g as<br />
simple as estimat<strong>in</strong>g the <strong>in</strong>tensity <strong>of</strong> the pattern depends on the w<strong>in</strong>dow <strong>of</strong> observation. It would<br />
be wrong, or at least different, to analyze a <strong>po<strong>in</strong>t</strong> pattern dataset by “guess<strong>in</strong>g” the appropriate<br />
w<strong>in</strong>dow (e.g. by comput<strong>in</strong>g the convex hull <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s). An analogy may be drawn with the<br />
difference between sequential experiments <strong>and</strong> experiments <strong>in</strong> which the sample size is fixed a<br />
priori.<br />
Often, the w<strong>in</strong>dow <strong>of</strong> observation is a rectangle, so this requirement just means that we have<br />
to specify the x <strong>and</strong> y dimensions <strong>of</strong> the rectangle when we create the <strong>po<strong>in</strong>t</strong> pattern. W<strong>in</strong>dows<br />
with a more complicated shape can easily be represented <strong>in</strong> spatstat, as described below.<br />
For situations where the w<strong>in</strong>dow is really unknown, spatstat provides the function ripras<br />
to compute the Ripley-Rasson estimator <strong>of</strong> the w<strong>in</strong>dow, given only the <strong>po<strong>in</strong>t</strong> locations.<br />
8.2 Operations on ppp objects<br />
Directly manipulat<strong>in</strong>g the entries <strong>in</strong>side an object is not safe. It is also unnecessary, because<br />
these manipulations can be performed us<strong>in</strong>g functions or operators.<br />
For <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> (objects <strong>of</strong> class "ppp") there are the follow<strong>in</strong>g operations.<br />
Copyright c○CSIRO 2008
48 Manipulat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
8.2.1 Extract<strong>in</strong>g subsets<br />
Recall that <strong>in</strong> R the subset operator is [ ]. If x is a vector <strong>of</strong> numbers, then x[s] extracts an<br />
element or subset <strong>of</strong> x. The subset <strong>in</strong>dex s can be<br />
• a positive <strong>in</strong>teger: x[3] means the third element <strong>of</strong> x;<br />
• a vector <strong>of</strong> positive <strong>in</strong>tegers <strong>in</strong>dicat<strong>in</strong>g which elements to extract: x[c(2,4,6)] extracts<br />
the 2nd, 4th <strong>and</strong> 6th elements <strong>of</strong> x;<br />
• a vector <strong>of</strong> negative <strong>in</strong>tegers <strong>in</strong>dicat<strong>in</strong>g which elements not to extract: x[-1] means all<br />
elements <strong>of</strong> x except the first one;<br />
• a vector <strong>of</strong> logical values, <strong>of</strong> the same length as x, with each TRUE entry <strong>of</strong> s <strong>in</strong>dicat<strong>in</strong>g<br />
that the correspond<strong>in</strong>g entry <strong>of</strong> x should be extracted, <strong>and</strong> FALSE <strong>in</strong>dicat<strong>in</strong>g that it should<br />
not be extracted. For example x[x > 3.1] extracts those elements <strong>of</strong> x which are greater<br />
than 3.1.<br />
To extract a subset <strong>of</strong> a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> spatstat, we also use the subset operator [ ]. If<br />
X is a <strong>po<strong>in</strong>t</strong> pattern then X[s] is also a <strong>po<strong>in</strong>t</strong> pattern, consist<strong>in</strong>g <strong>of</strong> those <strong>po<strong>in</strong>t</strong>s <strong>of</strong> X selected by<br />
the subset <strong>in</strong>dex s, where s can be any <strong>of</strong> the three types listed above, (Recall that the <strong>po<strong>in</strong>t</strong>s<br />
<strong>in</strong> a <strong>po<strong>in</strong>t</strong> pattern object are stored <strong>in</strong> a particular order; this is the order <strong>in</strong> which they are<br />
<strong>in</strong>dexed by s.)<br />
> data(bei)<br />
> bei<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 3604 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 1000] x [0, 500] metres<br />
> bei[1:10]<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 10 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 1000] x [0, 500] metres<br />
It is also possible to extract the subset def<strong>in</strong>ed by a <strong>spatial</strong> region. If X is a <strong>po<strong>in</strong>t</strong> pattern<br />
<strong>and</strong> W is a <strong>spatial</strong> w<strong>in</strong>dow (object <strong>of</strong> class "ow<strong>in</strong>") then X[W] is the <strong>po<strong>in</strong>t</strong> pattern consist<strong>in</strong>g <strong>of</strong><br />
all <strong>po<strong>in</strong>t</strong>s <strong>of</strong> X that lie <strong>in</strong>side W.<br />
> W W<br />
w<strong>in</strong>dow: rectangle = [100, 800] x [100, 400] units<br />
> bei[W]<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 918 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [100, 800] x [100, 400] units<br />
Tip: You may need to put quotes around the subset operator <strong>in</strong> some contexts.<br />
The generic subset operator is [ but the help file is summoned by typ<strong>in</strong>g help("[").<br />
The subset method for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> is called [.ppp but the help file is summoned<br />
by typ<strong>in</strong>g help("[.ppp").<br />
The comm<strong>and</strong> split.ppp allows you to divide a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong>to sub-<strong>patterns</strong>, <strong>and</strong> the<br />
comm<strong>and</strong> by.ppp allows you to perform an operation on each sub-pattern.<br />
Copyright c○CSIRO 2008
8.2 Operations on ppp objects 49<br />
8.2.2 Fiddl<strong>in</strong>g with marks<br />
To extract the marks from a <strong>po<strong>in</strong>t</strong> pattern, use marks:<br />
> m radii X X<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 62 <strong>po<strong>in</strong>t</strong>s<br />
marks are numeric, <strong>of</strong> type ’double’<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [-1, 0] units<br />
> unmark(X)<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 62 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [-1, 0] units<br />
For a <strong>po<strong>in</strong>t</strong> pattern with real-valued marks, the method cut.ppp for the generic function<br />
cut will divide the range <strong>of</strong> mark values <strong>in</strong>to several discrete b<strong>and</strong>s, yield<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> pattern<br />
with categorical marks:<br />
> Y Y Y<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 62 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = (0,1] (1,10] (10,Inf]<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [-1, 0] units<br />
8.2.3 Chang<strong>in</strong>g scales <strong>and</strong> units<br />
A scalar dilation can be applied us<strong>in</strong>g aff<strong>in</strong>e. For example, the Swedish P<strong>in</strong>es data were<br />
recorded <strong>in</strong> decimetres. To convert the coord<strong>in</strong>ates to metres, we could type<br />
> data(swedishp<strong>in</strong>es)<br />
> X unitname(X) X<br />
Copyright c○CSIRO 2008
50 Manipulat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
planar <strong>po<strong>in</strong>t</strong> pattern: 71 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 9.6] x [0, 10] metres<br />
The comm<strong>and</strong> rescale performs the same function:<br />
> data(swedishp<strong>in</strong>es)<br />
> X X<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 71 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 9.6] x [0, 10] metres<br />
Beware that this does not change the marks <strong>in</strong> the <strong>po<strong>in</strong>t</strong> pattern. If your marks represent<br />
tree diameter <strong>and</strong> you want to rescale them as well, this must be done by h<strong>and</strong>.<br />
8.2.4 Geometrical transformations<br />
The comm<strong>and</strong>s rotate, shift <strong>and</strong> aff<strong>in</strong>e apply two-dimensional rotation, vector shifts, <strong>and</strong><br />
aff<strong>in</strong>e transformations, respectively.<br />
8.2.5 R<strong>and</strong>om perturbations <strong>of</strong> a <strong>po<strong>in</strong>t</strong> pattern<br />
It is sometimes useful to r<strong>and</strong>omise the data, for example for hypothesis test<strong>in</strong>g. The comm<strong>and</strong><br />
rshift will apply the same r<strong>and</strong>om shift to each <strong>po<strong>in</strong>t</strong>, while rjitter will apply a different<br />
r<strong>and</strong>om shift to each <strong>po<strong>in</strong>t</strong>. The comm<strong>and</strong> quadratresample performs a block resampl<strong>in</strong>g<br />
procedure <strong>in</strong> which the w<strong>in</strong>dow is divided <strong>in</strong>to rectangles <strong>and</strong> these rectangles are r<strong>and</strong>omly<br />
resampled.<br />
8.3 Example<br />
We will use one <strong>of</strong> the st<strong>and</strong>ard <strong>po<strong>in</strong>t</strong> pattern datasets that is <strong>in</strong>stalled with the package. The<br />
NZ trees dataset represent the positions <strong>of</strong> 86 trees <strong>in</strong> a forest plot 153 by 95 feet.<br />
> data(nztrees)<br />
> nztrees<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 86 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 153] x [0, 95] feet<br />
> plot(nztrees)<br />
Copyright c○CSIRO 2008<br />
nztrees
0 2 4 6 8 10<br />
8.3 Example 51<br />
To get an impression <strong>of</strong> local <strong>spatial</strong> variations <strong>in</strong> <strong>in</strong>tensity, we plot a kernel density estimate<br />
<strong>of</strong> <strong>in</strong>tensity.<br />
> contour(density(nztrees, 10), axes = FALSE)<br />
0.01<br />
0.002<br />
0.004<br />
0.008<br />
0.01<br />
density(nztrees, 10)<br />
0.006<br />
0.004<br />
0.004<br />
0.006<br />
0.008 0.008<br />
0.01<br />
0.004<br />
0.012<br />
0.01<br />
0.018<br />
0.014<br />
0.008<br />
The density surface has a steep slope at the top right-h<strong>and</strong> corner <strong>of</strong> the study region.<br />
Look<strong>in</strong>g at the plot <strong>of</strong> the <strong>po<strong>in</strong>t</strong> pattern itself, we can see a cluster <strong>of</strong> trees at the top right.<br />
You may also notice a l<strong>in</strong>e <strong>of</strong> trees at the right-h<strong>and</strong> edge <strong>of</strong> the study region. It looks<br />
as though the study region may have <strong>in</strong>cluded some trees that were planted as a boundary or<br />
avenue. This sticks out like a sore thumb if we plot the x coord<strong>in</strong>ates <strong>of</strong> the trees:<br />
> hist(nztrees$x, nclass = 25)<br />
Histogram <strong>of</strong> nztrees$x<br />
We might want to exclude the right-h<strong>and</strong> boundary from the study region, to focus on the<br />
pattern <strong>of</strong> the rema<strong>in</strong><strong>in</strong>g trees. Let’s say we decide to trim a 5-foot marg<strong>in</strong> <strong>of</strong>f the right-h<strong>and</strong><br />
side.<br />
First we create the new, trimmed study region:<br />
> chopped w<strong>in</strong> chopped chopped<br />
w<strong>in</strong>dow: rectangle = [0, 148] x [0, 95] feet<br />
0.008<br />
Copyright c○CSIRO 2008
52 Manipulat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
(Notice that chopped is not a <strong>po<strong>in</strong>t</strong> pattern, but simply a rectangle <strong>in</strong> the plane.)<br />
Then, us<strong>in</strong>g the subset operator [.ppp, we simply extract the subset <strong>of</strong> the orig<strong>in</strong>al <strong>po<strong>in</strong>t</strong><br />
pattern that lies <strong>in</strong>side the new w<strong>in</strong>dow:<br />
> nzchop summary(nzchop)<br />
Planar <strong>po<strong>in</strong>t</strong> pattern: 78 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 0.00555 <strong>po<strong>in</strong>t</strong>s per square foot<br />
W<strong>in</strong>dow: rectangle = [0, 148] x [0, 95] feet<br />
W<strong>in</strong>dow area = 14060 square feet<br />
Unit <strong>of</strong> length: 1 foot<br />
> plot(density(nzchop, 10))<br />
> plot(nzchop, add = TRUE)<br />
density(nzchop, 10)<br />
Remov<strong>in</strong>g the right marg<strong>in</strong> seems to have produced a much more uniform pattern.<br />
8.4 Splitt<strong>in</strong>g <strong>and</strong> comb<strong>in</strong><strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Sometimes it is useful to split a <strong>po<strong>in</strong>t</strong> pattern dataset <strong>in</strong>to several sub-<strong>patterns</strong>, <strong>and</strong> perform<br />
some calculations on each sub-pattern.<br />
8.4.1 Splitt<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong>to sub-<strong>patterns</strong><br />
The powerful R comm<strong>and</strong> split has a method for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. This enables the user to<br />
divide a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong>to sub-<strong>patterns</strong> us<strong>in</strong>g any suitable criterion.<br />
• If X is a marked <strong>po<strong>in</strong>t</strong> pattern, <strong>and</strong> the marks are a factor, then split(X) separates the<br />
data <strong>po<strong>in</strong>t</strong>s <strong>in</strong>to different <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> accord<strong>in</strong>g to their mark value.<br />
• If Z is a pixel image with factor values, then split(X,Z) separates the data <strong>po<strong>in</strong>t</strong>s <strong>in</strong>to<br />
different <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> accord<strong>in</strong>g to the pixel value <strong>of</strong> Z at each <strong>po<strong>in</strong>t</strong>.<br />
• If Z is a tessellation, then split(X,Z) separates the <strong>po<strong>in</strong>t</strong> pattern X <strong>in</strong>to sub-<strong>patterns</strong><br />
del<strong>in</strong>eated by the tiles <strong>of</strong> Z.<br />
Copyright c○CSIRO 2008<br />
0.005 0.01 0.015 0.02
8.4 Splitt<strong>in</strong>g <strong>and</strong> comb<strong>in</strong><strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> 53<br />
In each case the result is a list <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. You can then use the R comm<strong>and</strong> lapply<br />
to perform any desired operation on each element <strong>of</strong> the list. For example, to apply adaptive<br />
estimation <strong>of</strong> <strong>in</strong>tensity to each species <strong>of</strong> tree <strong>in</strong> the Lans<strong>in</strong>g Woods data,<br />
> data(lans<strong>in</strong>g)<br />
> V A plot(as.list<strong>of</strong>(A))<br />
A neater way to operate on sub-<strong>patterns</strong> is to use by.ppp, a method for the R function<br />
by. The call by(X, INDICES=Z, FUN=f) is essentially equivalent to lapply(split(X,Z), f).<br />
It splits the dataset X <strong>in</strong>to sub-<strong>patterns</strong> accord<strong>in</strong>g to Z, then applies the function f to each<br />
sub-pattern. So to apply adaptive estimation <strong>of</strong> <strong>in</strong>tensity to each species <strong>of</strong> tree <strong>in</strong> the Lans<strong>in</strong>g<br />
Woods data,<br />
> data(lans<strong>in</strong>g)<br />
> A plot(A)<br />
8.4.2 Comb<strong>in</strong><strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Any number <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> can be comb<strong>in</strong>ed to make a s<strong>in</strong>gle pattern, us<strong>in</strong>g superimpose.<br />
> X Y superimpose(X, Y)<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 30 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units<br />
The argument W, if given, specifies the w<strong>in</strong>dow for the comb<strong>in</strong>ed <strong>po<strong>in</strong>t</strong> pattern.<br />
> superimpose(X, Y, W = square(2))<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 30 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 2] x [0, 2] units<br />
To attach a separate mark to each component pattern, use argument names:<br />
> superimpose(Hooray = X, Boo = Y)<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 30 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = Hooray Boo<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units<br />
Copyright c○CSIRO 2008
54 Pixel images <strong>in</strong> spatstat<br />
9 Pixel images <strong>in</strong> spatstat<br />
An object <strong>of</strong> class "im" represents a pixel image. It specifies a rectangular grid <strong>of</strong> locations<br />
(“pixels”) <strong>in</strong> two dimensional space, <strong>and</strong> a numerical value for each pixel. The pixel values<br />
can be real numbers, <strong>in</strong>tegers, complex numbers, s<strong>in</strong>gle characters or str<strong>in</strong>gs, logical values or<br />
categorical values. A pixel’s value can also be NA, mean<strong>in</strong>g that it is not def<strong>in</strong>ed at that location.<br />
A pixel image represents a <strong>spatial</strong> function Z(u) <strong>in</strong> many different contexts. It may conta<strong>in</strong><br />
experimental data (such as a map <strong>of</strong> terra<strong>in</strong> elevation) or computed values (such as a kernel<br />
estimate <strong>of</strong> <strong>po<strong>in</strong>t</strong> process <strong>in</strong>tensity) or it may be directly obta<strong>in</strong>ed from a camera (such as a<br />
satellite image).<br />
9.1 Creat<strong>in</strong>g a pixel image<br />
9.1.1 Creat<strong>in</strong>g an image from raw data<br />
To create a pixel image from raw data, use im:<br />
> im(mat, xcol, yrow)<br />
where mat is a matrix conta<strong>in</strong><strong>in</strong>g the pixel values. The pixel values could have been generated<br />
by h<strong>and</strong>, or read from a file.<br />
The correspondence between matrix <strong>in</strong>dices mat[i,j] <strong>and</strong> cartesian coord<strong>in</strong>ates is slightly<br />
idiosyncratic: the rows <strong>of</strong>mcorrespond to the y coord<strong>in</strong>ate, <strong>and</strong> the columns to the x coord<strong>in</strong>ate.<br />
The argument xcol is a vector <strong>of</strong> equally-spaced x coord<strong>in</strong>ate values correspond<strong>in</strong>g to the<br />
columns <strong>of</strong> mat, <strong>and</strong> yrow is a vector <strong>of</strong> equally-spaced y coord<strong>in</strong>ate values correspond<strong>in</strong>g to<br />
the rows <strong>of</strong> mat. These vectors determ<strong>in</strong>e the <strong>spatial</strong> position <strong>of</strong> the pixel grid. The length <strong>of</strong><br />
xcol is ncol(mat) while the length <strong>of</strong> yrow is nrow(mat). If mat is not a matrix, it will be<br />
converted <strong>in</strong>to a matrix with nrow(mat) = length(yrow) <strong>and</strong> ncol(mat) = length(xcol).<br />
> vec mat noisy plot(noisy)<br />
noisy<br />
−6 −4 −2 0 2 4 6<br />
For some strange reason, R does not allow matrices with categorical (factor) values. To<br />
create a pixel image with categorical values, leave the pixel values as a vector. The im comm<strong>and</strong><br />
will reshape it:<br />
Copyright c○CSIRO 2008
9.1 Creat<strong>in</strong>g a pixel image 55<br />
> cutvec cutnoise plot(cutnoise)<br />
cutnoise<br />
(−7.09,−1.98] (−1.98,3.14] (3.14,8.25]<br />
Although mat was a matrix, cutvec is a vector, with factor values. F<strong>in</strong>ally cutnoise is a<br />
factor-valued image.<br />
9.1.2 Convert<strong>in</strong>g a function to an image<br />
The comm<strong>and</strong> as.im will convert other types <strong>of</strong> data to a pixel image.<br />
A function f(x,y) can be converted <strong>in</strong>to a pixel image. This makes it easy to create a pixel<br />
image <strong>in</strong> which the pixel values are def<strong>in</strong>ed by an algebraic formula <strong>in</strong> the x <strong>and</strong> y coord<strong>in</strong>ates.<br />
> f w Z
56 Pixel images <strong>in</strong> spatstat<br />
as.im converts other data to a pixel image<br />
density.ppp kernel smooth<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong> pattern<br />
density.psp kernel smooth<strong>in</strong>g <strong>of</strong> l<strong>in</strong>e segment pattern<br />
distmap.ow<strong>in</strong> distance function <strong>of</strong> w<strong>in</strong>dow<br />
distmap.ppp distance function <strong>of</strong> <strong>po<strong>in</strong>t</strong> pattern<br />
distmap.psp distance function <strong>of</strong> l<strong>in</strong>e segment pattern<br />
setcov geometric covariance function <strong>of</strong> a w<strong>in</strong>dow<br />
predict.ppm fitted <strong>in</strong>tensity <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process model<br />
[.im subset <strong>of</strong> an image (or look up pixel values)<br />
shift.im vector shift <strong>of</strong> image doma<strong>in</strong><br />
rescale.im rescal<strong>in</strong>g <strong>of</strong> image doma<strong>in</strong><br />
eval.im evaluate any expression <strong>in</strong>volv<strong>in</strong>g images<br />
cut.im convert numeric image to factor image<br />
split.im divide pixel image <strong>in</strong>to sub-images<br />
by.im apply function to subsets <strong>of</strong> pixel image<br />
<strong>in</strong>terp.im <strong>spatial</strong> <strong>in</strong>terpolation <strong>of</strong> image<br />
blur <strong>spatial</strong> blurr<strong>in</strong>g <strong>and</strong> extrapolation <strong>of</strong> image<br />
9.2 Inspect<strong>in</strong>g an image<br />
9.2.1 Plott<strong>in</strong>g an image<br />
Methods for plott<strong>in</strong>g an image object <strong>in</strong>clude:<br />
plot.im display as colour image<br />
contour.im contour plot<br />
persp.im perspective plot <strong>of</strong> surface<br />
These are methods for generic functions, so you would typeplot(Z),contour(Z) orpersp(Z)<br />
to display a pixel image Z.<br />
> opa data(redwood)<br />
> D plot(D)<br />
> persp(D)<br />
> contour(D)<br />
> par(opa)<br />
D<br />
0 20 40 60 80 100 120<br />
D<br />
y<br />
D<br />
x<br />
For plot.im, note that the default colour map for image plots <strong>in</strong> R has only 12 colours<br />
<strong>and</strong> can convey a mislead<strong>in</strong>g impression <strong>of</strong> the gradation <strong>of</strong> pixel values <strong>in</strong> the image. Use the<br />
argument col to control the colour map.<br />
Copyright c○CSIRO 2008<br />
−1.0 −0.8 −0.6 −0.4 −0.2 0.0<br />
50<br />
30<br />
10<br />
20<br />
40<br />
20<br />
40<br />
60<br />
90<br />
110<br />
120<br />
100<br />
60<br />
30<br />
70<br />
110<br />
100<br />
80<br />
50<br />
70<br />
D<br />
70<br />
60<br />
70<br />
100<br />
80<br />
110<br />
30<br />
120<br />
40<br />
90<br />
50<br />
40<br />
50<br />
60<br />
70<br />
80
9.2 Inspect<strong>in</strong>g an image 57<br />
> opa plot(Z)<br />
> plot(Z, col = grey(seq(1, 0, length = 512)))<br />
> par(opa)<br />
Z<br />
0 0.5 1 1.5<br />
For persp.im, see also the help for persp.default for the names <strong>of</strong> various arguments to<br />
control the appearance <strong>of</strong> the plot. For example, the view<strong>in</strong>g direction is controlled by the angles<br />
theta <strong>and</strong> phi.<br />
> persp(density(redwood), theta = 30)<br />
density(redwood)<br />
x<br />
density(redwood)<br />
y<br />
Similarly for contour.im, consult also the help file for contour.default to control the<br />
appearance <strong>of</strong> the contours.<br />
For some <strong>in</strong>spir<strong>in</strong>g examples <strong>of</strong> perspective <strong>and</strong> contour plots with beautiful colour schemes<br />
<strong>and</strong> shad<strong>in</strong>g, see the R graphics demonstration by typ<strong>in</strong>g demo(graphics).<br />
Z<br />
0 0.5 1 1.5<br />
Copyright c○CSIRO 2008
58 Pixel images <strong>in</strong> spatstat<br />
9.2.2 Exploratory analysis<br />
To <strong>in</strong>spect an image, the follow<strong>in</strong>g are useful.<br />
as.matrix extract matrix <strong>of</strong> pixel values from image<br />
cut.im convert numeric image to factor image<br />
hist.im histogram <strong>of</strong> pixel values<br />
For an image Z with any type <strong>of</strong> values, plot(cut(Z, 3)) will divide the pixel values <strong>in</strong>to 3<br />
b<strong>and</strong>s, <strong>and</strong> display the image with the 3 b<strong>and</strong>s rendered <strong>in</strong> 3 different colours.<br />
To compute numerical summaries <strong>of</strong> pixel values, like the median or order statistics <strong>of</strong> the<br />
pixel values, extract the pixel values us<strong>in</strong>g as.matrix(Z) then apply the summary operation.<br />
9.3 Manipulat<strong>in</strong>g images<br />
9.3.1 Subsets <strong>of</strong> an image<br />
The subset operator [ has a method for pixel images, [.im:<br />
> X[S]<br />
> X[S, drop = TRUE]<br />
The subset to be extracted is determ<strong>in</strong>ed by the <strong>in</strong>dex argument S.<br />
• If S is a <strong>po<strong>in</strong>t</strong> pattern, or a list(x,y), then the values <strong>of</strong> the pixel image X at these <strong>po<strong>in</strong>t</strong>s<br />
are extracted, <strong>and</strong> returned as a vector.<br />
• If S is a w<strong>in</strong>dow (an object <strong>of</strong> class "ow<strong>in</strong>"), the values <strong>of</strong> the image <strong>in</strong>side this w<strong>in</strong>dow<br />
are extracted. The result is a pixel image if possible, <strong>and</strong> a numeric vector otherwise (see<br />
help("[.im") for details).<br />
• If S is a pixel image with logical values, it is <strong>in</strong>terpreted as a w<strong>in</strong>dow (with TRUE <strong>in</strong>side<br />
the w<strong>in</strong>dow).<br />
The logical argument drop determ<strong>in</strong>es whether pixel values that are undef<strong>in</strong>ed are omitted<br />
(drop = TRUE) or returned as the value NA (drop=FALSE).<br />
See help("[.im") for full details.<br />
The subset operator can be used to look up the value <strong>of</strong> a pixel image at a s<strong>in</strong>gle <strong>po<strong>in</strong>t</strong>:<br />
> data(bei)<br />
> elev elev[list(x = 142, y = 356)]<br />
[1] 147.08<br />
or to display a subregion:<br />
> S plot(elev[S])<br />
Copyright c○CSIRO 2008
9.3 Manipulat<strong>in</strong>g images 59<br />
elev[S]<br />
140.5 141 141.5 142 142.5 143 143.5<br />
This can even be performed <strong>in</strong>teractively, us<strong>in</strong>g the R function locator to click on a <strong>po<strong>in</strong>t</strong><br />
<strong>in</strong> the w<strong>in</strong>dow:<br />
> elev[locator(1)]<br />
9.3.2 Computation with images<br />
The h<strong>and</strong>y function eval.im allows us to perform pixel-by-pixel calculations on an image or on<br />
several compatible images.<br />
If Z is a pixel image, to take the logarithm <strong>of</strong> each pixel value,<br />
> logZ C W V 3)<br />
> U 3, 42, Z))<br />
Other functions which manipulate images <strong>in</strong>clude the follow<strong>in</strong>g:<br />
shift.im vector shift <strong>of</strong> an image<br />
cut.im convert numeric image to factor image<br />
split.im divide pixel image <strong>in</strong>to sub-images<br />
by.im apply function to subsets <strong>of</strong> pixel image<br />
<strong>in</strong>terp.im <strong>spatial</strong>ly <strong>in</strong>terpolate an image<br />
levelset threshold an image (produces a w<strong>in</strong>dow)<br />
solutionset f<strong>in</strong>d the region where a statement is true (produces a w<strong>in</strong>dow)<br />
Copyright c○CSIRO 2008
60 Tessellations<br />
10 Tessellations<br />
A “tessellation” is a division <strong>of</strong> space <strong>in</strong>to non-overlapp<strong>in</strong>g regions (“tiles”).<br />
Tessellation<br />
Tessellations have several uses <strong>in</strong> spatstat. The tessellation may be ‘real’, for example,<br />
a cont<strong>in</strong>ent divided <strong>in</strong>to states or prov<strong>in</strong>ces. The tessellation may be completely artificial, for<br />
example, the rectangular quadrats which we use <strong>in</strong> quadrat count<strong>in</strong>g. Or the tessellation may<br />
be computed from other data, for example, the Dirichlet tessellation def<strong>in</strong>ed by a set <strong>of</strong> <strong>po<strong>in</strong>t</strong>s.<br />
10.1 Creat<strong>in</strong>g a tessellation<br />
An object <strong>of</strong> class "tess" represents a tessellation. Currently spatstat supports three k<strong>in</strong>ds <strong>of</strong><br />
tessellations:<br />
• rectangular tessellations <strong>in</strong> which the tiles are rectangles with sides parallel to the<br />
coord<strong>in</strong>ate axes;<br />
• tile lists, tessellations consist<strong>in</strong>g <strong>of</strong> a list <strong>of</strong> w<strong>in</strong>dows, usually polygonal w<strong>in</strong>dows;<br />
• pixellated tessellations, <strong>in</strong> which space is divided <strong>in</strong>to pixels <strong>and</strong> each tile occupies a<br />
subset <strong>of</strong> the pixel grid.<br />
Copyright c○CSIRO 2008<br />
rectangular list<br />
pixel<br />
a b c d e f g h i j k
10.2 Computed tessellations 61<br />
All three types <strong>of</strong> tessellation can be created by the comm<strong>and</strong> tess.<br />
To create a rectangular tessellation:<br />
> tess(xgrid = xg, ygrid = yg)<br />
where xg <strong>and</strong> yg are vectors <strong>of</strong> coord<strong>in</strong>ates <strong>of</strong> vertical <strong>and</strong> horizontal l<strong>in</strong>es determ<strong>in</strong><strong>in</strong>g a<br />
grid <strong>of</strong> rectangles. Alternatively, if you want to divide a rectangular w<strong>in</strong>dow W <strong>in</strong>to rectangles<br />
<strong>of</strong> equal size, you can type<br />
> quadrats(W, nx, ny)<br />
wherenx,ny are the numbers <strong>of</strong> rectangles <strong>in</strong> the x <strong>and</strong> y directions, respectively. A common<br />
use <strong>of</strong> this comm<strong>and</strong> is to create quadrats for a quadrat-count<strong>in</strong>g method.<br />
To create a tessellation from a list <strong>of</strong> w<strong>in</strong>dows,<br />
> tess(tiles = z)<br />
wherezis a list <strong>of</strong> objects <strong>of</strong> class "ow<strong>in</strong>". The w<strong>in</strong>dows should not be overlapp<strong>in</strong>g; currently<br />
spatstat does not check this. This comm<strong>and</strong> is commonly used when the study region is divided<br />
<strong>in</strong>to adm<strong>in</strong>istrative regions (states, départements, postcodes, counties) <strong>and</strong> the boundaries <strong>of</strong><br />
each sub-region are provided by GIS data files.<br />
To create a tessellation from a pixel image,<br />
> tess(image = Z)<br />
where Z is a pixel image with factor values. Each level <strong>of</strong> the factor represents a different tile<br />
<strong>of</strong> the tessellation. The pixels that have a particular value <strong>of</strong> the factor constitute a tile. This<br />
comm<strong>and</strong> is <strong>of</strong>ten used to separate the l<strong>and</strong>cover types <strong>in</strong> a l<strong>and</strong>cover image (a pixel image <strong>in</strong><br />
which each pixel is labelled by the type <strong>of</strong> vegetation or l<strong>and</strong> use at that location) <strong>in</strong>to different<br />
regions.<br />
The comm<strong>and</strong> as.tess can also be used to convert other types <strong>of</strong> data to a tessellation.<br />
10.2 Computed tessellations<br />
There are two comm<strong>and</strong>s which compute a tessellation from a <strong>po<strong>in</strong>t</strong> pattern.<br />
The comm<strong>and</strong> dirichlet(X) computes the Dirichlet tessellation or Voronoi tessellation <strong>of</strong><br />
the <strong>po<strong>in</strong>t</strong> pattern X. The tile associated with a given <strong>po<strong>in</strong>t</strong> <strong>of</strong> the pattern X is the region <strong>of</strong> space<br />
which is closer to that <strong>po<strong>in</strong>t</strong> than to any other <strong>po<strong>in</strong>t</strong> <strong>of</strong> X. The Dirichlet tiles are polygons. The<br />
comm<strong>and</strong> dirichlet(X) computes these polygons <strong>and</strong> <strong>in</strong>tersects them with the w<strong>in</strong>dow <strong>of</strong> X.<br />
> X plot(dirichlet(X))<br />
Copyright c○CSIRO 2008
62 Tessellations<br />
dirichlet(X)<br />
The comm<strong>and</strong> delaunay(X) computes the Delaunay triangulation <strong>of</strong> the <strong>po<strong>in</strong>t</strong> pattern X.<br />
Strictly speak<strong>in</strong>g this is not a tessellation but a network or graph, formed by jo<strong>in</strong><strong>in</strong>g some <strong>of</strong> the<br />
<strong>po<strong>in</strong>t</strong>s <strong>of</strong> X by straight l<strong>in</strong>es. Two <strong>po<strong>in</strong>t</strong>s <strong>of</strong> X are jo<strong>in</strong>ed if their Dirichlet tiles share a common<br />
edge. The result<strong>in</strong>g network forms a set <strong>of</strong> non-overlapp<strong>in</strong>g triangles. These triangles cover the<br />
convex hull <strong>of</strong> X rather than the entire w<strong>in</strong>dow <strong>of</strong> X.<br />
> plot(delaunay(X))<br />
delaunay(X)<br />
10.3 Operations <strong>in</strong>volv<strong>in</strong>g a tessellation<br />
There are methods for pr<strong>in</strong>t, plot <strong>and</strong> [ for tessellations.<br />
Use the comm<strong>and</strong> tiles to extract a list <strong>of</strong> the tiles <strong>in</strong> a tessellation. The result is a list<br />
<strong>of</strong> w<strong>in</strong>dows ("ow<strong>in</strong>" objects). This can be h<strong>and</strong>y if, for example, you want to compute some<br />
characteristic <strong>of</strong> the tiles <strong>in</strong> a tessellation, such as their areas or diameters:<br />
> X V U unlist(lapply(U, area.ow<strong>in</strong>))<br />
Copyright c○CSIRO 2008
10.3 Operations <strong>in</strong>volv<strong>in</strong>g a tessellation 63<br />
Tile 1 Tile 2 Tile 3 Tile 4 Tile 5 Tile 6 Tile 7<br />
0.08923453 0.18978755 0.06111215 0.11110646 0.07150611 0.05885538 0.05563879<br />
Tile 8 Tile 9 Tile 10<br />
0.19671046 0.08942098 0.07662757<br />
Tessellations can be used to classify the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> a <strong>po<strong>in</strong>t</strong> pattern, <strong>in</strong> split.ppp, cut.ppp<br />
<strong>and</strong> by.ppp. If X is a <strong>po<strong>in</strong>t</strong> pattern <strong>and</strong> V is a tessellation, then<br />
• cut(X,V) attaches marks to the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> X identify<strong>in</strong>g which tile <strong>of</strong> V each <strong>po<strong>in</strong>t</strong> falls <strong>in</strong>to;<br />
• split(X,V) divides the <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong>to sub-<strong>patterns</strong> accord<strong>in</strong>g to the tiles <strong>of</strong> V, <strong>and</strong><br />
returns a list <strong>of</strong> the sub-<strong>patterns</strong>;<br />
• by(X,V,FUN) divides the <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong>to sub-<strong>patterns</strong> accord<strong>in</strong>g to the tiles <strong>of</strong> V, applies<br />
the function FUN to each sub-pattern, <strong>and</strong> returns the results as a list.<br />
> par(mfrow = c(1, 3))<br />
> X plot(X)<br />
> Z plot(Z)<br />
> plot(cut(X, Z))<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16<br />
> par(mfrow = c(1, 1))<br />
> plot(split(X, Z))<br />
X Z cut(X, Z)<br />
Copyright c○CSIRO 2008
64 Tessellations<br />
1 2<br />
5<br />
9<br />
13<br />
6<br />
10<br />
14<br />
split(X, Z)<br />
3<br />
7 8<br />
11<br />
15<br />
If we plot two tessellations on the same <strong>spatial</strong> doma<strong>in</strong>, what we see is another tessellation.<br />
The “<strong>in</strong>tersection” (or “overlay” or “common ref<strong>in</strong>ement”) <strong>of</strong> two tessellations X <strong>and</strong> Y is the<br />
tessellation whose tiles are the <strong>in</strong>tersections between tiles <strong>of</strong> X <strong>and</strong> tiles <strong>of</strong> Y. The comm<strong>and</strong><br />
<strong>in</strong>tersect.tess computes the <strong>in</strong>tersection <strong>of</strong> two tessellations.<br />
> opa plot(X)<br />
> plot(Y)<br />
> plot(<strong>in</strong>tersect.tess(X, Y))<br />
> par(opa)<br />
Copyright c○CSIRO 2008<br />
X Y <strong>in</strong>tersect.tess(X, Y)<br />
4<br />
12<br />
16
10.3 Operations <strong>in</strong>volv<strong>in</strong>g a tessellation 65<br />
PART III. INTENSITY AND RANDOMNESS<br />
F<strong>in</strong>ally we can start work<strong>in</strong>g on statistical methods for analys<strong>in</strong>g <strong>po<strong>in</strong>t</strong> pattern data. Part III<br />
<strong>of</strong> the workshop discusses how to <strong>in</strong>vestigate the <strong>in</strong>tensity <strong>of</strong> a <strong>po<strong>in</strong>t</strong> pattern, <strong>and</strong> how to assess<br />
whether a pattern is completely r<strong>and</strong>om.<br />
Copyright c○CSIRO 2008
66 Methods 1: Investigat<strong>in</strong>g <strong>in</strong>tensity<br />
11 Methods 1: Investigat<strong>in</strong>g <strong>in</strong>tensity<br />
When we analyse numerical data, we <strong>of</strong>ten beg<strong>in</strong> by tak<strong>in</strong>g the sample mean. The analogue <strong>of</strong><br />
the mean or expected value <strong>of</strong> a r<strong>and</strong>om variable is the <strong>in</strong>tensity <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process.<br />
‘Intensity’ is the average density <strong>of</strong> <strong>po<strong>in</strong>t</strong>s (expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s per unit area). Intensity<br />
may be constant (‘uniform’ or ‘homogeneous’) or may vary from location to location<br />
(‘<strong>in</strong>homogeneous’). Investigation <strong>of</strong> the <strong>in</strong>tensity should be one <strong>of</strong> the first steps <strong>in</strong> analys<strong>in</strong>g a<br />
<strong>po<strong>in</strong>t</strong> pattern.<br />
11.1 Uniform <strong>in</strong>tensity<br />
11.1.1 Theory<br />
If the <strong>po<strong>in</strong>t</strong> process X is homogeneous, then for any sub-region B <strong>of</strong> two-dimensional space, the<br />
expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> B is proportional to the area <strong>of</strong> B:<br />
E[N(X ∩ B)] = λarea(B)<br />
<strong>and</strong> the constant <strong>of</strong> proportionality λ is the <strong>in</strong>tensity. Intensity units are numbers per unit area<br />
(length −2 ). If we know that a <strong>po<strong>in</strong>t</strong> process is homogeneous, then the empirical density <strong>of</strong> <strong>po<strong>in</strong>t</strong>s,<br />
λ = n(x)<br />
area(W)<br />
is an unbiased estimator <strong>of</strong> the true <strong>in</strong>tensity λ.<br />
11.1.2 Implementation <strong>in</strong> spatstat<br />
To compute the estimator λ <strong>in</strong> spatstat, use summary.ppp:<br />
> data(bei)<br />
> summary(bei)<br />
Planar <strong>po<strong>in</strong>t</strong> pattern: 3604 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 0.00721 <strong>po<strong>in</strong>t</strong>s per square metre<br />
W<strong>in</strong>dow: rectangle = [0, 1000] x [0, 500] metres<br />
W<strong>in</strong>dow area = 5e+05 square metres<br />
Unit <strong>of</strong> length: 1 metre<br />
The estimated <strong>in</strong>tensity is λ = 0.00721 <strong>po<strong>in</strong>t</strong>s per square metre. To extract this <strong>in</strong>tensity<br />
value, type<br />
> lamb lamb<br />
[1] 0.007208<br />
Copyright c○CSIRO 2008
11.2 Inhomogeneous <strong>in</strong>tensity 67<br />
11.2 Inhomogeneous <strong>in</strong>tensity<br />
11.2.1 Theory<br />
In general the <strong>in</strong>tensity <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process will vary from place to place. Assume that the<br />
expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> a small region <strong>of</strong> area du around a location u is equal to<br />
λ(u)du. Then λ(u) is the “<strong>in</strong>tensity function” <strong>of</strong> the process, satisfy<strong>in</strong>g<br />
<br />
E[N(X ∩ B)] = λ(u)du<br />
for all regions B.<br />
More generally there could be s<strong>in</strong>gular concentrations <strong>of</strong> <strong>in</strong>tensity (e.g. earthquake epicentres<br />
may be concentrated along a fault l<strong>in</strong>e) so that an <strong>in</strong>tensity function does not exist. Then we<br />
speak <strong>of</strong> the “<strong>in</strong>tensity measure” Λ def<strong>in</strong>ed by<br />
B<br />
Λ(B) = E[N(X ∩ B)]<br />
for each B ⊂ R 2 , assum<strong>in</strong>g the expectation is f<strong>in</strong>ite.<br />
If it is suspected that the <strong>in</strong>tensity may be <strong>in</strong>homogeneous, the <strong>in</strong>tensity function or <strong>in</strong>tensity<br />
measure can be estimated nonparametrically by techniques such as quadrat count<strong>in</strong>g <strong>and</strong> kernel<br />
smooth<strong>in</strong>g.<br />
In quadrat count<strong>in</strong>g, the w<strong>in</strong>dow W is divided <strong>in</strong>to subregions (‘quadrats’) B1,... ,Bm <strong>of</strong><br />
equal area. We count the numbers <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> each quadrat, nj = n(x ∩ Bj) for j =<br />
1,... ,m. These are unbiased estimators <strong>of</strong> the correspond<strong>in</strong>g <strong>in</strong>tensity measure values Λ(Bj).<br />
The usual kernel estimator <strong>of</strong> the <strong>in</strong>tensity function is<br />
λ(u) = e(u)<br />
n<br />
κ(u − xi), (1)<br />
i=1<br />
where κ(u) is the kernel (an arbitrary probability density) <strong>and</strong><br />
e(u) −1 <br />
= κ(u − v)dv (2)<br />
is an edge effect bias correction. Clearly λ(u) is an unbiased estimator <strong>of</strong><br />
λ ∗ <br />
(u) = e(u) κ(u − v)λ(v)dv,<br />
W<br />
W<br />
a smoothed version <strong>of</strong> the true <strong>in</strong>tensity function λ(u). The choice <strong>of</strong> smooth<strong>in</strong>g kernel κ <strong>in</strong>volves<br />
a trade<strong>of</strong>f between bias <strong>and</strong> variance.<br />
Intensity can also be estimated us<strong>in</strong>g parametric methods, as we expla<strong>in</strong> <strong>in</strong> Section 13.<br />
11.2.2 Implementation <strong>in</strong> spatstat<br />
Quadrat count<strong>in</strong>g is performed <strong>in</strong> spatstat by the function quadratcount.<br />
> quadratcount(bei, nx = 4, ny = 2)<br />
x<br />
y [0,250] (250,500] (500,750] (750,1e+03]<br />
(250,500] 666 677 130 481<br />
[0,250] 544 165 643 298<br />
Copyright c○CSIRO 2008
68 Methods 1: Investigat<strong>in</strong>g <strong>in</strong>tensity<br />
> Q plot(bei, cex = 0.5, pch = "+")<br />
> plot(Q, add = TRUE, cex = 2)<br />
bei<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+ + +<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + ++<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+ +++ +<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
++ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+++<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+ + +<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++++<br />
++ + ++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+ +<br />
+ +<br />
+ +<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
++ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + ++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ + ++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
++ +<br />
+ +<br />
+<br />
+ +<br />
++<br />
+ +<br />
+++<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++++<br />
+<br />
+ + ++<br />
+ +<br />
+<br />
+<br />
+<br />
++ + +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
+ + + +<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ + ++<br />
+<br />
++++++ ++++<br />
+ + + + +<br />
+ +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
++++<br />
+<br />
++ +<br />
+<br />
++ ++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+++ ++<br />
++ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ + + +<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+ +<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
+ + +<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++++<br />
+<br />
+ + +<br />
++ + ++ +<br />
+<br />
++ +<br />
+ + +<br />
+<br />
++<br />
+<br />
+ +<br />
+ +<br />
++<br />
+<br />
++<br />
+<br />
+ ++ +<br />
+ +<br />
+<br />
+ + + +<br />
++ +++ ++<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++ + +<br />
++<br />
+<br />
+ +<br />
+ ++ + + ++<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+++<br />
+<br />
++<br />
+<br />
+ + + + ++ +<br />
+ +<br />
+<br />
+<br />
+ + + +<br />
++<br />
++<br />
++ +<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ ++ +<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
++<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + ++<br />
+<br />
+ +<br />
+<br />
+++<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+ + +<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+++<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
++<br />
+<br />
+<br />
+ ++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
++ ++ + + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+ +<br />
++<br />
+<br />
+ +++<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
++ +<br />
++ ++<br />
+<br />
+<br />
+ ++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+++<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+ +<br />
+ +++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+ + +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ ++<br />
+<br />
++<br />
+ +<br />
+ +<br />
++<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+ + +<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ ++ +<br />
+ + +<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ ++<br />
+++<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+ +<br />
+ +<br />
++<br />
+ ++<br />
+<br />
+<br />
+ +<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
++ +<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
++ +<br />
+ ++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ + + ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +++<br />
+ + +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+++<br />
+ +<br />
++<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+ +<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
++<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + + ++ +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++<br />
+ ++<br />
+ +<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++ + +<br />
+<br />
+ +++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
+ ++ +<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ + + +<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+ + +<br />
+ + +<br />
+<br />
+++<br />
++ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
++ + ++ +<br />
+<br />
+<br />
+<br />
+<br />
++ ++<br />
+<br />
+<br />
+ +<br />
++ ++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ + + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
++<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+++ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +++<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ + +<br />
+ +<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ + ++<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+ +<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
337 608 162 73 105 268<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + ++ +<br />
422 49 17 52 128 146<br />
231 134 92 406 310 64<br />
The value returned byquadratcount is an object belong<strong>in</strong>g to the special class"quadratcount".<br />
We have used the plot method for this class to get the display above.<br />
Kernel density (or <strong>in</strong>tensity) estimation us<strong>in</strong>g an isotropic Gaussian kernel is implemented<br />
<strong>in</strong> spatstat by the function density.ppp, a method for the generic comm<strong>and</strong> density.<br />
> den plot(den)<br />
> plot(bei, add = TRUE, cex = 0.5)<br />
den<br />
The value returned by density.ppp is a pixel image (object <strong>of</strong> class "im"). This class has<br />
methods for pr<strong>in</strong>t, summary, plot, contour (contour plots), persp (perspective plots) <strong>and</strong> so<br />
on.<br />
> persp(den)<br />
Copyright c○CSIRO 2008<br />
0.005 0.01 0.015 0.02
0 100 200 300 400 500<br />
11.2 Inhomogeneous <strong>in</strong>tensity 69<br />
den<br />
y<br />
> contour(den)<br />
0.012<br />
0.014<br />
0.012<br />
0.016<br />
0.02<br />
0.014<br />
0.01<br />
0.018<br />
0.006<br />
0.008<br />
0.002<br />
den<br />
x<br />
den<br />
0.004<br />
0.012<br />
0.002<br />
0.006<br />
0.01<br />
0.014<br />
0.016<br />
0.018<br />
0.008<br />
Alternatively, there is an adaptive estimator <strong>of</strong> <strong>in</strong>tensity which uses a fraction f <strong>of</strong> the<br />
data to construct a Dirichlet tessellation, then forms an <strong>in</strong>tensity estimate that is constant <strong>in</strong><br />
each tile <strong>of</strong> the tessellation:<br />
> aden plot(aden, ma<strong>in</strong> = "Adaptive <strong>in</strong>tensity")<br />
> plot(bei, add = TRUE, cex = 0.5)<br />
0.008<br />
0.006<br />
0.002<br />
0.004<br />
Copyright c○CSIRO 2008
70 Methods 1: Investigat<strong>in</strong>g <strong>in</strong>tensity<br />
Adaptive <strong>in</strong>tensity<br />
The value returned by adaptive.density is also a pixel image (object <strong>of</strong> class "im").<br />
11.3 Quadrats determ<strong>in</strong>ed by a covariate<br />
In quadrat count<strong>in</strong>g methods, any choice <strong>of</strong> quadrats is permissible. From a theoretical view<strong>po<strong>in</strong>t</strong>,<br />
the quadrats do not have to be rectangles <strong>of</strong> equal area, <strong>and</strong> could be regions <strong>of</strong> any<br />
shape.<br />
Quadrat count<strong>in</strong>g is more useful if we choose the quadrats <strong>in</strong> a mean<strong>in</strong>gful way. One way to<br />
do this is to def<strong>in</strong>e the quadrats us<strong>in</strong>g covariate <strong>in</strong>formation.<br />
For example, the tropical ra<strong>in</strong>forest <strong>po<strong>in</strong>t</strong> pattern dataset bei comes with an extra set <strong>of</strong><br />
covariate data bei.extra, which conta<strong>in</strong>s a pixel image <strong>of</strong> terra<strong>in</strong> elevation bei.extra$elev<br />
<strong>and</strong> a pixel image <strong>of</strong> terra<strong>in</strong> slope bei.extra$grad. It might be useful to split the study region<br />
<strong>in</strong>to several sub-regions accord<strong>in</strong>g to the terra<strong>in</strong> slope.<br />
> data(bei)<br />
> Z b Zcut V plot(V)<br />
> plot(bei, add = TRUE, pch = "+")<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+ +++<br />
+ +<br />
++<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++ +<br />
+ +<br />
+<br />
+ + + +<br />
++<br />
+<br />
++<br />
+ + +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ + +<br />
+ +<br />
++ + +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ + +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ + + +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++ +<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+ + ++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+ +<br />
+ +<br />
+<br />
++<br />
+<br />
++<br />
+<br />
++<br />
+++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +++<br />
+ + +<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++<br />
++<br />
++ + + ++<br />
+<br />
+<br />
+<br />
++<br />
+ ++ +<br />
+<br />
+ ++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
++++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
++ +++<br />
++++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+++ + ++<br />
+<br />
++<br />
++<br />
+<br />
+ +<br />
+++<br />
+<br />
+ ++<br />
++<br />
+<br />
+++<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
+<br />
+ + ++ ++<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
+ ++ +++<br />
+++ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +++ + +<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++ +<br />
+<br />
++ +<br />
++ +<br />
+ +<br />
+<br />
+<br />
+++<br />
+++ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+ +<br />
++<br />
+ + +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ ++<br />
++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++ +<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
++<br />
+ +<br />
+++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ ++ +<br />
+ +<br />
+++<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
++<br />
++<br />
+<br />
+<br />
++<br />
+ ++<br />
+<br />
+ ++ + + ++<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+ ++ +<br />
+ ++<br />
++ + +<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ ++<br />
++++<br />
+<br />
+ + ++<br />
+<br />
+++ +<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+++<br />
++<br />
+<br />
++ +<br />
+ + ++<br />
++<br />
++<br />
++<br />
+ +<br />
+ +<br />
++ +<br />
+ ++ +<br />
+<br />
+<br />
+ +++<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+++<br />
+ + + + +<br />
+<br />
++ + ++++<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ ++<br />
+ + +<br />
+<br />
+ ++ ++<br />
+ +<br />
+ + +<br />
+ + +++++ +<br />
+<br />
+ +<br />
+<br />
++ +<br />
+<br />
+<br />
+ +<br />
+ +++<br />
+<br />
++<br />
+ +++<br />
+ ++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
++<br />
+++<br />
+ ++<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
++<br />
++<br />
+ +<br />
++<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ ++<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+ ++++<br />
++<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+ +<br />
++<br />
+ +++<br />
++<br />
+<br />
++++<br />
+ +<br />
++<br />
+<br />
++<br />
+<br />
++<br />
+++ +<br />
+<br />
++<br />
+ +<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+ ++<br />
+ +++<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
+ ++<br />
++<br />
++ +<br />
+<br />
+ +<br />
+++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ + + ++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++<br />
+ ++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ + +++<br />
+++<br />
+ +<br />
+++ ++<br />
+ +<br />
++<br />
+<br />
+ ++<br />
+ ++ +<br />
+<br />
++<br />
++<br />
+ +<br />
+<br />
+ ++<br />
+<br />
+++<br />
++<br />
+++<br />
+<br />
+<br />
+ + ++<br />
+<br />
+ ++ +<br />
+ + ++<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+ +<br />
+<br />
+++<br />
+<br />
++++<br />
+ +<br />
+<br />
+ + ++<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
++<br />
+ ++ +<br />
++<br />
+<br />
+++ + +<br />
+<br />
+++<br />
+<br />
+++++ +<br />
+ + ++<br />
+<br />
++<br />
+ + ++ + +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
++<br />
+ +<br />
+ +<br />
+ + + + +<br />
+ + ++<br />
++ +<br />
++<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+ ++<br />
+++<br />
+<br />
+<br />
+ +++<br />
+<br />
+<br />
+ +<br />
++ +<br />
+ ++<br />
+ +<br />
++<br />
++ ++<br />
+<br />
+ + +<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+<br />
+++<br />
+<br />
++ +<br />
+ ++<br />
+<br />
+<br />
+ ++ +++<br />
+<br />
+<br />
++<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++ + ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ +++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++ +++ ++<br />
+ ++<br />
+ + +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+ + + +<br />
++<br />
++ + + +<br />
+ +<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
+<br />
+ +<br />
++ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ + +++<br />
+ + +<br />
+ +++ +<br />
+ +<br />
+++<br />
+<br />
+ ++ ++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
++<br />
+ +<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
++<br />
++ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+ ++ ++<br />
+<br />
+<br />
+++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+ +++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ ++ +<br />
+<br />
+ +<br />
+<br />
+ ++<br />
+<br />
++ +<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+++<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+ + ++<br />
+<br />
++ ++<br />
+<br />
+ + +<br />
++ +<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+ + + ++<br />
+ +<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ + ++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ + + + + + +<br />
+ +<br />
++<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
V<br />
The call to quantile gave us the quartiles <strong>of</strong> the slope values, so the four tiles <strong>in</strong> the<br />
tessellation V have equal area (ignor<strong>in</strong>g discretisation effects). In other words, we have divided<br />
the study region <strong>in</strong>to four zones <strong>of</strong> equal area accord<strong>in</strong>g to the terra<strong>in</strong> slope.<br />
Copyright c○CSIRO 2008<br />
0.01 0.03 0.05<br />
1 2 3 4
11.3 Quadrats determ<strong>in</strong>ed by a covariate 71<br />
We can now use this tessellation to study the <strong>po<strong>in</strong>t</strong> pattern bei. We could <strong>in</strong>voke the<br />
comm<strong>and</strong>s split, cut or by to divide the <strong>po<strong>in</strong>t</strong>s accord<strong>in</strong>g to this tessellation <strong>and</strong> manipulate<br />
the sub-<strong>patterns</strong>.<br />
The comm<strong>and</strong> quadratcount also works with tessellations:<br />
> qb qb<br />
tile<br />
1 2 3 4<br />
271 984 1028 1321<br />
> plot(qb)<br />
1028<br />
984<br />
qb<br />
271<br />
The text annotations show the number <strong>of</strong> trees <strong>in</strong> each region. S<strong>in</strong>ce the four regions have<br />
equal area, the counts should be approximately equal if there is a uniform density <strong>of</strong> trees.<br />
Obviously they are not equal; there appears to be a strong preference for steeper slopes.<br />
1321<br />
1 2 3 4<br />
Copyright c○CSIRO 2008
72 Methods 2: Tests <strong>of</strong> Complete Spatial R<strong>and</strong>omness<br />
12 Methods 2: Tests <strong>of</strong> Complete Spatial R<strong>and</strong>omness<br />
The basic ‘reference’ or ‘benchmark’ model <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process is the uniform Poisson <strong>po<strong>in</strong>t</strong><br />
process <strong>in</strong> the plane with <strong>in</strong>tensity λ, sometimes called Complete Spatial R<strong>and</strong>omness (CSR).<br />
Its basic properties are<br />
• the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> any region A has a Poisson distribution with mean λarea(A)<br />
• given that there are n <strong>po<strong>in</strong>t</strong>s <strong>in</strong>side region A, the locations <strong>of</strong> these <strong>po<strong>in</strong>t</strong>s are i.i.d. <strong>and</strong><br />
uniformly distributed <strong>in</strong>side A<br />
• the contents <strong>of</strong> two disjo<strong>in</strong>t regions A <strong>and</strong> B are <strong>in</strong>dependent.<br />
The uniform Poisson process is <strong>of</strong>ten the ‘null model’ <strong>in</strong> an analysis. For historical reasons,<br />
many applied writers focus on establish<strong>in</strong>g that their data do not conform to a uniform Poisson<br />
process.<br />
12.1 Def<strong>in</strong>ition<br />
The homogeneous Poisson process <strong>of</strong> <strong>in</strong>tensity λ > 0 has the properties<br />
(PP1): the number N(X ∩ B) <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> any region B is a Poisson r<strong>and</strong>om variable;<br />
(PP2): the expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> B is E[N(X ∩ B)] = λ · area(B);<br />
(PP3): if B1,B2 are disjo<strong>in</strong>t sets then N(X∩B1) <strong>and</strong> N(X∩B2) are <strong>in</strong>dependent r<strong>and</strong>om variables;<br />
(PP4): given that N(X ∩ B) = n, the n <strong>po<strong>in</strong>t</strong>s are <strong>in</strong>dependent <strong>and</strong> uniformly distributed <strong>in</strong> B.<br />
The list is redundant; (PP2) <strong>and</strong> (PP3) are sufficient.<br />
This process is <strong>of</strong>ten called “Complete Spatial R<strong>and</strong>omness” (CSR) especially <strong>in</strong> biological<br />
science. Under CSR, <strong>po<strong>in</strong>t</strong>s are <strong>in</strong>dependent <strong>of</strong> each other <strong>and</strong> have the same propensity to be<br />
found at any location.<br />
It is easy to simulate the Poisson process directly by follow<strong>in</strong>g the properties (PP1)–(PP4).<br />
In spatstat, use the comm<strong>and</strong> rpoispp (by convention, r<strong>and</strong>om data generators have names<br />
beg<strong>in</strong>n<strong>in</strong>g with r).<br />
> plot(rpoispp(100))<br />
rpoispp(100)<br />
Copyright c○CSIRO 2008
12.1 Def<strong>in</strong>ition 73<br />
Conceptually, if we discretise a homogeneous Poisson process <strong>in</strong>to <strong>in</strong>f<strong>in</strong>itesimal pixels, the<br />
<strong>in</strong>dicators I are <strong>in</strong>dependent <strong>and</strong> identically distributed, with success probability P {I = 1} =<br />
λdA where dA is the <strong>in</strong>f<strong>in</strong>itesimal area <strong>of</strong> a pixel.<br />
To develop some <strong>in</strong>tuition about completely r<strong>and</strong>om <strong>patterns</strong>, it’s useful to repeat the comm<strong>and</strong><br />
plot(rpoispp(100)) several times (use the up-arrow key to recall the previous comm<strong>and</strong><br />
l<strong>in</strong>e) so that you see several replicates <strong>of</strong> the Poisson process. In particular you will notice that<br />
the <strong>po<strong>in</strong>t</strong>s <strong>in</strong> a homogeneous Poisson process are not ‘uniformly spread’: there are empty gaps<br />
<strong>and</strong> clusters <strong>of</strong> <strong>po<strong>in</strong>t</strong>s.<br />
The comm<strong>and</strong> rpoispp has arguments lambda (the <strong>in</strong>tensity) <strong>and</strong> w<strong>in</strong> (the w<strong>in</strong>dow <strong>in</strong> which<br />
to simulate). The default w<strong>in</strong>dow is the unit square.<br />
> data(letterR)<br />
> plot(rpoispp(100, w<strong>in</strong> = letterR))<br />
rpoispp(100, w<strong>in</strong> = letterR)<br />
If you want to simulate a Poisson process conditionally on a fixed number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s, use the<br />
comm<strong>and</strong> runif<strong>po<strong>in</strong>t</strong>.<br />
> runif<strong>po<strong>in</strong>t</strong>(100)<br />
planar <strong>po<strong>in</strong>t</strong> pattern: 100 <strong>po<strong>in</strong>t</strong>s<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units<br />
Copyright c○CSIRO 2008
74 Methods 2: Tests <strong>of</strong> Complete Spatial R<strong>and</strong>omness<br />
12.2 Quadrat count<strong>in</strong>g tests for CSR<br />
In classical literature, the homogeneous Poisson process (CSR) is usually taken as the appropriate<br />
‘null’ model for a <strong>po<strong>in</strong>t</strong> pattern. Our basic task <strong>in</strong> analys<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> pattern is to f<strong>in</strong>d evidence<br />
aga<strong>in</strong>st CSR.<br />
A classical test for the null hypothesis <strong>of</strong> CSR is the χ 2 test based on quadrat counts. As<br />
expla<strong>in</strong>ed earlier, the w<strong>in</strong>dow W is divided <strong>in</strong>to subregions (‘quadrats’) B1,... ,Bm <strong>of</strong> equal area.<br />
We count the numbers <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> each quadrat, nj = n(x ∩Bj) for j = 1,... ,m. Under<br />
the null hypothesis <strong>of</strong> CSR, the nj are i.i.d. Poisson r<strong>and</strong>om variables with the same expected<br />
value. The Pearson χ 2 goodness-<strong>of</strong>-fit test can be used.<br />
> quadrat.test(nzchop, nx = 3, ny = 2)<br />
Chi-squared test <strong>of</strong> CSR us<strong>in</strong>g quadrat counts<br />
data: nzchop<br />
X-squared = 5.0769, df = 5, p-value = 0.4066<br />
The value returned by quadrat.test is an object <strong>of</strong> class "htest" (the st<strong>and</strong>ard R class<br />
for hypothesis tests). Pr<strong>in</strong>t<strong>in</strong>g the object (as shown above) gives comprehensible output about<br />
the outcome <strong>of</strong> the test. Inspect<strong>in</strong>g the p-value, we see that the test does not reject the null<br />
hypothesis <strong>of</strong> CSR for the (chopped) New Zeal<strong>and</strong> trees data.<br />
The return value quadrat.test also belongs to the special class "quadrat.test". Plott<strong>in</strong>g<br />
the object will display the quadrats, annotated by their observed <strong>and</strong> expected counts <strong>and</strong> the<br />
Pearson residuals (observed counts nj at top left; expected count at top right; Pearson residuals<br />
at bottom).<br />
> M M<br />
Chi-squared test <strong>of</strong> CSR us<strong>in</strong>g quadrat counts<br />
data: nzchop<br />
X-squared = 5.0769, df = 5, p-value = 0.4066<br />
> plot(nzchop)<br />
> plot(M, add = TRUE, cex = 2)<br />
nzchop<br />
9 13 14 13 17 13<br />
−1.1 0.28 1.1<br />
17 13 9 13 12 13<br />
1.1 −1.1 −0.28<br />
The p-value can also be extracted by<br />
Copyright c○CSIRO 2008
12.3 Critique 75<br />
> M$p.value<br />
[1] 0.4065648<br />
12.3 Critique<br />
S<strong>in</strong>ce this k<strong>in</strong>d <strong>of</strong> technique is <strong>of</strong>ten used <strong>in</strong> the applied literature, a few comments are appropriate.<br />
The ma<strong>in</strong> critique <strong>of</strong> the quadrat test approach is the lack <strong>of</strong> <strong>in</strong>formation. This is a goodness<strong>of</strong>-fit<br />
test <strong>in</strong> which the alternative hypothesis H1 is simply the negation <strong>of</strong> H0, that is, the<br />
alternative is that “the process is not a homogeneous Poisson process”. A <strong>po<strong>in</strong>t</strong> process may<br />
fail to satisfy properties (PP1)–(PP4) either because it violates (PP2) by hav<strong>in</strong>g non-uniform<br />
<strong>in</strong>tensity, or because it violates (PP3)–(PP4) by exhibit<strong>in</strong>g dependence between <strong>po<strong>in</strong>t</strong>s. There<br />
are too many types <strong>of</strong> departure from H0.<br />
The usual justification for the classical χ 2 goodness-<strong>of</strong>-fit test is to assume that the counts<br />
are <strong>in</strong>dependent, <strong>and</strong> derive a test <strong>of</strong> the null hypothesis that all counts have the same expected<br />
value. Invok<strong>in</strong>g it here is slightly naive, s<strong>in</strong>ce the <strong>in</strong>dependence <strong>of</strong> counts is also open to question<br />
here.<br />
Indeed we can also turn th<strong>in</strong>gs around <strong>and</strong> view the χ 2 test as a test <strong>of</strong> the Poisson distributional<br />
properties (PP2)–(PP3) assum<strong>in</strong>g that the <strong>in</strong>tensity is uniform. The Pearson χ 2 test<br />
statistic<br />
X 2 <br />
j<br />
=<br />
(nj − n/m) 2<br />
n/m<br />
(where n = <br />
j nj is the total number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s) co<strong>in</strong>cides, up to a constant factor, with the<br />
sample variance-to-mean ratio <strong>of</strong> the counts nj, which is <strong>of</strong>ten <strong>in</strong>terpreted as a measure <strong>of</strong><br />
over/under-dispersion <strong>of</strong> the counts nj assum<strong>in</strong>g they have constant mean.<br />
The power <strong>of</strong> the quadrat test depends on the size <strong>of</strong> quadrats, <strong>and</strong> falls to zero for quadrats<br />
which are either very large or very small. The power also depends on the alternative hypothesis,<br />
<strong>in</strong> particular on the ‘<strong>spatial</strong> scale’ <strong>of</strong> any departures from the assumptions <strong>of</strong> constant <strong>in</strong>tensity<br />
<strong>and</strong> <strong>in</strong>dependence <strong>of</strong> <strong>po<strong>in</strong>t</strong>s. The choice <strong>of</strong> quadrat size carries an implicit assumption about the<br />
<strong>spatial</strong> scale.<br />
12.4 Kolmogorov-Smirnov test <strong>of</strong> CSR<br />
Typically a more powerful test <strong>of</strong> CSR is the Kolmogorov-Smirnov test <strong>in</strong> which we compare<br />
the observed <strong>and</strong> expected distributions <strong>of</strong> the values <strong>of</strong> some function T.<br />
We specify a real-valued function T(x,y) def<strong>in</strong>ed at all locations (x,y) <strong>in</strong> the w<strong>in</strong>dow. We<br />
evaluate this function at each <strong>of</strong> the data <strong>po<strong>in</strong>t</strong>s. Then we compare this empirical distribution<br />
<strong>of</strong> values <strong>of</strong> T with the predicted distribution <strong>of</strong> values <strong>of</strong> T under CSR, us<strong>in</strong>g the classical<br />
Kolmogorov-Smirnov test.<br />
In spatstat the <strong>spatial</strong> Kolmogorov-Smirnov test is performed by kstest. This function is<br />
generic. The method for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, kstest.ppp, performs the Kolmogorov-Smirnov test<br />
for CSR.<br />
If X is the data <strong>po<strong>in</strong>t</strong> pattern, then<br />
> kstest(X, fun)<br />
performs the test, where fun is a function(x,y) <strong>in</strong> the R language.<br />
For example, let’s consider thenzchop data <strong>and</strong> choose the function T to be the x coord<strong>in</strong>ate,<br />
T(x,y) = x. This means we are simply compar<strong>in</strong>g the observed <strong>and</strong> expected distributions <strong>of</strong><br />
the x coord<strong>in</strong>ate.<br />
Copyright c○CSIRO 2008
76 Methods 2: Tests <strong>of</strong> Complete Spatial R<strong>and</strong>omness<br />
> kstest(nzchop, function(x, y) {<br />
+ x<br />
+ })<br />
Spatial Kolmogorov-Smirnov test <strong>of</strong> CSR<br />
data: covariate ’function(x, y) { x}’ evaluated at <strong>po<strong>in</strong>t</strong>s <strong>of</strong> ’nzchop’<br />
<strong>and</strong> transformed to uniform distribution under CSR<br />
D = 0.0717, p-value = 0.7913<br />
alternative hypothesis: two-sided<br />
The result <strong>of</strong> kstest is an object <strong>of</strong> class "htest" (the st<strong>and</strong>ard R class for hypothesis<br />
tests) <strong>and</strong> also <strong>of</strong> class "kstest" so that it can be pr<strong>in</strong>ted <strong>and</strong> plotted. The pr<strong>in</strong>t method<br />
(demonstrated above) reports <strong>in</strong>formation about the hypothesis test such as the p-value. The<br />
plot method displays the observed <strong>and</strong> expected distribution functions.<br />
> KS plot(KS)<br />
> pval
12.5 Us<strong>in</strong>g covariate data 77<br />
that the density <strong>of</strong> trees depends on terra<strong>in</strong> slope. To test this formally we can divide the region<br />
<strong>in</strong>to irregular quadrats accord<strong>in</strong>g to the terra<strong>in</strong> slope, <strong>and</strong> apply the χ 2 test. The comm<strong>and</strong><br />
quadrat.test accepts a tessellation <strong>and</strong> uses the tiles <strong>of</strong> the tessellation as the quadrats:<br />
> data(bei)<br />
> Z b Zcut V quadrat.test(bei, tess = V)<br />
Chi-squared test <strong>of</strong> CSR us<strong>in</strong>g quadrat counts<br />
data: bei<br />
X-squared = 661.8402, df = 3, p-value < 2.2e-16<br />
Because <strong>of</strong> the large counts <strong>in</strong> these regions, we can probably ignore concerns about <strong>in</strong>dependence,<br />
<strong>and</strong> conclude that the trees are not uniform <strong>in</strong> their <strong>in</strong>tensity.<br />
A more powerful test (if that were needed!) is the Kolmogorov-Smirnov test us<strong>in</strong>g the slope<br />
covariate:<br />
> KS plot(KS)<br />
> KS<br />
Spatial Kolmogorov-Smirnov test <strong>of</strong> CSR<br />
data: covariate ’Z’ evaluated at <strong>po<strong>in</strong>t</strong>s <strong>of</strong> ’bei’<br />
<strong>and</strong> transformed to uniform distribution under CSR<br />
D = 0.1871, p-value < 2.2e-16<br />
alternative hypothesis: two-sided<br />
probability<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
Spatial Kolmogorov−Smirnov test <strong>of</strong> CSR<br />
based on distribution <strong>of</strong> covariate ’Z’<br />
p−value= 0<br />
0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
Z<br />
Copyright c○CSIRO 2008
78 Methods 2: Tests <strong>of</strong> Complete Spatial R<strong>and</strong>omness<br />
The Kolmogorov-Smirnov test would typically be preferred if the covariateZhas cont<strong>in</strong>uouslyvary<strong>in</strong>g<br />
numerical values. If the covariate is a factor or discrete variable, then the Kolmogorov-<br />
Smirnov test is <strong>in</strong>effective because <strong>of</strong> tied values, <strong>and</strong> the χ 2 test based on quadrat counts would<br />
be preferable.<br />
Copyright c○CSIRO 2008
13 Methods 3: Maximum likelihood for Poisson processes<br />
If we are will<strong>in</strong>g to assume (tentatively) that the <strong>po<strong>in</strong>t</strong>s are <strong>in</strong>dependent, then we can apply<br />
some decent statistical methods to the <strong>in</strong>vestigation <strong>of</strong> the <strong>in</strong>tensity.<br />
13.1 Inhomogeneous Poisson process<br />
The <strong>in</strong>homogeneous Poisson process with <strong>in</strong>tensity function λ(u), u ∈ R 2 , is a modification <strong>of</strong><br />
the homogeneous Poisson process, <strong>in</strong> which properties (PP2) <strong>and</strong> (PP4) above are replaced by<br />
(PP2’): the number N(X ∩ B) <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> a region B has expectation<br />
<br />
E[N(X ∩ B)] = λ(u)du.<br />
(PP4’): given that N(X ∩ B) = n, the n <strong>po<strong>in</strong>t</strong>s are <strong>in</strong>dependent <strong>and</strong> identically distributed, with<br />
common probability density f(u) = λ(u)/I, where I = <br />
B λ(u)du.<br />
This process can also be simulated us<strong>in</strong>g rpoispp us<strong>in</strong>g the same properties. The <strong>in</strong>tensity<br />
argumentlambda can be a constant, a function(x,y) giv<strong>in</strong>g the values <strong>of</strong> the <strong>in</strong>tensity function<br />
at coord<strong>in</strong>ates x,y, or a pixel image conta<strong>in</strong><strong>in</strong>g the <strong>in</strong>tensity values at a grid <strong>of</strong> locations.<br />
> lambda plot(rpoispp(lambda))<br />
rpoispp(lambda)<br />
If we discretise an <strong>in</strong>homogeneous Poisson process, the <strong>in</strong>dicators I are <strong>in</strong>dependent, but<br />
have unequal success probabilities, P {I(u) = 1} = λ(u)dA.<br />
The <strong>in</strong>homogeneous Poisson process is a plausible model for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> under several<br />
scenarios. One is r<strong>and</strong>om th<strong>in</strong>n<strong>in</strong>g: suppose that a homogeneous Poisson process <strong>of</strong> <strong>in</strong>tensity<br />
β is generated, <strong>and</strong> that each <strong>po<strong>in</strong>t</strong> is either deleted or reta<strong>in</strong>ed, <strong>in</strong>dependently <strong>of</strong> other <strong>po<strong>in</strong>t</strong>s.<br />
Suppose the probability <strong>of</strong> reta<strong>in</strong><strong>in</strong>g a <strong>po<strong>in</strong>t</strong> at the location u is p(u). Then the result<strong>in</strong>g process<br />
<strong>of</strong> reta<strong>in</strong>ed <strong>po<strong>in</strong>t</strong>s is <strong>in</strong>homogeneous Poisson, with <strong>in</strong>tensity λ(u) = βp(u).<br />
B<br />
79<br />
Copyright c○CSIRO 2008
80 Methods 3: Maximum likelihood for Poisson processes<br />
Consider, for example, a model <strong>of</strong> plant propagation which assumes that seeds are r<strong>and</strong>omly<br />
dispersed accord<strong>in</strong>g to a Poisson process, <strong>and</strong> seeds r<strong>and</strong>omly germ<strong>in</strong>ate or do not germ<strong>in</strong>ate,<br />
<strong>in</strong>dependently <strong>of</strong> each other, with a germ<strong>in</strong>ation probability that depends on the local soil<br />
conditions. The result<strong>in</strong>g pattern <strong>of</strong> plants is an <strong>in</strong>homogeneous Poisson process.<br />
13.2 Likelihood methods<br />
The log-likelihood for the homogeneous Poisson process with <strong>in</strong>tensity λ is<br />
log L(λ;x) = n(x)log λ − λarea(W) (3)<br />
where n(x) is the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> the dataset x. The maximum likelihood estimator <strong>of</strong> λ is<br />
λ = n(x)<br />
area(W)<br />
which is also an unbiased estimator. The variance <strong>of</strong> λ is var[ λ] = λ/area(W).<br />
Consider an <strong>in</strong>homogeneous Poisson process with <strong>in</strong>tensity function λθ(u) depend<strong>in</strong>g on a<br />
parameter θ. The log-likelihood for θ is<br />
log L(θ;x) =<br />
n<br />
i=1<br />
<br />
log λθ(xi) − λθ(u)du (4)<br />
W<br />
This is a well-behaved likelihood; for example if log λθ(u) is l<strong>in</strong>ear <strong>in</strong> θ, then the log-likelihood<br />
is concave, so there is a unique MLE. However, the MLE θ is not analytically tractable, so it<br />
must be computed us<strong>in</strong>g numerical algorithms such as Newton’s method.<br />
The usual asymptotic theory <strong>of</strong> maximum likelihood applies: under suitable large sample<br />
conditions, the MLE <strong>of</strong> θ is asymptotically normal. If we wish to test CSR, the likelihood ratio<br />
test statistic<br />
R = 2log L( θ)<br />
L( λ)<br />
is asymptotically χ 2 under CSR, <strong>and</strong> this gives an asymptotically optimal test <strong>of</strong> CSR aga<strong>in</strong>st<br />
the alternative <strong>of</strong> an <strong>in</strong>homogeneous Poisson process with <strong>in</strong>tensity λθ(u).<br />
13.3 Fitt<strong>in</strong>g Poisson processes <strong>in</strong> spatstat<br />
Mark Berman <strong>and</strong> Rolf Turner [14] (see also [34, 18, 35]) developed a clever computational device<br />
for f<strong>in</strong>d<strong>in</strong>g the MLE <strong>of</strong> θ by exploit<strong>in</strong>g a formal similarity between the Poisson log-likelihood (4)<br />
<strong>and</strong> that <strong>of</strong> a logl<strong>in</strong>ear Poisson regression.<br />
The Berman-Turner algorithm is implemented <strong>in</strong> spatstat. The <strong>in</strong>tensity function λθ(u)<br />
must be logl<strong>in</strong>ear <strong>in</strong> the parameter θ:<br />
log λθ(u) = θ · S(u) (5)<br />
where S(u) is a real-valued or vector-valued function <strong>of</strong> location u. The form <strong>of</strong> S is arbitrary so<br />
this is not much <strong>of</strong> a restriction. In practice S(u) could be a function <strong>of</strong> the <strong>spatial</strong> coord<strong>in</strong>ates<br />
<strong>of</strong> u, or an observed covariate, or a mixture <strong>of</strong> both. Assum<strong>in</strong>g (5), the log-likelihood (4) is a<br />
convex function <strong>of</strong> θ, so maximum likelihood is well-behaved.<br />
Copyright c○CSIRO 2008
13.3 Fitt<strong>in</strong>g Poisson processes <strong>in</strong> spatstat 81<br />
13.3.1 Model-fitt<strong>in</strong>g function<br />
The fitt<strong>in</strong>g function is called ppm (‘<strong>po<strong>in</strong>t</strong> process model’) <strong>and</strong> is very closely analogous to the<br />
model fitt<strong>in</strong>g functions <strong>in</strong> R such as lm <strong>and</strong> glm. The statistic S(u) is specified by an R language<br />
formula, like the formulas used to specify the systematic relationship <strong>in</strong> a l<strong>in</strong>ear model or<br />
generalised l<strong>in</strong>ear model. The basic syntax is:<br />
> ppm(X, ~trend)<br />
where X is the <strong>po<strong>in</strong>t</strong> pattern dataset, <strong>and</strong> ~trend is an R formula with no left-h<strong>and</strong> side. This<br />
should be viewed as a model with log l<strong>in</strong>k, so the formula ~trend specifies the form <strong>of</strong> the<br />
logarithm <strong>of</strong> the <strong>in</strong>tensity function.<br />
To fit the homogeneous Poisson model:<br />
> ppm(bei, ~1)<br />
Stationary Poisson process<br />
Uniform <strong>in</strong>tensity: 0.007208<br />
To fit an <strong>in</strong>homogeneous Poisson model with an <strong>in</strong>tensity that is log-l<strong>in</strong>ear <strong>in</strong> the cartesian<br />
coord<strong>in</strong>ates, i.e. λθ((x,y)) = exp(θ0 + θ1x + θ2y),<br />
> ppm(bei, ~x + y)<br />
Nonstationary Poisson process<br />
Trend formula: ~x + y<br />
Fitted coefficients for trend formula:<br />
(Intercept) x y<br />
-4.7245290274 -0.0008031288 0.0006496090<br />
Herex<strong>and</strong>yare reserved names that always refer to the cartesian coord<strong>in</strong>ates. In the output,<br />
the ‘fitted coefficients’ are the maximum likelihood estimates <strong>of</strong> θ0,θ1,θ2, the coefficients <strong>of</strong> the<br />
‘l<strong>in</strong>ear predictor’. The fitted <strong>in</strong>tensity function is<br />
λθ((x,y)) = exp (−4.724529 + −0.000803x + 0.00065y) .<br />
To fit an <strong>in</strong>homogeneous Poisson model with an <strong>in</strong>tensity that is log-quadratic <strong>in</strong> the cartesian<br />
coord<strong>in</strong>ates, i.e. such that log λθ((x,y)) is a quadratic <strong>in</strong> x <strong>and</strong> y:<br />
> ppm(bei, ~polynom(x, y, 2))<br />
Nonstationary Poisson process<br />
Trend formula: ~polynom(x, y, 2)<br />
Fitted coefficients for trend formula:<br />
(Intercept) polynom(x, y, 2)[x] polynom(x, y, 2)[y]<br />
-4.275762e+00 -1.609187e-03 -4.895166e-03<br />
polynom(x, y, 2)[x^2] polynom(x, y, 2)[x.y] polynom(x, y, 2)[y^2]<br />
1.625968e-06 -2.836387e-06 1.331331e-05<br />
Copyright c○CSIRO 2008
82 Methods 3: Maximum likelihood for Poisson processes<br />
Essentially any k<strong>in</strong>d <strong>of</strong> model formula can be used, <strong>in</strong>volv<strong>in</strong>g the reserved names x <strong>and</strong> y<br />
<strong>and</strong> any covariates (as we expla<strong>in</strong> later).<br />
To fit a model with constant but unequal <strong>in</strong>tensities on each side <strong>of</strong> the vertical l<strong>in</strong>e x = 500,<br />
the explanatory variable S(u) should be a factor with two levels, Left <strong>and</strong> Right say, tak<strong>in</strong>g<br />
the value Left when the location u is to the left <strong>of</strong> the l<strong>in</strong>e x = 500.<br />
> side ppm(bei, ~side(x))<br />
Nonstationary Poisson process<br />
Trend formula: ~side(x)<br />
Fitted coefficients for trend formula:<br />
(Intercept) side(x)right<br />
-4.8026460 -0.2792705<br />
When factors are <strong>in</strong>volved, the <strong>in</strong>terpretation <strong>of</strong> the coefficients depends on which ‘contrasts’<br />
are <strong>in</strong> force. By default the ‘treatment contrasts’ are assumed. This means that the treatment<br />
effect is taken to be zero for the first level <strong>of</strong> the factor, <strong>and</strong> the estimated treatment effects for<br />
other levels are effectively estimates <strong>of</strong> the difference from the first level. In this case "left"<br />
comes alphabetically before "right", so by default, the first level is "left". The fitted model<br />
is<br />
λθ((x,y)) =<br />
exp(−4.8026) if x < 500<br />
exp(−4.8026 + (−0.2793)) if x ≥ 500<br />
Rather than rely<strong>in</strong>g on such <strong>in</strong>terpretations, it is prudent to use the comm<strong>and</strong> predict to<br />
compute predicted values <strong>of</strong> the model, as expla<strong>in</strong>ed <strong>in</strong> Section 13.4 below.<br />
13.3.2 Models <strong>in</strong>volv<strong>in</strong>g <strong>spatial</strong> covariates<br />
It is also possible to fit an <strong>in</strong>homogeneous Poisson process model with an <strong>in</strong>tensity function that<br />
depends on an observed covariate. Let Z(u) be a covariate that has been measured at every<br />
location u <strong>in</strong> the study w<strong>in</strong>dow. Then Z(u), or any transformation <strong>of</strong> it, can serve as the statistic<br />
S(u) <strong>in</strong> the parametric form (5) for the <strong>in</strong>tensity function.<br />
The <strong>po<strong>in</strong>t</strong> pattern dataset bei is supplied with accompany<strong>in</strong>g covariate data bei.extra.<br />
The covariates are the elevation (altitude) <strong>and</strong> the slope <strong>of</strong> the terra<strong>in</strong> at each location <strong>in</strong> the<br />
w<strong>in</strong>dow, given as two pixel images bei.extra$elev <strong>and</strong> bei.extra$grad.<br />
> data(bei)<br />
> grad plot(grad)<br />
Copyright c○CSIRO 2008
13.3 Fitt<strong>in</strong>g Poisson processes <strong>in</strong> spatstat 83<br />
grad<br />
To fit the <strong>in</strong>homogeneous Poisson model with <strong>in</strong>tensity which is a logl<strong>in</strong>ear function <strong>of</strong> slope,<br />
i.e.<br />
λ(u) = exp(β0 + β1Z(u)) (6)<br />
where β0,β1 are parameters <strong>and</strong> Z(u) is the slope at location u, we type<br />
> ppm(bei, ~slope, covariates = list(slope = grad))<br />
Nonstationary Poisson process<br />
Trend formula: ~slope<br />
Fitted coefficients for trend formula:<br />
(Intercept) slope<br />
-5.390553 5.022021<br />
In the call to ppm, the argumentcovariates should be a list <strong>of</strong> name=value pairs. The names<br />
should match the variables appear<strong>in</strong>g <strong>in</strong> the model formula. The values should be pixel images.<br />
The pr<strong>in</strong>tout <strong>in</strong>cludes the fitted coefficients β0,β1 so the fitted model is<br />
λ(u) = exp(−5.390553 + 5.022021Z(u)). (7)<br />
It might be more appropriate to fit the <strong>in</strong>homogeneous Poisson model with <strong>in</strong>tensity that is<br />
proportional to slope,<br />
λ(u) = βZ(u) (8)<br />
where aga<strong>in</strong> Z(u) is the slope at u. Equivalently<br />
log λ(u) = log β + log Z(u). (9)<br />
There is no coefficient <strong>in</strong> front <strong>of</strong> the term log Z(u) <strong>in</strong> (9), so this term is an ‘<strong>of</strong>fset’. To fit this<br />
model,<br />
> ppm(bei, ~<strong>of</strong>fset(log(slope)), covariates = list(slope = grad))<br />
0 0.05 0.1 0.15 0.2 0.25 0.3<br />
Copyright c○CSIRO 2008
84 Methods 3: Maximum likelihood for Poisson processes<br />
Nonstationary Poisson process<br />
Trend formula: ~<strong>of</strong>fset(log(slope))<br />
Fitted coefficients for trend formula:<br />
(Intercept)<br />
-2.427127<br />
The fitted coefficient is the constant log β appear<strong>in</strong>g <strong>in</strong> (9), so convert<strong>in</strong>g back to the form<br />
(8), the fitted model is<br />
λ(u) = e −2.427127 Z(u) = 0.0883 Z(u).<br />
13.4 Fitted models<br />
The value returned by the model-fitt<strong>in</strong>g function ppm is an object <strong>of</strong> class "ppm" that represents<br />
the fitted model. This is analogous to the fitt<strong>in</strong>g <strong>of</strong> l<strong>in</strong>ear models (lm), generalised l<strong>in</strong>ear models<br />
(glm) <strong>and</strong> so on.<br />
13.4.1 St<strong>and</strong>ard operations<br />
The follow<strong>in</strong>g st<strong>and</strong>ard operations on fitted models <strong>in</strong> R can be applied to <strong>po<strong>in</strong>t</strong> process models<br />
(i.e. these operations have methods for the class "ppm"):<br />
pr<strong>in</strong>t pr<strong>in</strong>t basic <strong>in</strong>formation<br />
summary pr<strong>in</strong>t detailed summary <strong>in</strong>formation<br />
plot plot the fitted <strong>in</strong>tensity<br />
predict compute the fitted <strong>in</strong>tensity<br />
fitted compute the fitted <strong>in</strong>tensity at data <strong>po<strong>in</strong>t</strong>s<br />
update re-fit the model<br />
coef extract the fitted coefficient vector θ<br />
vcov variance-covariance matrix <strong>of</strong> θ<br />
anova analysis <strong>of</strong> deviance<br />
logLik log-likelihood value<br />
formula extract the model formula<br />
terms extract the terms <strong>in</strong> the model<br />
model.matrix compute the design matrix<br />
For <strong>in</strong>formation on these methods, see pr<strong>in</strong>t.ppm, summary.ppm, plot.ppm etc. The follow<strong>in</strong>g<br />
comm<strong>and</strong>s also work on "ppm" objects:<br />
step stepwise model selection<br />
drop1 one step model deletion<br />
AIC Akaike Information Criterion<br />
> fit fit<br />
Nonstationary Poisson process<br />
Trend formula: ~x + y<br />
Fitted coefficients for trend formula:<br />
(Intercept) x y<br />
-4.7245290274 -0.0008031288 0.0006496090<br />
Copyright c○CSIRO 2008
13.4 Fitted models 85<br />
> plot(fit, how = "image", se = FALSE)<br />
> predict(fit, type = "trend")<br />
Fitted trend<br />
real-valued pixel image<br />
50 x 50 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 1000] x [0, 500] metres<br />
> predict(fit, type = "cif", ngrid = 256)<br />
real-valued pixel image<br />
256 x 256 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 1000] x [0, 500] metres<br />
> coef(fit)<br />
(Intercept) x y<br />
-4.7245290274 -0.0008031288 0.0006496090<br />
> vcov(fit)<br />
(Intercept) x y<br />
(Intercept) 1.854091e-03 -1.491267e-06 -3.528289e-06<br />
x -1.491267e-06 3.437842e-09 1.208410e-14<br />
y -3.528289e-06 1.208410e-14 1.338955e-08<br />
> sqrt(diag(vcov(fit)))<br />
(Intercept) x y<br />
4.305915e-02 5.863311e-05 1.157132e-04<br />
> round(vcov(fit, what = "corr"), 2)<br />
(Intercept) x y<br />
(Intercept) 1.00 -0.59 -0.71<br />
x -0.59 1.00 0.00<br />
y -0.71 0.00 1.00<br />
0.004 0.006 0.008 0.01 0.012<br />
Copyright c○CSIRO 2008
86 Methods 3: Maximum likelihood for Poisson processes<br />
This is the fitted model with <strong>in</strong>tensity function<br />
with the follow<strong>in</strong>g estimates:<br />
i θi var( θi) st<strong>and</strong>ard deviation<br />
0 -4.724529 0.001854091 0.04305915<br />
1 -0.0008031288 3.437842e-09 5.863311e-05<br />
2 0.000649609 1.338955e-08 0.0001157132<br />
λθ((x,y)) = exp (θ0 + θ1x + θ2y) (10)<br />
It is also possible to compute the st<strong>and</strong>ard error <strong>of</strong> the fitted <strong>in</strong>tensity λθ(u) at each location<br />
u, as a pixel image. Use predict(fit, type="se") or plot(fit, se=TRUE).<br />
> SE plot(SE, ma<strong>in</strong> = "st<strong>and</strong>ard error <strong>of</strong> fitted <strong>in</strong>tensity")<br />
st<strong>and</strong>ard error <strong>of</strong> fitted <strong>in</strong>tensity<br />
If the model formula <strong>in</strong>volves transformations <strong>of</strong> the orig<strong>in</strong>al covariates, thenmodel.matrix(fit)<br />
gives the design matrix whose columns conta<strong>in</strong> these transformed covariates, <strong>and</strong>model.images(fit)<br />
gives a list <strong>of</strong> pixel images <strong>of</strong> these transformed covariates.<br />
> fit mo mo<br />
(Intercept) :<br />
real-valued pixel image<br />
100 x 100 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 1000] x [0, 500] metres<br />
sqrt(slope) :<br />
real-valued pixel image<br />
100 x 100 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 1000] x [0, 500] metres<br />
x :<br />
Copyright c○CSIRO 2008<br />
2e−04 3e−04 4e−04
13.4 Fitted models 87<br />
real-valued pixel image<br />
100 x 100 pixel array (ny, nx)<br />
enclos<strong>in</strong>g rectangle: [0, 1000] x [0, 500] metres<br />
> plot(mo[[2]])<br />
mo[[2]]<br />
It is also possible to plot the ‘effect’ <strong>of</strong> a s<strong>in</strong>gle covariate <strong>in</strong> the model. The comm<strong>and</strong><br />
effectfun computes the <strong>in</strong>tensity <strong>of</strong> the fitted model as a function <strong>of</strong> one <strong>of</strong> its covariates. This<br />
is chiefly useful if the model only has one covariate.<br />
> fit plot(effectfun(fit, "slope"))<br />
lambda<br />
0.005 0.010 0.015 0.020<br />
effectfun(fit, "slope")<br />
0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
slope<br />
0.1 0.2 0.3 0.4 0.5<br />
Copyright c○CSIRO 2008
88 Methods 3: Maximum likelihood for Poisson processes<br />
13.4.2 Model selection<br />
Analysis <strong>of</strong> deviance for nested Poisson <strong>po<strong>in</strong>t</strong> process models is implemented <strong>in</strong> spatstat as<br />
anova.ppm. The first model should be a sub-model <strong>of</strong> the second.<br />
> fit fitnull anova(fitnull, fit, test = "Chi")<br />
Analysis <strong>of</strong> Deviance Table<br />
Model 1: .mpl.Y ~ 1<br />
Model 2: .mpl.Y ~ slope<br />
Resid. Df Resid. Dev Df Deviance P(>|Chi|)<br />
1 20507 18728.4<br />
2 20506 18346.1 1 382.3 4.018e-85<br />
This effectively performs the likelihood ratio test <strong>of</strong> the null hypothesis <strong>of</strong> a homogeneous<br />
Poisson process (CSR) aga<strong>in</strong>st the alternative <strong>of</strong> an <strong>in</strong>homogeneous Poisson process with <strong>in</strong>tensity<br />
that is a logl<strong>in</strong>ear function <strong>of</strong> the slope covariate (6). The p-value is extremely small,<br />
<strong>in</strong>dicat<strong>in</strong>g rejection <strong>of</strong> CSR <strong>in</strong> favour <strong>of</strong> the alternative. (Please ignore the columns Resid.Df<br />
<strong>and</strong> Resid.Dev which are artefacts <strong>of</strong> the discretisation. Only the deviance difference <strong>and</strong> the<br />
difference <strong>in</strong> degrees <strong>of</strong> freedom are valid.)<br />
Note that st<strong>and</strong>ard Analysis <strong>of</strong> Deviance requires the null hypothesis to be a sub-model <strong>of</strong> the<br />
alternative. Unfortunately the model (8), <strong>in</strong> which <strong>in</strong>tensity is proportional to slope, does not<br />
<strong>in</strong>clude the homogeneous Poisson process as a special case, so we cannot use analysis <strong>of</strong> deviance<br />
to test the null hypothesis <strong>of</strong> homogeneous Poisson aga<strong>in</strong>st the alternative <strong>of</strong> an <strong>in</strong>homogeneous<br />
Poisson with <strong>in</strong>tensity (8).<br />
One possibility here is to use the Akaike Information Criterion AIC for model selection.<br />
> fitprop fitnull AIC(fitprop)<br />
[1] 42496.65<br />
> AIC(fitnull)<br />
[1] 42763.92<br />
The smaller AIC favours the model (8) with <strong>in</strong>tensity is proportional to slope.<br />
Automatic model selection can be performed us<strong>in</strong>g step. By default, this performs stepwise<br />
deletion. Start<strong>in</strong>g from the fitted model, the procedure considers each term <strong>in</strong> the model, <strong>and</strong><br />
determ<strong>in</strong>es whether the term should be deleted (accord<strong>in</strong>g to AIC). The deletion giv<strong>in</strong>g the<br />
biggest improvement <strong>in</strong> AIC is carried out. This is applied recursively until no more terms can<br />
be deleted.<br />
> X fit step(fit)<br />
Copyright c○CSIRO 2008
13.5 Simulat<strong>in</strong>g the fitted model 89<br />
Start: AIC=-707.35<br />
~x + y<br />
Df AIC<br />
- y 1 -709.35<br />
- x 1 -707.83<br />
-707.35<br />
Step: AIC=-709.35<br />
~x<br />
Df AIC<br />
- x 1 -709.83<br />
-709.35<br />
Step: AIC=-709.83<br />
~1<br />
Stationary Poisson process<br />
Uniform <strong>in</strong>tensity: 99<br />
13.5 Simulat<strong>in</strong>g the fitted model<br />
A fitted Poisson model can be simulated automatically us<strong>in</strong>g the function rmh.<br />
> X plot(X, ma<strong>in</strong> = "realisation <strong>of</strong> fitted model")<br />
realisation <strong>of</strong> fitted model<br />
Copyright c○CSIRO 2008
90 Methods 4: check<strong>in</strong>g a fitted Poisson model<br />
14 Methods 4: check<strong>in</strong>g a fitted Poisson model<br />
After fitt<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> process model to a <strong>po<strong>in</strong>t</strong> pattern dataset, we should check that the model is a<br />
good fit (‘goodness-<strong>of</strong>-fit’), <strong>and</strong> that each component assumption <strong>of</strong> the model was appropriate<br />
(‘validation’). This section presents some techniques available for check<strong>in</strong>g a fitted Poisson<br />
model.<br />
Model check<strong>in</strong>g can be either ‘formal’ or ‘<strong>in</strong>formal’. Formal techniques are based on detailed<br />
probabilistic assumptions about the data, <strong>and</strong> allow us to make probabilistic statements about<br />
the outcome. They <strong>in</strong>clude hypothesis tests (χ 2 tests, goodness-<strong>of</strong>-fit tests, Monte Carlo tests)<br />
<strong>and</strong> Bayesian model selection.<br />
In contrast, ‘<strong>in</strong>formal’ tools do not impose assumptions on the data <strong>and</strong> their <strong>in</strong>terpretation<br />
depends on human judgement. A typical example is the residual, def<strong>in</strong>ed for each observation by<br />
(residual) = (observed) - (fitted). If the model is a good fit, then the residuals should<br />
be ‘noise’, centred around zero.<br />
14.1 Goodness-<strong>of</strong>-fit<br />
A goodness-<strong>of</strong>-fit test is a formal test <strong>of</strong> the null hypothesis that the model is true, aga<strong>in</strong>st the<br />
very general alternative that the model is not true.<br />
The χ 2 goodness-<strong>of</strong>-fit test based on quadrat counts can be applied to a fitted Poisson model,<br />
homogeneous or <strong>in</strong>homogeneous. Under the null hypothesis, the quadrat counts are <strong>in</strong>dependent<br />
Poisson variables with different mean values, <strong>and</strong> the means are estimated by the fitted model.<br />
> data(bei)<br />
> fit M M<br />
Chi-squared test <strong>of</strong> fitted model ’fit’ us<strong>in</strong>g quadrat counts<br />
data: data from fit<br />
X-squared = 711.5036, df = 6, p-value < 2.2e-16<br />
If (as <strong>in</strong> this case) the formal goodness-<strong>of</strong>-fit test rejects the fitted model, we would then like<br />
to ga<strong>in</strong> an <strong>in</strong>formal impression <strong>of</strong> the type <strong>of</strong> departure from the model (i.e. <strong>in</strong> what way the<br />
data appear to depart from the predictions <strong>of</strong> the model) so that we may formulate a better<br />
model. To do this we can <strong>in</strong>spect the residual counts.<br />
> plot(bei, pch = ".")<br />
> plot(M, add = TRUE, cex = 1.5, col = "red")<br />
Copyright c○CSIRO 2008
14.2 Validation us<strong>in</strong>g residuals 91<br />
bei<br />
666 601 677 478.5 130 402.8 481 319.7<br />
2.7 9.1 −14 9<br />
544 601.6 165 477.9 643 402.3 298 320.2<br />
−2.3 −14 12 −1.2<br />
The plot displays, for each quadrat, the observed number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s (top left), the predicted<br />
number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s accord<strong>in</strong>g to the model (top right), <strong>and</strong> the Pearson residual (bottom) def<strong>in</strong>ed<br />
by<br />
(observed) − (expected)<br />
Pearson residual = √<br />
expected<br />
If the orig<strong>in</strong>al data were Poisson, this transformation approximately st<strong>and</strong>ardises the residuals<br />
so that they have mean zero <strong>and</strong> variance 1 when the model is true. A Pearson residual <strong>of</strong> −14<br />
is a gross departure from the fitted model.<br />
The Kolmogorov-Smirnov test can also be applied to a fitted Poisson model, with homogeneous<br />
or <strong>in</strong>homogeneous <strong>in</strong>tensity.<br />
> kstest(fit, function(x, y) {<br />
+ y<br />
+ })<br />
Spatial Kolmogorov-Smirnov test <strong>of</strong> <strong>in</strong>homogeneous Poisson process<br />
data: covariate ’function(x, y) { y}’ evaluated at <strong>po<strong>in</strong>t</strong>s <strong>of</strong> ’bei’<br />
<strong>and</strong> transformed to uniform distribution under ’fit’<br />
D = 0.1019, p-value < 2.2e-16<br />
alternative hypothesis: two-sided<br />
This uses the method kstest.ppm for the generic function kstest.<br />
14.2 Validation us<strong>in</strong>g residuals<br />
14.2.1 Residuals<br />
Residuals from the fitted model are an important diagnostic tool <strong>in</strong> other areas <strong>of</strong> applied<br />
statistics, but <strong>in</strong> <strong>spatial</strong> statistics they have only recently been developed ([39, 45], [44, pp.<br />
49–50], [6]).<br />
For a fitted Poisson process model, with fitted <strong>in</strong>tensity λ(u), the predicted number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s<br />
fall<strong>in</strong>g <strong>in</strong> any region B is <br />
B λ(u)du. Hence the residual <strong>in</strong> each region B ⊂ R2 is def<strong>in</strong>ed [6]<br />
to be the observed m<strong>in</strong>us predicted number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> B: [6]<br />
<br />
R(B) = n(x ∩ B) − λ(u)du (11)<br />
B<br />
Copyright c○CSIRO 2008
92 Methods 4: check<strong>in</strong>g a fitted Poisson model<br />
where x is the observed <strong>po<strong>in</strong>t</strong> pattern, n(x ∩ B) the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> x <strong>in</strong> the region B, <strong>and</strong><br />
λ(u) is the <strong>in</strong>tensity <strong>of</strong> the fitted model.<br />
These residuals are closely related to the residuals for quadrat counts that were used above.<br />
Tak<strong>in</strong>g the set B to be one <strong>of</strong> our quadrats, the ‘observed’ quadrat count is n(x ∩ B). The<br />
‘expected’ quadrat count is λarea(B) if the model is CSR, or more generally <br />
B λ(u)du if the<br />
model is an <strong>in</strong>homogeneous Poisson process. Hence the ‘raw residual’ is observed - expected<br />
= n(x ∩ B) − <br />
B λ(u)du.<br />
14.2.2 Residual measure<br />
Equation (11) def<strong>in</strong>es the total residual for any region B, large or small.<br />
Intuitively the residuals can be visualised as an electric charge, with unit positive charge at<br />
each data <strong>po<strong>in</strong>t</strong>, <strong>and</strong> a diffuse negative charge at all other locations u, with density ˆ λ(u). If the<br />
model is true, then these charges should approximately cancel.<br />
If we’d like to visualise this electric charge, one way is to plot the observed <strong>po<strong>in</strong>t</strong>s <strong>and</strong> the<br />
fitted <strong>in</strong>tensity function together:<br />
> data(bei)<br />
> fit plot(predict(fit))<br />
> plot(bei, add = TRUE, pch = "+")<br />
predict(fit)<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ + + +<br />
+ ++<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+ + ++ +<br />
+<br />
+<br />
+ + ++ + + +<br />
+ +<br />
+ + +<br />
+ ++<br />
++ + +<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+ + + +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++ +<br />
+<br />
++<br />
+ +<br />
+ + + + +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + +++<br />
+<br />
+<br />
+ +<br />
++ +<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
++<br />
++ +<br />
++<br />
+<br />
+<br />
++ + +<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +++<br />
++<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+ ++<br />
++<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++ ++<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
++ + +<br />
+<br />
++<br />
+ ++ +<br />
+<br />
++ +<br />
+ +<br />
+ +<br />
+++<br />
+<br />
+<br />
++<br />
+ + ++++<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++ ++++++<br />
+++<br />
+<br />
+ +<br />
+<br />
+ ++ +<br />
++<br />
+ ++<br />
+<br />
+ +<br />
+++<br />
+<br />
+ ++ +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
++ +<br />
+ + +<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+++<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++ ++++<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
++ +<br />
+ +<br />
++ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ ++<br />
+++ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+ ++<br />
+ ++ + +<br />
+ +<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+<br />
++<br />
+ +<br />
++ ++<br />
++ +<br />
+ ++<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ + ++<br />
+ +<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+ ++<br />
++<br />
+++<br />
+<br />
++<br />
+<br />
+ ++ + +<br />
+ ++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+++<br />
+ + +++<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++ +<br />
++<br />
++<br />
+ +<br />
++ + + ++<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+ ++<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+ +<br />
+ +++<br />
+<br />
++ +<br />
+<br />
+++<br />
+ + +<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
++<br />
++<br />
+ +<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ + ++<br />
+++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+ + +<br />
+ + + +<br />
++++<br />
+<br />
+<br />
+++<br />
+ +<br />
+ ++<br />
++<br />
+ +<br />
+<br />
+ ++<br />
+ +++<br />
+<br />
+<br />
++<br />
+ + +<br />
+++<br />
+ ++<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ ++<br />
++<br />
+ + ++<br />
+<br />
+<br />
+<br />
++++<br />
+ +<br />
+<br />
+<br />
+ +<br />
+++<br />
+<br />
++<br />
+ +++<br />
+<br />
+<br />
+<br />
+ +<br />
+++<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+++<br />
+<br />
+<br />
+++<br />
+<br />
++<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
++ +<br />
++ +++<br />
+<br />
++<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+ + + +<br />
+<br />
+ ++++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
++<br />
+<br />
+<br />
+++<br />
+<br />
++<br />
+<br />
++ +<br />
++<br />
++<br />
+<br />
++<br />
+++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++<br />
+++<br />
+<br />
+++<br />
+ +<br />
+++<br />
+<br />
+ + ++<br />
+<br />
+ +<br />
+<br />
++<br />
++ +<br />
+<br />
+<br />
++<br />
+<br />
+ ++ ++<br />
+++<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++ +<br />
+ +<br />
+<br />
+ ++ ++ ++<br />
+<br />
+++ +<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
++<br />
+ ++<br />
+<br />
+<br />
++ +<br />
++<br />
++<br />
++<br />
++<br />
+++ ++ +<br />
+<br />
+<br />
++++<br />
+<br />
++<br />
+<br />
++<br />
+ +++++<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+++<br />
++<br />
+ ++ + ++ +<br />
++ +<br />
+<br />
+<br />
+<br />
+ + +<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
++++<br />
+<br />
++<br />
+<br />
+ +<br />
+++<br />
+<br />
++++<br />
+<br />
+ + +<br />
+<br />
+<br />
+++<br />
++<br />
+ +<br />
++<br />
+<br />
++<br />
+++<br />
+ +<br />
+++<br />
+<br />
++<br />
+ ++<br />
+<br />
+<br />
+ ++<br />
+ + + ++<br />
++<br />
++++ ++<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+ + ++<br />
++ +<br />
++<br />
+++<br />
+<br />
+<br />
+<br />
+ ++<br />
++<br />
+<br />
+ ++<br />
+++<br />
+<br />
+<br />
+ ++<br />
+ ++ +<br />
++<br />
+ + +<br />
++ +<br />
++ + +<br />
+<br />
++<br />
+ ++<br />
+++ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+++<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+ ++ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ ++<br />
+<br />
+<br />
+<br />
++<br />
++ +<br />
++ + +<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+ +++<br />
+ ++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++<br />
+<br />
++<br />
++ ++ + ++ +<br />
+<br />
+ ++ +<br />
+<br />
++<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
++ +<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
+ +<br />
++ +<br />
++<br />
++ + ++ ++ + +<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
+<br />
++ + + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+++<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+ +<br />
+<br />
++<br />
+ +<br />
+ + +<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
++<br />
++<br />
+ +<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
++<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
++++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++++<br />
+<br />
+<br />
+<br />
+ + +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
++ +<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ +<br />
+ +<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+ +++ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+++<br />
++++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+ +++<br />
+ ++<br />
+ +<br />
++<br />
+ + + + + +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++++<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+ + ++<br />
+<br />
+<br />
+++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
++ +<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+ + +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+ ++<br />
+<br />
+ + +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ ++ +<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+ + + +<br />
+<br />
+ +<br />
+<br />
+<br />
+ +<br />
+<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
+ + +<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+ +<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+ +<br />
+<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+ +<br />
+<br />
++<br />
+<br />
+ +<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++<br />
+<br />
++<br />
+ +<br />
+ + + + +<br />
+ + +<br />
++<br />
+<br />
+<br />
+<br />
+<br />
+<br />
++ +<br />
+<br />
+<br />
Each data <strong>po<strong>in</strong>t</strong> should be visualised as a charge <strong>of</strong> +1, while the colour image <strong>in</strong>dicates a<br />
negative charge density. If the model is true then these positive <strong>and</strong> negative charges should<br />
even out to zero.<br />
14.2.3 Smoothed residuals<br />
A more useful way to visualise the residuals is to smooth them.<br />
> data(bei)<br />
> fitx diagnose.ppm(fitx, which = "smooth")<br />
Copyright c○CSIRO 2008<br />
0.004 0.006 0.008 0.01 0.012
14.2 Validation us<strong>in</strong>g residuals 93<br />
0.002<br />
0.003<br />
0.001<br />
0.006<br />
0.005<br />
0.004<br />
0<br />
−0.001<br />
−0.002<br />
Smoothed raw residuals<br />
−0.003<br />
−0.004<br />
This is an image plot <strong>of</strong> the ‘smoothed residual field’<br />
where λ(u) is the nonparametric, kernel estimate <strong>of</strong> the <strong>in</strong>tensity,<br />
−0.003<br />
−0.002<br />
−0.001<br />
0<br />
0.005<br />
0.006<br />
0.004<br />
0.001<br />
0.003<br />
0.002<br />
0.002<br />
0.003<br />
0.001<br />
0<br />
−0.001<br />
−0.002<br />
s(u) = λ(u) − λ † (u) (12)<br />
n(x) <br />
λ(u) = e(u) κ(u − xi)<br />
i=1<br />
while λ † (u) is a correspond<strong>in</strong>gly-smoothed version <strong>of</strong> the parametric estimate <strong>of</strong> the <strong>in</strong>tensity<br />
accord<strong>in</strong>g to the fitted model,<br />
λ † <br />
(u) = e(u) κ(u − v)λˆ θ (v)dv.<br />
W<br />
Here κ is the smooth<strong>in</strong>g kernel <strong>and</strong> e(u) is the edge correction (2) on page 67. The difference<br />
(12) should be approximately zero if the model is true.<br />
In this example the smoothed residual image conta<strong>in</strong>s a visible trend, suggest<strong>in</strong>g that the<br />
model is <strong>in</strong>appropriate.<br />
14.2.4 Lurk<strong>in</strong>g variable plot<br />
If there is a <strong>spatial</strong> covariate Z(u) that plays an important role <strong>in</strong> the analysis, it may be useful<br />
to display a lurk<strong>in</strong>g variable plot <strong>of</strong> the residuals aga<strong>in</strong>st Z. This is a plot <strong>of</strong> C(z) = R(B(z))<br />
aga<strong>in</strong>st z, where<br />
B(z) = {u ∈ W : Z(u) ≤ z}<br />
is the region <strong>of</strong> space where the covariate value is less than or equal to z.<br />
> grad lurk<strong>in</strong>g(fitx, grad, type = "raw")<br />
Copyright c○CSIRO 2008
94 Methods 4: check<strong>in</strong>g a fitted Poisson model<br />
cumulative raw residuals<br />
−600 −400 −200 0<br />
0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
covariate<br />
Note that the lurk<strong>in</strong>g variable plot typically starts <strong>and</strong> ends at the horizontal axis, s<strong>in</strong>ce (for<br />
any model with an <strong>in</strong>tercept term) the total residual for the entire w<strong>in</strong>dow W must equal zero.<br />
This is analogous to the fact that the residuals <strong>in</strong> l<strong>in</strong>ear regression sum to zero.<br />
The plot also shows approximate 5% significance b<strong>and</strong>s for the cumulative residual C(x) or<br />
C(y), obta<strong>in</strong>ed from the asymptotic variance under the model.<br />
This plot <strong>in</strong>dicates that the model is grossly <strong>in</strong>adequate; the fitted <strong>in</strong>tensity function fails to<br />
capture the dependence <strong>of</strong> <strong>in</strong>tensity on slope.<br />
It can be helpful to display the derivative C ′ (z), which <strong>of</strong>ten <strong>in</strong>dicates which values <strong>of</strong> z are<br />
associated with a lack <strong>of</strong> fit.<br />
> lurk<strong>in</strong>g(fitx, grad, type = "raw", cumulative = FALSE)<br />
marg<strong>in</strong>al raw residuals<br />
−30000 −10000 0 10000 20000 30000<br />
0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
Copyright c○CSIRO 2008<br />
covariate
14.2 Validation us<strong>in</strong>g residuals 95<br />
The derivative is estimated us<strong>in</strong>g a smooth<strong>in</strong>g spl<strong>in</strong>e <strong>and</strong> you may need to tweak the smooth<strong>in</strong>g<br />
parameters (argument spl<strong>in</strong>eargs) to get a useful plot. Also the package currently does<br />
not plot significance b<strong>and</strong>s for C ′ (z).<br />
14.2.5 Four-panel plot<br />
If there are no <strong>spatial</strong> covariates, use the comm<strong>and</strong> diagnose.ppm to plot the residuals:<br />
> data(japanesep<strong>in</strong>es)<br />
> fit diagnose.ppm(fit)<br />
cumulative sum <strong>of</strong> raw residuals<br />
−6 −4 −2 0 2 4 6<br />
0 0.2 0.4 0.6 0.8 1<br />
x coord<strong>in</strong>ate<br />
−15<br />
20<br />
−5<br />
5<br />
15<br />
cumulative sum <strong>of</strong> raw residuals<br />
8 6 4 2 0 −2 −4 −6<br />
This comb<strong>in</strong>ation <strong>of</strong> four plots has proved to be a useful quick <strong>in</strong>dication <strong>of</strong> departure from<br />
the trend <strong>in</strong> the model.<br />
The bottom right panel is an image <strong>of</strong> the smoothed residual field.<br />
0<br />
10<br />
−20<br />
−25<br />
0<br />
35<br />
−10<br />
25<br />
20<br />
15<br />
10<br />
5<br />
−15<br />
30<br />
−5<br />
−10<br />
0<br />
10<br />
−5<br />
5<br />
20<br />
30<br />
15<br />
−10<br />
−20<br />
−25<br />
−15<br />
−15<br />
0<br />
5<br />
−5<br />
25<br />
35<br />
0 0.2 0.4 0.6 0.8 1<br />
y coord<strong>in</strong>ate<br />
Copyright c○CSIRO 2008
96 Methods 4: check<strong>in</strong>g a fitted Poisson model<br />
The top left panel is a direct representation <strong>of</strong> the residual ‘charge’, with circles represent<strong>in</strong>g<br />
the data <strong>po<strong>in</strong>t</strong>s (positive residuals) <strong>and</strong> a colour scheme represent<strong>in</strong>g the fitted <strong>in</strong>tensity (negative<br />
residuals). However, it is <strong>of</strong>ten difficult to <strong>in</strong>terpret.<br />
The two other panels are lurk<strong>in</strong>g variables aga<strong>in</strong>st one <strong>of</strong> the cartesian coord<strong>in</strong>ates. For<br />
example, the bottom left panel is a lurk<strong>in</strong>g variable plot for the x-coord<strong>in</strong>ate. Imag<strong>in</strong>e a vertical<br />
l<strong>in</strong>e which sweeps from left to right across the w<strong>in</strong>dow. The progressive total residual to the left<br />
<strong>of</strong> the l<strong>in</strong>e is plotted aga<strong>in</strong>st the position <strong>of</strong> the l<strong>in</strong>e.<br />
In this example, the lurk<strong>in</strong>g variable plot for the y coord<strong>in</strong>ate suggests a lack <strong>of</strong> fit at about<br />
y = 0.15, <strong>and</strong> the image <strong>of</strong> the smoothed residual field suggests an excess <strong>of</strong> positive residuals<br />
at about x = 0.8,y = 0.15, both <strong>in</strong>dicat<strong>in</strong>g that the model underestimates the true <strong>in</strong>tensity <strong>of</strong><br />
<strong>po<strong>in</strong>t</strong>s <strong>in</strong> this vic<strong>in</strong>ity.<br />
14.2.6 Caveats<br />
The residual plots described above are useful for detect<strong>in</strong>g misspecification <strong>of</strong> the trend <strong>in</strong> a fitted<br />
Poisson process model. They are not very useful for check<strong>in</strong>g the <strong>in</strong>dependence assumption, that<br />
is, for check<strong>in</strong>g the properties (PP3)–(PP4) <strong>of</strong> a Poisson process listed on page 72.<br />
Effective diagnostics <strong>of</strong> <strong>in</strong>dependence or dependence between <strong>po<strong>in</strong>t</strong>s <strong>in</strong>clude the K-function<br />
(section 16.4) <strong>and</strong> a Q–Q plot <strong>of</strong> the residuals (section 22.2.3).<br />
Copyright c○CSIRO 2008
14.2 Validation us<strong>in</strong>g residuals 97<br />
PART IV. INTERACTION<br />
Part IV <strong>of</strong> the workshop expla<strong>in</strong>s how to <strong>in</strong>vestigate dependence between the <strong>po<strong>in</strong>t</strong>s <strong>in</strong> a <strong>po<strong>in</strong>t</strong><br />
pattern.<br />
Copyright c○CSIRO 2008
98 Simple models <strong>of</strong> non-Poisson <strong>patterns</strong><br />
15 Simple models <strong>of</strong> non-Poisson <strong>patterns</strong><br />
A <strong>po<strong>in</strong>t</strong> process that is not Poisson can be said to exhibit ‘<strong>in</strong>teraction’ or dependence between<br />
the <strong>po<strong>in</strong>t</strong>s. It’s time to <strong>in</strong>troduce some models for such processes. This section covers simple<br />
models that are derived from the Poisson process, <strong>and</strong> still reta<strong>in</strong> some <strong>of</strong> the tractable features<br />
<strong>of</strong> the Poisson model.<br />
15.1 Poisson cluster processes<br />
In a Poisson cluster process, we beg<strong>in</strong> with a Poisson process Y <strong>of</strong> ‘parent’ <strong>po<strong>in</strong>t</strong>s. Each ‘parent’<br />
<strong>po<strong>in</strong>t</strong> yi ∈ Y then gives rise to a f<strong>in</strong>ite set Zi <strong>of</strong> ‘<strong>of</strong>fspr<strong>in</strong>g’ <strong>po<strong>in</strong>t</strong>s accord<strong>in</strong>g to some stochastic<br />
mechanism. The set compris<strong>in</strong>g all the <strong>of</strong>fspr<strong>in</strong>g <strong>po<strong>in</strong>t</strong>s forms a <strong>po<strong>in</strong>t</strong> process X. Only X is<br />
observed.<br />
parents clusters <strong>of</strong>fspr<strong>in</strong>g<br />
An example is the Matérn cluster process <strong>in</strong> which the parent <strong>po<strong>in</strong>t</strong>s come from a homogeneous<br />
Poisson process with <strong>in</strong>tensity κ, <strong>and</strong> each parent has a Poisson (µ) number <strong>of</strong> <strong>of</strong>fspr<strong>in</strong>g,<br />
<strong>in</strong>dependently <strong>and</strong> uniformly distributed <strong>in</strong> a disc <strong>of</strong> radius r centred around the parent.<br />
The Matérn cluster process can be generated <strong>in</strong> spatstat us<strong>in</strong>g the comm<strong>and</strong> rMatClust.<br />
[By convention, r<strong>and</strong>om data generators <strong>in</strong> R always have names beg<strong>in</strong>n<strong>in</strong>g with r.]<br />
> plot(rMatClust(kappa = 10, r = 0.1, mu = 5))<br />
rMatClust(kappa = 10, r = 0.1, mu = 5)<br />
Other Poisson cluster processes implemented <strong>in</strong> spatstat are<br />
• rThomas: the Thomas process, <strong>in</strong> which each cluster consists <strong>of</strong> a Poisson(µ) number <strong>of</strong><br />
r<strong>and</strong>om <strong>po<strong>in</strong>t</strong>s, each hav<strong>in</strong>g an isotropic Gaussian N(0,σ 2 I) displacement from its parent.<br />
Copyright c○CSIRO 2008
15.2 Cox processes 99<br />
• rGaussPoisson: the Gauss-Poisson process <strong>in</strong> which each cluster is either a s<strong>in</strong>gle <strong>po<strong>in</strong>t</strong><br />
or a pair <strong>of</strong> <strong>po<strong>in</strong>t</strong>s.<br />
• rNeymanScott: the general Neyman-Scott cluster process <strong>in</strong> which the cluster mechanism<br />
is arbitrary.<br />
15.2 Cox processes<br />
A Cox <strong>po<strong>in</strong>t</strong> process is effectively a Poisson process with a r<strong>and</strong>om <strong>in</strong>tensity function. Let Λ(u)<br />
be a r<strong>and</strong>om function with non-negative values, def<strong>in</strong>ed at all locations u ∈ R 2 . Conditional on<br />
Λ, let X be a Poisson process with <strong>in</strong>tensity function Λ. Then X is a Cox process.<br />
A trivial example is the “mixed Poisson” process <strong>in</strong> which we generate a r<strong>and</strong>om variable Λ<br />
<strong>and</strong>, conditional on Λ, generate a uniform Poisson process with <strong>in</strong>tensity Λ. Follow<strong>in</strong>g are three<br />
different realisations <strong>of</strong> this process:<br />
> par(mfrow = c(1, 3))<br />
> for (i <strong>in</strong> 1:3) {<br />
+ lambda
100 Simple models <strong>of</strong> non-Poisson <strong>patterns</strong><br />
Poisson process, the result<strong>in</strong>g process <strong>of</strong> reta<strong>in</strong>ed <strong>po<strong>in</strong>t</strong>s is Poisson. To get a non-Poisson process<br />
we need some k<strong>in</strong>d <strong>of</strong> dependent th<strong>in</strong>n<strong>in</strong>g mechanism.<br />
In Matérn’s Model I, a homogeneous Poisson process Y is first generated. Any <strong>po<strong>in</strong>t</strong> <strong>in</strong> Y<br />
that lies closer than a distance r from the nearest other <strong>po<strong>in</strong>t</strong> <strong>of</strong> Y, is deleted. Thus, pairs <strong>of</strong><br />
close neighbours annihilate each other.<br />
> plot(rMaternI(70, 0.05))<br />
rMaternI(70, 0.05)<br />
In Matérn’s Model II, the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> the homogeneous Poisson process Y are marked by ‘arrival<br />
times’ ti which are <strong>in</strong>dependent <strong>and</strong> uniformly distributed <strong>in</strong> [0,1]. Any <strong>po<strong>in</strong>t</strong> <strong>in</strong> Y that lies<br />
closer than distance r from another <strong>po<strong>in</strong>t</strong> that has an earlier arrival time, is deleted.<br />
> plot(rMaternII(70, 0.05))<br />
rMaternII(70, 0.05)<br />
15.4 Sequential models<br />
In a sequential model, we start with an empty w<strong>in</strong>dow, <strong>and</strong> the <strong>po<strong>in</strong>t</strong>s are placed <strong>in</strong>to the w<strong>in</strong>dow<br />
one-at-a-time, accord<strong>in</strong>g to some criterion.<br />
In Simple Sequential Inhibition, each new <strong>po<strong>in</strong>t</strong> is generated uniformly <strong>in</strong> the w<strong>in</strong>dow <strong>and</strong><br />
<strong>in</strong>dependently <strong>of</strong> preced<strong>in</strong>g <strong>po<strong>in</strong>t</strong>s. If the new <strong>po<strong>in</strong>t</strong> lies closer than r units from an exist<strong>in</strong>g<br />
<strong>po<strong>in</strong>t</strong>, then it is rejected <strong>and</strong> another r<strong>and</strong>om <strong>po<strong>in</strong>t</strong> is generated. The process term<strong>in</strong>ates when<br />
no further <strong>po<strong>in</strong>t</strong>s can be added.<br />
Copyright c○CSIRO 2008
15.4 Sequential models 101<br />
> plot(rSSI(0.05, 200))<br />
rSSI(0.05, 200)<br />
Sequential <strong>po<strong>in</strong>t</strong> processes are the easiest way to generate highly ordered <strong>patterns</strong> with high<br />
<strong>in</strong>tensity.<br />
Copyright c○CSIRO 2008
102 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
16 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Suppose that a <strong>po<strong>in</strong>t</strong> pattern appears to have constant <strong>in</strong>tensity, <strong>and</strong> we wish to assess whether<br />
the pattern is Poisson. The alternative is that the <strong>po<strong>in</strong>t</strong>s are dependent (they exhibit ‘<strong>in</strong>teraction’).<br />
Classical writers suggested a simple trichotomy between ‘<strong>in</strong>dependence’ (the Poisson process),<br />
‘regularity’ (where <strong>po<strong>in</strong>t</strong>s tend to avoid each other), <strong>and</strong> ‘cluster<strong>in</strong>g’ (where <strong>po<strong>in</strong>t</strong>s tend to be<br />
close together). [The concept <strong>of</strong> ‘cluster<strong>in</strong>g’ does not imply that the <strong>po<strong>in</strong>t</strong>s are organised <strong>in</strong>to<br />
identifiable ‘clusters’; merely that they are closer together than expected for a Poisson process.]<br />
16.1 Distances<br />
<strong>in</strong>dependent regular clustered<br />
The classical techniques for <strong>in</strong>vestigat<strong>in</strong>g <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction are distance methods, based on<br />
measur<strong>in</strong>g the distances between <strong>po<strong>in</strong>t</strong>s. Specifically we may consider<br />
• pairwise distances sij = ||xi − xj|| between all dist<strong>in</strong>ct pairs <strong>of</strong> <strong>po<strong>in</strong>t</strong>s xi <strong>and</strong> xj (i = j)<br />
<strong>in</strong> the pattern;<br />
• nearest neighbour distances ti = m<strong>in</strong>j=i sij, the distance from each <strong>po<strong>in</strong>t</strong> xi to its<br />
nearest neighbour;<br />
• empty space distances d(u) = m<strong>in</strong>i ||u−xi||, the distance from a fixed reference location<br />
u <strong>in</strong> the w<strong>in</strong>dow to the nearest data <strong>po<strong>in</strong>t</strong>.<br />
If you need to compute these directly, they are available <strong>in</strong> spatstat us<strong>in</strong>g the functions<br />
pairdist, nndist <strong>and</strong> distmap respectively. If X is a <strong>po<strong>in</strong>t</strong> pattern object,<br />
• pairdist(X) returns the matrix <strong>of</strong> pairwise distances.<br />
• nndist(X) returns the vector <strong>of</strong> nearest neighbour distances.<br />
• distmap(X) returns a pixel image whose pixel values are the empty space distances to the<br />
pattern X measured from every pixel.<br />
> data(cells)<br />
> emp plot(emp, ma<strong>in</strong> = "Empty space distances")<br />
> plot(cells, add = TRUE)<br />
Copyright c○CSIRO 2008
16.2 Empty space distances 103<br />
Empty space distances<br />
0 0.05 0.1 0.15 0.2 0.25<br />
Tip: Quite a useful exploratory tool is the Stienen diagram obta<strong>in</strong>ed by draw<strong>in</strong>g a<br />
circle around each data <strong>po<strong>in</strong>t</strong> <strong>of</strong> diameter equal to its nearest neighbour distance:<br />
> plot(X %mark% (nndist(X)/2), markscale = 1, ma<strong>in</strong> = "Stienen diagram")<br />
Stienen diagram<br />
16.2 Empty space distances<br />
It’s easiest to start by expla<strong>in</strong><strong>in</strong>g the analysis <strong>of</strong> the empty space distances<br />
The distance<br />
d(u,x) = m<strong>in</strong>{||u − xi|| : xi ∈ x}<br />
from a fixed location u ∈ R 2 to the nearest <strong>po<strong>in</strong>t</strong> <strong>in</strong> a <strong>po<strong>in</strong>t</strong> pattern x, is called the ‘empty<br />
space distance’ or ‘void distance’. It can be computed for all locations u on a f<strong>in</strong>e grid, us<strong>in</strong>g<br />
the spatstat function distmap as we saw above.<br />
Copyright c○CSIRO 2008
104 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
16.2.1 Edge effects<br />
It is not easy to <strong>in</strong>terpret a histogram <strong>of</strong> the empty space distances. The empirical distribution <strong>of</strong><br />
the empty space distances depends on the geometry <strong>of</strong> the w<strong>in</strong>dow W as well as on characteristics<br />
<strong>of</strong> the <strong>po<strong>in</strong>t</strong> process X.<br />
Another view<strong>po<strong>in</strong>t</strong> is that the w<strong>in</strong>dow <strong>in</strong>troduces a sampl<strong>in</strong>g bias. Recall that under the<br />
‘st<strong>and</strong>ard model’ (Section 2.3) the <strong>po<strong>in</strong>t</strong> process X extends throughout 2-D space, but is observed<br />
only <strong>in</strong>side W. This leads to bias <strong>in</strong> the distance measurements. Conf<strong>in</strong><strong>in</strong>g observations to a<br />
w<strong>in</strong>dow W implies that the observed distance d(u,x) = d(u,X ∩ W) to the nearest data <strong>po<strong>in</strong>t</strong><br />
<strong>in</strong>side W, may be greater than the true distance d(u,X) to the nearest <strong>po<strong>in</strong>t</strong> <strong>of</strong> the complete<br />
<strong>po<strong>in</strong>t</strong> process X.<br />
16.2.2 Empty space function F<br />
observed true<br />
Ignor<strong>in</strong>g the edge problems for a moment, let us focus on the entire <strong>po<strong>in</strong>t</strong> process X.<br />
Assum<strong>in</strong>g X is stationary (statistically <strong>in</strong>variant under translations), we can def<strong>in</strong>e the cumulative<br />
distribution function <strong>of</strong> the empty space distance<br />
F(r) = P {d(u,X) ≤ r} (13)<br />
where u is an arbitrary reference location. If the process is stationary then this def<strong>in</strong>ition does<br />
not depend on u.<br />
The empirical distribution function <strong>of</strong> the observed empty space distances on a grid <strong>of</strong> loca-<br />
tions uj, j = 1,... ,m,<br />
F ∗ (r) = 1<br />
m<br />
<br />
1 {d(uj,x) ≤ r} (14)<br />
j<br />
is a negatively biased estimator <strong>of</strong> F(r), for reasons expla<strong>in</strong>ed above.<br />
Corrections for this ‘edge effect bias’ are required. Many edge corrections are available.<br />
Typically they are weighted versions <strong>of</strong> the ecdf,<br />
F(r) = <br />
e(uj,r)1 {d(uj,x) ≤ r} (15)<br />
j<br />
where e(u,r) is an edge correction weight designed so that F(r) is unbiased. These corrections<br />
are effectively forms <strong>of</strong> the Horvitz-Thompson estimator <strong>of</strong> survey sampl<strong>in</strong>g fame.<br />
Copyright c○CSIRO 2008
16.2 Empty space distances 105<br />
The edge effect problem can also be regarded as a form <strong>of</strong> censor<strong>in</strong>g (analogous to rightcensor<strong>in</strong>g<br />
<strong>in</strong> survival data), as first <strong>po<strong>in</strong>t</strong>ed out by CSIRO researcher Ge<strong>of</strong>f Laslett [33]. A<br />
counterpart <strong>of</strong> the Kaplan-Meier estimator is available. For further <strong>in</strong>formation see [7].<br />
Thus, assum<strong>in</strong>g that the <strong>po<strong>in</strong>t</strong> process is homogeneous, we are able to compute an unbiased<br />
<strong>and</strong> reasonably accurate estimate <strong>of</strong> the empty space function F def<strong>in</strong>ed by (13).<br />
To <strong>in</strong>terpret this estimate, a useful benchmark is the Poisson process. Notice that d(u,X) > r<br />
if <strong>and</strong> only if there are no <strong>po<strong>in</strong>t</strong>s <strong>of</strong> X <strong>in</strong> the disc b(u,r) <strong>of</strong> radius r centred on u. For a<br />
homogeneous Poisson process <strong>of</strong> <strong>in</strong>tensity λ, the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> b(u,r) is Poisson<br />
with mean µ = λarea(b(u,r)) = λπr 2 , so the probability that there are no <strong>po<strong>in</strong>t</strong>s <strong>in</strong> this region<br />
is exp(−µ) = exp(−λπr 2 ). Hence for a Poisson process<br />
Fpois(r) = 1 − exp(−λπr 2 ). (16)<br />
Typically we compare F(r) with the value <strong>of</strong> Fpois(r) obta<strong>in</strong>ed by plugg<strong>in</strong>g <strong>in</strong> the estimated<br />
<strong>in</strong>tensity ˆ λ = n(x)/area(W). Values F(r) > Fpois(r) suggest that empty space distances <strong>in</strong> the<br />
<strong>po<strong>in</strong>t</strong> pattern are shorter than for a Poisson process, suggest<strong>in</strong>g a regularly space pattern; while<br />
values F(r) < Fpois(r) suggest a clustered pattern.<br />
16.2.3 Implementation <strong>in</strong> spatstat<br />
The functionFest computes estimates <strong>of</strong> F(r) us<strong>in</strong>g several edge corrections, <strong>and</strong> the benchmark<br />
value for the Poisson process.<br />
> data(cells)<br />
> plot(cells)<br />
> Fc Fc<br />
Function value object (class ’fv’)<br />
for the function r -> F(r)<br />
Entries:<br />
id label description<br />
-- ----- ----------r<br />
r distance argument r<br />
theo Fpois(r) theoretical Poisson F(r)<br />
rs Fbord(r) border corrected estimate <strong>of</strong> F(r)<br />
km Fkm(r) Kaplan-Meier estimate <strong>of</strong> F(r)<br />
hazard lambda(r) Kaplan-Meier estimate <strong>of</strong> hazard function lambda(r)<br />
raw Fraw(r) uncorrected estimate <strong>of</strong> F(r)<br />
--------------------------------------<br />
Default plot formula:<br />
. ~ r<br />
Recommended range <strong>of</strong> argument r: [0, 0.085]<br />
Copyright c○CSIRO 2008
106 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
cells<br />
Tip: Don’t use F as a variable name! It’s a reserved word — an abbreviation for<br />
FALSE.<br />
The value returned by Fest is an object <strong>of</strong> class "fv" (“function value table”). This is<br />
effectively a data frame with some extra <strong>in</strong>formation. The pr<strong>in</strong>tout for Fc <strong>in</strong>dicates that the<br />
columns <strong>in</strong> the data frame are named r, theo, rs, km, hazard <strong>and</strong> raw. The first column r<br />
conta<strong>in</strong>s a sequence <strong>of</strong> values <strong>of</strong> the function argument r. The next column theo conta<strong>in</strong>s the<br />
correspond<strong>in</strong>g values <strong>of</strong> F(r) for a homogeneous Poisson process. The columns rs, km <strong>and</strong> raw<br />
conta<strong>in</strong> different estimates <strong>of</strong> the empty space function F, namely the ‘reduced sample’ estimator,<br />
the Kaplan-Meier estimator, <strong>and</strong> the uncorrected empirical distribution function, respectively.<br />
The columnhazard conta<strong>in</strong>s an estimate <strong>of</strong> the hazard rate <strong>of</strong> F, i.e. h(r) = (d/dr)log(1−F(r)),<br />
a by-product <strong>of</strong> the Kaplan-Meier estimate.<br />
> par(pty = "s")<br />
> plot(Fest(cells))<br />
lty col<br />
km 1 1<br />
rs 2 2<br />
theo 3 3<br />
F(r)<br />
0.0 0.2 0.4 0.6 0.8<br />
Fest(cells)<br />
0.00 0.02 0.04 0.06 0.08<br />
Copyright c○CSIRO 2008<br />
r
16.2 Empty space distances 107<br />
This is a call to plot.fv. The pr<strong>in</strong>ted output is the return value from plot.fv, which<br />
expla<strong>in</strong>s the encod<strong>in</strong>g <strong>of</strong> the different function estimates us<strong>in</strong>g the R graphics parameters lty<br />
(l<strong>in</strong>e type) <strong>and</strong> col (l<strong>in</strong>e colour).<br />
You’ll notice that, by default, the uncorrected estimate raw <strong>and</strong> the hazard rate hazard were<br />
not plotted. The choice <strong>of</strong> estimates to be plotted, <strong>and</strong> the style <strong>in</strong> which they are plotted, are<br />
controlled by the second argument to plot.fv, which should be an R language formula <strong>in</strong>volv<strong>in</strong>g<br />
the identifier names r, theo, rs, km, hazard <strong>and</strong> raw. To plot the hazard rate aga<strong>in</strong>st r,<br />
> plot(Fest(cells), hazard ~ r, ma<strong>in</strong> = "Hazard rate <strong>of</strong> F")<br />
hazard<br />
0 20 40 60 80<br />
Hazard rate <strong>of</strong> F<br />
0.00 0.02 0.04 0.06 0.08<br />
r<br />
To plot all the estimates <strong>of</strong> F(r), <strong>in</strong>clud<strong>in</strong>g the uncorrected estimate:<br />
> plot(Fest(cells), cb<strong>in</strong>d(km, rs, raw, theo) ~ r)<br />
km , rs , raw , theo<br />
0.0 0.2 0.4 0.6 0.8<br />
Fest(cells)<br />
0.00 0.02 0.04 0.06 0.08<br />
r<br />
Notice the use <strong>of</strong> cb<strong>in</strong>d to specify several different graphs on the same plot.<br />
To plot the estimates <strong>of</strong> F(r) aga<strong>in</strong>st the Poisson value, <strong>in</strong> the style <strong>of</strong> a P–P plot:<br />
> plot(Fest(cells), cb<strong>in</strong>d(km, rs, theo) ~ theo)<br />
Copyright c○CSIRO 2008
108 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
km , rs , theo<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
Fest(cells)<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
theo<br />
(<strong>in</strong>clud<strong>in</strong>g theo on the left side here gives us the diagonal l<strong>in</strong>e).<br />
The symbol . st<strong>and</strong>s for ‘all recommended estimates <strong>of</strong> the function’. So an abbreviation<br />
for the last comm<strong>and</strong> is<br />
> plot(Fest(cells), . ~ theo)<br />
Transformations can be applied to these function values. For example, to subtract the<br />
theoretical Poisson value from the estimates,<br />
> plot(Fest(cells), . - theo ~ r)<br />
cb<strong>in</strong>d(km, rs, theo) − theo<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
Fest(cells)<br />
0.00 0.02 0.04 0.06 0.08<br />
r<br />
To apply Fisher’s variance stabilis<strong>in</strong>g transformation φ( ˆ F(t)) = s<strong>in</strong> −1 ( ( ˆ F(t))),<br />
> plot(Fest(cells), as<strong>in</strong>(sqrt(.)) ~ r)<br />
Copyright c○CSIRO 2008
16.3 Nearest neighbour distances 109<br />
as<strong>in</strong>(sqrt(cb<strong>in</strong>d(km, rs, theo)))<br />
0.0 0.2 0.4 0.6 0.8 1.0 1.2<br />
Fest(cells)<br />
0.00 0.02 0.04 0.06 0.08<br />
r<br />
16.3 Nearest neighbour distances<br />
For other types <strong>of</strong> distances we encounter similar problems. For the nearest neighbour distances<br />
ti = m<strong>in</strong>j=i ||xi − x|j||, aga<strong>in</strong> it is not easy to <strong>in</strong>terpret a histogram <strong>of</strong> the observed distances.<br />
The empirical distribution <strong>of</strong> the nearest neighbour distances depends on the geometry <strong>of</strong> the<br />
w<strong>in</strong>dow W as well as on characteristics <strong>of</strong> the <strong>po<strong>in</strong>t</strong> process X. Conf<strong>in</strong><strong>in</strong>g observations to a<br />
w<strong>in</strong>dow W implies that the observed nearest-neighbour distances are larger, <strong>in</strong> general, than the<br />
‘true’ nearest neighbour distances <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> the entire <strong>po<strong>in</strong>t</strong> process X. Corrections for this<br />
edge effect bias are required.<br />
16.3.1 G function<br />
Assum<strong>in</strong>g the <strong>po<strong>in</strong>t</strong> process X is stationary, we can def<strong>in</strong>e the cumulative distribution function<br />
<strong>of</strong> the nearest-neighbour distance for a typical <strong>po<strong>in</strong>t</strong> <strong>in</strong> the pattern,<br />
G(r) = P {d(u,X \ {u}) ≤ r | u ∈ X} (17)<br />
where u is an arbitrary location, <strong>and</strong> d(u,X \ {u}) is the shortest distance from u to the <strong>po<strong>in</strong>t</strong><br />
pattern X exclud<strong>in</strong>g u itself. If the process is stationary then this def<strong>in</strong>ition does not depend on<br />
u.<br />
The empirical distribution function <strong>of</strong> the observed nearest-neighbour distances<br />
G ∗ (r) = 1<br />
n(x)<br />
<br />
1 {ti ≤ r} (18)<br />
is a negatively biased estimator <strong>of</strong> G(r), for reasons we expla<strong>in</strong>ed above. Many edge corrections<br />
are available. Typically they are weighted versions <strong>of</strong> the ecdf,<br />
G(r) = <br />
e(xi,r)1 {ti ≤ r} (19)<br />
i<br />
where e(xi,r) is an edge correction weight designed so that G(r) is approximately unbiased. A<br />
counterpart <strong>of</strong> the Kaplan-Meier estimator is also available.<br />
For a homogeneous Poisson <strong>po<strong>in</strong>t</strong> process <strong>of</strong> <strong>in</strong>tensity λ, the nearest-neighbour distance<br />
distribution function is known to be<br />
i<br />
Gpois(r) = 1 − exp(−λπr 2 ). (20)<br />
Copyright c○CSIRO 2008
110 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
This is identical to the empty space function for the Poisson process. Intuitively, because <strong>po<strong>in</strong>t</strong>s<br />
<strong>of</strong> the Poisson process are <strong>in</strong>dependent <strong>of</strong> each other, the knowledge that u is a <strong>po<strong>in</strong>t</strong> <strong>of</strong> X does<br />
not affect any other <strong>po<strong>in</strong>t</strong>s <strong>of</strong> the process, hence G is equivalent to F.<br />
Interpretation <strong>of</strong> G(r) is the reverse <strong>of</strong> F(r). Values G(r) > Gpois(r) suggest that nearest<br />
neighbour distances <strong>in</strong> the <strong>po<strong>in</strong>t</strong> pattern are shorter than for a Poisson process, suggest<strong>in</strong>g a<br />
clustered pattern; while values G(r) < Gpois(r) suggest a regular (<strong>in</strong>hibited) pattern.<br />
The function Gest computes estimates <strong>of</strong> G(r) us<strong>in</strong>g several edge corrections, <strong>and</strong> the benchmark<br />
value for the Poisson process.<br />
> Gc Gc<br />
Function value object (class ’fv’)<br />
for the function r -> G(r)<br />
Entries:<br />
id label description<br />
-- ----- ----------r<br />
r distance argument r<br />
theo Gpois(r) theoretical Poisson G(r)<br />
rs Gbord(r) border corrected estimate <strong>of</strong> G(r)<br />
km Gkm(r) Kaplan-Meier estimate <strong>of</strong> G(r)<br />
hazard lambda(r) Kaplan-Meier estimate <strong>of</strong> hazard function lambda(r)<br />
raw Graw(r) uncorrected estimate <strong>of</strong> G(r)<br />
--------------------------------------<br />
Default plot formula:<br />
. ~ r<br />
Recommended range <strong>of</strong> argument r: [0, 0.15]<br />
> par(pty = "s")<br />
> plot(Gest(cells))<br />
G(r)<br />
0.0 0.2 0.4 0.6 0.8<br />
Gest(cells)<br />
0.00 0.05 0.10 0.15<br />
r<br />
The estimate <strong>of</strong> G(r) suggests strongly that the pattern is regular. Indeed, G(r) is zero for<br />
r ≤ 0.07 which <strong>in</strong>dicates that there are no nearest-neighbour distances shorter than 0.07.<br />
Common ways <strong>of</strong> plott<strong>in</strong>g G <strong>in</strong>clude:<br />
Copyright c○CSIRO 2008
16.4 Pairwise distances <strong>and</strong> the K function 111<br />
G(r) <strong>and</strong> Gpois(r) plotted aga<strong>in</strong>st r plot(Gest(X))<br />
G(r) − Gpois(r) plotted aga<strong>in</strong>st r plot(Gest(X), . - theo ~ r)<br />
G(r) plotted aga<strong>in</strong>st Gpois(r) <strong>in</strong> P–P style plot(Gest(X), . ~ theo)<br />
<strong>and</strong> Fisher’s variance-stabilis<strong>in</strong>g transformation φ(G(t)) = s<strong>in</strong> −1 ( G(t)) applied to the P–P<br />
plot:<br />
> fisher plot(Gest(cells), fisher(.) ~ fisher(theo))<br />
fisher(cb<strong>in</strong>d(km, rs, theo))<br />
0.0 0.5 1.0 1.5<br />
Gest(cells)<br />
0.0 0.5 1.0 1.5<br />
fisher(theo)<br />
16.4 Pairwise distances <strong>and</strong> the K function<br />
The observed pairwise distances sij = ||xi −xj|| <strong>in</strong> the data pattern x constitute a biased sample<br />
<strong>of</strong> pairwise distances <strong>in</strong> the <strong>po<strong>in</strong>t</strong> process, with a bias <strong>in</strong> favour <strong>of</strong> smaller distances. For example,<br />
we can never observe a pairwise distance greater than the diameter <strong>of</strong> the w<strong>in</strong>dow.<br />
Ripley [40] def<strong>in</strong>ed the K-function for a stationary <strong>po<strong>in</strong>t</strong> process so that λK(r) is the expected<br />
number <strong>of</strong> other <strong>po<strong>in</strong>t</strong>s <strong>of</strong> the process with<strong>in</strong> a distance r <strong>of</strong> a typical <strong>po<strong>in</strong>t</strong> <strong>of</strong> the process.<br />
Formally<br />
K(r) = 1<br />
E [n(X ∩ b(u,r) \ {u}) | u ∈ X]. (21)<br />
λ<br />
For a homogeneous Poisson process, <strong>in</strong>tuitively, the knowledge that u is a <strong>po<strong>in</strong>t</strong> <strong>of</strong> X does<br />
not affect the other <strong>po<strong>in</strong>t</strong>s <strong>of</strong> the process, so that X\{u} is conditionally a Poisson process. The<br />
expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s fall<strong>in</strong>g <strong>in</strong> b(u,r) is λπr 2 . Thus for a homogeneous Poisson process<br />
Kpois(r) = πr 2<br />
regardless <strong>of</strong> the <strong>in</strong>tensity.<br />
Numerous estimators <strong>of</strong> K have been proposed. Most <strong>of</strong> them are weighted <strong>and</strong> renormalised<br />
empirical distribution functions <strong>of</strong> the pairwise distances, <strong>of</strong> the general form<br />
K(r) =<br />
1<br />
λ 2 area(W)<br />
(22)<br />
<br />
1 {||xi − xj|| ≤ r}e(xi,xj;r) (23)<br />
i<br />
j=i<br />
where e(u,v,r) is an edge correction weight. The choice <strong>of</strong> estimator does not seem to be very<br />
important, as long as some edge correction is applied.<br />
Copyright c○CSIRO 2008
112 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Aga<strong>in</strong> we usually compare the estimate K(r) with the Poisson K function. Values K(r) > πr 2<br />
suggest cluster<strong>in</strong>g, while K(r) < πr 2 suggests a regular pattern.<br />
In spatstat the function Kest computes several estimates <strong>of</strong> the K-function.<br />
> Gc Gc<br />
Function value object (class ’fv’)<br />
for the function r -> K(r)<br />
Entries:<br />
id label description<br />
-- ----- ----------r<br />
r distance argument r<br />
theo Kpois(r) theoretical Poisson K(r)<br />
border Kbord(r) border-corrected estimate <strong>of</strong> K(r)<br />
trans Ktrans(r) translation-corrected estimate <strong>of</strong> K(r)<br />
iso Kiso(r) Ripley isotropic correction estimate <strong>of</strong> K(r)<br />
--------------------------------------<br />
Default plot formula:<br />
. ~ r<br />
Recommended range <strong>of</strong> argument r: [0, 0.25]<br />
> par(pty = "s")<br />
> plot(Kest(cells))<br />
K(r)<br />
0.00 0.05 0.10 0.15 0.20<br />
Kest(cells)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
In this case, the <strong>in</strong>terpretation <strong>of</strong> all three summary statistics F, G <strong>and</strong> K is the same:<br />
emphatic evidence <strong>of</strong> a regular pattern. It is not always the case that these three summaries<br />
give equivalent messages.<br />
A commonly-used transformation <strong>of</strong> K is the L-function<br />
<br />
K(r)<br />
L(r) =<br />
π<br />
which transforms the Poisson K function to the straight l<strong>in</strong>e Lpois(r) = r, mak<strong>in</strong>g visual assessment<br />
<strong>of</strong> the graph much easier. The square root transformation also approximately stabilises<br />
the variance <strong>of</strong> the estimator, mak<strong>in</strong>g it easier to assess deviations.<br />
Copyright c○CSIRO 2008
16.4 Pairwise distances <strong>and</strong> the K function 113<br />
To compute the estimated L function, use Lest.<br />
> L plot(L, ma<strong>in</strong> = "L function")<br />
L(r)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
L function<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
Another related summary function is the pair correlation function<br />
g(r) = K′ (r)<br />
2πr<br />
where K ′ (r) is the derivative <strong>of</strong> K. The pair correlation is <strong>in</strong> some ways easier to <strong>in</strong>terpret than<br />
either K or L, although it is more difficult to estimate. Roughly speak<strong>in</strong>g, the pair correlation<br />
g(r) is the probability <strong>of</strong> observ<strong>in</strong>g a pair <strong>of</strong> <strong>po<strong>in</strong>t</strong>s separated by a distance r, divided by the<br />
correspond<strong>in</strong>g probability for a Poisson process. This is a non-centred correlation which may<br />
take any nonnegative value. The value g(r) = 1 corresponds to complete r<strong>and</strong>omness; for the<br />
Poisson process the pair correlation is gpois(r) ≡ 1. For other processes, values g(r) > 1 suggest<br />
cluster<strong>in</strong>g or attraction at distance r, while values g(r) < 1 suggest <strong>in</strong>hibition or regularity.<br />
To compute the estimated pair correlation function, use pcf.<br />
> plot(pcf(cells))<br />
g(r)<br />
0.0 0.5 1.0 1.5<br />
pcf(cells)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
Here we have used the method pcf.ppp. This computes a st<strong>and</strong>ard kernel estimate which<br />
performs well except at very small values <strong>of</strong> r. So it is prudent not to read too much <strong>in</strong>to the<br />
behaviour <strong>of</strong> the pcf close to r = 0.<br />
Copyright c○CSIRO 2008
114 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
If you want to try another algebraic transformation <strong>of</strong> a summary function, the transformation<br />
can be computed us<strong>in</strong>g eval.fv. You can also plot algebraic transformations <strong>of</strong> a summary<br />
function us<strong>in</strong>g the ‘plott<strong>in</strong>g formula’ argument to plot.fv. For example, if we have already<br />
computed the K function, we can plot the L function by<br />
> K plot(K, sqrt(./pi) ~ r)<br />
<strong>and</strong> compute the L function us<strong>in</strong>g eval.fv:<br />
> K L K g 1 suggest regularity, <strong>and</strong> J(r) < 1 suggest cluster<strong>in</strong>g.<br />
An appeal<strong>in</strong>g property <strong>of</strong> the J function is that the superposition X• = X1 ∪ X2 <strong>of</strong> two<br />
<strong>in</strong>dependent <strong>po<strong>in</strong>t</strong> processes X1,X2 has J-function<br />
J(t) = λ1<br />
J1(t) +<br />
λ1 + λ2<br />
λ2<br />
J2(t)<br />
λ1 + λ2<br />
where J1,J2 are the J-functions <strong>of</strong> X1,X2 respectively <strong>and</strong> λ1,λ2 are their <strong>in</strong>tensities.<br />
The J function is computed by Jest.<br />
The convenient function allstats efficiently computes the F, G, J <strong>and</strong> K functions for a<br />
dataset. They can be plotted automatically.<br />
> plot(allstats(cells))<br />
Copyright c○CSIRO 2008
16.6 Manipulat<strong>in</strong>g <strong>and</strong> plott<strong>in</strong>g summary functions 115<br />
F(r)<br />
J(r)<br />
0.0 0.2 0.4 0.6 0.8<br />
2 4 6 8<br />
F function<br />
0.00 0.02 0.04 0.06 0.08<br />
r<br />
J function<br />
0.00 0.02 0.04 0.06 0.08<br />
allstats(cells)<br />
G(r)<br />
K(r)<br />
0.0 0.2 0.4 0.6 0.8<br />
0.00 0.05 0.10 0.15 0.20<br />
G function<br />
0.00 0.05 0.10 0.15<br />
r<br />
K function<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
16.6 Manipulat<strong>in</strong>g <strong>and</strong> plott<strong>in</strong>g summary functions<br />
As expla<strong>in</strong>ed above, the summary function comm<strong>and</strong>s Fest, Gest, Kest, Lest, pcf etc. return<br />
a function value table (an object <strong>of</strong> class "fv"). This is a data frame (i.e. it also belongs to<br />
the class "data.frame") with some extra <strong>in</strong>formation. One column <strong>of</strong> the data frame conta<strong>in</strong>s<br />
values <strong>of</strong> the distance argument r, while the other columns conta<strong>in</strong> different estimates <strong>of</strong> the<br />
value <strong>of</strong> the function, or the theoretical value <strong>of</strong> the function under CSR.<br />
The follow<strong>in</strong>g operations are def<strong>in</strong>ed on this class:<br />
pr<strong>in</strong>t.fv pr<strong>in</strong>t a summary description<br />
plot.fv plot the function estimates<br />
as.data.frame strip extra <strong>in</strong>formation (returns a data frame)<br />
$ extract one column (returns a numeric vector)<br />
[.fv extract subset (returns an "fv" object)<br />
with.fv perform calculations with specific columns <strong>of</strong> data frame<br />
eval.fv perform calculation on all columns <strong>of</strong> data frame<br />
stieltjes compute Stieltjes <strong>in</strong>tegral with respect to function<br />
To make life easier, there are several options for manipulat<strong>in</strong>g the function values.<br />
To manipulate or comb<strong>in</strong>e one or more columns <strong>of</strong> the data frame, it is typically easiest to<br />
use the comm<strong>and</strong> with.fv, which is a method for the generic comm<strong>and</strong> with. For example:<br />
> data(redwood)<br />
> K y x
116 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
seedl<strong>in</strong>gs data. For this to work, we have to know that K conta<strong>in</strong>s columns named r, iso <strong>and</strong><br />
theo.<br />
The general syntax is with(X, expr) where X is an "fv" object <strong>and</strong> expr can be any<br />
expression <strong>in</strong>volv<strong>in</strong>g the names <strong>of</strong> columns <strong>of</strong> X. The expression can <strong>in</strong>clude functions, so long<br />
as they are capable <strong>of</strong> operat<strong>in</strong>g on numeric vectors. The expression can also <strong>in</strong>volve the symbol<br />
. represent<strong>in</strong>g “all recommended estimates <strong>of</strong> the function”. Thus:<br />
> L with(Kest(redwood), max(abs(iso - theo)))<br />
[1] 0.04945199<br />
To plot a transformed function, you can also use the plot method. Its second argument is<br />
a formula <strong>in</strong> the R language. The left side <strong>of</strong> the formula represents what curve or curves will<br />
be plotted on the y axis, <strong>and</strong> the right side determ<strong>in</strong>es the x variable for the plot. Thus:<br />
> plot(K, sqrt(./pi) ~ r)<br />
plots the estimates <strong>of</strong> L(r) = K(r), by all the available edge correction methods, aga<strong>in</strong>st<br />
r. The symbol . aga<strong>in</strong> signifies “all recommended estimates <strong>of</strong> the function”. The left h<strong>and</strong> side<br />
<strong>of</strong> the formula may use the comm<strong>and</strong> cb<strong>in</strong>d to <strong>in</strong>dicate that several different curves should be<br />
plotted. For example, to plot only two curves, giv<strong>in</strong>g the isotropic correction estimate <strong>and</strong> the<br />
theoretical value <strong>of</strong> K(r):<br />
> plot(K, cb<strong>in</strong>d(iso, theo) ~ r)<br />
The right-h<strong>and</strong> side can be any expression that evaluates to a numeric vector, <strong>and</strong> the left<br />
h<strong>and</strong> side is any expression that evaluates to a vector or matrix, <strong>of</strong> compatible dimensions.<br />
To manipulate or comb<strong>in</strong>e one or more "fv" objects, use eval.fv. Its argument is an<br />
expression conta<strong>in</strong><strong>in</strong>g the names <strong>of</strong> "fv" objects. For example<br />
> K L K1 K2 DK
16.7 Caveats 117<br />
16.7 Caveats<br />
The use <strong>of</strong> summary functions for analys<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> has become established across wide<br />
areas <strong>of</strong> applied science, follow<strong>in</strong>g Ripley’s <strong>in</strong>fluential paper [40] <strong>and</strong> many subsequent textbooks<br />
[19, 21, 23, 47, 42, 43, 46] until quite recently.<br />
There is a tendency to apply them uncritically <strong>and</strong> exclusively. It’s important to remember<br />
that<br />
1. the functions F, G <strong>and</strong> K are def<strong>in</strong>ed <strong>and</strong> estimated under the assumption that the <strong>po<strong>in</strong>t</strong><br />
process is stationary (homogeneous).<br />
2. these summary functions do not completely characterise the process.<br />
3. if the process is not stationary, deviations between the empirical <strong>and</strong> theoretical functions<br />
(e.g. K <strong>and</strong> Kpois) are not necessarily evidence <strong>of</strong> <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction, s<strong>in</strong>ce they may<br />
also be attributable to variations <strong>in</strong> <strong>in</strong>tensity.<br />
For an example <strong>of</strong> caveat 2, here is a <strong>po<strong>in</strong>t</strong> process constructed by Baddeley <strong>and</strong> Silverman<br />
[10] which has the same K function as the homogeneous Poisson process:<br />
> par(mfrow = c(1, 2))<br />
> X plot(X)<br />
> plot(Kest(X))<br />
X<br />
0.00 0.05 0.10 0.15 0.20<br />
Kest(X)<br />
For an example <strong>of</strong> caveat 3, we generate an <strong>in</strong>homogeneous Poisson pattern <strong>and</strong> apply the<br />
ord<strong>in</strong>ary K function estimator. The result appears to show cluster<strong>in</strong>g, but this is an artefact <strong>of</strong><br />
the <strong>spatial</strong> <strong>in</strong>homogeneity.<br />
> par(mfrow = c(1, 2))<br />
> X plot(X)<br />
> plot(Kest(X))<br />
Copyright c○CSIRO 2008
118 Methods 5: Distance methods for <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Copyright c○CSIRO 2008<br />
X<br />
0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
Kest(X)
17 Methods 6: simulation envelopes <strong>and</strong> goodness-<strong>of</strong>-fit tests<br />
Although summary statistics such as the K-function are <strong>in</strong>tended primarily for exploratory<br />
purposes, it is also possible to use them as a basis for statistical <strong>in</strong>ference.<br />
17.1 Envelopes <strong>and</strong> Monte Carlo tests<br />
17.1.1 Motivation<br />
In Section 16 we exam<strong>in</strong>ed plots <strong>of</strong> the K-function to judge whether a <strong>po<strong>in</strong>t</strong> pattern dataset is<br />
completely r<strong>and</strong>om. The K-function estimated from the <strong>po<strong>in</strong>t</strong> pattern data, K(r), was compared<br />
graphically with the theoretical K-function for a completely r<strong>and</strong>om pattern, Kpois(r) = πr 2 . In<br />
the toy examples, large discrepancies between K <strong>and</strong> Kpois were observed, <strong>in</strong>dicat<strong>in</strong>g that the<br />
toy examples were not completely r<strong>and</strong>om <strong>patterns</strong>.<br />
However, because <strong>of</strong> r<strong>and</strong>om variability, we will never obta<strong>in</strong> perfect agreement between K<br />
<strong>and</strong> Kpois, even with a completely r<strong>and</strong>om pattern. Try typ<strong>in</strong>g plot(Kest(rpoispp(50))) a<br />
few times to get an idea <strong>of</strong> the <strong>in</strong>herent variability.<br />
The follow<strong>in</strong>g plot shows the K-function estimated from the cells dataset (thick l<strong>in</strong>e), <strong>and</strong><br />
also the K-functions <strong>of</strong> 20 simulated realisations <strong>of</strong> CSR with the same <strong>in</strong>tensity (th<strong>in</strong> l<strong>in</strong>es).<br />
K(r)<br />
0.00 0.05 0.10 0.15<br />
0.00 0.05 0.10 0.15 0.20<br />
r<br />
The next plot shows the upper <strong>and</strong> lower envelopes <strong>of</strong> the simulated K-functions, that is, the<br />
maximum <strong>and</strong> m<strong>in</strong>imum values <strong>of</strong> K(r) for each value <strong>of</strong> r. The region between the envelopes<br />
is shaded.<br />
K(r)<br />
0.00 0.05 0.10 0.15 0.20<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
119<br />
Copyright c○CSIRO 2008
120 Methods 6: simulation envelopes <strong>and</strong> goodness-<strong>of</strong>-fit tests<br />
Clearly, the K-function estimated from thecells data lies outside the typical range <strong>of</strong> values<br />
<strong>of</strong> the K-function for a completely r<strong>and</strong>om pattern.<br />
To conclude formally that there is a ‘significant’ difference between K <strong>and</strong> Kpois, we use<br />
the language <strong>of</strong> hypothesis test<strong>in</strong>g. Our null hypothesis is that the data <strong>po<strong>in</strong>t</strong> pattern is a<br />
realisation <strong>of</strong> complete <strong>spatial</strong> r<strong>and</strong>omness. The alternative hypothesis is that the data pattern<br />
is a realisation <strong>of</strong> another, unspecified <strong>po<strong>in</strong>t</strong> process.<br />
17.1.2 Monte Carlo tests<br />
A Monte Carlo test is a test based on simulations from the null hypothesis. The pr<strong>in</strong>ciple was<br />
orig<strong>in</strong>ated <strong>in</strong>dependently by Barnard [12] <strong>and</strong> Dwass [27]. It was applied <strong>in</strong> <strong>spatial</strong> statistics<br />
by Ripley [40, 42] <strong>and</strong> Besag [15, 16]. See also [28]. Monte Carlo tests are a special case <strong>of</strong><br />
r<strong>and</strong>omisation tests which are commonly used <strong>in</strong> nonparametric statistics.<br />
Suppose the reference curve is the theoretical K function for CSR. Generate M <strong>in</strong>dependent<br />
simulations <strong>of</strong> CSR <strong>in</strong>side the study region W. Compute the estimated K functions for each <strong>of</strong><br />
these realisations, say K (j) (r) for j = 1,... ,M. Obta<strong>in</strong> the <strong>po<strong>in</strong>t</strong>wise upper <strong>and</strong> lower envelopes<br />
<strong>of</strong> these simulated curves,<br />
L(r) = m<strong>in</strong><br />
j<br />
U(r) = max<br />
j<br />
K (j) (r)<br />
K (j) (r).<br />
For any fixed value <strong>of</strong> r, consider the probability that K(r) lies outside the envelope [L(r),U(r)]<br />
for the simulated curves. If the data came from a uniform Poisson process, then K(r) <strong>and</strong><br />
K (1) (r),... , K (M) (r) are statistically equivalent <strong>and</strong> <strong>in</strong>dependent, so this probability is equal<br />
to 2/(M + 1) by symmetry. That is, the test which rejects the null hypothesis <strong>of</strong> a uniform<br />
Poisson process when K(r) lies outside [L(r),U(r)], has exact significance level α = 2/(M + 1).<br />
Instead <strong>of</strong> the <strong>po<strong>in</strong>t</strong>wise maximum <strong>and</strong> m<strong>in</strong>imum, one could use the <strong>po<strong>in</strong>t</strong>wise order statistics<br />
(the <strong>po<strong>in</strong>t</strong>wise kth largest <strong>and</strong> k smallest values) giv<strong>in</strong>g a test <strong>of</strong> exact size α = 2k/(M + 1).<br />
17.1.3 Envelopes <strong>in</strong> spatstat<br />
In spatstat the function envelope computes the <strong>po<strong>in</strong>t</strong>wise envelopes.<br />
> data(cells)<br />
> E E<br />
Po<strong>in</strong>twise critical envelopes for K(r)<br />
Obta<strong>in</strong>ed from 39 simulations <strong>of</strong> simulations <strong>of</strong> CSR<br />
Significance level <strong>of</strong> <strong>po<strong>in</strong>t</strong>wise Monte Carlo test: 2/40 = 0.05<br />
Data: cells<br />
Function value object (class ’fv’)<br />
for the function r -> K(r)<br />
Entries:<br />
id label description<br />
-- ----- ----------r<br />
r distance argument r<br />
obs obs(r) function value for data pattern<br />
Copyright c○CSIRO 2008
17.1 Envelopes <strong>and</strong> Monte Carlo tests 121<br />
theo theo(r) theoretical value for CSR<br />
lo lo(r) lower <strong>po<strong>in</strong>t</strong>wise envelope <strong>of</strong> simulations<br />
hi hi(r) upper <strong>po<strong>in</strong>t</strong>wise envelope <strong>of</strong> simulations<br />
--------------------------------------<br />
Default plot formula:<br />
. ~ r<br />
Recommended range <strong>of</strong> argument r: [0, 0.25]<br />
> plot(E, ma<strong>in</strong> = "<strong>po<strong>in</strong>t</strong>wise envelopes")<br />
K(r)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
<strong>po<strong>in</strong>t</strong>wise envelopes<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
For example if r had been fixed at r = 0.10 we would have rejected the null hypothesis <strong>of</strong><br />
CSR at the 5% level. The value M = 39 is the smallest to yield a two-sided test with significance<br />
level 5%.<br />
Tip: A common <strong>and</strong> dangerous mistake is to mis<strong>in</strong>terpret the simulation envelopes<br />
as “confidence <strong>in</strong>tervals” around ˆ K. They cannot be <strong>in</strong>terpreted as a measure <strong>of</strong><br />
accuracy <strong>of</strong> the estimated K function! They are the critical values for a test <strong>of</strong> the<br />
hypothesis that K(r) = πr 2 .<br />
The value returned by envelope is an object <strong>of</strong> class "fv" that can be manipulated <strong>in</strong><br />
the usual way: you can plot it, transform it, extract columns, <strong>and</strong> so on (see Section 16.6 on<br />
page 115).<br />
17.1.4 Simultaneous Monte Carlo test<br />
Note that the theory <strong>of</strong> the Monte Carlo test, as presented above, requires that r be fixed <strong>in</strong><br />
advance. If we plot the envelope <strong>and</strong> check whether the empirical K function ever w<strong>and</strong>ers<br />
Copyright c○CSIRO 2008
122 Methods 6: simulation envelopes <strong>and</strong> goodness-<strong>of</strong>-fit tests<br />
outside the envelope, this is equivalent to choos<strong>in</strong>g the value <strong>of</strong> r <strong>in</strong> a data-dependent way, <strong>and</strong><br />
the true significance level is higher (less ‘significant’).<br />
To avoid this problem we can construct simultaneous critical b<strong>and</strong>s which have the property<br />
that, under H0, the probability that K ever w<strong>and</strong>ers outside the critical b<strong>and</strong>s is exactly 5%.<br />
One simple way to achieve this is to compute, for each estimate K(r), its maximum deviation<br />
from the Poisson K function, D = maxr | K(r) − Kpois(r)|. This is computed for each <strong>of</strong> the M<br />
simulated datasets, <strong>and</strong> the maximum value Dmax obta<strong>in</strong>ed. Then the upper <strong>and</strong> lower limits<br />
are<br />
L(r) = πr 2 − Dmax<br />
U(r) = πr 2 + Dmax.<br />
The estimated K function for the data transgresses these limits if <strong>and</strong> only if the D-value for<br />
the data exceeds Dmax. Under H0 this occurs with probability 1/(M + 1). Thus, a test <strong>of</strong> size<br />
5% is obta<strong>in</strong>ed by tak<strong>in</strong>g M = 19.<br />
> E plot(E, ma<strong>in</strong> = "global envelopes")<br />
K(r)<br />
−0.05 0.00 0.05 0.10 0.15 0.20 0.25<br />
global envelopes<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
A more powerful test is obta<strong>in</strong>ed if we (approximately) stabilise the variance, by us<strong>in</strong>g the<br />
L function <strong>in</strong> place <strong>of</strong> K.<br />
> E plot(E, ma<strong>in</strong> = "global envelopes <strong>of</strong> L(r)")<br />
Copyright c○CSIRO 2008
17.1 Envelopes <strong>and</strong> Monte Carlo tests 123<br />
L(r)<br />
−0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
global envelopes <strong>of</strong> L(r)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
17.1.5 Envelopes for any fitted model<br />
In the explanation above, we assumed that the null hypothesis was CSR (complete <strong>spatial</strong><br />
r<strong>and</strong>omness, a uniform Poisson process). In fact the Monte Carlo test<strong>in</strong>g rationale can be<br />
applied to any <strong>po<strong>in</strong>t</strong> process model serv<strong>in</strong>g as a null hypothesis. We simply have to generate<br />
simulated realisations from the null hypothesis, <strong>and</strong> compute the summary function for each<br />
simulated realisation.<br />
To simulate from a fitted <strong>po<strong>in</strong>t</strong> process model (object <strong>of</strong> class "ppm"), call the envelope<br />
function, giv<strong>in</strong>g the fitted model as the first argument <strong>of</strong> envelope. Then the simulated <strong>patterns</strong><br />
will be generated accord<strong>in</strong>g to this fitted model. The orig<strong>in</strong>al data <strong>po<strong>in</strong>t</strong> pattern, to which the<br />
model was fitted, is stored <strong>in</strong> the fitted model object; the orig<strong>in</strong>al data are extracted <strong>and</strong> the<br />
summary function for the data is also computed.<br />
The follow<strong>in</strong>g code fits an <strong>in</strong>homogeneous Poisson process to the Beilschmiedia pattern, then<br />
generates simulation envelopes <strong>of</strong> the L function by simulat<strong>in</strong>g from the fitted <strong>in</strong>homogeneous<br />
Poisson model.<br />
> data(bei)<br />
> fit E plot(E, ma<strong>in</strong> = "envelope for <strong>in</strong>homogeneous Poisson")<br />
L(r)<br />
0 20 40 60 80 100 120<br />
envelope for <strong>in</strong>homogeneous Poisson<br />
0 20 40 60 80 100 120<br />
r (metres)<br />
Copyright c○CSIRO 2008
124 Methods 6: simulation envelopes <strong>and</strong> goodness-<strong>of</strong>-fit tests<br />
17.1.6 Envelopes based on any simulation procedure<br />
Envelopes can also be computed us<strong>in</strong>g any user-specified procedure to generate the simulated<br />
realisations. This allows us to perform r<strong>and</strong>omisation tests, for example.<br />
The simulation procedure should be encoded as an R expression, which will be evaluated<br />
each time a simulation is required. For example if we type<br />
> sim data(cells)<br />
> e E plot(E, ma<strong>in</strong> = "envelope with fixed n")<br />
L(r)<br />
−0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30<br />
envelope with fixed n<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
17.1.7 Envelopes based on a set <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Envelopes can also be computed from a user-supplied list <strong>of</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, <strong>in</strong>stead <strong>of</strong> the<br />
simulated <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> generated by a chosen simulation procedure.<br />
This improves efficiency <strong>and</strong> consistency if, for example, we are go<strong>in</strong>g to calculate the envelopes<br />
<strong>of</strong> several different summary statistics.<br />
> data(cells)<br />
> SimPatList for (i <strong>in</strong> 1:1000) SimPatList[[i]] EK Ep
18 Methods 7: model-fitt<strong>in</strong>g us<strong>in</strong>g summary statistics<br />
Summary statistics can also be used to fit models to data.<br />
In the ‘method <strong>of</strong> moments’ we estimate a parameter θ by solv<strong>in</strong>g<br />
Eθ[S(X)] = S(x)<br />
where S(x) is the observed value <strong>of</strong> a statistic S for our data x, <strong>and</strong> the left side is the theoretical<br />
mean <strong>of</strong> S for the model governed by parameter θ.<br />
The analogue for <strong>po<strong>in</strong>t</strong> process models is to fit the model by match<strong>in</strong>g a summary statistic<br />
such as the K function to its theoretical value under the model.<br />
18.0.8 Cluster processes<br />
In a precious few cases, the K function <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process is known exactly, as an analytic<br />
expression <strong>in</strong> terms <strong>of</strong> the model parameters. These happy cases <strong>in</strong>clude many Neyman-Scott<br />
cluster processes. For example, the K-function <strong>of</strong> the Thomas process with parameters θ =<br />
(κ,µ,σ) is<br />
Kθ(r) = πr 2 + 1 r2<br />
(1 − exp(−<br />
κ 4σ2)). (26)<br />
We can use this to fit a Thomas model to data. We determ<strong>in</strong>e the values <strong>of</strong> the parameters<br />
θ = (κ,µ,σ) to achieve the best match between Kθ(r) <strong>and</strong> the estimated K-function <strong>of</strong> the data,<br />
K(r). The best match is determ<strong>in</strong>ed by m<strong>in</strong>imis<strong>in</strong>g the discrepancy between the two functions<br />
over some range [a,b]:<br />
b <br />
<br />
D(θ) = K(r) q − Kθ(r) q<br />
<br />
<br />
dr (27)<br />
a<br />
where 0 ≤ a < b, <strong>and</strong> where p,q > 0 are <strong>in</strong>dices. This method was orig<strong>in</strong>ally advocated by Peter<br />
Diggle <strong>and</strong> collaborators, <strong>and</strong> is now known as the method <strong>of</strong> m<strong>in</strong>imum contrast. See [23].<br />
The comm<strong>and</strong> kppm fits cluster <strong>po<strong>in</strong>t</strong> process models by the method <strong>of</strong> m<strong>in</strong>imum contrast.<br />
To fit the Thomas model to the redwood data:<br />
> data(redwood)<br />
> fit fit<br />
Stationary cluster <strong>po<strong>in</strong>t</strong> process model<br />
Fitted to <strong>po<strong>in</strong>t</strong> pattern dataset ’redwood’<br />
Cluster model: Thomas process<br />
Fitted parameters:<br />
kappa sigma mu<br />
23.55389789 0.04704965 2.63226071<br />
p<br />
125<br />
Copyright c○CSIRO 2008
126 Methods 7: model-fitt<strong>in</strong>g us<strong>in</strong>g summary statistics<br />
> plot(fit)<br />
K(r)<br />
0.00 0.05 0.10 0.15 0.20<br />
fit<br />
K−function<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
The plot shows the theoretical K function <strong>of</strong> the fitted Thomas process (fit), three nonparametric<br />
estimates <strong>of</strong> the K function (iso, trans, border) <strong>and</strong> the Poisson K function<br />
(theo).<br />
At present, the only cluster process models that can be fitted us<strong>in</strong>g kppm are the Thomas<br />
process <strong>and</strong> the Matérn cluster process. To fit the Matérn cluster process to the redwood data,<br />
> fitM plot(simulate(fit, nsim = 4))<br />
The comm<strong>and</strong> simulate is generic; here we have used the method simulate.kppm.<br />
18.1 Other models with known K function<br />
Apart from cluster processes, there are certa<strong>in</strong> other <strong>po<strong>in</strong>t</strong> process models for which the Kfunction<br />
is known as a function <strong>of</strong> the model parameters. M<strong>in</strong>imum contrast methods are also<br />
available for these models.<br />
One special case is the log-Gaussian Cox processes described <strong>in</strong> detail <strong>in</strong> [37]. To fit a<br />
log-Gaussian Cox process with exponential covariance function to the redwood data:<br />
> fit fit<br />
M<strong>in</strong>imum contrast fit (object <strong>of</strong> class "m<strong>in</strong>confit")<br />
Model: log-Gaussian Cox process<br />
Fitted by match<strong>in</strong>g theoretical K function to Kest(K)<br />
Parameters fitted by m<strong>in</strong>imum contrast ($par):<br />
sigma2 alpha<br />
1.0485493 0.0997963<br />
Copyright c○CSIRO 2008
18.2 Generic algorithm for m<strong>in</strong>imum contrast 127<br />
Derived parameters <strong>of</strong> log-Gaussian Cox process ($modelpar):<br />
sigma2 alpha mu<br />
1.0485493 0.0997963 3.6028597<br />
Converged successfully after 145 iterations.<br />
Doma<strong>in</strong> <strong>of</strong> <strong>in</strong>tegration: [ 0 , 0.25 ]<br />
Exponents: p= 2, q= 0.25<br />
The second argument to lgcp.estK gives <strong>in</strong>itial values for the model parameters σ 2 <strong>and</strong> α.<br />
The result <strong>of</strong> lgcp.estK is an object <strong>of</strong> class m<strong>in</strong>confit (represent<strong>in</strong>g a ‘m<strong>in</strong>imum contrast<br />
fit’). There are methods for pr<strong>in</strong>t<strong>in</strong>g <strong>and</strong> plott<strong>in</strong>g the fit. Simulation <strong>of</strong> these models has not<br />
yet been implemented <strong>in</strong> spatstat.<br />
18.2 Generic algorithm for m<strong>in</strong>imum contrast<br />
The comm<strong>and</strong> m<strong>in</strong>contrast is a generic fitt<strong>in</strong>g algorithm for the method <strong>of</strong> m<strong>in</strong>imum contrast.<br />
It can be used <strong>in</strong> any context where the theoretical function can be computed exactly from the<br />
model parameters. A basic call to m<strong>in</strong>contrast is:<br />
> m<strong>in</strong>contrast(observed, theoretical, starpar)<br />
where observed is an object <strong>of</strong> class "fv" conta<strong>in</strong><strong>in</strong>g the summary function calculated from<br />
the data; theoretical is a function which returns the theoretical value <strong>of</strong> the summary function<br />
for a given parameter value; <strong>and</strong> startpar is a vector <strong>of</strong> <strong>in</strong>itial values <strong>of</strong> the model parameters.<br />
For details, see the help file for m<strong>in</strong>contrast.<br />
18.2.1 Monte Carlo<br />
For the vast majority <strong>of</strong> <strong>po<strong>in</strong>t</strong> process models, the true K function Kθ(r) is not known analytically<br />
<strong>in</strong> terms <strong>of</strong> the parameter θ. In pr<strong>in</strong>ciple we could use Monte Carlo simulation to determ<strong>in</strong>e<br />
an approximation to Kθ(r), for any given θ, by generat<strong>in</strong>g a large number <strong>of</strong> simulated realisations<br />
<strong>of</strong> the process with parameter θ, comput<strong>in</strong>g the estimated K-function for each realisation,<br />
<strong>and</strong> tak<strong>in</strong>g the <strong>po<strong>in</strong>t</strong>wise sample average. It’s possible to do this <strong>in</strong> spatstat us<strong>in</strong>g the generic<br />
algorithm m<strong>in</strong>contrast. Details are not given here as it is rather fiddly at present, <strong>and</strong> will<br />
change soon.<br />
Copyright c○CSIRO 2008
128 Methods 8: adjust<strong>in</strong>g for <strong>in</strong>homogeneity<br />
19 Methods 8: adjust<strong>in</strong>g for <strong>in</strong>homogeneity<br />
If a <strong>po<strong>in</strong>t</strong> pattern is known or suspected to be <strong>spatial</strong>ly <strong>in</strong>homogeneous, then our statistical<br />
analysis <strong>of</strong> the pattern should take account <strong>of</strong> this <strong>in</strong>homogeneity.<br />
19.1 Inhomogeneous K function<br />
There is a modification <strong>of</strong> the K function that applies to <strong>in</strong>homogeneous processes [2]. If λ(u)<br />
is the true <strong>in</strong>tensity function <strong>of</strong> the <strong>po<strong>in</strong>t</strong> process X, then the idea is that each <strong>po<strong>in</strong>t</strong> xi will be<br />
weighted by wi = 1/λ(xi).<br />
The <strong>in</strong>homogeneous K-function is def<strong>in</strong>ed as<br />
K<strong>in</strong>hom(r) = E<br />
⎡<br />
⎣ 1<br />
λ(u)<br />
<br />
xj∈X<br />
1<br />
λ(xj) 1 {0 < ||u − xj|| ≤ r}<br />
⎤<br />
<br />
<br />
u ∈ X⎦<br />
(28)<br />
<br />
assum<strong>in</strong>g that this does not depend on location u. Thus, λ(u)K(r) is the expected total ‘weight’<br />
<strong>of</strong> all r<strong>and</strong>om <strong>po<strong>in</strong>t</strong>s with<strong>in</strong> a distance r <strong>of</strong> the <strong>po<strong>in</strong>t</strong> u, where the ‘weight’ <strong>of</strong> a <strong>po<strong>in</strong>t</strong> xi is 1/λ(xi).<br />
If the process is actually homogeneous, then λ(u) is constant <strong>and</strong> K<strong>in</strong>hom(r) reduces to the<br />
usual K function (21).<br />
It turns out that, for an <strong>in</strong>homogeneous Poisson process with <strong>in</strong>tensity function λ(u), the<br />
<strong>in</strong>homogeneous K function is<br />
exactly as for the homogeneous case.<br />
K<strong>in</strong>hom, pois(r) = πr 2<br />
The st<strong>and</strong>ard estimators <strong>of</strong> K can be extended to the <strong>in</strong>homogeneous K function:<br />
K<strong>in</strong>hom(r) =<br />
1<br />
area(W)<br />
<br />
i<br />
j=i<br />
(29)<br />
1 {||xi − xj|| ≤ r}<br />
λ(xi) e(xi,xj;r) (30)<br />
λ(xj)<br />
where e(u,v,r) is an edge correction weight as before, <strong>and</strong> λ(u) is an estimate <strong>of</strong> the <strong>in</strong>tensity<br />
function λ(u).<br />
There rema<strong>in</strong>s the question <strong>of</strong> how to estimate the <strong>in</strong>tensity function λ(u). It is usually<br />
advisable to obta<strong>in</strong> the <strong>in</strong>tensity estimate λ(u) by fitt<strong>in</strong>g a parametric model, to avoid overfitt<strong>in</strong>g.<br />
Here is an example for the tropical ra<strong>in</strong>forest data, us<strong>in</strong>g the covariate data to suggest a model<br />
for the <strong>in</strong>tensity.<br />
> data(bei)<br />
> fit lam Ki plot(Ki, ma<strong>in</strong> = "Inhomogeneous K function")<br />
Copyright c○CSIRO 2008
0 10000 20000 30000 40000 50000<br />
19.1 Inhomogeneous K function 129<br />
Inhomogeneous K function<br />
The plot suggests that, even after account<strong>in</strong>g for dependence on altitude <strong>and</strong> slope, the trees<br />
still appear to be clustered.<br />
The <strong>in</strong>tensity function λ(u) could also be estimated by kernel smooth<strong>in</strong>g the <strong>po<strong>in</strong>t</strong> pattern<br />
data. However, notice that the estimator (30) <strong>of</strong> the <strong>in</strong>homogeneous K function depends on<br />
the estimated <strong>in</strong>tensity values at the data <strong>po<strong>in</strong>t</strong>s, λ(xi). These are positively biased estimates<br />
<strong>of</strong> the true values λ(xi). In order to avoid bias, the value λ(xi) should be estimated by kernel<br />
smooth<strong>in</strong>g <strong>of</strong> the <strong>po<strong>in</strong>t</strong> pattern with the <strong>po<strong>in</strong>t</strong> xi deleted. This “leave-one-out” estimator is<br />
implemented <strong>in</strong> K<strong>in</strong>hom <strong>and</strong> is <strong>in</strong>voked when the argument lambda is not given:<br />
> Ki2 plot(Ki2, ma<strong>in</strong> = "K<strong>in</strong>hom us<strong>in</strong>g leave-one-out")<br />
lty col<br />
bord.modif 1 1<br />
border 2 2<br />
theo 3 3<br />
(the smooth<strong>in</strong>g parameter σ can also be controlled.)<br />
The <strong>in</strong>homogeneous analogue <strong>of</strong> the L-function is def<strong>in</strong>ed by<br />
<br />
L<strong>in</strong>hom(r) =<br />
K<strong>in</strong>hom(r)<br />
This can be computed us<strong>in</strong>g L<strong>in</strong>hom. For an <strong>in</strong>homogeneous Poisson process, L<strong>in</strong>hom(r) ≡ r.<br />
The <strong>in</strong>homogeneous analogue <strong>of</strong> the pair correlation function can be def<strong>in</strong>ed, similarly to the<br />
homogeneous case, as<br />
g<strong>in</strong>hom(r) = K′ <strong>in</strong>hom (r)<br />
.<br />
2πr<br />
It has the same <strong>in</strong>terpretation, namely, that g<strong>in</strong>hom(r) is the probability <strong>of</strong> observ<strong>in</strong>g a pair <strong>of</strong><br />
<strong>po<strong>in</strong>t</strong>s at certa<strong>in</strong> locations separated by a distance r, divided by the correspond<strong>in</strong>g probability<br />
for a Poisson process <strong>of</strong> the same (<strong>in</strong>homogeneous) <strong>in</strong>tensity.<br />
The <strong>in</strong>homogeneous pair correlation function is currently computed by call<strong>in</strong>g K<strong>in</strong>hom followed<br />
by pcf.fv (which does numerical differentiation):<br />
> g
130 Methods 8: adjust<strong>in</strong>g for <strong>in</strong>homogeneity<br />
19.2 Inhomogeneous cluster models<br />
The <strong>in</strong>homogeneous Poisson process was described <strong>in</strong> Section 13.1. We can also <strong>in</strong>troduce <strong>spatial</strong><br />
<strong>in</strong>homogeneity <strong>in</strong>to any <strong>of</strong> the non-Poisson models described <strong>in</strong> Section 15.<br />
In the case <strong>of</strong> Poisson cluster processes (Section 15.1) we can <strong>in</strong>troduce <strong>in</strong>homogeneity <strong>in</strong><br />
either the parent process or the <strong>of</strong>fspr<strong>in</strong>g processes.<br />
To make the parents <strong>in</strong>homogeneous, we simply generate the parent <strong>po<strong>in</strong>t</strong>s from an <strong>in</strong>homogeneous<br />
Poisson process with some <strong>in</strong>tensity function κ(u).<br />
To make the clusters <strong>in</strong>homogeneous, we use a clever construction by Waagepetersen [49].<br />
For a parent <strong>po<strong>in</strong>t</strong> at location (x0,y0), the <strong>of</strong>fspr<strong>in</strong>g are generated from a Poisson process with<br />
<strong>in</strong>tensity β(x,y) = µ(x,y)f(x − x0,y − y0), where f(u,v) is either the Gaussian probability<br />
density (for the Thomas process) or the uniform probability density <strong>in</strong> a disc (for the Matérn<br />
cluster process), <strong>and</strong> µ(x,y) is the reference or modulat<strong>in</strong>g <strong>in</strong>tensity. The number <strong>of</strong> <strong>of</strong>fspr<strong>in</strong>g<br />
from a given parent (x0,y0) is a Poisson r<strong>and</strong>om variable with mean<br />
<br />
<br />
B(x0,y0) = β(x,y)dxdy = f(x − x0,y − y0)µ(x,y)dxdy.<br />
The simulation algorithms rMatClust <strong>and</strong> rThomas allow these options. If the parent <strong>in</strong>tensity<br />
parameter kappa is given as a function(x,y) or a pixel image, then the parents are<br />
Poisson with <strong>in</strong>homogeneous <strong>in</strong>tensity kappa. If the <strong>of</strong>fspr<strong>in</strong>g mean parameter mu is given as a<br />
function(x,y) or a pixel image, then this determ<strong>in</strong>es an <strong>in</strong>homogeneous reference density for<br />
the clusters.<br />
> Z plot(rMatClust(10, 0.05, Z))<br />
rMatClust(10, 0.05, Z)<br />
19.3 Fitt<strong>in</strong>g <strong>in</strong>homogeneous models by m<strong>in</strong>imum contrast<br />
M<strong>in</strong>imum contrast methods can be applied to <strong>in</strong>homogeneous <strong>po<strong>in</strong>t</strong> process models.<br />
In pr<strong>in</strong>ciple we could fit any model (homogeneous or <strong>in</strong>homogeneous) by the method <strong>of</strong><br />
m<strong>in</strong>imum contrast us<strong>in</strong>g any summary statistic. However, the method works best when we<br />
have an exact formula for the true value <strong>of</strong> the summary function for the model, expressed as a<br />
function <strong>of</strong> the parameters <strong>of</strong> the model.<br />
Copyright c○CSIRO 2008
19.3 Fitt<strong>in</strong>g <strong>in</strong>homogeneous models by m<strong>in</strong>imum contrast 131<br />
Waagepetersen [49] <strong>po<strong>in</strong>t</strong>ed out that, if we take a Thomas process or Matérn cluster process<br />
(or <strong>in</strong> general a Neyman-Scott process) with homogeneous parent <strong>in</strong>tensity κ <strong>and</strong> <strong>in</strong>homogeneous<br />
cluster reference density µ(u), then the overall <strong>in</strong>tensity <strong>of</strong> the process is<br />
λ(u) = κµ(u)<br />
<strong>and</strong> the <strong>in</strong>homogeneous K-function is the same as it would be if µ were constant.<br />
Thus, we can fit a Thomas process or Matérn cluster process with <strong>in</strong>homogeneous clusters<br />
as follows:<br />
1. estimate the <strong>in</strong>homogeneous <strong>in</strong>tensity λ(u) <strong>of</strong> the process.<br />
2. derive an estimate <strong>of</strong> the <strong>in</strong>homogeneous K-function.<br />
3. use the method <strong>of</strong> m<strong>in</strong>imum contrast to estimate the parent <strong>in</strong>tensity κ <strong>and</strong> the cluster<br />
scale parameter (Gaussian st<strong>and</strong>ard deviation or disc radius), exactly as we would <strong>in</strong> the<br />
homogeneous case.<br />
The comm<strong>and</strong> kppm performs this algorithm us<strong>in</strong>g a parametric model for the trend:<br />
> data(bei)<br />
> fit fit<br />
Inhomogeneous cluster <strong>po<strong>in</strong>t</strong> process model<br />
Fitted to <strong>po<strong>in</strong>t</strong> pattern dataset ’bei’<br />
Trend formula:~elev + grad<br />
Fitted coefficients for trend formula:<br />
(Intercept) elev grad<br />
-8.55862210 0.02140987 5.84104065<br />
Cluster model: Thomas process<br />
Fitted parameters:<br />
kappa sigma<br />
0.0004290453 5.4110425537<br />
In this example,kppm first estimates the <strong>in</strong>tensity by fitt<strong>in</strong>g the modelppm(bei, ~elev+grad, covariate<br />
Then predict.ppm is used to compute the predicted <strong>in</strong>tensity at the data <strong>po<strong>in</strong>t</strong>s, <strong>and</strong> this is<br />
passed to K<strong>in</strong>hom to calculate the <strong>in</strong>homogeneous K function. The parameters <strong>of</strong> the Thomas<br />
process are estimated from the <strong>in</strong>homogeneous K function by m<strong>in</strong>imum contrast.<br />
The result <strong>of</strong> kppm can be pr<strong>in</strong>ted, plotted <strong>and</strong> simulated as before.<br />
Copyright c○CSIRO 2008
132 Gibbs models<br />
20 Gibbs models<br />
One way to construct a statistical model (<strong>in</strong> any field <strong>of</strong> statistics) is to write down its probability<br />
density. Advantages <strong>of</strong> do<strong>in</strong>g this are:<br />
• the functional form <strong>of</strong> the density reflects its probabilistic properties.<br />
• terms or factors <strong>in</strong> the density <strong>of</strong>ten have an <strong>in</strong>terpretation as ‘components’ <strong>of</strong> the model.<br />
• it is easy to <strong>in</strong>troduce terms that represent the dependence <strong>of</strong> the model on covariates, etc.<br />
This approach is useful provided the density can be written down, <strong>and</strong> provided the density<br />
is tractable.<br />
Spatial <strong>po<strong>in</strong>t</strong> process models that are constructed by writ<strong>in</strong>g down their probability densities<br />
are called ‘Gibbs processes’. Good references on Gibbs <strong>po<strong>in</strong>t</strong> processes are [47, 20].<br />
20.1 Probability densities<br />
It is possible to def<strong>in</strong>e probability densities for <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> processes that live <strong>in</strong>side a bounded<br />
w<strong>in</strong>dow W.<br />
The probability density will be a function f(x) def<strong>in</strong>ed for each f<strong>in</strong>ite configuration x =<br />
{x1,... ,xn} <strong>of</strong> <strong>po<strong>in</strong>t</strong>s xi ∈ W for any n ≥ 0. Notice that the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s n is not fixed,<br />
<strong>and</strong> may be zero. Apart from this peculiarity, probability densities for <strong>po<strong>in</strong>t</strong> processes behave<br />
much like probability densities <strong>in</strong> more familiar contexts.<br />
That’s all you need to know for applications. If you’re <strong>in</strong>terested <strong>in</strong> the mathematical<br />
technicalities, read on; otherwise, skip to section 20.2.<br />
A <strong>po<strong>in</strong>t</strong> process X <strong>in</strong>side W is def<strong>in</strong>ed to have probability density f if <strong>and</strong> only if, for any<br />
nonnegative <strong>in</strong>tegrable function h,<br />
E[h(X)] = e −|W | −|W |<br />
h(∅)f(∅) + e<br />
∞<br />
n=1<br />
<br />
1<br />
· · · h({x1,...,xn})f({x1,...,xn})dx1 · · · dxn<br />
n! W W<br />
(31)<br />
where |W | denotes the area <strong>of</strong> W.<br />
In particular, the probability that X conta<strong>in</strong>s exactly n <strong>po<strong>in</strong>t</strong>s is<br />
pn = P{n(X) = n} =<br />
e−|W |<br />
n!<br />
<br />
W<br />
<br />
· · · f({x1,... ,xn})dx1 · · · dxn<br />
W<br />
for n ≥ 1 <strong>and</strong> p0 = P{n(X) = 0} = e −|W | f(∅). Given that there are exactly n <strong>po<strong>in</strong>t</strong>s, the<br />
conditional jo<strong>in</strong>t density <strong>of</strong> the locations x1,... ,xn is f({x1,... ,xn})/pn.<br />
20.2 Poisson processes<br />
The uniform Poisson process with <strong>in</strong>tensity 1 has probability density f(x) ≡ 1.<br />
The uniform Poisson process <strong>in</strong> W with <strong>in</strong>tensity λ has probability density<br />
f(x) = α λ n(x)<br />
where n(x) is the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> the configuration x, <strong>and</strong> the constant α is<br />
Copyright c○CSIRO 2008<br />
α = e (1−λ)|W | .<br />
(32)
20.3 Pairwise <strong>in</strong>teraction models 133<br />
The <strong>in</strong>homogeneous Poisson process <strong>in</strong> W with <strong>in</strong>tensity function λ(u) has probability density<br />
where the constant α is<br />
f(x) = α<br />
n<br />
λ(xi). (33)<br />
i=1<br />
<br />
α = exp (1 − λ(u))du .<br />
W<br />
The densities (32) <strong>and</strong> (33) are products <strong>of</strong> terms associated with <strong>in</strong>dividual <strong>po<strong>in</strong>t</strong>s xi. This<br />
reflects the conditional <strong>in</strong>dependence property (PP4) <strong>of</strong> the Poisson process.<br />
20.3 Pairwise <strong>in</strong>teraction models<br />
In order to construct <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> processes which exhibit <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction (stochastic<br />
dependence between <strong>po<strong>in</strong>t</strong>s), we need to <strong>in</strong>troduce terms <strong>in</strong> the density that depend on more<br />
than one <strong>po<strong>in</strong>t</strong>. The simplest are pairwise <strong>in</strong>teraction models, which have probability densities<br />
<strong>of</strong> the form<br />
⎡ ⎤ ⎡<br />
n(x) <br />
f(x) = α ⎣ b(xi) ⎦ ⎣ <br />
⎤<br />
c(xi,xj) ⎦ (34)<br />
i=1<br />
where α is a normalis<strong>in</strong>g constant, b(u), u ∈ W is the ‘first order’ term, <strong>and</strong> c(u,v), u,v ∈ W<br />
is the ‘second order’ or ‘pairwise <strong>in</strong>teraction’ term. The pairwise <strong>in</strong>teraction term <strong>in</strong>troduces<br />
dependence between <strong>po<strong>in</strong>t</strong>s. The <strong>in</strong>teraction function must be symmetric, c(u,v) = c(v,u). In<br />
pr<strong>in</strong>ciple we are free to choose any functions b <strong>and</strong> c, provided the result<strong>in</strong>g density is <strong>in</strong>tegrable<br />
(the right side <strong>of</strong> (31) should be f<strong>in</strong>ite when h ≡ 1).<br />
20.3.1 Hard core process<br />
If we take b(u) ≡ β <strong>and</strong><br />
c(u,v) =<br />
i r<br />
0 if ||u − v|| ≤ r<br />
where ||u − v|| denotes the distance between u <strong>and</strong> v, <strong>and</strong> r > 0 is a fixed distance, then the<br />
density becomes<br />
<br />
αβn(x) if ||xi − xj|| > r for all i = j<br />
f(x) =<br />
0 otherwise<br />
This is the density <strong>of</strong> the Poisson process <strong>of</strong> <strong>in</strong>tensity β <strong>in</strong> W conditioned on the event that no<br />
two <strong>po<strong>in</strong>t</strong>s <strong>of</strong> the pattern lie closer than r units apart. It is known as the (classical) hard core<br />
process.<br />
(35)<br />
Copyright c○CSIRO 2008
134 Gibbs models<br />
20.3.2 Strauss process<br />
Hard core process<br />
Generalis<strong>in</strong>g the hard core process, suppose we take b(u) ≡ β <strong>and</strong><br />
c(u,v) =<br />
where γ is a parameter. Then the density becomes<br />
1 if ||u − v|| > r<br />
γ if ||u − v|| ≤ r<br />
f(x) = αβ n(x) γ s(x)<br />
where s(x) is the number <strong>of</strong> pairs <strong>of</strong> dist<strong>in</strong>ct <strong>po<strong>in</strong>t</strong>s <strong>in</strong> x that lie closer than r units apart.<br />
The parameter γ controls the ‘strength’ <strong>of</strong> <strong>in</strong>teraction between <strong>po<strong>in</strong>t</strong>s. If γ = 1 the model<br />
reduces to a Poisson process with <strong>in</strong>tensity β. If γ = 0 the model is a hard core process. For<br />
values 0 < γ < 1, the process exhibits <strong>in</strong>hibition (negative association) between <strong>po<strong>in</strong>t</strong>s.<br />
Strauss(γ = 0.2) Strauss(γ = 0.7)<br />
For γ > 1, the density (37) is not <strong>in</strong>tegrable. Hence the Strauss process is def<strong>in</strong>ed only for<br />
0 ≤ γ ≤ 1 <strong>and</strong> is a model for <strong>in</strong>hibition between <strong>po<strong>in</strong>t</strong>s. This is typical <strong>of</strong> most Gibbs models.<br />
Copyright c○CSIRO 2008<br />
(36)<br />
(37)
20.4 Higher-order <strong>in</strong>teractions 135<br />
20.3.3 Other pairwise <strong>in</strong>teraction models<br />
Other pairwise <strong>in</strong>teractions that are considered <strong>in</strong> spatstat <strong>in</strong>clude the Strauss-hard core <strong>in</strong>teraction<br />
(with hard core distance h > 0 <strong>and</strong> <strong>in</strong>teraction distance r > h)<br />
⎧<br />
⎨ 0 if ||u − v|| ≤ h<br />
c(u,v) = γ if h < ||u − v|| ≤ r ,<br />
⎩<br />
1 if ||u − v|| > r<br />
the s<strong>of</strong>t-core <strong>in</strong>teraction (with scale σ > 0 <strong>and</strong> <strong>in</strong>dex 0 < κ < 1)<br />
2/κ σ<br />
c(u,v) =<br />
,<br />
||u − v||<br />
the Diggle-Gates-Stibbard <strong>in</strong>teraction (with <strong>in</strong>teraction range ρ)<br />
2 π||u−v||<br />
s<strong>in</strong><br />
c(u,v) = 2ρ if ||u − v|| ≤ ρ<br />
1 if ||u − v|| > ρ ,<br />
the Diggle-Gratton <strong>in</strong>teraction (with hard core distance δ, <strong>in</strong>teraction distance ρ <strong>and</strong> <strong>in</strong>dex κ)<br />
⎧<br />
⎪⎨ 0<br />
<br />
if ||u − v|| ≤ δ<br />
κ<br />
||u−v||−δ<br />
c(u,v) =<br />
⎪⎩<br />
ρ−δ if δ < ||u − v|| ≤ ρ ,<br />
1 if ||u − v|| > ρ<br />
<strong>and</strong> the general piecewise constant <strong>in</strong>teraction <strong>in</strong> which c(||u − v||) is a step function <strong>of</strong> ||u − v||.<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
20.4 Higher-order <strong>in</strong>teractions<br />
Piecewise constant <strong>in</strong>teraction<br />
There are some useful Gibbs <strong>po<strong>in</strong>t</strong> process models which exhibit <strong>in</strong>teractions <strong>of</strong> higher order,<br />
that is, <strong>in</strong> which the probability density has contributions from m-tuples <strong>of</strong> <strong>po<strong>in</strong>t</strong>s, where m > 2.<br />
One example is the area-<strong>in</strong>teraction or Widom-Rowl<strong>in</strong>son process [11] with probability density<br />
f(x) = αβ n(x) γ −A(x)<br />
(38)<br />
where α is the normalis<strong>in</strong>g constant, β > 0 is an <strong>in</strong>tensity parameter, <strong>and</strong> γ > 0 is an <strong>in</strong>teraction<br />
parameter. Here A(x) denotes the area <strong>of</strong> the region obta<strong>in</strong>ed by draw<strong>in</strong>g a disc <strong>of</strong> radius r<br />
centred at each <strong>po<strong>in</strong>t</strong> xi, <strong>and</strong> tak<strong>in</strong>g the union <strong>of</strong> these discs. The value γ = 1 aga<strong>in</strong> corresponds<br />
to a Poisson process, while γ < 1 produces a regular process <strong>and</strong> γ > 1 a clustered process.<br />
This process has <strong>in</strong>teractions <strong>of</strong> all orders. It can be used as a model for moderate regularity or<br />
cluster<strong>in</strong>g.<br />
Copyright c○CSIRO 2008
136 Gibbs models<br />
20.5 Conditional <strong>in</strong>tensity<br />
The ma<strong>in</strong> tool for analys<strong>in</strong>g a Gibbs <strong>po<strong>in</strong>t</strong> process is its conditional <strong>in</strong>tensity λ(u,X). Intuitively<br />
this determ<strong>in</strong>es the conditional probability <strong>of</strong> f<strong>in</strong>d<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> <strong>of</strong> the process at the location u given<br />
complete <strong>in</strong>formation about the rest <strong>of</strong> the process. For formal def<strong>in</strong>itions see [20]. Informally,<br />
the conditional probability <strong>of</strong> f<strong>in</strong>d<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> <strong>of</strong> the process <strong>in</strong>side an <strong>in</strong>f<strong>in</strong>itesimal neighbourhood<br />
<strong>of</strong> the location u, given the complete <strong>po<strong>in</strong>t</strong> pattern at all other locations, is λ(u,X)du.<br />
u<br />
For <strong>po<strong>in</strong>t</strong> processes <strong>in</strong> a bounded w<strong>in</strong>dow, the conditional <strong>in</strong>tensity at a location u given the<br />
configuration x is related to the probability density f by<br />
λ(u,x) =<br />
f(x ∪ {u})<br />
f(x)<br />
(for u ∈ x), the ratio <strong>of</strong> the probability densities for the configuration x with <strong>and</strong> without the<br />
<strong>po<strong>in</strong>t</strong> u added.<br />
The homogeneous Poisson process with <strong>in</strong>tensity λ has conditional <strong>in</strong>tensity<br />
λ(u,x) = λ<br />
while the <strong>in</strong>homogeneous Poisson process with <strong>in</strong>tensity function λ(u) has conditional <strong>in</strong>tensity<br />
λ(u,x) = λ(u)<br />
. The conditional <strong>in</strong>tensity for a Poisson process does not depend on the configuration x, because<br />
the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> a Poisson process are <strong>in</strong>dependent.<br />
For the general pairwise <strong>in</strong>teraction process (34) the conditional <strong>in</strong>tensity is<br />
For the hard core process,<br />
(39)<br />
n(x) <br />
λ(u,x) = b(u) c(u,xi). (40)<br />
i=1<br />
<br />
β if ||u − xi|| > r for all i<br />
λ(u,x) =<br />
0 otherwise<br />
which has the nice <strong>in</strong>terpretation that a <strong>po<strong>in</strong>t</strong> u is either ‘permitted’ or ‘not permitted’ depend<strong>in</strong>g<br />
on whether it satisfies the hard core requirement.<br />
For the Strauss process<br />
λ(u,x) = βγ t(u,x)<br />
(42)<br />
where t(u,x) = s(x ∪ {u}) − s(x) is the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> x that lie with<strong>in</strong> a distance r <strong>of</strong> the<br />
location u. For γ < 1, this has the <strong>in</strong>terpretation that a r<strong>and</strong>om <strong>po<strong>in</strong>t</strong> is less likely to occur at<br />
the location u if there are many <strong>po<strong>in</strong>t</strong>s <strong>in</strong> the neighbourhood.<br />
Copyright c○CSIRO 2008<br />
(41)
20.6 Simulat<strong>in</strong>g Gibbs models 137<br />
Strauss<br />
+<br />
For the area-<strong>in</strong>teraction process,<br />
λ(u,x) = βγ −B(u,x)<br />
area−<strong>in</strong>teraction<br />
where B(u,x) = A(x ∪ {u}) − A(x) is the area <strong>of</strong> that part <strong>of</strong> the disc <strong>of</strong> radius r centred on u<br />
that is not covered by discs <strong>of</strong> radius r centred at the other <strong>po<strong>in</strong>t</strong>s xi ∈ x. If the <strong>po<strong>in</strong>t</strong>s represent<br />
trees or plants, we may imag<strong>in</strong>e that each tree takes nutrients <strong>and</strong> water from the soil <strong>in</strong>side a<br />
circle <strong>of</strong> radius r. Then we may <strong>in</strong>terpret B(u,x) as the area <strong>of</strong> the ‘unclaimed zone’ where a<br />
new plant at location u would be able to draw nutrients <strong>and</strong> water without competition from<br />
other plants. For γ < 1 we can <strong>in</strong>terpret (43) as say<strong>in</strong>g that a r<strong>and</strong>om <strong>po<strong>in</strong>t</strong> is less likely to<br />
occur when the unclaimed area is small.<br />
The conditional <strong>in</strong>tensity <strong>of</strong> a <strong>po<strong>in</strong>t</strong> process determ<strong>in</strong>es the probability density, through (39).<br />
Hence we can use the conditional <strong>in</strong>tensity to def<strong>in</strong>e a <strong>po<strong>in</strong>t</strong> process. The conditional <strong>in</strong>tensity<br />
is the preferred modell<strong>in</strong>g tool for Gibbs processes: it has a direct <strong>in</strong>terpretation, <strong>and</strong> it is easier<br />
to h<strong>and</strong>le than the probability density.<br />
20.6 Simulat<strong>in</strong>g Gibbs models<br />
Gibbs models can be simulated by Markov cha<strong>in</strong> Monte Carlo algorithms. Indeed, MCMC<br />
algorithms were <strong>in</strong>vented to simulate Gibbs processes [36, 41].<br />
In brief, these algorithms simulate a Markov cha<strong>in</strong> whose states are <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. The cha<strong>in</strong><br />
is designed so that its equilibrium distribution is the distribution <strong>of</strong> the <strong>po<strong>in</strong>t</strong> process we want<br />
to simulate. If the cha<strong>in</strong> were run for an <strong>in</strong>f<strong>in</strong>ite time, the state would converge <strong>in</strong> distribution<br />
to the desired <strong>po<strong>in</strong>t</strong> process. In practice the cha<strong>in</strong> is run for a long f<strong>in</strong>ite time. Further details<br />
are beyond the scope <strong>of</strong> this workshop; consult [37, 38] for more <strong>in</strong>formation.<br />
Currently spatstat <strong>of</strong>fers the function rmh which simulates Gibbs processes us<strong>in</strong>g the<br />
Metropolis-Hast<strong>in</strong>gs algorithm.<br />
> rmh(model, start, control)<br />
• model determ<strong>in</strong>es the <strong>po<strong>in</strong>t</strong> process model to be simulated (see help(rmhmodel)).<br />
• start determ<strong>in</strong>es the <strong>in</strong>itial state <strong>of</strong> the Markov cha<strong>in</strong> (see help(rmhstart)).<br />
• control specifies control parameters for runn<strong>in</strong>g the Markov cha<strong>in</strong>, such as the number<br />
<strong>of</strong> iteration steps (see help(rmhcontrol)).<br />
+<br />
(43)<br />
Copyright c○CSIRO 2008
138 Gibbs models<br />
In the simplest uses <strong>of</strong> rmh, the three arguments are lists <strong>of</strong> parameter values. To generate a<br />
simulated realisation <strong>of</strong> the Strauss process with parameters β = 2,γ = 0.7,r = 0.7 <strong>in</strong> a square<br />
<strong>of</strong> side 10,<br />
> mo X
21 Methods 9: fitt<strong>in</strong>g Gibbs models<br />
21.1 Maximum pseudolikelihood<br />
Maximum likelihood estimation is <strong>in</strong>tractable for most <strong>po<strong>in</strong>t</strong> process models. At the very least<br />
it requires Monte Carlo simulation to evaluate the likelihood (or the score <strong>and</strong> the Fisher <strong>in</strong>formation).<br />
A workable alternative, at least for <strong>in</strong>vestigative purposes, is to maximise the log pseudolikelihood<br />
log PL (θ;x) = <br />
<br />
log λ(xi;x) − λ(u,x)du. (44)<br />
i<br />
You may recognise this as be<strong>in</strong>g very similar to the likelihood (4) <strong>of</strong> the Poisson process. In<br />
general it is not a likelihood, but the analogue <strong>of</strong> the score equation<br />
∂<br />
log PL (θ) = 0<br />
∂θ<br />
is an unbiased estimat<strong>in</strong>g equation. Thus the maximum pseudolikelihood estimator is asymptotically<br />
unbiased, consistent <strong>and</strong> asymptotically normal under appropriate conditions.<br />
The ma<strong>in</strong> advantage <strong>of</strong> maximum pseudolikelihood is that, at least for popular Gibbs models,<br />
the conditional <strong>in</strong>tensity λ(u,x) is easily computable, so that the pseudolikelihood is easy to<br />
compute <strong>and</strong> to maximise. The ma<strong>in</strong> disadvantage is the bias <strong>and</strong> <strong>in</strong>efficiency <strong>of</strong> maximum<br />
pseudolikelihood <strong>in</strong> small samples.<br />
More computationally-<strong>in</strong>tensive estimation procedures typically use the maximum pseudolikelihood<br />
estimate as their <strong>in</strong>itial guess. We are implement<strong>in</strong>g such procedures <strong>in</strong> spatstat as<br />
well.<br />
21.2 Fitt<strong>in</strong>g Gibbs models <strong>in</strong> spatstat<br />
We have already met the function ppm for fitt<strong>in</strong>g Poisson <strong>po<strong>in</strong>t</strong> process models. In fact this<br />
function will fit a wide class <strong>of</strong> Gibbs models.<br />
ppm conta<strong>in</strong>s an implementation <strong>of</strong> the algorithm <strong>of</strong> Baddeley <strong>and</strong> Turner [3] for maximum<br />
pseudolikelihood (which extends the Berman-Turner device for Poisson processes to a general<br />
Gibbs process). The conditional <strong>in</strong>tensity <strong>of</strong> the model, λθ(u,x), must be logl<strong>in</strong>ear <strong>in</strong> the<br />
parameters θ:<br />
log λθ(u,x) = θ · S(u,x), (45)<br />
generalis<strong>in</strong>g (5), where S(u,x) is a real-valued or vector-valued function <strong>of</strong> location u <strong>and</strong> configuration<br />
x. Parameters θ appear<strong>in</strong>g <strong>in</strong> the logl<strong>in</strong>ear form (45) are called ‘regular’ parameters, <strong>and</strong><br />
all other parameters are ‘irregular’ parameters. For example, the Strauss process conditional<br />
<strong>in</strong>tensity (42) can be recast as<br />
W<br />
log λ(u,x) = log β + (log γ)t(u,x)<br />
so that θ = (log β,log γ) are regular parameters, but the <strong>in</strong>teraction distance r is an irregular<br />
parameter (technically called a ‘bloody nuisance parameter’).<br />
In spatstat we split the conditional <strong>in</strong>tensity <strong>in</strong>to first-order <strong>and</strong> higher-order terms:<br />
139<br />
log λθ(u,x) = η · S(u) + ϕ · V (u,x). (46)<br />
The ‘first order term’ S(u) describes <strong>spatial</strong> <strong>in</strong>homogeneity <strong>and</strong>/or covariate effects. The ‘higher<br />
order term’ V (u,x) describes <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction.<br />
The model with conditional <strong>in</strong>tensity (46) is fitted by call<strong>in</strong>g ppm <strong>in</strong> the form<br />
Copyright c○CSIRO 2008
140 Methods 9: fitt<strong>in</strong>g Gibbs models<br />
ppm(X, ~ terms, V)<br />
The first argument X is the <strong>po<strong>in</strong>t</strong> pattern dataset. The second argument ~terms is a model<br />
formula, specify<strong>in</strong>g the first order term S(u) <strong>in</strong> (46), <strong>in</strong> the manner described <strong>in</strong> Section 13.<br />
Thus the first order term S(u) <strong>in</strong> (46) may take very general forms.<br />
The third argument V is an object <strong>of</strong> the special class "<strong>in</strong>teract" which describes the<br />
<strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction term V (u,x) <strong>in</strong> (46). It may be compared to the ‘family’ argument<br />
which determ<strong>in</strong>es the distribution <strong>of</strong> the responses <strong>in</strong> a l<strong>in</strong>ear model or generalised l<strong>in</strong>ear model.<br />
Only a limited number <strong>of</strong> canned <strong>in</strong>teractions are available <strong>in</strong> spatstat, because they must be<br />
constructed carefully to ensure that the <strong>po<strong>in</strong>t</strong> process exists.<br />
To fit the Strauss process to the cells data us<strong>in</strong>g ppm,<br />
> data(cells)<br />
> ppm(cells, ~1, Strauss(r = 0.1))<br />
Stationary Strauss process<br />
First order term:<br />
beta<br />
762.6005<br />
Interaction: Strauss process<br />
<strong>in</strong>teraction distance: 0.1<br />
Fitted <strong>in</strong>teraction parameter gamma: 0.008<br />
Relevant coefficients:<br />
Interaction<br />
-4.825006<br />
Here Strauss is a special function that creates an ‘<strong>in</strong>teraction’ object (class "<strong>in</strong>teract")<br />
describ<strong>in</strong>g the <strong>in</strong>teraction structure <strong>of</strong> the Strauss process. Notice that we had to specify the<br />
value <strong>of</strong> the irregular parameter r (more about that later).<br />
To fit the <strong>in</strong>homogeneous Strauss process with conditional <strong>in</strong>tensity<br />
λ(u,x) = b(u)γ t(u,x)<br />
where, say, b(u) is logl<strong>in</strong>ear <strong>in</strong> the Cartesian coord<strong>in</strong>ates,<br />
we simply type<br />
> ppm(cells, ~x + y, Strauss(r = 0.1))<br />
Nonstationary Strauss process<br />
Trend formula: ~x + y<br />
Fitted coefficients for trend formula:<br />
(Intercept) x y<br />
6.2922384 0.5269869 0.1576416<br />
Copyright c○CSIRO 2008<br />
log b((x,y)) = β0 + β1x + β2y
21.3 Inter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teractions 141<br />
Interaction: Strauss process<br />
<strong>in</strong>teraction distance: 0.1<br />
Fitted <strong>in</strong>teraction parameter gamma: 0.0082<br />
Relevant coefficients:<br />
Interaction<br />
-4.805565<br />
To fit an <strong>in</strong>homogeneous Strauss process with log-quadratic first order term,<br />
> ppm(cells, ~polynom(x, y, 2), Strauss(r = 0.1))<br />
Nonstationary Strauss process<br />
Trend formula: ~polynom(x, y, 2)<br />
Fitted coefficients for trend formula:<br />
(Intercept) polynom(x, y, 2)[x] polynom(x, y, 2)[y]<br />
5.9747220 -0.9375707 3.4732733<br />
polynom(x, y, 2)[x^2] polynom(x, y, 2)[x.y] polynom(x, y, 2)[y^2]<br />
1.4970947 -0.1838987 -3.3696109<br />
Interaction: Strauss process<br />
<strong>in</strong>teraction distance: 0.1<br />
Fitted <strong>in</strong>teraction parameter gamma: 0.0081<br />
Relevant coefficients:<br />
Interaction<br />
-4.812711<br />
21.3 Inter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teractions<br />
Instead <strong>of</strong> Strauss we may use any <strong>of</strong> the follow<strong>in</strong>g functions to create an <strong>in</strong>teraction:<br />
Poisson() the Poisson <strong>po<strong>in</strong>t</strong> process (the default)<br />
Strauss() the Strauss process<br />
StraussHard() the Strauss/hard core <strong>po<strong>in</strong>t</strong> process<br />
S<strong>of</strong>tcore() pairwise <strong>in</strong>teraction, s<strong>of</strong>t core potential<br />
PairPiece() pairwise <strong>in</strong>teraction, piecewise constant<br />
DiggleGratton() Diggle-Gratton potential<br />
LennardJones() Lennard-Jones potential<br />
AreaInter() area-<strong>in</strong>teraction process<br />
Geyer() Geyer’s saturation process<br />
BadGey() hybrid Geyer saturation process<br />
SatPiece() multiscale saturation process<br />
OrdThresh() Ord process, threshold potential<br />
Pairwise() pairwise <strong>in</strong>teraction, user-supplied potential<br />
Saturated() general saturated model, user-supplied potential<br />
Ord() Ord model, user-supplied potential<br />
(There are two additional ones for multitype <strong>po<strong>in</strong>t</strong> processes, described <strong>in</strong> section 27.3.2.)<br />
Copyright c○CSIRO 2008
142 Methods 9: fitt<strong>in</strong>g Gibbs models<br />
The area-<strong>in</strong>teraction model <strong>and</strong> the Geyer saturation model are quite h<strong>and</strong>y, as they can be<br />
used to model both cluster<strong>in</strong>g <strong>and</strong> regularity.<br />
> data(redwood)<br />
> ppm(redwood, ~1, Geyer(r = 0.07, sat = 2))<br />
Stationary Geyer saturation process<br />
First order term:<br />
beta<br />
12.39488<br />
Interaction: Geyer saturation process<br />
<strong>in</strong>teraction distance: 0.07<br />
saturation parameter: 2<br />
Fitted <strong>in</strong>teraction parameter gamma: 2.9004<br />
Relevant coefficients:<br />
Interaction<br />
1.064845<br />
> ppm(redwood, ~1, AreaInter(r = 0.03))<br />
Stationary Area-<strong>in</strong>teraction process<br />
First order term:<br />
beta<br />
36.6887<br />
Interaction: Area-<strong>in</strong>teraction process<br />
disc radius: 0.03<br />
Fitted <strong>in</strong>teraction parameter eta: 15.6968<br />
Relevant coefficients:<br />
Interaction<br />
2.753459<br />
The pr<strong>in</strong>tout for the area-<strong>in</strong>teraction model uses the “scale-free” parameter eta def<strong>in</strong>ed by<br />
η = γ πr2<br />
where γ <strong>and</strong> r are the parameters appear<strong>in</strong>g <strong>in</strong> the def<strong>in</strong>ition (43). Values <strong>of</strong> η greater than 1<br />
suggest cluster<strong>in</strong>g.<br />
For more detailed explanation <strong>of</strong> modell<strong>in</strong>g, see [5].<br />
21.4 Fitted <strong>po<strong>in</strong>t</strong> process models<br />
The result <strong>of</strong> the ppm call is an object <strong>of</strong> class "ppm" (‘<strong>po<strong>in</strong>t</strong> process model’). This is very closely<br />
analogous to a fitted l<strong>in</strong>ear model (lm) or fitted generalised l<strong>in</strong>ear model (glm).<br />
St<strong>and</strong>ard R operations that are def<strong>in</strong>ed for fitted <strong>po<strong>in</strong>t</strong> process models (i.e. that have methods<br />
for the class "ppm") <strong>in</strong>clude:<br />
Copyright c○CSIRO 2008
21.4 Fitted <strong>po<strong>in</strong>t</strong> process models 143<br />
pr<strong>in</strong>t pr<strong>in</strong>t basic <strong>in</strong>formation<br />
summary pr<strong>in</strong>t detailed summary <strong>in</strong>formation<br />
plot plot the fitted (conditional) <strong>in</strong>tensity<br />
predict fitted (conditional) <strong>in</strong>tensity<br />
fitted fitted (conditional) <strong>in</strong>tensity at data <strong>po<strong>in</strong>t</strong>s<br />
update re-fit the model<br />
coef extract the fitted coefficient vector θ<br />
vcov variance-covariance matrix <strong>of</strong> θ<br />
anova analysis <strong>of</strong> deviance<br />
logLik evaluate log-pseudolikelihood<br />
model.matrix extract design matrix<br />
formula extract trend formula <strong>of</strong> model<br />
terms extract terms <strong>in</strong> model formula<br />
(the methods for anova <strong>and</strong> vcov are only available for Poisson models). The follow<strong>in</strong>g<br />
functions are also available:<br />
step stepwise model selection<br />
drop1 one step backward <strong>in</strong> model selection<br />
model.images compute images <strong>of</strong> canonical covariates <strong>in</strong> model<br />
effectfun fitted <strong>in</strong>tensity as function <strong>of</strong> one covariate<br />
Plott<strong>in</strong>g a fitted model generates a series <strong>of</strong> image <strong>and</strong> contour plots <strong>of</strong><br />
• the fitted first order term exp(ˆη · S(u))<br />
• the fitted conditional <strong>in</strong>tensity λˆ θ (u,x) evaluated for the data pattern x<br />
For Poisson models, the two plots are equivalent, <strong>and</strong> give the fitted <strong>in</strong>tensity function.<br />
> fit par(mfrow = c(1, 2))<br />
> plot(fit, how = "image", ngrid = 256)<br />
Fitted trend<br />
400 600 800 1000 1400<br />
Fitted cif<br />
For non-Poisson models, it is also possible to extract <strong>and</strong> plot the <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction<br />
function, us<strong>in</strong>g fit<strong>in</strong>.<br />
> model f plot(f)<br />
0 500 1000<br />
Copyright c○CSIRO 2008
144 Methods 9: fitt<strong>in</strong>g Gibbs models<br />
Pairwise <strong>in</strong>teraction<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
0 20 40 60 80 100 120<br />
Distance<br />
21.5 Simulation from fitted models<br />
A fitted Gibbs model can also be simulated automatically us<strong>in</strong>g rmh.<br />
> fit Xsim plot(Xsim, ma<strong>in</strong> = "Simulation from fitted Strauss model")<br />
Simulation from fitted Strauss model<br />
The envelope comm<strong>and</strong> will also generate simulation envelopes for a fitted model.<br />
> plot(envelope(fit, nsim = 39))<br />
Copyright c○CSIRO 2008
21.6 Deal<strong>in</strong>g with nuisance parameters 145<br />
K(r)<br />
0 500 1000 1500<br />
envelope(fit, nsim = 39)<br />
0 5 10 15 20<br />
r (one unit = 0.1 metres)<br />
21.6 Deal<strong>in</strong>g with nuisance parameters<br />
Irregular parameters, such as the <strong>in</strong>teraction radius r <strong>in</strong> the Strauss process, cannot be estimated<br />
directly us<strong>in</strong>g ppm. Indeed the statistical theory for estimat<strong>in</strong>g such parameters is unclear.<br />
For some special cases, a maximum likelihood estimator <strong>of</strong> the nuisance parameter is available.<br />
For example, for the ‘hard core process’ (Strauss process with <strong>in</strong>teraction parameter γ = 0)<br />
with <strong>in</strong>teraction radius r, the maximum likelihood estimator is the m<strong>in</strong>imum nearest-neighbour<br />
distance. Thus the follow<strong>in</strong>g is a reasonable approach to the cells dataset:<br />
> rhat rhat ppm(cells, ~1, Strauss(r = rhat))<br />
Stationary Strauss process<br />
First order term:<br />
beta<br />
301.0949<br />
Interaction: Strauss process<br />
<strong>in</strong>teraction distance: 0.0836293018068393<br />
Fitted <strong>in</strong>teraction parameter gamma: 0<br />
Relevant coefficients:<br />
Interaction<br />
-20.77031<br />
The analogue <strong>of</strong> pr<strong>of</strong>ile likelihood, pr<strong>of</strong>ile pseudolikelihood, provides a general solution which<br />
may or may not perform well. If θ = (φ,η) where φ denotes the nuisance parameters <strong>and</strong> η the<br />
regular parameters, def<strong>in</strong>e the pr<strong>of</strong>ile log pseudolikelihood by<br />
PLP(φ,x) = max log PL ((φ,η);x) .<br />
η<br />
The right h<strong>and</strong> side can be computed, for each fixed value <strong>of</strong> φ, by the algorithm ppm. Then we<br />
just have to maximise PLP(φ) over φ. This is done by the comm<strong>and</strong> pr<strong>of</strong>ilepl:<br />
Copyright c○CSIRO 2008
146 Methods 9: fitt<strong>in</strong>g Gibbs models<br />
> data(simdat)<br />
> df pfit pfit<br />
Pr<strong>of</strong>ile log pseudolikelihood values<br />
for model: ppm(simdat, ~1, <strong>in</strong>teraction = Strauss)<br />
fitted with rbord= 2<br />
Interaction: Strauss<br />
with irregular parameter ’r’ <strong>in</strong> [0.05, 2]<br />
Optimum value <strong>of</strong> irregular parameter: r = 0.275<br />
The result is an object <strong>of</strong> classpr<strong>of</strong>ilepl conta<strong>in</strong><strong>in</strong>g the pr<strong>of</strong>ile log pseudolikelihood function,<br />
the optimised value <strong>of</strong> the irregular parameter r, <strong>and</strong> the f<strong>in</strong>al fitted model. To plot the pr<strong>of</strong>ile<br />
log pseudolikelihood,<br />
> plot(pfit)<br />
log PL<br />
−17.5 −16.5 −15.5 −14.5<br />
ppm(simdat, ~1, <strong>in</strong>teraction = Strauss)<br />
0.0 0.5 1.0 1.5 2.0<br />
To extract the f<strong>in</strong>al fitted model,<br />
> pfit$fit<br />
Stationary Strauss process<br />
First order term:<br />
beta<br />
2.583110<br />
Interaction: Strauss process<br />
<strong>in</strong>teraction distance: 0.275<br />
Fitted <strong>in</strong>teraction parameter gamma: 0.5631<br />
Relevant coefficients:<br />
Interaction<br />
-0.5743608<br />
There is a summary method for these objects as well.<br />
Copyright c○CSIRO 2008<br />
r
21.7 Improvements over maximum pseudolikelihood 147<br />
21.7 Improvements over maximum pseudolikelihood<br />
Maximum pseudolikelihood is quick <strong>and</strong> dirty. There are statistically more efficient alternatives,<br />
but they are computationally <strong>in</strong>tensive.<br />
Currently we have implemented the easiest <strong>of</strong> these alternatives, the Huang-Ogata [30] onestep<br />
approximation to maximum likelihood. Start<strong>in</strong>g from the maximum pseudolikelihood estimate<br />
ˆ θPL, we simulate M <strong>in</strong>dependent realisations <strong>of</strong> the model with parameters ˆ θPL, evaluate<br />
the canonical sufficient statistics, <strong>and</strong> use them to form estimates <strong>of</strong> the score <strong>and</strong> Fisher <strong>in</strong>formation<br />
at θ = ˆ θPL. Then we take one Newton-Raphson step, updat<strong>in</strong>g the value <strong>of</strong> θ. The<br />
rationale is that the log-likelihood is approximately quadratic <strong>in</strong> a neighbourhood <strong>of</strong> the maximum<br />
pseudolikelihood estimator, so that one Newton-Raphson step is almost enough.<br />
To use the Huang-Ogata method <strong>in</strong>stead <strong>of</strong> maximum pseudolikelihood, add the argument<br />
method="ho".<br />
> fit fit<br />
Stationary Strauss process<br />
First order term:<br />
beta<br />
2.515173<br />
Interaction: Strauss process<br />
<strong>in</strong>teraction distance: 0.275<br />
Fitted <strong>in</strong>teraction parameter gamma: 0.676<br />
Relevant coefficients:<br />
Interaction<br />
-0.3916353<br />
> vcov(fit)<br />
[,1] [,2]<br />
[1,] 0.01058897 -0.01236823<br />
[2,] -0.01236823 0.03235970<br />
For models fitted by Huang-Ogata, the variance-covariance matrix returned by vcov is computed<br />
from the simulations.<br />
Copyright c○CSIRO 2008
148 Methods 10: validation <strong>of</strong> fitted Gibbs models<br />
22 Methods 10: validation <strong>of</strong> fitted Gibbs models<br />
Goodness-<strong>of</strong>-fit test<strong>in</strong>g <strong>and</strong> model validation for Poisson models were described <strong>in</strong> Section 14.<br />
Check<strong>in</strong>g a fitted Gibbs <strong>po<strong>in</strong>t</strong> process model is more difficult. There is little theory available to<br />
support goodness-<strong>of</strong>-fit tests <strong>and</strong> the like.<br />
As an example, consider the follow<strong>in</strong>g data:<br />
> data(residualspaper)<br />
> X plot(X)<br />
X<br />
We fit a Strauss process model with a log-quadratic <strong>in</strong>tensity term:<br />
> fit plot(envelope(fit, Lest, nsim = 19, global = TRUE))<br />
Copyright c○CSIRO 2008
22.1 Goodness-<strong>of</strong>-fit test<strong>in</strong>g for Gibbs processes 149<br />
L(r)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
envelope(fit, Lest, nsim=19, global=TRUE)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
Let’s subtract the theoretical Poisson value L(r) = r to get a more readable plot:<br />
> plot(envelope(fit, Lest, nsim = 19, global = TRUE), . - r ~ r)<br />
cb<strong>in</strong>d(obs, mmean, hi, lo) − r<br />
−0.02 −0.01 0.00 0.01 0.02<br />
envelope(fit, Lest, nsim=19, global=TRUE)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r<br />
This is fairly consistent with a Strauss process.<br />
Copyright c○CSIRO 2008
150 Methods 10: validation <strong>of</strong> fitted Gibbs models<br />
22.2 Residuals for Gibbs processes<br />
22.2.1 Def<strong>in</strong>ition<br />
Residuals for a general Gibbs model were def<strong>in</strong>ed only recently [6, 1]. The total residual <strong>in</strong> a<br />
region B ⊂ R 2 is def<strong>in</strong>ed as<br />
<br />
R(B) = n(x ∩ B) − λ(u,x)du (47)<br />
B<br />
where aga<strong>in</strong> n(x ∩ B) is the observed number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> the region B, <strong>and</strong> λ(u,x) is the<br />
conditional <strong>in</strong>tensity <strong>of</strong> the fitted model, evaluated for the data <strong>po<strong>in</strong>t</strong> pattern x. If the fitted<br />
model is correct, the residuals have mean zero.<br />
This def<strong>in</strong>ition is similar to the def<strong>in</strong>ition <strong>of</strong> residuals for Poisson processes (Section 14.2)<br />
except that the <strong>in</strong>tensity λ(u) <strong>of</strong> the fitted Poisson process has been replaced by the conditional<br />
<strong>in</strong>tensity λ(u,x) <strong>of</strong> the fitted Gibbs process evaluated for the data <strong>po<strong>in</strong>t</strong> pattern x.<br />
22.2.2 Residual plots<br />
Residuals for Gibbs processes can be plotted us<strong>in</strong>g the same techniques as <strong>in</strong> Section 14.2. Here<br />
is the four-panel plot:<br />
> diagnose.ppm(fit, type = "Pearson")<br />
Copyright c○CSIRO 2008
22.2 Residuals for Gibbs processes 151<br />
cumulative sum <strong>of</strong> Pearson residuals<br />
−2 −1.5 −1 −0.5 0 0.5<br />
0 0.2 0.4 0.6 0.8 1<br />
x coord<strong>in</strong>ate<br />
−4<br />
−2<br />
4<br />
5<br />
1<br />
2<br />
0<br />
3<br />
−2<br />
cumulative sum <strong>of</strong> Pearson residuals<br />
0.5 0 −0.5 −1 −1.5<br />
−1<br />
At the time <strong>of</strong> writ<strong>in</strong>g, spatstat does not yet display 2σ significance b<strong>and</strong>s for the lurk<strong>in</strong>g<br />
variable plots when the fitted model is not Poisson. The <strong>in</strong>terpretation <strong>of</strong> the lurk<strong>in</strong>g variable<br />
plots is a little more difficult without the significance b<strong>and</strong>s. One tends to place a little more<br />
emphasis on the smoothed residual field. The Pearson residuals should be approximately st<strong>and</strong>ardised,<br />
so that values which are much greater than 2 (<strong>in</strong> absolute value) suggest a lack <strong>of</strong><br />
fit.<br />
The four-panel plot above suggests that the model is a reasonable fit.<br />
22.2.3 Q–Q plots<br />
As we noted <strong>in</strong> Section 14.2.6, the four-panel residual plot <strong>and</strong> the lurk<strong>in</strong>g variable plot are<br />
useful for detect<strong>in</strong>g misspecification <strong>of</strong> the trend <strong>in</strong> a fitted model. They are not very useful for<br />
check<strong>in</strong>g misspecification <strong>of</strong> the <strong>in</strong>teraction <strong>in</strong> a fitted model.<br />
An extreme example is provided by the cells dataset. The residual plots for a uniform<br />
Poisson process fitted to the cells data suggest that this is a good model:<br />
> data(cells)<br />
> fitPois diagnose.ppm(fitPois)<br />
−5<br />
−4<br />
−3<br />
−3<br />
−5<br />
−4<br />
−2<br />
−1<br />
0<br />
2<br />
4<br />
6<br />
5<br />
7<br />
1<br />
3<br />
0<br />
−1<br />
0 0.2 0.4 0.6 0.8 1<br />
y coord<strong>in</strong>ate<br />
Copyright c○CSIRO 2008
152 Methods 10: validation <strong>of</strong> fitted Gibbs models<br />
cumulative sum <strong>of</strong> raw residuals<br />
−8 −4 0 2 4 6 8<br />
0 0.2 0.4 0.6 0.8 1<br />
x coord<strong>in</strong>ate<br />
cumulative sum <strong>of</strong> raw residuals<br />
8 6 4 2 0 −4 −8<br />
However, the K-function shows that the cells dataset is clearly not a Poisson pattern, but<br />
has strong <strong>in</strong>hibition:<br />
> par(mfrow = c(1, 2))<br />
> plot(cells)<br />
> plot(Kest(cells))<br />
> par(mfrow = c(1, 1))<br />
Copyright c○CSIRO 2008<br />
−10<br />
−5<br />
−5<br />
−10<br />
−15<br />
0<br />
10<br />
−5<br />
5<br />
5<br />
−20<br />
−10<br />
0<br />
0<br />
10<br />
−10<br />
0 0.2 0.4 0.6 0.8 1<br />
y coord<strong>in</strong>ate
22.2 Residuals for Gibbs processes 153<br />
cells<br />
K(r)<br />
0.00 0.05 0.10 0.15 0.20<br />
Kest(cells)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
Interaction between <strong>po<strong>in</strong>t</strong>s <strong>in</strong> a <strong>po<strong>in</strong>t</strong> process corresponds roughly to the distribution <strong>of</strong> the<br />
responses <strong>in</strong> logl<strong>in</strong>ear regression. To validate the <strong>in</strong>teraction terms <strong>in</strong> a <strong>po<strong>in</strong>t</strong> process model, we<br />
should plot the distribution <strong>of</strong> the residuals. The appropriate tool is a Q–Q plot.<br />
> qqplot.ppm(fitPois, nsim = 39)<br />
data quantile<br />
−30 −20 −10 0 10 20 30 40<br />
−20 0 20 40<br />
Mean quantile <strong>of</strong> simulations<br />
Residuals: raw<br />
r<br />
Copyright c○CSIRO 2008
154 Methods 10: validation <strong>of</strong> fitted Gibbs models<br />
This shows a Q–Q plot <strong>of</strong> the smoothed residuals for a uniform Poisson model fitted to the<br />
cells data, with <strong>po<strong>in</strong>t</strong>wise 5% critical envelopes from simulations <strong>of</strong> the fitted model. This<br />
<strong>in</strong>dicates that the uniform Poisson model is grossly <strong>in</strong>appropriate for the cells data.<br />
Return<strong>in</strong>g to the model we fitted at the start <strong>of</strong> this chapter:<br />
> qqplot.ppm(fit, nsim = 39)<br />
data quantile<br />
−100 −50 0 50 100<br />
qqplot.ppm(fit, nsim=39)<br />
−100 −50 0 50 100 150<br />
Mean quantile <strong>of</strong> simulations<br />
Residuals: raw<br />
This shows a Q–Q plot <strong>of</strong> the smoothed residuals, with <strong>po<strong>in</strong>t</strong>wise 5% critical envelopes from<br />
simulations <strong>of</strong> the fitted model. This suggests that the Strauss model is reasonable.<br />
These validation techniques generalise <strong>and</strong> unify many exist<strong>in</strong>g exploratory methods. For<br />
particular models <strong>of</strong> <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teraction, the Q–Q plot is closely related to the summary<br />
functions F, G <strong>and</strong> K. See [6].<br />
Copyright c○CSIRO 2008
22.2 Residuals for Gibbs processes 155<br />
PART V. MARKED POINT PATTERNS<br />
Part V <strong>of</strong> the workshop deals with marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
Copyright c○CSIRO 2008
156 Marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
23 Marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
23.1 Marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Each <strong>po<strong>in</strong>t</strong> <strong>in</strong> a <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> pattern may carry additional <strong>in</strong>formation called a ‘mark’. For<br />
example, <strong>po<strong>in</strong>t</strong>s which are classified <strong>in</strong>to two or more different types (on/<strong>of</strong>f, case/control, species,<br />
colour, etc) may be regarded as marked <strong>po<strong>in</strong>t</strong>s, with a mark which identifies which type they<br />
are. Data record<strong>in</strong>g the locations <strong>and</strong> heights <strong>of</strong> trees <strong>in</strong> a forest can be regarded as a marked<br />
<strong>po<strong>in</strong>t</strong> pattern where the mark attached to a tree’s location is the tree height.<br />
In our current implementation, the mark attached to each <strong>po<strong>in</strong>t</strong> must be a s<strong>in</strong>gle value (which<br />
may be numeric, character, complex, logical, or factor). Many <strong>of</strong> the functions <strong>in</strong> spatstat<br />
h<strong>and</strong>le marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> <strong>in</strong> which the mark attached to each <strong>po<strong>in</strong>t</strong> is either<br />
• a cont<strong>in</strong>uous variate or “real number”. An example is the Longleaf P<strong>in</strong>es dataset<br />
(longleaf) <strong>in</strong> which each tree is marked with its diameter at breast height. The marks<br />
component must be a numeric vector such that marks[i] is the mark value associated<br />
with the ith <strong>po<strong>in</strong>t</strong>. We say the <strong>po<strong>in</strong>t</strong> pattern has cont<strong>in</strong>uous marks.<br />
• a categorical variate. An example is the Amacr<strong>in</strong>e Cells dataset (amacr<strong>in</strong>e) <strong>in</strong> which<br />
each cell is identified as either “on” or “<strong>of</strong>f”. Such <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> may be regarded as<br />
consist<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> different “types”. The marks component must be a factor such<br />
that marks[i] is the label or type <strong>of</strong> the ith <strong>po<strong>in</strong>t</strong>. We call this a multitype <strong>po<strong>in</strong>t</strong> pattern<br />
<strong>and</strong> the levels <strong>of</strong> the factor are the possible types.<br />
longleaf<br />
amacr<strong>in</strong>e<br />
Note that, <strong>in</strong> some other packages, a <strong>po<strong>in</strong>t</strong> pattern dataset consist<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> two different<br />
types (A <strong>and</strong> B say) is represented by two datasets, one represent<strong>in</strong>g the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> type A <strong>and</strong><br />
another conta<strong>in</strong><strong>in</strong>g the <strong>po<strong>in</strong>t</strong>s <strong>of</strong> type B. In spatstat we take a different approach, <strong>in</strong> which<br />
all the <strong>po<strong>in</strong>t</strong>s are collected together <strong>in</strong> one <strong>po<strong>in</strong>t</strong> pattern, <strong>and</strong> the <strong>po<strong>in</strong>t</strong>s are then labelled by<br />
the type to which they belong. An advantage <strong>of</strong> this approach is that it is easy to deal with<br />
multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> with more than 2 types. For example the classic Lans<strong>in</strong>g Woods dataset<br />
represents the positions <strong>of</strong> trees <strong>of</strong> 6 different species. This is available <strong>in</strong> spatstat as a s<strong>in</strong>gle<br />
dataset, a marked <strong>po<strong>in</strong>t</strong> pattern, with the marks hav<strong>in</strong>g 6 levels.<br />
23.2 Formulation<br />
A mark variable may be <strong>in</strong>terpreted as an additional coord<strong>in</strong>ate for the <strong>po<strong>in</strong>t</strong>: for example<br />
a <strong>po<strong>in</strong>t</strong> process <strong>of</strong> earthquake epicentre locations (longitude, latitude), with marks giv<strong>in</strong>g the<br />
Copyright c○CSIRO 2008
23.3 Methodological issues 157<br />
occurrence time <strong>of</strong> each earthquake, can alternatively be viewed as a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> space-time<br />
with coord<strong>in</strong>ates (longitude, latitude, time).<br />
A marked <strong>po<strong>in</strong>t</strong> process <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> space S with marks belong<strong>in</strong>g to a set M is mathematically<br />
def<strong>in</strong>ed as a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> the cartesian product S × M. The space M <strong>of</strong> possible marks<br />
may be ‘anyth<strong>in</strong>g’. In current applications, typically the mark is either a categorical variable<br />
(so that the <strong>po<strong>in</strong>t</strong>s are grouped <strong>in</strong>to ‘types’) or a real number. Multivariate marks consist<strong>in</strong>g <strong>of</strong><br />
several such variables are also common.<br />
A marked <strong>po<strong>in</strong>t</strong> pattern is an unordered set<br />
y = {(x1,m1),... ,(xn,mn)}, xi ∈ W, mi ∈ M<br />
where xi are the locations <strong>and</strong> mi are the correspond<strong>in</strong>g marks.<br />
23.3 Methodological issues<br />
23.3.1 Should the data be treated as a marked <strong>po<strong>in</strong>t</strong> process?<br />
In a marked <strong>po<strong>in</strong>t</strong> process the <strong>po<strong>in</strong>t</strong>s are r<strong>and</strong>om. Treat<strong>in</strong>g the data as a <strong>po<strong>in</strong>t</strong> process is<br />
<strong>in</strong>appropriate if the locations are fixed, or if the locations are not part <strong>of</strong> the ‘response’.<br />
Example 16 Today’s maximum temperatures at 25 Australian cities are displayed on a map.<br />
This is not a <strong>po<strong>in</strong>t</strong> process <strong>in</strong> any useful sense. The cities are fixed locations. The temperatures<br />
are observations <strong>of</strong> a <strong>spatial</strong> variable at a fixed set <strong>of</strong> locations. See the R packages sp,<br />
spdep, spgwr for suitable methods.<br />
Example 17 A m<strong>in</strong>eral exploration dataset records the map coord<strong>in</strong>ates where 15 core samples<br />
were drilled, <strong>and</strong> for each core sample, the assayed concentration <strong>of</strong> iron <strong>in</strong> the sample.<br />
This should not be treated as a <strong>po<strong>in</strong>t</strong> process. The core sample locations were chosen by a<br />
geologist, <strong>and</strong> are part <strong>of</strong> the experimental design. The ma<strong>in</strong> <strong>in</strong>terest is <strong>in</strong> the iron concentration<br />
at these locations. This should probably be analysed as a geostatistical dataset. See the R<br />
packages geoR, geoRglm for suitable methods.<br />
23.3.2 Jo<strong>in</strong>t vs. conditional analysis<br />
There are more choices for analysis (<strong>and</strong> more traps) when marks are present. Schematically, if<br />
we write X for the <strong>po<strong>in</strong>t</strong>s <strong>and</strong> M for the marks, then a statistical model for the marked <strong>po<strong>in</strong>t</strong><br />
pattern could be formulated <strong>in</strong> several ways:<br />
• [X] [M|X] — ‘conditional on locations’ — <strong>po<strong>in</strong>t</strong>s X are first generated accord<strong>in</strong>g to a<br />
<strong>spatial</strong> <strong>po<strong>in</strong>t</strong> process, then marks M are ‘assigned’ to the <strong>po<strong>in</strong>t</strong>s by a r<strong>and</strong>om mechanism<br />
[M|X];<br />
• [M] [X|M] — ‘conditional on marks’ or ‘split by marks’ — marks M are first generated<br />
accord<strong>in</strong>g to some r<strong>and</strong>om mechanism [M], then they are placed at certa<strong>in</strong> locations X by<br />
<strong>po<strong>in</strong>t</strong> process(es) [X|M];<br />
• [X,M] — ‘jo<strong>in</strong>t’ — marked <strong>po<strong>in</strong>t</strong>s are generated accord<strong>in</strong>g to a marked <strong>po<strong>in</strong>t</strong> process.<br />
These approaches typically lead to different stochastic models <strong>and</strong> have different <strong>in</strong>ferential<br />
<strong>in</strong>terpretations. Correspond<strong>in</strong>gly, there are different null hypotheses that can be tested:<br />
Copyright c○CSIRO 2008
158 Marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
• r<strong>and</strong>om labell<strong>in</strong>g: given the locations X, the marks are conditionally <strong>in</strong>dependent <strong>and</strong><br />
identically distributed;<br />
• <strong>in</strong>dependence <strong>of</strong> components: the sub-processes Xm <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> each mark m, are <strong>in</strong>dependent<br />
<strong>po<strong>in</strong>t</strong> processes;<br />
• complete <strong>spatial</strong> r<strong>and</strong>omness <strong>and</strong> <strong>in</strong>dependence (CSRI): the locations X are a uniform<br />
Poisson <strong>po<strong>in</strong>t</strong> process, <strong>and</strong> the marks are <strong>in</strong>dependent <strong>and</strong> identically distributed. (This<br />
implies both r<strong>and</strong>om labell<strong>in</strong>g <strong>and</strong> <strong>in</strong>dependence <strong>of</strong> components).<br />
These null hypotheses are not equivalent.<br />
The properties <strong>of</strong> r<strong>and</strong>om labell<strong>in</strong>g <strong>and</strong> <strong>in</strong>dependence <strong>of</strong> components are not equivalent. For<br />
example, take a <strong>po<strong>in</strong>t</strong> process X where nearest neighbour distances are always larger than a<br />
threshold r, <strong>and</strong> attach r<strong>and</strong>om marks to the <strong>po<strong>in</strong>t</strong>s. The result<strong>in</strong>g marked <strong>po<strong>in</strong>t</strong> process cannot<br />
be generated us<strong>in</strong>g the <strong>in</strong>dependence construction, because if <strong>po<strong>in</strong>t</strong>s with different marks are<br />
<strong>in</strong>dependent, they can come arbitrarily close to one another.<br />
Example 18 (Ant nests data) Two species <strong>of</strong> ants build nests <strong>in</strong> a desert. We want to <strong>in</strong>vestigate<br />
ecological <strong>in</strong>teraction between the species, <strong>and</strong> between different nests <strong>of</strong> the same species.<br />
The locations <strong>of</strong> all nests are mapped, <strong>and</strong> marked by the species.<br />
These data can be analysed as a marked <strong>po<strong>in</strong>t</strong> process consist<strong>in</strong>g <strong>of</strong> two different types <strong>of</strong><br />
<strong>po<strong>in</strong>t</strong>s. The ‘mark’ attached to each <strong>po<strong>in</strong>t</strong> is its species (a categorical variable). The most<br />
natural k<strong>in</strong>d <strong>of</strong> modell<strong>in</strong>g <strong>and</strong> analysis is either jo<strong>in</strong>t [X,M] or split by species [M] [X|M]. We<br />
could also treat one <strong>of</strong> the species as a covariate <strong>and</strong> analyse the other species conditional on it.<br />
Example 19 Trees <strong>in</strong> an orchard are exam<strong>in</strong>ed <strong>and</strong> their disease status (<strong>in</strong>fected/not <strong>in</strong>fected)<br />
is recorded. We are <strong>in</strong>terested <strong>in</strong> the <strong>spatial</strong> characteristics <strong>of</strong> the disease, such as contagion<br />
between neighbour<strong>in</strong>g trees.<br />
These data probably should not be treated as a <strong>po<strong>in</strong>t</strong> process. The response is ‘disease<br />
status’. We can th<strong>in</strong>k <strong>of</strong> disease status as a label applied to the trees after their locations have<br />
been determ<strong>in</strong>ed. S<strong>in</strong>ce we are <strong>in</strong>terested <strong>in</strong> the <strong>spatial</strong> correlation <strong>of</strong> disease status, the tree<br />
locations are effectively fixed covariate values. It would probably be best to treat these data<br />
as a discrete r<strong>and</strong>om field (<strong>of</strong> disease status values) observed at a f<strong>in</strong>ite known set <strong>of</strong> sites (the<br />
trees).<br />
23.3.3 Grey areas<br />
There are some ‘grey areas’ which permit several alternative choices <strong>of</strong> analysis. It could be<br />
appropriate either to analyse the locations <strong>and</strong> marks jo<strong>in</strong>tly (denoted [X,M]), or to analyse<br />
the marks conditional on the locations ([M|X]) or to analyse the locations given the marks<br />
([X|M]).<br />
One grey area occurs when the locations are r<strong>and</strong>om, but may be ancillary for the parameters<br />
<strong>of</strong> <strong>in</strong>terest.<br />
Example 20 Case-control study <strong>of</strong> cancer [22, 26]. The domicile locations <strong>of</strong> all new cases<br />
<strong>of</strong> a rare cancer are mapped. To allow for <strong>spatial</strong> variation <strong>in</strong> the density <strong>of</strong> the susceptible<br />
population, domicile locations are recorded for a r<strong>and</strong>om sample <strong>of</strong> (matched) controls.<br />
Copyright c○CSIRO 2008
23.3 Methodological issues 159<br />
This can be analysed either as a marked <strong>po<strong>in</strong>t</strong> pattern (where the mark is the case/control<br />
label) or, by condition<strong>in</strong>g on locations, as a r<strong>and</strong>om field <strong>of</strong> case/control values attached to the<br />
known domicile locations.<br />
Chorley−Ribble Data<br />
Copyright c○CSIRO 2008
160 H<strong>and</strong>l<strong>in</strong>g marked <strong>po<strong>in</strong>t</strong> pattern data<br />
24 H<strong>and</strong>l<strong>in</strong>g marked <strong>po<strong>in</strong>t</strong> pattern data<br />
This section expla<strong>in</strong>s how to create a marked <strong>po<strong>in</strong>t</strong> pattern dataset <strong>in</strong> spatstat, <strong>and</strong> how to<br />
manipulate it.<br />
24.1 Creat<strong>in</strong>g datasets<br />
In spatstat version 1, each <strong>po<strong>in</strong>t</strong> <strong>in</strong> a <strong>po<strong>in</strong>t</strong> pattern can be marked with a s<strong>in</strong>gle value (i.e.<br />
one mark value per <strong>po<strong>in</strong>t</strong>). The marks are stored <strong>in</strong> a vector, <strong>of</strong> the same length as the number<br />
<strong>of</strong> <strong>po<strong>in</strong>t</strong>s. The marks can be <strong>of</strong> any atomic type: numeric, <strong>in</strong>teger, character, factor, logical or<br />
complex.<br />
A marked <strong>po<strong>in</strong>t</strong> pattern dataset can be created us<strong>in</strong>g any <strong>of</strong> the follow<strong>in</strong>g tools:<br />
ppp create <strong>po<strong>in</strong>t</strong> pattern dataset<br />
as.ppp convert other data to <strong>po<strong>in</strong>t</strong> pattern<br />
superimpose comb<strong>in</strong>e several <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> <strong>in</strong>to a marked <strong>po<strong>in</strong>t</strong> pattern<br />
marks extract marks from a <strong>po<strong>in</strong>t</strong> pattern<br />
marks ppp(x, y, ..., marks = m)<br />
where x, y <strong>and</strong> m are vectors <strong>of</strong> equal length conta<strong>in</strong><strong>in</strong>g the (x,y) coord<strong>in</strong>ates <strong>and</strong> the correspond<strong>in</strong>g<br />
mark values, <strong>and</strong> ... are arguments that determ<strong>in</strong>e the w<strong>in</strong>dow for the <strong>po<strong>in</strong>t</strong> pattern.<br />
Tip: If the marks are <strong>in</strong>tended to be a categorical variable (represent<strong>in</strong>g the types<br />
<strong>in</strong> a multitype <strong>po<strong>in</strong>t</strong> pattern),<br />
• ensure that m is stored as a factor <strong>in</strong> R.<br />
• when the <strong>po<strong>in</strong>t</strong> pattern X has been created, check that it is multitype us<strong>in</strong>g<br />
is.multitype(X).<br />
• check that the factor levels are as you <strong>in</strong>tended, us<strong>in</strong>glevels(m) orlevels(marks(X))<br />
where X is the marked <strong>po<strong>in</strong>t</strong> pattern. If the factor levels are character str<strong>in</strong>gs,<br />
they will be sorted <strong>in</strong>to alphabetical order by default.<br />
• be careful when perform<strong>in</strong>g equality/<strong>in</strong>equality comparisons <strong>in</strong>volv<strong>in</strong>g a factor.<br />
Particular danger occurs when the factor levels are str<strong>in</strong>gs that represent<br />
<strong>in</strong>tegers.<br />
The comm<strong>and</strong> as.ppp will convert data <strong>in</strong> another format (for example, a 2-column or 3column<br />
matrix or data frame) to a <strong>po<strong>in</strong>t</strong> pattern object <strong>of</strong> class "ppp". The third column <strong>of</strong> a<br />
matrix or data frame will be <strong>in</strong>terpreted as conta<strong>in</strong><strong>in</strong>g the marks.<br />
> mydata as.ppp(mydata, square(1))<br />
Copyright c○CSIRO 2008
24.2 Inspect<strong>in</strong>g a marked <strong>po<strong>in</strong>t</strong> pattern 161<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 10 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = a b c<br />
w<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units<br />
If <strong>po<strong>in</strong>t</strong> pattern data are stored <strong>in</strong> a text file, the comm<strong>and</strong> scanpp will read the data <strong>and</strong><br />
create a <strong>po<strong>in</strong>t</strong> pattern object <strong>of</strong> class "ppp". The argument multitype=TRUE will ensure that<br />
the mark values are <strong>in</strong>terpreted as a factor.<br />
> X amacr<strong>in</strong>e<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 294 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = <strong>of</strong>f on<br />
w<strong>in</strong>dow: rectangle = [0, 1.6012] x [0, 1] units (one unit = 662 microns)<br />
> summary(amacr<strong>in</strong>e)<br />
Copyright c○CSIRO 2008
162 H<strong>and</strong>l<strong>in</strong>g marked <strong>po<strong>in</strong>t</strong> pattern data<br />
Marked planar <strong>po<strong>in</strong>t</strong> pattern: 294 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 184 <strong>po<strong>in</strong>t</strong>s per square unit (one unit = 662 microns)<br />
Multitype:<br />
frequency proportion <strong>in</strong>tensity<br />
<strong>of</strong>f 142 0.483 88.7<br />
on 152 0.517 94.9<br />
W<strong>in</strong>dow: rectangle = [0, 1.6012] x [0, 1] units<br />
W<strong>in</strong>dow area = 1.60121 square units<br />
Unit <strong>of</strong> length: 662 microns<br />
> plot(amacr<strong>in</strong>e)<br />
<strong>of</strong>f on<br />
1 2<br />
amacr<strong>in</strong>e<br />
You can also convert a marked <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong>to a data frame for closer <strong>in</strong>spection <strong>of</strong> the<br />
coord<strong>in</strong>ates <strong>and</strong> mark values:<br />
> as.data.frame(amacr<strong>in</strong>e)<br />
x y marks<br />
1 0.0224 0.0243 on<br />
2 0.0243 0.1028 on<br />
3 0.1626 0.1477 on<br />
........<br />
The marks can be extracted us<strong>in</strong>g the function marks:<br />
> data(longleaf)<br />
> m
24.3 Manipulat<strong>in</strong>g data 163<br />
24.3 Manipulat<strong>in</strong>g data<br />
24.3.1 Manipulat<strong>in</strong>g marks<br />
The follow<strong>in</strong>g tools can manipulate the marks <strong>in</strong> a <strong>po<strong>in</strong>t</strong> pattern:<br />
marks extract marks<br />
marks data(lans<strong>in</strong>g)<br />
> d a marks(lans<strong>in</strong>g) data(amacr<strong>in</strong>e)<br />
> Y plot(split(amacr<strong>in</strong>e))<br />
split(amacr<strong>in</strong>e)<br />
<strong>of</strong>f on<br />
24.3.3 Cutt<strong>in</strong>g the numerical scale <strong>in</strong>to b<strong>and</strong>s<br />
For a <strong>po<strong>in</strong>t</strong> pattern with numeric marks, the marks can be converted to a factor, us<strong>in</strong>g a method<br />
for the generic function cut. The user specifies a series <strong>of</strong> cut-<strong>po<strong>in</strong>t</strong>s on the numerical scale; all<br />
mark values between two cut-<strong>po<strong>in</strong>t</strong>s are given the same label.<br />
Copyright c○CSIRO 2008
164 H<strong>and</strong>l<strong>in</strong>g marked <strong>po<strong>in</strong>t</strong> pattern data<br />
For example, the Longleaf P<strong>in</strong>es data are the locations <strong>of</strong> trees marked with their diameter<br />
at breast height, dbh, <strong>in</strong> centimetres. By convention we def<strong>in</strong>e “adult” trees to be those with<br />
dbh greater than 30 centimetres. To obta<strong>in</strong> the bivariate <strong>po<strong>in</strong>t</strong> pattern <strong>of</strong> adult <strong>and</strong> juvenile<br />
trees,<br />
> data(longleaf)<br />
> longleaf<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 584 <strong>po<strong>in</strong>t</strong>s<br />
marks are numeric, <strong>of</strong> type ’double’<br />
w<strong>in</strong>dow: rectangle = [0, 200] x [0, 200] metres<br />
> X X<br />
marked planar <strong>po<strong>in</strong>t</strong> pattern: 584 <strong>po<strong>in</strong>t</strong>s<br />
multitype, with levels = juvenile adult<br />
w<strong>in</strong>dow: rectangle = [0, 200] x [0, 200] metres<br />
> par(mfrow = c(1, 2))<br />
> plot(longleaf)<br />
0 20 40 60 80<br />
0.000000 1.722522 3.445045 5.167567 6.890090<br />
> plot(X, ma<strong>in</strong> = "cut(longleaf)")<br />
juvenile adult<br />
1 2<br />
> par(mfrow = c(1, 1))<br />
Copyright c○CSIRO 2008<br />
longleaf cut(longleaf)
25 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
This section covers some tools for exploratory data analysis <strong>of</strong> marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Most <strong>of</strong><br />
the tools have been developed for the special case <strong>of</strong> multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> (i.e. where the<br />
marks are categorical).<br />
25.1 Intensity<br />
The Lans<strong>in</strong>g Woods data give the locations <strong>of</strong> 6 species <strong>of</strong> trees <strong>in</strong> a forest <strong>in</strong> Michigan. Elementary<br />
estimates <strong>of</strong> the frequency distribution <strong>of</strong> species, <strong>and</strong> the <strong>in</strong>tensity <strong>of</strong> each species, are<br />
available from summary.ppp.<br />
> data(lans<strong>in</strong>g)<br />
> summary(lans<strong>in</strong>g)<br />
Marked planar <strong>po<strong>in</strong>t</strong> pattern: 2251 <strong>po<strong>in</strong>t</strong>s<br />
Average <strong>in</strong>tensity 2250 <strong>po<strong>in</strong>t</strong>s per square unit (one unit = 924 feet)<br />
*Pattern conta<strong>in</strong>s duplicated <strong>po<strong>in</strong>t</strong>s*<br />
Multitype:<br />
frequency proportion <strong>in</strong>tensity<br />
blackoak 135 0.0600 135<br />
hickory 703 0.3120 703<br />
maple 514 0.2280 514<br />
misc 105 0.0466 105<br />
redoak 346 0.1540 346<br />
whiteoak 448 0.1990 448<br />
W<strong>in</strong>dow: rectangle = [0, 1] x [0, 1] units<br />
W<strong>in</strong>dow area = 1 square unit<br />
Unit <strong>of</strong> length: 924 feet<br />
It’s sensible to exam<strong>in</strong>e the sub-<strong>patterns</strong> <strong>of</strong> different types separately, us<strong>in</strong>g split.ppp.<br />
> plot(split(lans<strong>in</strong>g))<br />
165<br />
Copyright c○CSIRO 2008
166 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
split(lans<strong>in</strong>g)<br />
blackoak hickory maple<br />
misc redoak whiteoak<br />
It would be useful to compute <strong>and</strong> plot a separate estimate <strong>of</strong> <strong>in</strong>tensity for each type <strong>of</strong> tree.<br />
This is possible us<strong>in</strong>g the functions density.splitppp <strong>and</strong> plot.list<strong>of</strong>. They are <strong>in</strong>voked<br />
simply by typ<strong>in</strong>g<br />
> plot(density(split(lans<strong>in</strong>g)), ribbon = FALSE)<br />
density(split(lans<strong>in</strong>g))<br />
blackoak hickory maple<br />
misc redoak whiteoak<br />
The relative proportions <strong>of</strong> <strong>in</strong>tensity can then be computed us<strong>in</strong>g eval.im:<br />
> Y attach(Y)<br />
Copyright c○CSIRO 2008
0 20 40 60 80<br />
25.2 Numeric marks: distribution <strong>and</strong> trend 167<br />
> pBlackoak plot(pBlackoak)<br />
> detach(Y)<br />
pBlackoak<br />
0.05 0.1 0.15<br />
Parametric estimates <strong>of</strong> <strong>in</strong>tensity can be obta<strong>in</strong>ed us<strong>in</strong>g ppm, fitt<strong>in</strong>g a Poisson model with<br />
an <strong>in</strong>tensity function that may depend on location <strong>and</strong>/or on the marks. See below.<br />
25.2 Numeric marks: distribution <strong>and</strong> trend<br />
For a <strong>po<strong>in</strong>t</strong> pattern with marks that are numeric (real numbers or <strong>in</strong>tegers) or logical values,<br />
the mark values can be extracted us<strong>in</strong>g the marks function <strong>and</strong> <strong>in</strong>spected us<strong>in</strong>g the histogram<br />
or kernel density estimate:<br />
> data(longleaf)<br />
> hist(marks(longleaf))<br />
Histogram <strong>of</strong> marks(longleaf)<br />
Copyright c○CSIRO 2008
168 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
To assess <strong>spatial</strong> trend <strong>in</strong> the marks, one way is to form a kernel regression smoother. The<br />
smoothed mark value at location u ∈ R2 is<br />
<br />
i m(u) =<br />
miκ(u − xi)<br />
κ(u − xi)<br />
<br />
i<br />
where k is the smooth<strong>in</strong>g kernel, <strong>and</strong> mi is the mark value at data <strong>po<strong>in</strong>t</strong> xi. This is computed<br />
by smooth.ppp:<br />
> plot(smooth.ppp(longleaf))<br />
smooth.ppp(longleaf)<br />
15 20 25 30 35 40<br />
You can also use cut.ppp followed by split.ppp to look for <strong>spatial</strong> <strong>in</strong>homogeneity <strong>of</strong> the<br />
marks:<br />
> data(spruces)<br />
> plot(split(cut(spruces, breaks = 3)))<br />
split(cut(spruces, breaks = 3))<br />
(0.16,0.23] (0.23,0.3] (0.3,0.37]<br />
25.3 Simple summaries <strong>of</strong> neighbour<strong>in</strong>g marks<br />
We are <strong>of</strong>ten <strong>in</strong>terested <strong>in</strong> the marks that are attached to the close neighbours <strong>of</strong> a typical <strong>po<strong>in</strong>t</strong>.<br />
For a multitype <strong>po<strong>in</strong>t</strong> pattern, the function marktable compiles a cont<strong>in</strong>gency table <strong>of</strong> the<br />
marks <strong>of</strong> all <strong>po<strong>in</strong>t</strong>s with<strong>in</strong> a given radius <strong>of</strong> each data <strong>po<strong>in</strong>t</strong>:<br />
Copyright c○CSIRO 2008
25.4 Summary functions 169<br />
> data(amacr<strong>in</strong>e)<br />
> M M[1:10, ]<br />
mark<br />
<strong>po<strong>in</strong>t</strong> <strong>of</strong>f on<br />
1 1 1<br />
2 2 2<br />
3 4 3<br />
4 3 1<br />
5 4 1<br />
6 2 3<br />
7 3 2<br />
8 1 1<br />
9 3 1<br />
10 3 2<br />
More general summaries <strong>of</strong> the marks <strong>of</strong> neighbours can be obta<strong>in</strong>ed us<strong>in</strong>g the function<br />
markstat. For example, to compute the average diameter <strong>of</strong> the 5 closest neighbours <strong>of</strong> each<br />
tree <strong>in</strong> the Longleaf P<strong>in</strong>es dataset,<br />
> md md[1:10]<br />
[1] 43.40 43.40 48.58 21.70 48.38 53.32 40.28 29.82 24.92 21.70<br />
25.4 Summary functions<br />
The summary functions F, G, J <strong>and</strong> K (<strong>and</strong> other functions derived from K, such as L <strong>and</strong> the<br />
pair correlation function) have been extended to multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
25.4.1 A pair <strong>of</strong> types<br />
Assume the multitype <strong>po<strong>in</strong>t</strong> process X is stationary. Let Xj denote the sub-pattern <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong><br />
type j, with <strong>in</strong>tensity λj. Then for any pair <strong>of</strong> types i <strong>and</strong> j,<br />
• Fj(r) is the empty space function for Xj.<br />
• Gij(r) is the distribution function <strong>of</strong> the distance from a <strong>po<strong>in</strong>t</strong> <strong>of</strong> type i to the nearest<br />
<strong>po<strong>in</strong>t</strong> <strong>of</strong> type j<br />
• Kij(r) is 1/λj times the expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> type j with<strong>in</strong> a distance r <strong>of</strong> a<br />
typical <strong>po<strong>in</strong>t</strong> <strong>of</strong> type i.<br />
• Lij(r) is the correspond<strong>in</strong>g L-function<br />
<br />
Kij(r)<br />
Lij(r) = .<br />
π<br />
• gij(r) is the correspond<strong>in</strong>g analogue <strong>of</strong> the pair correlation function<br />
where K ′ ij (r) is the derivative <strong>of</strong> Kij.<br />
gij(r) = K′ ij (r)<br />
2πr<br />
Copyright c○CSIRO 2008
170 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
• Jij is def<strong>in</strong>ed as<br />
Jij(r) =<br />
1 − Gij(r)<br />
1 − Fj(r) .<br />
The functions Gij,Kij,Lij,gij,Jij are called “cross-type” or “i-to-j” summary functions. They<br />
are computed <strong>in</strong> spatstat by Gcross, Kcross, Lcross, pcfcross <strong>and</strong> Jcross respectively.<br />
> data(amacr<strong>in</strong>e)<br />
> amacr<strong>in</strong>e<br />
> plot(Gcross(amacr<strong>in</strong>e, "on", "<strong>of</strong>f"))<br />
Gcross["on", "<strong>of</strong>f"](r)<br />
0.0 0.2 0.4 0.6 0.8<br />
Gcross(amacr<strong>in</strong>e, "on", "<strong>of</strong>f")<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
The <strong>in</strong>terpretation <strong>of</strong> the cross-type summary functions is similar, but not identical, to that<br />
<strong>of</strong> the orig<strong>in</strong>al functions F, G, K etc:<br />
• if Xj is a uniform Poisson process (CSR), then Fj(r) = 1 − exp(−λjπr 2 ).<br />
• if Xj is a uniform Poisson process (CSR) <strong>and</strong> is <strong>in</strong>dependent <strong>of</strong> Xi, then Gij(r) = 1 −<br />
exp(−λjπr 2 ).<br />
• if Xi <strong>and</strong> Xj are <strong>in</strong>dependent, then Kij(r) = πr 2 <strong>and</strong> so Lij(r) = r <strong>and</strong> gij(r) = 1.<br />
• if Xi <strong>and</strong> Xj are <strong>in</strong>dependent, then Jij(r) = 1.<br />
Here ‘<strong>in</strong>dependent’ means that the two <strong>po<strong>in</strong>t</strong> processes are probabilistically <strong>in</strong>dependent.<br />
25.4.2 All pairs <strong>of</strong> types<br />
The comm<strong>and</strong> alltypes enables the user to compute the cross-type summary functions between<br />
all pairs <strong>of</strong> types simultaneously. For example, to compute Gij(r) for all i <strong>and</strong> j <strong>in</strong> the amacr<strong>in</strong>e<br />
cells data, we would use alltypes(amacr<strong>in</strong>e, "G"). The result is automatically displayed as<br />
an array <strong>of</strong> plot panels.<br />
> plot(alltypes(amacr<strong>in</strong>e, "G"))<br />
Copyright c○CSIRO 2008
25.4 Summary functions 171<br />
<strong>of</strong>f<br />
on<br />
km , rs , theo<br />
km , rs , theo<br />
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7<br />
0.0 0.2 0.4 0.6 0.8<br />
array <strong>of</strong> G functions for amacr<strong>in</strong>e.<br />
<strong>of</strong>f on<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
km , rs , theo<br />
km , rs , theo<br />
0.0 0.2 0.4 0.6 0.8<br />
0.0 0.2 0.4 0.6<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
The result <strong>of</strong> alltypes is a ‘function array’ (object <strong>of</strong> class "fasp") which can be <strong>in</strong>dexed<br />
by row <strong>and</strong> column subscripts. If the <strong>po<strong>in</strong>t</strong> pattern has a large number <strong>of</strong> possible types, you<br />
can compute the array <strong>of</strong> all possible pairwise G functions, then use the subscript operator to<br />
<strong>in</strong>spect a subset <strong>of</strong> the array.<br />
> data(lans<strong>in</strong>g)<br />
> a plot(a[2:3, 2:3])<br />
Copyright c○CSIRO 2008
172 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
hickory<br />
maple<br />
km , rs , theo<br />
km , rs , theo<br />
0.0 0.2 0.4 0.6 0.8<br />
0.0 0.2 0.4 0.6 0.8<br />
array <strong>of</strong> G functions for lans<strong>in</strong>g.<br />
0.000 0.005 0.010 0.015 0.020 0.025 0.030<br />
r (one unit = 924 feet)<br />
0.000 0.005 0.010 0.015 0.020 0.025 0.030<br />
r (one unit = 924 feet)<br />
25.4.3 One type to any type<br />
hickory maple<br />
Also def<strong>in</strong>ed are the “i-to-any” summaries<br />
km , rs , theo<br />
km , rs , theo<br />
0.0 0.2 0.4 0.6 0.8<br />
0.0 0.2 0.4 0.6 0.8<br />
0.000 0.005 0.010 0.015 0.020 0.025 0.030<br />
r (one unit = 924 feet)<br />
0.000 0.005 0.010 0.015 0.020 0.025 0.030<br />
r (one unit = 924 feet)<br />
• Gi•(r), the distribution function <strong>of</strong> the distance from a <strong>po<strong>in</strong>t</strong> <strong>of</strong> type i to the nearest other<br />
<strong>po<strong>in</strong>t</strong> <strong>of</strong> any type;<br />
• Ki•(r) is 1/λ times the expected number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> any type with<strong>in</strong> a distance r <strong>of</strong> a<br />
typical <strong>po<strong>in</strong>t</strong> <strong>of</strong> type i. Here λ = <br />
j λj is the <strong>in</strong>tensity <strong>of</strong> the entire process X.<br />
• Li•(r) is the correspond<strong>in</strong>g L-function<br />
• Ji• def<strong>in</strong>ed by<br />
<br />
Ki•(r)<br />
Li•(r) = .<br />
π<br />
Ji•(r) =<br />
1 − Gi•<br />
1 − F(r)<br />
These are comput<strong>in</strong>g by Gdot, Kdot, Ldot <strong>and</strong> Jdot respectively, or us<strong>in</strong>g alltypes.<br />
> plot(Gdot(amacr<strong>in</strong>e, "on"))<br />
Copyright c○CSIRO 2008
25.4 Summary functions 173<br />
Gdot["on"](r)<br />
0.0 0.2 0.4 0.6 0.8<br />
Gdot(amacr<strong>in</strong>e, "on")<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
> plot(alltypes(amacr<strong>in</strong>e, "Gdot"))<br />
array <strong>of</strong> Gdot functions for amacr<strong>in</strong>e.<br />
<strong>of</strong>f<br />
on<br />
km , rs , theo<br />
km , rs , theo<br />
0.0 0.2 0.4 0.6 0.8<br />
0.0 0.2 0.4 0.6 0.8<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
0.00 0.01 0.02 0.03 0.04 0.05 0.06<br />
r (one unit = 662 microns)<br />
25.4.4 Plott<strong>in</strong>g <strong>and</strong> manipulat<strong>in</strong>g function arrays<br />
A function array (object <strong>of</strong> class "fasp") can be pr<strong>in</strong>ted <strong>and</strong> plotted us<strong>in</strong>g methods for this<br />
class. It can also be manipulated <strong>in</strong> various ways.<br />
The plot method is similar to plot.fv <strong>and</strong> allows the function values to be transformed:<br />
> aG fisher plot(aG, fisher(.) ~ fisher(theo))<br />
Copyright c○CSIRO 2008
174 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
<strong>of</strong>f<br />
on<br />
fisher(cb<strong>in</strong>d(km, rs, theo))<br />
fisher(cb<strong>in</strong>d(km, rs, theo))<br />
0.0 0.5 1.0 1.5<br />
0.0 0.5 1.0 1.5<br />
array <strong>of</strong> G functions for amacr<strong>in</strong>e.<br />
<strong>of</strong>f on<br />
0.0 0.5 1.0 1.5<br />
fisher(theo)<br />
0.0 0.5 1.0 1.5<br />
fisher(theo)<br />
fisher(cb<strong>in</strong>d(km, rs, theo))<br />
fisher(cb<strong>in</strong>d(km, rs, theo))<br />
0.0 0.5 1.0 1.5<br />
0.0 0.5 1.0 1.5<br />
0.0 0.5 1.0 1.5<br />
fisher(theo)<br />
0.0 0.5 1.0 1.5<br />
fisher(theo)<br />
As mentioned above, the function array can be <strong>in</strong>dexed by array subscripts.<br />
> data(lans<strong>in</strong>g)<br />
> a b aGfish
25.6 R<strong>and</strong>omisation tests 175<br />
> markcorr(X, f)<br />
where X is a <strong>po<strong>in</strong>t</strong> pattern <strong>and</strong> f is an R language function. For example, for the amacr<strong>in</strong>e<br />
data, the natural function f is f(m1,m2) = 1 {m1 = m2} which we encode as<br />
> eqfun M plot(M)<br />
m(r)<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
M<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)<br />
25.6 R<strong>and</strong>omisation tests<br />
Simulation envelopes <strong>of</strong> summary functions can be used to test various null hypotheses for<br />
marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
25.6.1 Poisson null<br />
The null hypothesis <strong>of</strong> a homogeneous Poisson marked <strong>po<strong>in</strong>t</strong> process can be tested by direct<br />
simulation, us<strong>in</strong>g envelope as before. For example, us<strong>in</strong>g the cross-type K function as the test<br />
statistic,<br />
> data(amacr<strong>in</strong>e)<br />
> E plot(E, ma<strong>in</strong> = "test <strong>of</strong> marked Poisson model")<br />
Copyright c○CSIRO 2008
176 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Kcross["on", "<strong>of</strong>f"](r)<br />
0.00 0.05 0.10 0.15 0.20<br />
test <strong>of</strong> marked Poisson model<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)<br />
Notice that the arguments i <strong>and</strong> j here do not match any <strong>of</strong> the formal arguments <strong>of</strong><br />
envelope, so they are passed toKcross. This has the effect <strong>of</strong> call<strong>in</strong>gKcross(X, i="on", j="<strong>of</strong>f")<br />
for each <strong>of</strong> the simulated <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> X. Each simulated pattern is generated by the homogeneous<br />
Poisson <strong>po<strong>in</strong>t</strong> process with <strong>in</strong>tensities estimated from the dataset amacr<strong>in</strong>e.<br />
25.6.2 Independence <strong>of</strong> components<br />
It’s also possible to test other null hypotheses by a r<strong>and</strong>omisation test. We discussed two popular<br />
null hypotheses:<br />
• r<strong>and</strong>om labell<strong>in</strong>g: given the locations X, the marks are conditionally <strong>in</strong>dependent <strong>and</strong><br />
identically distributed;<br />
• <strong>in</strong>dependence <strong>of</strong> components: the sub-processes Xm <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> each mark m, are <strong>in</strong>dependent<br />
<strong>po<strong>in</strong>t</strong> processes.<br />
In a r<strong>and</strong>omisation test <strong>of</strong> the <strong>in</strong>dependence-<strong>of</strong>-components hypothesis, the simulated <strong>patterns</strong><br />
X are generated from the dataset by splitt<strong>in</strong>g the data <strong>in</strong>to sub-<strong>patterns</strong> <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> one<br />
type, <strong>and</strong> r<strong>and</strong>omly shift<strong>in</strong>g these sub-<strong>patterns</strong>, <strong>in</strong>dependently <strong>of</strong> each other. The shift<strong>in</strong>g is<br />
performed by rshift:<br />
> E plot(E, ma<strong>in</strong> = "test <strong>of</strong> <strong>in</strong>dependent components")<br />
Copyright c○CSIRO 2008
25.6 R<strong>and</strong>omisation tests 177<br />
Kcross["on", "<strong>of</strong>f"](r)<br />
0.00 0.05 0.10 0.15 0.20<br />
test <strong>of</strong> <strong>in</strong>dependent components<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)<br />
The <strong>in</strong>dependence-<strong>of</strong>-components hypothesis seems to be accepted <strong>in</strong> this example.<br />
Under the <strong>in</strong>dependence hypothesis,<br />
Kij(r) = πr 2<br />
Gij(r) = Fj(r)<br />
Jij(r) ≡ 1.<br />
while the “i-to-any” functions have complicated values. Thus, we would normally use Kij or Jij<br />
to construct a test statistic for <strong>in</strong>dependence <strong>of</strong> components.<br />
25.6.3 R<strong>and</strong>om labell<strong>in</strong>g<br />
In a r<strong>and</strong>omisation test <strong>of</strong> the r<strong>and</strong>om labell<strong>in</strong>g null hypothesis, the simulated <strong>patterns</strong> X are<br />
generated from the dataset by hold<strong>in</strong>g the <strong>po<strong>in</strong>t</strong> locations fixed, <strong>and</strong> r<strong>and</strong>omly resampl<strong>in</strong>g the<br />
marks, either with replacement (<strong>in</strong>dependent r<strong>and</strong>om sampl<strong>in</strong>g) or without replacement (r<strong>and</strong>omly<br />
permut<strong>in</strong>g the marks). The resampl<strong>in</strong>g operation is performed by rlabel.<br />
Under r<strong>and</strong>om labell<strong>in</strong>g,<br />
Ji•(r) = J(r)<br />
Ki•(r) = K(r)<br />
Gi•(r) = G(r)<br />
(where G,K,J are the summary functions for the <strong>po<strong>in</strong>t</strong> process without marks) while the other,<br />
cross-type functions have complicated values. Thus, we would normally use someth<strong>in</strong>g like<br />
Ki•(r) − K(r) to construct a test statistic for r<strong>and</strong>om labell<strong>in</strong>g.<br />
To do this, cook up a little function to evaluate Ji•(r) − J(r):<br />
> Jdif
178 Methods 11: exploratory tools for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Jdot["on"](r) − J(r)<br />
−0.4 −0.2 0.0 0.2 0.4 0.6<br />
test <strong>of</strong> r<strong>and</strong>om labell<strong>in</strong>g<br />
0.00 0.01 0.02 0.03 0.04 0.05<br />
r (one unit = 662 microns)<br />
The r<strong>and</strong>om labell<strong>in</strong>g hypothesis also seems to be accepted.<br />
25.6.4 Arrays <strong>of</strong> envelopes<br />
To compute a simulation envelope for the function Kij for each pair <strong>of</strong> types i <strong>and</strong> j, use<br />
alltypes with the argument envelope=TRUE.<br />
> aE plot(aE, sqrt(./pi) - r ~ r, ylab = "L(r)-r")<br />
<strong>of</strong>f<br />
on<br />
L(r)−r<br />
L(r)−r<br />
−0.03 −0.02 −0.01 0.00 0.01<br />
−0.010 −0.005 0.000 0.005<br />
array <strong>of</strong> envelopes <strong>of</strong> Kcross functions for amacr<strong>in</strong>e.<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)<br />
Copyright c○CSIRO 2008<br />
<strong>of</strong>f on<br />
L(r)−r<br />
L(r)−r<br />
−0.010 −0.005 0.000 0.005<br />
−0.03 −0.02 −0.01 0.00 0.01<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)<br />
0.00 0.05 0.10 0.15 0.20 0.25<br />
r (one unit = 662 microns)
26 Methods 12: multitype Poisson models<br />
This section covers multitype Poisson process models: basic properties, simulation, <strong>and</strong> fitt<strong>in</strong>g<br />
models to data.<br />
26.1 Theory<br />
26.1.1 Complete <strong>spatial</strong> r<strong>and</strong>omness <strong>and</strong> <strong>in</strong>dependence<br />
A uniform Poisson marked <strong>po<strong>in</strong>t</strong> process <strong>in</strong> R 2 with marks <strong>in</strong> M can be def<strong>in</strong>ed <strong>in</strong> the follow<strong>in</strong>g<br />
equivalent ways.<br />
179<br />
• r<strong>and</strong>omly marked Poisson process (Poisson [X], iid [M|X]): a Poisson <strong>po<strong>in</strong>t</strong> process <strong>of</strong><br />
locations X with <strong>in</strong>tensity β is first generated. Then each <strong>po<strong>in</strong>t</strong> xi is labelled with a<br />
r<strong>and</strong>om mark mi, <strong>in</strong>dependently <strong>of</strong> other <strong>po<strong>in</strong>t</strong>s, with distribution P {Mi = m} = pm for<br />
m ∈ M.<br />
• superposition <strong>of</strong> <strong>in</strong>dependent Poisson processes (iid [M], Poisson [X|M]): for each possible<br />
mark m ∈ M, a Poisson process Xm is generated, with <strong>in</strong>tensity βm. The <strong>po<strong>in</strong>t</strong>s <strong>of</strong> Xm<br />
are tagged with the mark m. Then the processes Xm with different marks m ∈ M are<br />
superimposed, to yield a marked <strong>po<strong>in</strong>t</strong> process.<br />
• Poisson marked <strong>po<strong>in</strong>t</strong> process (jo<strong>in</strong>tly Poisson [X,M]): a Poisson process on R 2 × M is<br />
generated, with <strong>in</strong>tensity function λ(u,m) = βm at location u <strong>and</strong> mark m.<br />
These constructions are equivalent when βm = pmβ. See the lovely book by K<strong>in</strong>gman [32].<br />
S<strong>in</strong>ce the established term CSR (‘complete <strong>spatial</strong> r<strong>and</strong>omness’) is used to refer to the uniform<br />
Poisson <strong>po<strong>in</strong>t</strong> process, I propose that the uniform marked Poisson <strong>po<strong>in</strong>t</strong> process should be called<br />
‘complete <strong>spatial</strong> r<strong>and</strong>omness <strong>and</strong> <strong>in</strong>dependence’ (CSRI).<br />
26.1.2 Inhomogeneous Poisson marked <strong>po<strong>in</strong>t</strong> processes<br />
A <strong>in</strong>homogeneous Poisson marked <strong>po<strong>in</strong>t</strong> process Y with ‘jo<strong>in</strong>t’ <strong>in</strong>tensity λ(u,m) for locations u<br />
<strong>and</strong> mark values m is simply def<strong>in</strong>ed as an <strong>in</strong>homogeneous Poisson <strong>po<strong>in</strong>t</strong> process on R 2 × M<br />
with <strong>in</strong>tensity function λ(u,m).<br />
Let’s restrict attention to the case <strong>of</strong> categorical marks, where M is f<strong>in</strong>ite. Then the process<br />
Y has the follow<strong>in</strong>g properties:<br />
• The locations X, obta<strong>in</strong>ed by remov<strong>in</strong>g the marks, constitute an <strong>in</strong>homogeneous Poisson<br />
process <strong>in</strong> R 2 with <strong>in</strong>tensity function<br />
β(u) = <br />
λ(u,m).<br />
m<br />
• Conditional on the locations X, the marks attached to the <strong>po<strong>in</strong>t</strong>s are <strong>in</strong>dependent. For a<br />
<strong>po<strong>in</strong>t</strong> xi the conditional distribution <strong>of</strong> the mark mi is P{Mi = m} = λ(xi,m)/β(xi).<br />
• The sub-process Xm <strong>of</strong> <strong>po<strong>in</strong>t</strong>s with mark m, is an <strong>in</strong>homogeneous Poisson <strong>po<strong>in</strong>t</strong> process<br />
with <strong>in</strong>tensity βm(u) = λ(u,m).<br />
• The sub-processes Xm <strong>of</strong> <strong>po<strong>in</strong>t</strong>s with different marks m are <strong>in</strong>dependent processes.<br />
Copyright c○CSIRO 2008
180 Methods 12: multitype Poisson models<br />
26.2 Simulation<br />
Realisations <strong>of</strong> Poisson marked <strong>po<strong>in</strong>t</strong> processes can be generated us<strong>in</strong>g rmpoispp. The first<br />
argument <strong>of</strong> this comm<strong>and</strong> specifies the <strong>in</strong>tensity or <strong>in</strong>tensity function λ(u,m). It can be a<br />
constant, a vector <strong>of</strong> constants, or an R function.<br />
> par(mfrow = c(1, 2))<br />
> Xunif plot(Xunif, ma<strong>in</strong> = "CSRI, <strong>in</strong>tensity A=100, B=100")<br />
> Xunif plot(Xunif, ma<strong>in</strong> = "CSRI, <strong>in</strong>tensity A=100, B=20")<br />
> par(mfrow = c(1, 1))<br />
CSRI, <strong>in</strong>tensity A=100, B=100 CSRI, <strong>in</strong>tensity A=100, B=20<br />
> X1 lamb X2 par(mfrow = c(1, 2))<br />
> plot(X1, ma<strong>in</strong> = "")<br />
> plot(X2, ma<strong>in</strong> = "")<br />
> par(mfrow = c(1, 1))<br />
Copyright c○CSIRO 2008
26.3 Fitt<strong>in</strong>g Poisson models 181<br />
26.3 Fitt<strong>in</strong>g Poisson models<br />
Poisson marked <strong>po<strong>in</strong>t</strong> process models may be fitted to <strong>po<strong>in</strong>t</strong> pattern data us<strong>in</strong>g ppm. Currently<br />
the methods are only available for multitype <strong>po<strong>in</strong>t</strong> processes (categorical marks).<br />
26.3.1 Probability densities<br />
Let W ⊂ R 2 be the study region, <strong>and</strong> M the (f<strong>in</strong>ite) set <strong>of</strong> possible marks. Then a marked <strong>po<strong>in</strong>t</strong><br />
pattern is a set<br />
y = {(x1,m1),... ,(xn,mn)}, xi ∈ W, mi ∈ M, n ≥ 0<br />
<strong>of</strong> pairs (xi,mi) <strong>of</strong> locations xi with marks mi. It can be viewed as a <strong>po<strong>in</strong>t</strong> pattern <strong>in</strong> the<br />
Cartesian product W × M.<br />
The probability density <strong>of</strong> a marked <strong>po<strong>in</strong>t</strong> process is a function f(y) def<strong>in</strong>ed for all marked<br />
<strong>po<strong>in</strong>t</strong> <strong>patterns</strong> y <strong>in</strong>clud<strong>in</strong>g the empty pattern ∅.<br />
The process with probability density f(y) ≡ 1 is the uniform Poisson marked <strong>po<strong>in</strong>t</strong> process<br />
with <strong>in</strong>tensity 1 for each mark. That is, for this model, the sub-process <strong>of</strong> <strong>po<strong>in</strong>t</strong>s with mark<br />
mi = m is a uniform Poisson process with <strong>in</strong>tensity 1. If the marks are removed, we obta<strong>in</strong> a<br />
Poisson <strong>po<strong>in</strong>t</strong> process with <strong>in</strong>tensity equal to |M|, the number <strong>of</strong> possible types.<br />
The uniform Poisson marked <strong>po<strong>in</strong>t</strong> process with <strong>in</strong>tensity λ(u,m) = βm has probability<br />
density<br />
<br />
<br />
f(y) = exp (1 − βm)|W |<br />
m∈M<br />
n(y)<br />
<br />
i=1<br />
<br />
<br />
<br />
= exp (1 − βm)|W |<br />
m∈M<br />
m∈M<br />
βmi<br />
β nm(y)<br />
m<br />
where nm(y) is the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> y hav<strong>in</strong>g mark value m.<br />
The <strong>in</strong>homogeneous Poisson marked <strong>po<strong>in</strong>t</strong> process with <strong>in</strong>tensity function λ(u,m), at location<br />
u ∈ W <strong>and</strong> mark m ∈ M, has probability density<br />
Copyright c○CSIRO 2008
182 Methods 12: multitype Poisson models<br />
f(y) = exp<br />
26.3.2 Maximum likelihood<br />
<br />
m∈M<br />
<br />
W<br />
(1 − λ(u,m)du<br />
n(y)<br />
<br />
λ(xi,mi). (48)<br />
For the multitype Poisson process with <strong>in</strong>tensity function λ(u,m) at location u ∈ W <strong>and</strong> mark<br />
m ∈ M, the loglikelihood is, up to a constant,<br />
log L =<br />
n<br />
log λ(xi,mi) − <br />
<br />
i=1<br />
m∈M<br />
W<br />
i=1<br />
λ(u,m)du. (49)<br />
where mi is the mark attached to data <strong>po<strong>in</strong>t</strong> xi. This is formally equivalent to the loglikelihood<br />
<strong>of</strong> a Poisson logl<strong>in</strong>ear regression, so the Berman-Turner algorithm can aga<strong>in</strong> be used to maximise<br />
the loglikelihood.<br />
26.3.3 Model-fitt<strong>in</strong>g <strong>in</strong> spatstat<br />
Poisson marked <strong>po<strong>in</strong>t</strong> process models are fitted to data us<strong>in</strong>g ppm.<br />
The trend formula <strong>in</strong> the call to ppm may <strong>in</strong>volve the reserved name marks as a variable.<br />
This refers to the marks <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s. S<strong>in</strong>ce the marks are categorical, marks is treated as a<br />
factor variable for modell<strong>in</strong>g purposes.<br />
To fit the homogeneous multitype Poisson process (CSRI), equation (50), we call<br />
> ppm(X, ~marks)<br />
The formula ~marks <strong>in</strong>dicates that the trend depends only on the marks, <strong>and</strong> not on <strong>spatial</strong><br />
location; s<strong>in</strong>ce marks is a factor, the trend has a separate constant value for each level <strong>of</strong> marks.<br />
This is the model (50).<br />
Note that if we had typed<br />
> ppm(X, ~1)<br />
this would have fitted the special case <strong>of</strong> CSRI where the <strong>in</strong>tensities βm are equal, βm ≡ α say,<br />
for all possible marks. That model is only appropriate if we believe that all mark values are<br />
equally likely.<br />
For the Lans<strong>in</strong>g Woods data, the m<strong>in</strong>imal model that makes sense is (50), so we call<br />
> ppm(lans<strong>in</strong>g, ~marks)<br />
Stationary multitype Poisson process<br />
Possible marks:<br />
blackoak hickory maple misc redoak whiteoak<br />
Trend formula: ~marks<br />
Intensities:<br />
beta_blackoak beta_hickory beta_maple beta_misc beta_redoak<br />
135 703 514 105 346<br />
beta_whiteoak<br />
448<br />
Copyright c○CSIRO 2008
26.3 Fitt<strong>in</strong>g Poisson models 183<br />
S<strong>in</strong>ce lans<strong>in</strong>g is a multitype <strong>po<strong>in</strong>t</strong> pattern (its marks are categorical), the variable marks <strong>in</strong><br />
the formula is a factor. The model has one parameter/coefficient for each level <strong>of</strong> the factor, i.e.<br />
one coefficient for each type <strong>of</strong> <strong>po<strong>in</strong>t</strong>. In other words, this is the homogeneous Poisson marked<br />
<strong>po<strong>in</strong>t</strong> process with <strong>in</strong>tensity βm for <strong>po<strong>in</strong>t</strong>s <strong>of</strong> mark m.<br />
You’ll notice that the parameter estimates βm co<strong>in</strong>cide with those obta<strong>in</strong>ed fromsummary.ppp<br />
above. That is a consequence <strong>of</strong> the fact that the maximum likelihood estimates (obta<strong>in</strong>ed by<br />
ppm) are also the method-<strong>of</strong>-moments estimates (obta<strong>in</strong>ed by summary.ppp).<br />
A more complicated example is<br />
> ppm(lans<strong>in</strong>g, ~marks + x)<br />
Nonstationary multitype Poisson process<br />
Possible marks:<br />
blackoak hickory maple misc redoak whiteoak<br />
Trend formula: ~marks + x<br />
Fitted coefficients for trend formula:<br />
(Intercept) markshickory marksmaple marksmisc marksredoak<br />
4.94294727 1.65008211 1.33694849 -0.25131442 0.94116400<br />
markswhiteoak x<br />
1.19951845 -0.07581624<br />
This is the marked Poisson process whose <strong>in</strong>tensity function λ((x,y,m)) at location (x,y)<br />
<strong>and</strong> mark m satisfies<br />
log λ((x,y,m)) = αm + βx<br />
where α1,... ,α6 <strong>and</strong> β are parameters. The <strong>in</strong>tensity is logl<strong>in</strong>ear <strong>in</strong> x, with a different <strong>in</strong>tercept<br />
for each mark, but the same slope (“parallel logl<strong>in</strong>ear regression”). In the pr<strong>in</strong>tout above, the<br />
fitted slope parameter β is ˆ β =-0.07581624. As discussed <strong>in</strong> Section 13.3 on page 82, the fitted<br />
coefficients αm for the categorical mark are <strong>in</strong>terpreted <strong>in</strong> the light <strong>of</strong> the ‘contrasts’ <strong>in</strong> force.<br />
The default is the treatment contrasts, <strong>and</strong> the first level <strong>of</strong> the mark is blackoak, so <strong>in</strong> this<br />
case the fitted coefficient for m=blackoak is 4.942947, while the fitted coefficient for m=hickory<br />
is 4.942947 + 1.650082 = 6.593029 <strong>and</strong> so on.<br />
> ppm(lans<strong>in</strong>g, ~marks * x)<br />
Nonstationary multitype Poisson process<br />
Possible marks:<br />
blackoak hickory maple misc redoak whiteoak<br />
Trend formula: ~marks * x<br />
Fitted coefficients for trend formula:<br />
(Intercept) markshickory marksmaple marksmisc marksredoak<br />
5.2378062 1.4424915 0.6795604 -0.8482907 0.6916392<br />
markswhiteoak x markshickory:x marksmaple:x marksmisc:x<br />
1.0901772 -0.7063987 0.4511157 1.3243326 1.2138278<br />
marksredoak:x markswhiteoak:x<br />
0.5380413 0.2421379<br />
Copyright c○CSIRO 2008
184 Methods 12: multitype Poisson models<br />
The symbol * here is an ‘<strong>in</strong>teraction’ <strong>in</strong> the usual sense for l<strong>in</strong>ear models. The fitted model<br />
is the marked Poisson process with<br />
log λ((x,y,m)) = αm + βmx<br />
where α1,... ,α6 <strong>and</strong> β1,...,β6 are parameters. The <strong>in</strong>tensity is logl<strong>in</strong>ear <strong>in</strong> x with a different<br />
slope <strong>and</strong> <strong>in</strong>tercept for each mark.<br />
The result <strong>of</strong> ppm is aga<strong>in</strong> an object <strong>of</strong> class "ppm" represent<strong>in</strong>g a fitted <strong>po<strong>in</strong>t</strong> process model.<br />
To plot the fitted <strong>in</strong>tensity <strong>and</strong> conditional <strong>in</strong>tensity <strong>of</strong> the fitted model, use plot.ppm. For a<br />
multitype <strong>po<strong>in</strong>t</strong> process you will get a separate plot for each possible mark value.<br />
More complicated examples are:<br />
> ppm(lans<strong>in</strong>g, ~marks * polynom(x, y, 2))<br />
> ppm(lans<strong>in</strong>g, ~marks * harmonic(x, y, 2))<br />
Copyright c○CSIRO 2008
27 Methods 13: Gibbs models for multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
Gibbs <strong>po<strong>in</strong>t</strong> process models (section 20) are also available for marked <strong>po<strong>in</strong>t</strong> processes, <strong>and</strong> can<br />
be fitted to data us<strong>in</strong>g ppm. Currently the methods are only implemented for multitype <strong>po<strong>in</strong>t</strong><br />
processes (categorical marks), so we restrict attention to this case.<br />
27.1 Gibbs models<br />
Much <strong>of</strong> the theory <strong>of</strong> Gibbs models described <strong>in</strong> Section 20 carries over immediately to multitype<br />
<strong>po<strong>in</strong>t</strong> processes.<br />
27.1.1 Conditional <strong>in</strong>tensity<br />
The conditional <strong>in</strong>tensity λ(u,X) <strong>of</strong> an (unmarked) <strong>po<strong>in</strong>t</strong> process X at a location u was def<strong>in</strong>ed<br />
<strong>in</strong> section 20.5. Roughly speak<strong>in</strong>g λ(u,x)du is the conditional probability <strong>of</strong> f<strong>in</strong>d<strong>in</strong>g a <strong>po<strong>in</strong>t</strong><br />
near u, given that the rest <strong>of</strong> the <strong>po<strong>in</strong>t</strong> process X co<strong>in</strong>cides with x.<br />
For a marked <strong>po<strong>in</strong>t</strong> process Y the conditional <strong>in</strong>tensity is a function λ((u,m),Y) giv<strong>in</strong>g a<br />
value at a location u for each possible mark m. For a f<strong>in</strong>ite set <strong>of</strong> marks M, we can <strong>in</strong>terpret<br />
λ((u,m),y)du as the conditional probability f<strong>in</strong>d<strong>in</strong>g a <strong>po<strong>in</strong>t</strong> with mark m near u, given the rest<br />
<strong>of</strong> the marked <strong>po<strong>in</strong>t</strong> process.<br />
The conditional <strong>in</strong>tensity is related to the probability density f(y) by<br />
λ((u,m),y) =<br />
f(y ∪ {u})<br />
f(y)<br />
for (u,m) ∈ y.<br />
For Poisson processes, the conditional <strong>in</strong>tensity λ((u,m),y) co<strong>in</strong>cides with the <strong>in</strong>tensity<br />
function λ(u,m) <strong>and</strong> does not depend on the configuration y. For example, the homogeneous<br />
Poisson multitype <strong>po<strong>in</strong>t</strong> process or “CSRI” (Section 26.1.1) has conditional <strong>in</strong>tensity<br />
λ((u,m),y) = βm<br />
where βm ≥ 0 are constants which can be <strong>in</strong>terpreted <strong>in</strong> several equivalent ways (section 20.5).<br />
The sub-process consist<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> type m only is Poisson with <strong>in</strong>tensity βm. The process<br />
obta<strong>in</strong>ed by ignor<strong>in</strong>g the types, <strong>and</strong> comb<strong>in</strong><strong>in</strong>g all the <strong>po<strong>in</strong>t</strong>s, is Poisson with <strong>in</strong>tensity β =<br />
<br />
m βm. The marks attached to the <strong>po<strong>in</strong>t</strong>s are i.i.d. with distribution pm = βm/β.<br />
27.1.2 Pairwise <strong>in</strong>teractions<br />
A multitype pairwise <strong>in</strong>teraction process is a Gibbs process with probability density <strong>of</strong> the form<br />
⎡<br />
n(y) <br />
f(y) = α⎣<br />
bmi (xi)<br />
⎤ ⎡<br />
⎦ ⎣ <br />
cmi,mj (xi,xj)<br />
⎤<br />
⎦ (51)<br />
i=1<br />
where bm(u),m ∈ M are functions determ<strong>in</strong><strong>in</strong>g the ‘first order trend’ for <strong>po<strong>in</strong>t</strong>s <strong>of</strong> each type,<br />
<strong>and</strong> cm,m ′(u,v),m,m′ ∈ M are functions determ<strong>in</strong><strong>in</strong>g the <strong>in</strong>teraction between a pair <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong><br />
given types m <strong>and</strong> m ′ . The <strong>in</strong>teraction functions must be symmetric, cm,m ′(u,v) = cm,m ′(v,u)<br />
<strong>and</strong> cm,m ′ ≡ cm ′ ,m. The conditional <strong>in</strong>tensity is<br />
i
186 Methods 13: Gibbs models for multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
27.1.3 Pairwise <strong>in</strong>teractions not depend<strong>in</strong>g on marks<br />
The simplest examples <strong>of</strong> multitype pairwise <strong>in</strong>teraction processes are those <strong>in</strong> which the <strong>in</strong>teraction<br />
term cm,m ′(u,v) does not depend on the marks m,m′ . For example, we can take any <strong>of</strong><br />
the <strong>in</strong>teraction functions c(u,v) described <strong>in</strong> section 20.3 <strong>and</strong> use it to construct a marked <strong>po<strong>in</strong>t</strong><br />
process.<br />
Such processes can be constructed equivalently as follows [8]:<br />
• an unmarked Gibbs process is generated with first order term b(u) = <br />
m∈M bm(u) <strong>and</strong><br />
pairwise <strong>in</strong>teraction c(u,v).<br />
• each <strong>po<strong>in</strong>t</strong> xi <strong>of</strong> this unmarked process is labelled with a mark mi with probability distribution<br />
P{mi = m} = bi(xi)/b(xi) <strong>in</strong>dependent <strong>of</strong> other <strong>po<strong>in</strong>t</strong>s.<br />
If additionally the <strong>in</strong>tensity functions are constant, bm(u) ≡ βm, then such a <strong>po<strong>in</strong>t</strong> process<br />
has the r<strong>and</strong>om labell<strong>in</strong>g property.<br />
27.1.4 Mark-dependent pairwise <strong>in</strong>teractions<br />
Various complex k<strong>in</strong>ds <strong>of</strong> behaviour can be created by postulat<strong>in</strong>g a pairwise <strong>in</strong>teraction that<br />
does depend on the marks.<br />
A simple example is the multitype hard core process <strong>in</strong> which βm(u) ≡ β <strong>and</strong><br />
cm,m ′(u,v) =<br />
1 if ||u − v|| > rm,m ′<br />
0 if ||u − v|| ≤ rm,m ′<br />
where rm,m ′ = rm ′ ,m > 0 is the hard core distance for type m with type m ′ . In this process, two<br />
<strong>po<strong>in</strong>t</strong>s <strong>of</strong> type m <strong>and</strong> m ′ respectively can never come closer than the distance rm,m ′.<br />
By sett<strong>in</strong>g rm,m ′ = 0 for a particular pair <strong>of</strong> marks m,m′ we effectively remove the <strong>in</strong>teraction<br />
term between <strong>po<strong>in</strong>t</strong>s <strong>of</strong> these types. If there are only two types, say M = {1,2},<br />
then sett<strong>in</strong>g r1,2 = 0 implies that the sub-processes X1 <strong>and</strong> X2, consist<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> types<br />
1 <strong>and</strong> 2 respectively, are <strong>in</strong>dependent <strong>po<strong>in</strong>t</strong> processes. In other words the process satisfies the<br />
<strong>in</strong>dependence-<strong>of</strong>-components property.<br />
The multitype Strauss process has pairwise <strong>in</strong>teraction term<br />
cm,m ′(u,v) =<br />
1 if ||u − v|| > rm,m ′<br />
γm,m ′ if ||u − v|| ≤ rm,m ′<br />
where rm,m ′ > 0 are <strong>in</strong>teraction radii as above, <strong>and</strong> γm,m ′ ≥ 0 are <strong>in</strong>teraction parameters.<br />
In contrast to the unmarked Strauss process, which is well-def<strong>in</strong>ed only when its <strong>in</strong>teraction<br />
parameter γ is between 0 <strong>and</strong> 1, the multitype Strauss process allows some <strong>of</strong> the <strong>in</strong>teraction<br />
parameters γm,m ′ to exceed 1 for m = m′ , provided one <strong>of</strong> the relevant types has a hard core<br />
(γm,m = 0 or γm ′ ,m ′ = 0).<br />
If there are only two types, say M = {1,2}, then sett<strong>in</strong>g γ1,2 = 1 implies that the subprocesses<br />
X1 <strong>and</strong> X2, consist<strong>in</strong>g <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>of</strong> types 1 <strong>and</strong> 2 respectively, are <strong>in</strong>dependent Strauss<br />
processes.<br />
The multitype Strauss-hard core process has pairwise <strong>in</strong>teraction term<br />
cm,m ′(u,v) =<br />
⎧<br />
⎨<br />
⎩<br />
0 if ||u − v|| < hm,m ′<br />
γm,m ′ if hm,m ′ ≤ ||u − v|| ≤ rm,m ′<br />
1 if ||u − v|| > rm,m ′<br />
where rm,m ′ > 0 are <strong>in</strong>teraction distances <strong>and</strong> γm,m ′ ≥ 0 are <strong>in</strong>teraction parameters as above,<br />
<strong>and</strong> hm,m ′ are hard core distances satisfy<strong>in</strong>g hm,m ′ = hm ′ ,m <strong>and</strong> 0 < hm,m ′ < rm,m ′.<br />
Copyright c○CSIRO 2008<br />
(53)<br />
(54)<br />
(55)
27.2 Pseudolikelihood for multitype Gibbs processes 187<br />
27.2 Pseudolikelihood for multitype Gibbs processes<br />
Models can be fitted by maximum pseudolikelihood. For a multitype Gibbs <strong>po<strong>in</strong>t</strong> process with<br />
conditional <strong>in</strong>tensity λ((u,m);y), the log pseudolikelihood is<br />
n(y) <br />
log PL = log λ((xi,mi);y) − <br />
<br />
i=1<br />
m∈M<br />
W<br />
λ((u,m);y)du. (56)<br />
The pseudolikelihood can be maximised us<strong>in</strong>g an extension <strong>of</strong> the Berman-Turner device [3].<br />
27.3 Fitt<strong>in</strong>g Gibbs models to multitype data<br />
Marked <strong>po<strong>in</strong>t</strong> process models may be fitted to <strong>po<strong>in</strong>t</strong> pattern data us<strong>in</strong>g ppm. Currently the<br />
methods are only available for multitype <strong>po<strong>in</strong>t</strong> processes (categorical marks).<br />
27.3.1 Interactions not depend<strong>in</strong>g on marks<br />
The model-fitt<strong>in</strong>g function ppm expects an argument <strong>in</strong>teraction that specifies the <strong>in</strong>ter<strong>po<strong>in</strong>t</strong><br />
<strong>in</strong>teraction structure <strong>of</strong> the <strong>po<strong>in</strong>t</strong> process. The default is ‘no <strong>in</strong>teraction’, correspond<strong>in</strong>g to a<br />
Poisson process.<br />
On page 141 there is a list <strong>of</strong> <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teractions for modell<strong>in</strong>g unmarked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
These <strong>in</strong>teractions can also be used, without modification, to fit models to multitype <strong>po<strong>in</strong>t</strong><br />
<strong>patterns</strong>.<br />
For example<br />
> ppm(lans<strong>in</strong>g, ~marks, Strauss(0.07))<br />
fits a multitype version <strong>of</strong> the Strauss process (section 20.3.2) <strong>in</strong> which the conditional <strong>in</strong>tensity<br />
is<br />
λ((u,m),y) = βmγ t(u,y) . (57)<br />
Here βm are constants which account for the unequal abundance <strong>of</strong> the different species <strong>of</strong> tree.<br />
The other quantities are the same as <strong>in</strong> (42). The <strong>in</strong>teraction between two trees is assumed to be<br />
the same for all species, <strong>and</strong> is controlled by the <strong>in</strong>teraction parameter γ <strong>and</strong> <strong>in</strong>teraction radius<br />
r = 0.07. For example, this <strong>in</strong>cludes the case γ = 0 where no two trees (whatever species they<br />
belong to) come closer than 0.07 units apart, a ‘multitype hard core process’.<br />
27.3.2 Interactions depend<strong>in</strong>g on marks<br />
There are two additional <strong>in</strong>ter<strong>po<strong>in</strong>t</strong> <strong>in</strong>teractions def<strong>in</strong>ed <strong>in</strong> spatstat for multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>:<br />
MultiStrauss the multitype Strauss process<br />
MultiStraussHard multitype hybrid hard core / Strauss process<br />
In these models, the <strong>in</strong>teraction between two <strong>po<strong>in</strong>t</strong>s depends on the types <strong>of</strong> the <strong>po<strong>in</strong>t</strong>s as<br />
well as their separation.<br />
In the multitype Strauss process (54), for each pair <strong>of</strong> types i <strong>and</strong> j there is an <strong>in</strong>teraction<br />
radius rij <strong>and</strong> <strong>in</strong>teraction parameter γij. In simple terms, each pair <strong>of</strong> <strong>po<strong>in</strong>t</strong>s, with marks i <strong>and</strong> j<br />
say, contributes an <strong>in</strong>teraction term γi,j if the distance between them is less than the <strong>in</strong>teraction<br />
distance ri,j. These parameters must satisfy rij = rji <strong>and</strong> γij = γji. The conditional <strong>in</strong>tensity is<br />
<br />
λ((u,i),y) = βi<br />
j<br />
γ ti,j(u,y)<br />
i,j<br />
(58)<br />
Copyright c○CSIRO 2008
188 Methods 13: Gibbs models for multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
where ti,j(u,y) is the number <strong>of</strong> <strong>po<strong>in</strong>t</strong>s <strong>in</strong> y, with mark equal to j, ly<strong>in</strong>g with<strong>in</strong> a distance ri,j<br />
<strong>of</strong> the location u.<br />
To fit the stationary multitype Strauss process to the dataset betacells, we must specify<br />
the matrix <strong>of</strong> <strong>in</strong>teraction radii rij:<br />
> data(betacells)<br />
> r ppm(betacells, ~1, MultiStrauss(c("<strong>of</strong>f", "on"), r), rbord = 60)<br />
Stationary Multitype Strauss process<br />
Possible marks:<br />
<strong>of</strong>f on<br />
First order terms:<br />
beta_<strong>of</strong>f beta_on<br />
0.0001373652 0.0001373652<br />
Interaction: Pairwise <strong>in</strong>teraction family<br />
Interaction: Multitype Strauss process<br />
2 types <strong>of</strong> <strong>po<strong>in</strong>t</strong>s<br />
Possible types:<br />
[1] "<strong>of</strong>f" "on"<br />
Interaction radii:<br />
<strong>of</strong>f on<br />
<strong>of</strong>f 30 60<br />
on 60 30<br />
Fitted <strong>in</strong>teraction parameters gamma_ij:<br />
<strong>of</strong>f on<br />
<strong>of</strong>f 0.0000 0.8303<br />
on 0.8303 0.0000<br />
Relevant coefficients:<br />
mark<strong>of</strong>fx<strong>of</strong>f mark<strong>of</strong>fxon markonxon<br />
-17.2378706 -0.1860184 -17.2138383<br />
To fit a nonstationary multitype Strauss process with log-cubic polynomial trend:<br />
> ppm(betacells, ~polynom(x, y, 3), MultiStrauss(c("<strong>of</strong>f", "on"),<br />
+ r), rbord = 60)<br />
For more detailed explanation <strong>and</strong> examples <strong>of</strong> modell<strong>in</strong>g <strong>and</strong> the <strong>in</strong>terpretation <strong>of</strong> model<br />
formulae for <strong>po<strong>in</strong>t</strong> processes, see [5].<br />
Copyright c○CSIRO 2008
27.3 Fitt<strong>in</strong>g Gibbs models to multitype data 189<br />
27.3.3 Plott<strong>in</strong>g the fitted <strong>in</strong>teraction<br />
The fitted pairwise <strong>in</strong>teraction <strong>in</strong> a <strong>po<strong>in</strong>t</strong> process model can be plotted us<strong>in</strong>g fit<strong>in</strong>. The value<br />
returned by fit<strong>in</strong> is a function array (class "fasp").<br />
> model plot(fit<strong>in</strong>(model))<br />
<strong>of</strong>f<br />
on<br />
Pairwise <strong>in</strong>teraction<br />
Pairwise <strong>in</strong>teraction<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
0 20 40 60<br />
Distance<br />
0 20 40 60<br />
Fitted pairwise <strong>in</strong>teractions<br />
<strong>of</strong>f on<br />
Distance<br />
Pairwise <strong>in</strong>teraction<br />
Pairwise <strong>in</strong>teraction<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
0.0 0.2 0.4 0.6 0.8 1.0<br />
0 20 40 60<br />
Distance<br />
0 20 40 60<br />
Distance<br />
Copyright c○CSIRO 2008
190 L<strong>in</strong>e segment data<br />
28 L<strong>in</strong>e segment data<br />
spatstat also has some facilities for h<strong>and</strong>l<strong>in</strong>g <strong>spatial</strong> <strong>patterns</strong> <strong>of</strong> l<strong>in</strong>e segments.<br />
For example, the copper dataset <strong>in</strong> spatstat conta<strong>in</strong>s a dataset copper$L<strong>in</strong>es that records<br />
the locations <strong>of</strong> geological faults <strong>in</strong> a survey region.<br />
> data(copper)<br />
> L L plot(L)<br />
L<br />
A <strong>spatial</strong> pattern <strong>of</strong> l<strong>in</strong>e segments is represented by an object <strong>of</strong> class "psp". It consists <strong>of</strong><br />
a list <strong>of</strong> l<strong>in</strong>e segments (given by the coord<strong>in</strong>ates <strong>of</strong> their two end<strong>po<strong>in</strong>t</strong>s), <strong>and</strong> a w<strong>in</strong>dow <strong>in</strong> which<br />
the l<strong>in</strong>e segments were observed. The l<strong>in</strong>e segments may also carry marks.<br />
Objects <strong>of</strong> class "psp" can be created by the function psp or obta<strong>in</strong>ed by convert<strong>in</strong>g other<br />
data us<strong>in</strong>g the function as.psp.<br />
Capabilities available for this class <strong>in</strong>clude:<br />
[.psp subset operator (also performs clipp<strong>in</strong>g)<br />
marks.psp extract marks<br />
end<strong>po<strong>in</strong>t</strong>s.psp extract mid<strong>po<strong>in</strong>t</strong>s <strong>of</strong> l<strong>in</strong>e segments<br />
mid<strong>po<strong>in</strong>t</strong>s.psp compute mid<strong>po<strong>in</strong>t</strong>s <strong>of</strong> l<strong>in</strong>e segments<br />
lengths.psp compute lengths <strong>of</strong> l<strong>in</strong>e segments<br />
angles.psp compute angles <strong>of</strong> orientation for l<strong>in</strong>e segments<br />
rotate.psp rotate a l<strong>in</strong>e segment pattern<br />
shift.psp shift a l<strong>in</strong>e segment pattern<br />
aff<strong>in</strong>e.psp apply aff<strong>in</strong>e transformation<br />
pairdist.psp distances between l<strong>in</strong>e segments<br />
crossdist.psp distances between l<strong>in</strong>e segments<br />
nndist.psp closest distances between l<strong>in</strong>e segments<br />
density.psp kernel-smoothed <strong>in</strong>tensity image<br />
cross<strong>in</strong>g.psp f<strong>in</strong>d <strong>in</strong>tersection <strong>po<strong>in</strong>t</strong>s between l<strong>in</strong>e segments<br />
selfcross<strong>in</strong>g.psp f<strong>in</strong>d <strong>in</strong>tersection <strong>po<strong>in</strong>t</strong>s between l<strong>in</strong>e segments<br />
unitname.psp determ<strong>in</strong>e units <strong>of</strong> length<br />
rescale.psp change units <strong>of</strong> length<br />
rshift.psp apply r<strong>and</strong>om shift to each l<strong>in</strong>e segment<br />
Copyright c○CSIRO 2008
There are also the usual methods<br />
plot.psp plot a l<strong>in</strong>e segment pattern<br />
pr<strong>in</strong>t.psp pr<strong>in</strong>t <strong>in</strong>formation on a l<strong>in</strong>e segment pattern<br />
summary.psp compute summary <strong>of</strong> a l<strong>in</strong>e segment pattern<br />
> summary(L)<br />
146 l<strong>in</strong>e segments<br />
Lengths:<br />
M<strong>in</strong>. 1st Qu. Median Mean 3rd Qu. Max.<br />
0.09242 6.61400 12.18000 15.02000 19.95000 65.48000<br />
Total length: 2192.57251480451 km<br />
Length per unit area: 0.196937548404655<br />
Angles (radians):<br />
M<strong>in</strong>. 1st Qu. Median Mean 3rd Qu. Max.<br />
0.008107 0.549500 1.747000 1.378000 2.113000 2.912000<br />
W<strong>in</strong>dow: polygonal boundary<br />
s<strong>in</strong>gle connected closed polygon with 4 vertices<br />
enclos<strong>in</strong>g rectangle: [-158.23, -0.19] x [-0.335, 70.11] km<br />
W<strong>in</strong>dow area = 11133.3 square km<br />
Unit <strong>of</strong> length: 1 km<br />
> plot(distmap(L))<br />
> plot(L, add = TRUE)<br />
0 20 40 60<br />
distmap(L)<br />
−150 −100 −50 0<br />
0 2 4 6 8 10 12 14<br />
191<br />
Copyright c○CSIRO 2008
192 Further <strong>in</strong>formation on spatstat<br />
29 Further <strong>in</strong>formation on spatstat<br />
Help files<br />
For <strong>in</strong>formation on a particular comm<strong>and</strong> <strong>in</strong> spatstat, consult the onl<strong>in</strong>e help file by typ<strong>in</strong>g<br />
help(comm<strong>and</strong>). The help files are detailed <strong>and</strong> extensive. The complete manual is over 500<br />
pages.<br />
For examples <strong>of</strong> the use <strong>of</strong> a particular comm<strong>and</strong>, read the examples section <strong>in</strong> the help file,<br />
or type example(comm<strong>and</strong>) to see the examples executed.<br />
Quick reference<br />
Typehelp(spatstat) for a quick-reference overview <strong>of</strong> all the functions available <strong>in</strong> the package.<br />
For a demonstration <strong>of</strong> many <strong>of</strong> the capabilities <strong>of</strong> spatstat, type demo(spatstat).<br />
For a visual display <strong>of</strong> all the datasets supplied <strong>in</strong> spatstat, type demo(data).<br />
Website<br />
The websitewww.spatstat.orgconta<strong>in</strong>s <strong>in</strong>formation on recent updates to the package, frequentlyasked<br />
questions, bug fixes, literature <strong>and</strong> other developments.<br />
Modell<strong>in</strong>g<br />
For examples on fitt<strong>in</strong>g <strong>po<strong>in</strong>t</strong> process models, see [5].<br />
Citation<br />
If you use spatstat <strong>in</strong> a research publication, it would be much appreciated if you could cite<br />
the paper [4], or mention spatstat <strong>in</strong> the acknowledgements.<br />
In do<strong>in</strong>g so, you will help us to justify the expenditure <strong>of</strong> time <strong>and</strong> effort on ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g <strong>and</strong><br />
develop<strong>in</strong>g the package.<br />
Citation details are also available <strong>in</strong> the package by typ<strong>in</strong>gcitation(package="spatstat").<br />
Queries <strong>and</strong> requests<br />
If you have difficulty <strong>in</strong> gett<strong>in</strong>g the package to do what you want, or if you have a suggestion for<br />
additional features that could be added, please contact the package authors,adrian@maths.uwa.edu.au<br />
<strong>and</strong>r.turner@auckl<strong>and</strong>.ac.nz, or email the R special <strong>in</strong>terest group <strong>in</strong> <strong>spatial</strong> <strong>and</strong> geographical<br />
statistics, r-sig-geo@stat.math.ethz.ch.<br />
Copyright c○CSIRO 2008
REFERENCES 193<br />
References<br />
[1] A. Baddeley, J. Møller, <strong>and</strong> A.G. Pakes. Properties <strong>of</strong> residuals for <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> processes.<br />
Annals <strong>of</strong> the Institute <strong>of</strong> <strong>Statistical</strong> Mathematics, 60:627–649, 2008.<br />
[2] A. Baddeley, J. Møller, <strong>and</strong> R. Waagepetersen. Non- <strong>and</strong> semiparametric estimation <strong>of</strong> <strong>in</strong>teraction<br />
<strong>in</strong> <strong>in</strong>homogeneous <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Statistica Neerl<strong>and</strong>ica, 54(3):329–350, November<br />
2000.<br />
[3] A. Baddeley <strong>and</strong> R. Turner. Practical maximum pseudolikelihood for <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
(with discussion). Australian <strong>and</strong> New Zeal<strong>and</strong> Journal <strong>of</strong> Statistics, 42(3):283–322, 2000.<br />
[4] A. Baddeley <strong>and</strong> R. Turner. Spatstat: an R package for analyz<strong>in</strong>g <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
Journal <strong>of</strong> <strong>Statistical</strong> S<strong>of</strong>tware, 12(6):1–42, 2005. URL: www.jstats<strong>of</strong>t.org, ISSN: 1548-<br />
7660.<br />
[5] A. Baddeley <strong>and</strong> R. Turner. Modell<strong>in</strong>g <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong> <strong>in</strong> R. In A. Baddeley, P. Gregori,<br />
J. Mateu, R. Stoica, <strong>and</strong> D. Stoyan, editors, Case Studies <strong>in</strong> Spatial Po<strong>in</strong>t Pattern<br />
Modell<strong>in</strong>g, number 185 <strong>in</strong> Lecture Notes <strong>in</strong> Statistics, pages 23–74. Spr<strong>in</strong>ger-Verlag, New<br />
York, 2006. ISBN: 0-387-28311-0.<br />
[6] A. Baddeley, R. Turner, J. Møller, <strong>and</strong> M. Hazelton. Residual analysis for <strong>spatial</strong> <strong>po<strong>in</strong>t</strong><br />
processes (with discussion). Journal <strong>of</strong> the Royal <strong>Statistical</strong> Society, series B, 67(5):617–666,<br />
2005.<br />
[7] A.J. Baddeley. Spatial sampl<strong>in</strong>g <strong>and</strong> censor<strong>in</strong>g. In O.E. Barndorff-Nielsen, W.S. Kendall,<br />
<strong>and</strong> M.N.M. van Lieshout, editors, Stochastic Geometry: Likelihood <strong>and</strong> Computation, chapter<br />
2, pages 37–78. Chapman <strong>and</strong> Hall, London, 1998.<br />
[8] A.J. Baddeley <strong>and</strong> J. Møller. Nearest-neighbour Markov <strong>po<strong>in</strong>t</strong> processes <strong>and</strong> r<strong>and</strong>om sets.<br />
International <strong>Statistical</strong> Review, 57:89–121, 1989.<br />
[9] A.J. Baddeley, R.A. Moyeed, C.V. Howard, <strong>and</strong> A. Boyde. Analysis <strong>of</strong> a three-dimensional<br />
<strong>po<strong>in</strong>t</strong> pattern with replication. Applied Statistics, 42(4):641–668, 1993.<br />
[10] A.J. Baddeley <strong>and</strong> B.W. Silverman. A cautionary example on the use <strong>of</strong> second-order<br />
methods for analyz<strong>in</strong>g <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Biometrics, 40:1089–1094, 1984.<br />
[11] A.J. Baddeley <strong>and</strong> M.N.M. van Lieshout. Area-<strong>in</strong>teraction <strong>po<strong>in</strong>t</strong> processes. Annals <strong>of</strong> the<br />
Institute <strong>of</strong> <strong>Statistical</strong> Mathematics, 47:601–619, 1995.<br />
[12] G. Barnard. Contribution to discussion <strong>of</strong> “The spectral analysis <strong>of</strong> <strong>po<strong>in</strong>t</strong> processes” by<br />
M.S. Bartlett. Journal <strong>of</strong> the Royal <strong>Statistical</strong> Society, series B, 25:294, 1963.<br />
[13] M. Bell <strong>and</strong> G. Grunwald. Mixed models for the analysis <strong>of</strong> replicated <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>.<br />
Biostatistics, 5:633–648, 2004.<br />
[14] M. Berman <strong>and</strong> T.R. Turner. Approximat<strong>in</strong>g <strong>po<strong>in</strong>t</strong> process likelihoods with GLIM. Applied<br />
Statistics, 41:31–38, 1992.<br />
[15] J. Besag <strong>and</strong> P.J. Diggle. Simple Monte Carlo tests for <strong>spatial</strong> pattern. Applied Statistics,<br />
26:327–333, 1977.<br />
[16] J.E. Besag <strong>and</strong> P. Clifford. Generalized Monte Carlo significance tests. Biometrika, 76:633–<br />
642, 1989.<br />
Copyright c○CSIRO 2008
194 REFERENCES<br />
[17] R. Biv<strong>and</strong>, E.J. Pebesma, <strong>and</strong> V. Gómez-Rubio. Applied <strong>spatial</strong> data analysis with R.<br />
Spr<strong>in</strong>ger, 2008.<br />
[18] D.R. Brill<strong>in</strong>ger. Comparative aspects <strong>of</strong> the study <strong>of</strong> ord<strong>in</strong>ary time series <strong>and</strong> <strong>of</strong> <strong>po<strong>in</strong>t</strong><br />
processes. In P.R. Krishnaiah, editor, Developments <strong>in</strong> Statistics, pages 33–133. Academic<br />
Press, 1978.<br />
[19] N.A.C. Cressie. Statistics for Spatial Data. John Wiley <strong>and</strong> Sons, New York, 1991.<br />
[20] D.J. Daley <strong>and</strong> D. Vere-Jones. An Introduction to the Theory <strong>of</strong> Po<strong>in</strong>t Processes. Spr<strong>in</strong>ger<br />
Verlag, New York, 1988.<br />
[21] P.J. Diggle. <strong>Statistical</strong> analysis <strong>of</strong> <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Academic Press, London, 1983.<br />
[22] P.J. Diggle. A <strong>po<strong>in</strong>t</strong> process modell<strong>in</strong>g approach to raised <strong>in</strong>cidence <strong>of</strong> a rare phenomenon<br />
<strong>in</strong> the vic<strong>in</strong>ity <strong>of</strong> a prespecified <strong>po<strong>in</strong>t</strong>. Journal <strong>of</strong> the Royal <strong>Statistical</strong> Society, series A,<br />
153:349–362, 1990.<br />
[23] P.J. Diggle. <strong>Statistical</strong> Analysis <strong>of</strong> Spatial Po<strong>in</strong>t Patterns. Arnold, second edition, 2003.<br />
[24] P.J. Diggle, N. Lange, <strong>and</strong> F. M. Benes. Analysis <strong>of</strong> variance for replicated <strong>spatial</strong> <strong>po<strong>in</strong>t</strong><br />
<strong>patterns</strong> <strong>in</strong> cl<strong>in</strong>ical neuroanatomy. Journal <strong>of</strong> the American <strong>Statistical</strong> Association, 86:618–<br />
625, 1991.<br />
[25] P.J. Diggle, J. Mateu, <strong>and</strong> H.E. Clough. A comparison between parametric <strong>and</strong> nonparametric<br />
approaches to the analysis <strong>of</strong> replicated <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Advances <strong>in</strong><br />
Applied Probability (SGSA), 32:331–343, 2000.<br />
[26] P.J. Diggle <strong>and</strong> B. Rowl<strong>in</strong>gson. A conditional approach to <strong>po<strong>in</strong>t</strong> process modell<strong>in</strong>g <strong>of</strong><br />
elevated risk. Journal <strong>of</strong> the Royal <strong>Statistical</strong> Society, series A (Statistics <strong>in</strong> Society),<br />
157(3):433–440, 1994.<br />
[27] M. Dwass. Modified r<strong>and</strong>omization tests for nonparametric hypotheses. Annals <strong>of</strong> Mathematical<br />
Statistics, 28:181–187, 1957.<br />
[28] A.C.A. Hope. A simplified Monte Carlo significance test procedure. Journal <strong>of</strong> the Royal<br />
<strong>Statistical</strong> Society, series B, 30:582–598, 1968.<br />
[29] C.V. Howard, S. Reid, A.J. Baddeley, <strong>and</strong> A. Boyde. Unbiased estimation <strong>of</strong> particle density<br />
<strong>in</strong> the t<strong>and</strong>em-scann<strong>in</strong>g reflected light microscope. Journal <strong>of</strong> Microscopy, 138:203–212,<br />
1985.<br />
[30] F. Huang <strong>and</strong> Y. Ogata. Improvements <strong>of</strong> the maximum pseudo-likelihood estimators<br />
<strong>in</strong> various <strong>spatial</strong> statistical models. Journal <strong>of</strong> Computational <strong>and</strong> Graphical Statistics,<br />
8(3):510–530, 1999.<br />
[31] J. Illian, A. Pentt<strong>in</strong>en, H. Stoyan, <strong>and</strong> D. Stoyan. <strong>Statistical</strong> analysis <strong>and</strong> modell<strong>in</strong>g <strong>of</strong><br />
<strong>spatial</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Wiley, 2008.<br />
[32] J.F.C. K<strong>in</strong>gman. Poisson Processes. Oxford University Press, 1993.<br />
[33] G.M. Laslett. Censor<strong>in</strong>g <strong>and</strong> edge effects <strong>in</strong> areal <strong>and</strong> l<strong>in</strong>e transect sampl<strong>in</strong>g <strong>of</strong> rock jo<strong>in</strong>t<br />
traces. Mathematical Geology, 14:125–140, 1982.<br />
Copyright c○CSIRO 2008
REFERENCES 195<br />
[34] P.A.W. Lewis. Recent results <strong>in</strong> the statistical analysis <strong>of</strong> univariate <strong>po<strong>in</strong>t</strong> processes. In<br />
P.A.W. Lewis, editor, Stochastic <strong>po<strong>in</strong>t</strong> processes, pages 1–54. Wiley, New York, 1972.<br />
[35] J.K. L<strong>in</strong>dsey. The analysis <strong>of</strong> stochastic processes us<strong>in</strong>g GLIM. Spr<strong>in</strong>ger, Berl<strong>in</strong>, 1992.<br />
[36] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, <strong>and</strong> E. Teller. Equation <strong>of</strong><br />
state calculations by fast comput<strong>in</strong>g mach<strong>in</strong>es. Journal <strong>of</strong> Chemical Physics, 21:1087–1092,<br />
1953.<br />
[37] J. Møller <strong>and</strong> R.P. Waagepetersen. <strong>Statistical</strong> Inference <strong>and</strong> Simulation for Spatial Po<strong>in</strong>t<br />
Processes. Chapman <strong>and</strong> Hall/CRC, Boca Raton, 2003.<br />
[38] J. Møller <strong>and</strong> R.P. Waagepetersen. Modern statistics for <strong>spatial</strong> <strong>po<strong>in</strong>t</strong> processes. Research<br />
Report R-2006-12, <strong>Department</strong> <strong>of</strong> Mathematical Sciences, Aalborg University, April 2006.<br />
Submitted for publication.<br />
[39] Y. Ogata. <strong>Statistical</strong> models for earthquake occurrences <strong>and</strong> residual analysis for <strong>po<strong>in</strong>t</strong><br />
processes. Journal <strong>of</strong> the American <strong>Statistical</strong> Association, 83:9–27, 1988.<br />
[40] B.D. Ripley. Modell<strong>in</strong>g <strong>spatial</strong> <strong>patterns</strong> (with discussion). Journal <strong>of</strong> the Royal <strong>Statistical</strong><br />
Society, series B, 39:172–212, 1977.<br />
[41] B.D. Ripley. Simulat<strong>in</strong>g <strong>spatial</strong> <strong>patterns</strong>: dependent samples from a multivariate density.<br />
Applied Statistics, 28:109–112, 1979.<br />
[42] B.D. Ripley. Spatial Statistics. John Wiley <strong>and</strong> Sons, New York, 1981.<br />
[43] B.D. Ripley. <strong>Statistical</strong> Inference for Spatial Processes. Cambridge University Press, 1988.<br />
[44] A. Särkkä. Pseudo-likelihood approach for pair potential estimation <strong>of</strong> Gibbs processes.<br />
Number 22 <strong>in</strong> Jyväskylä Studies <strong>in</strong> Computer Science, Economics <strong>and</strong> Statistics. University<br />
<strong>of</strong> Jyväskylä, 1993.<br />
[45] D. Stoyan <strong>and</strong> P. Grabarnik. Second-order characteristics for stochastic structures connected<br />
with Gibbs <strong>po<strong>in</strong>t</strong> processes. Mathematische Nachrichten, 151:95–100, 1991.<br />
[46] D. Stoyan <strong>and</strong> H. Stoyan. Fractals, R<strong>and</strong>om Shapes <strong>and</strong> Po<strong>in</strong>t Fields. John Wiley <strong>and</strong><br />
Sons, Chichester, 1995.<br />
[47] M.N.M. van Lieshout. Markov Po<strong>in</strong>t Processes <strong>and</strong> their Applications. Imperial College<br />
Press, 2000.<br />
[48] M.N.M. van Lieshout <strong>and</strong> A.J. Baddeley. A nonparametric measure <strong>of</strong> <strong>spatial</strong> <strong>in</strong>teraction<br />
<strong>in</strong> <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>. Statistica Neerl<strong>and</strong>ica, 50:344–361, 1996.<br />
[49] R. Waagepetersen. An estimat<strong>in</strong>g function approach to <strong>in</strong>ference for <strong>in</strong>homogeneous<br />
Neyman-Scott processes. Submitted for publication, 2006.<br />
Copyright c○CSIRO 2008
Index<br />
analysis <strong>of</strong> deviance, 88<br />
area-<strong>in</strong>teraction process, 135<br />
b<strong>in</strong>ary mask, 28, 41<br />
circular w<strong>in</strong>dows, 39<br />
classes, 27<br />
<strong>in</strong> R, 27<br />
<strong>in</strong> spatstat, 27<br />
clickppp, 25<br />
cluster models<br />
fitt<strong>in</strong>g, 125, 130<br />
<strong>in</strong>homogeneous, 130<br />
fitt<strong>in</strong>g, 130<br />
complete <strong>spatial</strong> r<strong>and</strong>omness, 72<br />
<strong>and</strong> <strong>in</strong>dependence, 157, 179<br />
def<strong>in</strong>ition, 72<br />
Kolmogorov-Smirnov test, 75<br />
quadrat count<strong>in</strong>g test, 74<br />
conditional <strong>in</strong>tensity, 136<br />
for marked <strong>po<strong>in</strong>t</strong> processes, 185<br />
contrasts, 82, 183<br />
covariate effects, 9<br />
covariates, 7, 16, 82<br />
<strong>in</strong> ppm, 82<br />
Cox process, 99<br />
CSRI, 157, 179<br />
conditional <strong>in</strong>tensity, 185<br />
fitt<strong>in</strong>g to data, 182<br />
simulat<strong>in</strong>g, 180<br />
data entry, 33<br />
at the term<strong>in</strong>al, 34<br />
basic, 33, 34<br />
check<strong>in</strong>g, 36<br />
from file, 34<br />
GIS formats, 42<br />
marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, 160<br />
marks, 35<br />
<strong>po<strong>in</strong>t</strong>-<strong>and</strong>-click, 25<br />
datasets<br />
<strong>in</strong>spect<strong>in</strong>g, 20<br />
provided <strong>in</strong> spatstat, 25<br />
dispatch<strong>in</strong>g, 27<br />
distance methods, 102<br />
distances<br />
empty space, 102, 103<br />
196<br />
nearest neighbour, 102, 109<br />
pairwise, 102, 111<br />
distmap, 102<br />
edge effects, 104<br />
empty space distances, 102, 103<br />
empty space function, 104<br />
envelopes, 119<br />
<strong>and</strong> Monte Carlo tests, 119<br />
for any fitted model, 123<br />
for any simulation procedure, 124<br />
<strong>in</strong> spatstat, 120<br />
<strong>of</strong> summary functions, 119<br />
exploratory data analysis, 21<br />
for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, 165<br />
fitted model, 142<br />
goodness-<strong>of</strong>-fit, 90, 148<br />
<strong>in</strong>terpretation <strong>of</strong> coefficients, 82<br />
methods for, 84<br />
residuals, 91, 150<br />
simulation <strong>of</strong>, 89<br />
fitt<strong>in</strong>g models<br />
by Huang-Ogata method, 147<br />
kppm, 125, 130<br />
maximum pseudolikelihood, 139<br />
to marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, 182, 187<br />
via summary statistics, 125<br />
fv, 32<br />
geometrical transformations, 50<br />
Gibbs models, 132<br />
area-<strong>in</strong>teraction, 135<br />
Diggle-Gates-Stibbard, 135<br />
Diggle-Gratton, 135<br />
fitt<strong>in</strong>g, 139<br />
by Huang-Ogata method, 147<br />
maximum pseudolikelihood, 139<br />
ppm, 139<br />
fitt<strong>in</strong>g to marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, 187<br />
goodness-<strong>of</strong>-fit, 148<br />
hard core process, 133<br />
<strong>in</strong> spatstat, 141<br />
<strong>in</strong>f<strong>in</strong>ite order <strong>in</strong>teraction, 135<br />
multitype, 185<br />
maximum pseudolikelihood, 187<br />
multitype pairwise <strong>in</strong>teraction, 185
INDEX 197<br />
pairwise <strong>in</strong>teraction, 135<br />
residuals, 150<br />
simulation, 137<br />
simulation <strong>of</strong> fitted model, 144<br />
s<strong>of</strong>t core, 135<br />
Strauss process, 134<br />
Strauss-hard core, 135<br />
GIS formats, 42<br />
goodness-<strong>of</strong>-fit, 90<br />
for fitted Gibbs model, 148<br />
for Poisson models, 90<br />
hard core process, 133<br />
multitype, 186<br />
Huang-Ogata method, 147<br />
im, 27, 54<br />
images, 54<br />
comput<strong>in</strong>g with, 59<br />
creat<strong>in</strong>g, 54<br />
from raw data, 54<br />
exploratory <strong>in</strong>spection <strong>of</strong>, 58<br />
extract<strong>in</strong>g subset, 58<br />
plott<strong>in</strong>g, 56<br />
returned by a function, 55<br />
<strong>in</strong>dependence <strong>of</strong> components, 157, 176<br />
<strong>in</strong>tensity<br />
function, 67<br />
kernel estimator, 67<br />
homogeneous, 66<br />
<strong>in</strong>homogeneous, 67<br />
<strong>in</strong>vestigation <strong>of</strong>, 66<br />
measure, 67<br />
<strong>of</strong> marked <strong>po<strong>in</strong>t</strong> process, 165<br />
<strong>in</strong>teraction, 8, 11<br />
distance methods, 102<br />
<strong>in</strong> spatstat, 141<br />
multitype, 185, 187<br />
<strong>in</strong> spatstat, 187<br />
plott<strong>in</strong>g a fitted <strong>in</strong>teraction, 189<br />
Q–Q plot, 153<br />
simple models, 98<br />
summary functions, 102<br />
K function, 22, 111<br />
for multitype <strong>po<strong>in</strong>t</strong> pattern, 169<br />
<strong>in</strong>homogeneous, 128<br />
kernel estimator <strong>of</strong> <strong>in</strong>tensity, 67, 68<br />
kernel smooth<strong>in</strong>g <strong>of</strong> marks, 167<br />
Kolmogorov-Smirnov test<br />
<strong>of</strong> CSR, 75<br />
<strong>of</strong> <strong>in</strong>homogeneous Poisson, 91<br />
kppm, 125, 130<br />
l<strong>in</strong>e segments, 190<br />
lurk<strong>in</strong>g variable plot, 93<br />
maptools package, 42<br />
mark correlation function, 174<br />
marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
cutt<strong>in</strong>g marks <strong>in</strong>to b<strong>and</strong>s, 163<br />
data entry, 160<br />
exploratory data analysis, 165<br />
explor<strong>in</strong>g marks, 167<br />
<strong>in</strong>spect<strong>in</strong>g, 161<br />
jo<strong>in</strong>t <strong>and</strong> conditional analysis, 157<br />
manipulat<strong>in</strong>g, 163<br />
methodological issues, 157<br />
model-fitt<strong>in</strong>g, 182, 187<br />
probabilistic formulation, 156<br />
r<strong>and</strong>omisation tests, 157<br />
separat<strong>in</strong>g <strong>in</strong>to types, 163<br />
summary functions, 169<br />
marked <strong>po<strong>in</strong>t</strong> process<br />
<strong>in</strong>tensity, 165<br />
marks, 6, 15, 156<br />
categorical, 35<br />
data entry, 33, 35<br />
exploratory data analysis, 167<br />
manipulat<strong>in</strong>g, 163<br />
operations on, 49<br />
smooth<strong>in</strong>g, 167<br />
<strong>spatial</strong> trend <strong>in</strong>, 167<br />
versus covariates, 15<br />
markstat, 169<br />
marktable, 168<br />
Matern cluster process, 98<br />
maximum likelihood, 79<br />
maximum pseudolikelihood, 139, 187<br />
for multitype Gibbs models, 187<br />
improvements over, 147<br />
methods, 27<br />
default method, 29<br />
dispatch, 27<br />
m<strong>in</strong>imum contrast, 125<br />
model validation, 90, 148<br />
Monte Carlo test, 119<br />
<strong>po<strong>in</strong>t</strong>wise, 120<br />
simultaneous, 121<br />
Copyright c○CSIRO 2008
198 INDEX<br />
multitype hard core process, 186<br />
multitype <strong>po<strong>in</strong>t</strong> pattern, 10, 11, 22, 35<br />
multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong><br />
separat<strong>in</strong>g <strong>in</strong>to types, 163<br />
summary functions, 169<br />
multitype Strauss process, 186<br />
nearest neighbour distances, 102, 109<br />
nndist, 102<br />
nuisance parameters, 145<br />
ow<strong>in</strong>, 27, 39<br />
pairdist, 102<br />
pairwise distances, 102, 111<br />
pairwise <strong>in</strong>teraction process, 133<br />
<strong>po<strong>in</strong>t</strong> pattern, 6<br />
marked, 156<br />
marks, 6, 15<br />
multitype, 10, 11<br />
needs w<strong>in</strong>dow, 47<br />
<strong>po<strong>in</strong>t</strong> process model for, 13<br />
st<strong>and</strong>ard model, 14<br />
<strong>po<strong>in</strong>t</strong> process, 13<br />
<strong>po<strong>in</strong>t</strong> process models<br />
area-<strong>in</strong>teraction, 135<br />
Diggle-Gates-Stibbard, 135<br />
Diggle-Gratton, 135<br />
Gibbs, 132<br />
hard core, 133<br />
<strong>in</strong>f<strong>in</strong>ite order <strong>in</strong>teraction, 135<br />
pairwise <strong>in</strong>teraction, 133, 135<br />
s<strong>of</strong>t core, 135<br />
Strauss, 134<br />
Strauss-hard core, 135<br />
Poisson cluster processes, 98<br />
Poisson models<br />
fitt<strong>in</strong>g, 80<br />
goodness-<strong>of</strong>-fit, 90<br />
homogeneous, 72<br />
<strong>in</strong>homogeneous, 79<br />
log-likelihood, 80<br />
marked, 179<br />
maximum likelihood, 79<br />
residuals, 91<br />
Poisson <strong>po<strong>in</strong>t</strong> process<br />
homogeneous<br />
def<strong>in</strong>ition, 72<br />
simulation, 72<br />
<strong>in</strong>homogeneous<br />
Copyright c○CSIRO 2008<br />
def<strong>in</strong>ition, 79<br />
fitt<strong>in</strong>g, 80<br />
likelihood, 80<br />
motivation, 79<br />
simulation, 79<br />
Poisson-derived models, 98<br />
polygonal w<strong>in</strong>dows, 28, 40<br />
ppm, 84, 142<br />
marked Gibbs <strong>po<strong>in</strong>t</strong> process models, 187<br />
marked Poisson <strong>po<strong>in</strong>t</strong> process models,<br />
182<br />
methods for, 84<br />
ppp, 27<br />
comb<strong>in</strong><strong>in</strong>g several, 53<br />
extract<strong>in</strong>g subset, 48<br />
format, 46<br />
geometrical transformations, 50<br />
<strong>in</strong> arbitrary w<strong>in</strong>dow, 44<br />
manipulat<strong>in</strong>g, 46<br />
needs w<strong>in</strong>dow, 47<br />
operations on, 47<br />
r<strong>and</strong>om perturbations, 50<br />
ways to make, 37<br />
probability density, 132<br />
pr<strong>of</strong>ile pseudolikelihood, 145<br />
pseudolikelihood, 139<br />
pr<strong>of</strong>ile pseudolikelihood, 145<br />
quadrat count<strong>in</strong>g, 21, 67<br />
quadrat count<strong>in</strong>g test<br />
<strong>of</strong> CSR, 74<br />
quadrat test<br />
<strong>of</strong> <strong>in</strong>homogeneous Poisson, 90<br />
R, 17<br />
contributed packages, 18<br />
for <strong>spatial</strong> data formats, 42<br />
for <strong>spatial</strong> statistics, 18<br />
where to get, 17<br />
r<strong>and</strong>om labell<strong>in</strong>g, 157, 177<br />
r<strong>and</strong>om perturbations, 50<br />
r<strong>and</strong>om th<strong>in</strong>n<strong>in</strong>g, 79<br />
r<strong>and</strong>omisation tests, 157, 175<br />
for marked <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, 175<br />
rectangular w<strong>in</strong>dows, 28, 39<br />
residuals, 91, 150<br />
for fitted Gibbs model, 150<br />
for Poisson models, 91<br />
lurk<strong>in</strong>g variable plot, 93
INDEX 199<br />
Q–Q plot, 151<br />
smoothed residual field, 93<br />
return value, 30<br />
rpoispp, 72, 79<br />
runif<strong>po<strong>in</strong>t</strong>, 73<br />
sequential models, 100<br />
shapefiles, 42<br />
shapefiles package, 42<br />
simulation<br />
<strong>of</strong> fitted Gibbs model, 144<br />
<strong>of</strong> fitted Poisson model, 89<br />
smoothed residual field, 93<br />
sp package, 42<br />
spatstat, 19, 192<br />
cit<strong>in</strong>g, 19<br />
gett<strong>in</strong>g started, 19<br />
<strong>in</strong>stall<strong>in</strong>g, 19<br />
split, 24<br />
st<strong>and</strong>ard model, 14<br />
Strauss process, 134<br />
fitt<strong>in</strong>g to data, 140<br />
multitype, 186<br />
summary functions, 102<br />
<strong>and</strong> Monte Carlo tests, 119<br />
critique, 117<br />
edge effects, 104<br />
envelopes, 119<br />
F, 104<br />
for multitype <strong>po<strong>in</strong>t</strong> <strong>patterns</strong>, 169<br />
G, 109<br />
<strong>in</strong>ference us<strong>in</strong>g, 119<br />
<strong>in</strong>homogeneous K, 128<br />
J, 114<br />
K, 111<br />
L, 112<br />
mark correlation, 174<br />
model-fitt<strong>in</strong>g with, 125<br />
pair correlation, 112<br />
tests<br />
χ 2 quadrat count<strong>in</strong>g, 74<br />
Kolmogorov-Smirnov, 75, 91<br />
Monte Carlo, 119<br />
th<strong>in</strong>n<strong>in</strong>g, 99<br />
Thomas process, 98<br />
tips, 27, 31, 36, 48, 103, 106, 121, 160<br />
treatment contrasts, 82<br />
unitname, 37<br />
units <strong>of</strong> length, 37<br />
validation, 90, 148<br />
w<strong>in</strong>dows, 39<br />
b<strong>in</strong>ary mask, 28, 41<br />
circular, 39<br />
GIS formats, 42<br />
needed <strong>in</strong> any <strong>po<strong>in</strong>t</strong> pattern, 47<br />
operations on, 44<br />
polygonal, 28, 40<br />
rectangular, 28, 39<br />
returned by functions, 43<br />
χ 2 quadrat count<strong>in</strong>g test, 74<br />
Copyright c○CSIRO 2008