considering autocorrelation in predictive models - Department of ...
considering autocorrelation in predictive models - Department of ...
considering autocorrelation in predictive models - Department of ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Introduction 5<br />
and cluster homogeneity (<strong>in</strong> terms <strong>of</strong> <strong>autocorrelation</strong>) at the same time, dur<strong>in</strong>g the phase <strong>of</strong> add<strong>in</strong>g a new<br />
node to the <strong>predictive</strong> cluster<strong>in</strong>g tree.<br />
The network (spatial and relational) sett<strong>in</strong>g that we address <strong>in</strong> this work is based on the use <strong>of</strong> both<br />
the descriptive <strong>in</strong>formation (attributes) and the network structure dur<strong>in</strong>g tra<strong>in</strong><strong>in</strong>g, whereas we only use<br />
the descriptive <strong>in</strong>formation <strong>in</strong> the test<strong>in</strong>g phase and disregard the network structure. More specifically,<br />
<strong>in</strong> the tra<strong>in</strong><strong>in</strong>g phase, we assume that all examples are labeled and that the given network is complete.<br />
In the test<strong>in</strong>g phase, all test<strong>in</strong>g examples are unlabeled and the network is not given. A key property <strong>of</strong><br />
our approach is that the existence <strong>of</strong> the network is not obligatory <strong>in</strong> the test<strong>in</strong>g phase, where we only<br />
need the descriptive <strong>in</strong>formation. This can be very beneficial when predictions need to be made for those<br />
examples for which connections to others examples are not known or need to be confirmed. The more<br />
common sett<strong>in</strong>g where a network with some nodes labeled and some nodes unlabeled is given, can be<br />
easily mapped to our sett<strong>in</strong>g. We can use the nodes with labels and the projection <strong>of</strong> the network on these<br />
nodes for tra<strong>in</strong><strong>in</strong>g and only the unlabeled nodes without network <strong>in</strong>formation <strong>in</strong> the test<strong>in</strong>g phase.<br />
This network sett<strong>in</strong>g is very different from the exist<strong>in</strong>g approaches to network classification and<br />
regression where the descriptive <strong>in</strong>formation is typically <strong>in</strong> a tight connection to the network structure.<br />
The connections (edges <strong>in</strong> the network) between the data <strong>in</strong> the tra<strong>in</strong><strong>in</strong>g/test<strong>in</strong>g set are predef<strong>in</strong>ed for<br />
a particular <strong>in</strong>stance and are used to generate the descriptive <strong>in</strong>formation associated to the nodes <strong>of</strong><br />
the network (see, for example, (Ste<strong>in</strong>haeuser et al, 2011)). Therefore, <strong>in</strong> order to predict the value <strong>of</strong><br />
the response variable(s), besides the descriptive <strong>in</strong>formation, one needs the connections (edges <strong>in</strong> the<br />
network) to related/similar entities. This is very different from what is typically done <strong>in</strong> network analysis<br />
as well. Indeed, the general focus there is on explor<strong>in</strong>g the structure <strong>of</strong> a network by calculat<strong>in</strong>g its<br />
properties (e.g. the degrees <strong>of</strong> the nodes, the connectedness with<strong>in</strong> the network, scalability, robustness,<br />
etc.). The network properties are then fitted <strong>in</strong>to an already exist<strong>in</strong>g mathematical (theoretical) network<br />
(graph) model (Ste<strong>in</strong>haeuser et al, 2011).<br />
From the <strong>predictive</strong> perspective, accord<strong>in</strong>g to the tests <strong>in</strong> the tree, it is possible to associate an observation<br />
(a test node <strong>of</strong> a network) to a cluster. The <strong>predictive</strong> model associated to the cluster can then<br />
be used to predict its response value (or response values, <strong>in</strong> the case <strong>of</strong> multi-target tasks). From the<br />
descriptive perspective, the tree <strong>models</strong> obta<strong>in</strong>ed by the proposed algorithm allow us to obta<strong>in</strong> a hierarchical<br />
view <strong>of</strong> the network, where clusters can be employed to design a federation <strong>of</strong> hierarchically<br />
arranged networks.<br />
A hierarchial view <strong>of</strong> the network can be useful, for <strong>in</strong>stance, <strong>in</strong> wireless sensor networks, where a<br />
hierarchical structure is one <strong>of</strong> the possible ways to reduce the communication cost between the nodes<br />
(Li et al, 2007). Moreover, it is possible to browse the generated clusters at different levels <strong>of</strong> the hierarchy,<br />
where each cluster can naturally consider different effects <strong>of</strong> the <strong>autocorrelation</strong> phenomenon<br />
on different portions <strong>of</strong> the network: at higher levels <strong>of</strong> the tree, clusters will be able to consider <strong>autocorrelation</strong><br />
phenomenons that are spread all over the network, while at lower levels <strong>of</strong> the tree, clusters<br />
will reasonably consider local effects <strong>of</strong> <strong>autocorrelation</strong>. This gives us a way to consider non-stationary<br />
<strong>autocorrelation</strong>.<br />
1.3 Contributions<br />
The research presented <strong>in</strong> this dissertation extends the PCT framework towards learn<strong>in</strong>g from autocorrelated<br />
data. We address important aspects <strong>of</strong> the problem <strong>of</strong> learn<strong>in</strong>g <strong>predictive</strong> <strong>models</strong> <strong>in</strong> the case when<br />
the examples <strong>in</strong> the data are not i.i.d, such as the def<strong>in</strong>ition <strong>of</strong> <strong>autocorrelation</strong> measures for a variety <strong>of</strong><br />
learn<strong>in</strong>g tasks that we consider, the def<strong>in</strong>ition <strong>of</strong> <strong>autocorrelation</strong>-based heuristics, the development <strong>of</strong> algorithms<br />
that use such heuristics for learn<strong>in</strong>g <strong>predictive</strong> <strong>models</strong>, as well as their experimental evaluation.<br />
In our broad overview, we consider four different types <strong>of</strong> <strong>autocorrelation</strong>: spatial, temporal, spatio-