Project Proposal (PDF) - Oxford Brookes University
Project Proposal (PDF) - Oxford Brookes University
Project Proposal (PDF) - Oxford Brookes University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
FP7-ICT-2011-9 STREP proposal<br />
18/01/12 v1 [Dynact]<br />
multiple actions performed by the same person.<br />
Breakthrough #3: presence of covariance factors and recognition "in the wild". In order to<br />
bring action recognition out of our research labs, and towards extensive deployment in the real world, the<br />
issue of the manifold nuisance factors that affect human motion has to be somehow addressed. Again, we<br />
propose two parallel research lines to tackle this problem, one based on imprecise-probabilistic graphical<br />
models, classified in a flexible, adaptive way by learning classifiers from the actual distribution of the data in<br />
the training set; the other, founded on flexible spatio-temporal discriminative models.<br />
Breakthrough #4: inherent variability and overfitting due to the limited size of training sets. In<br />
a machine learning framework, the ability to classify (in our case, to recognize actions or activities) is<br />
acquired from a training set of (labelled or unlabeled) examples. Unfortunately, collecting (and labelling) a<br />
large training set is a long and costly procedure: fully processed training sets for which ground truth is<br />
provided (think action labels and the corresponding bounding boxes in the video sequence) are a scarce<br />
resource. On the other hand a massive amount of unlabeled data is theoretically available, but in practise<br />
cannot be used for supervised learning of motions. We propose to cope with the overfitting issue by<br />
introducing a more robust, flexible class of "imprecise-probabilistic" graphical models which allow for<br />
uncertainty in the determination of the probabilities which characterize the model.<br />
Breakthrough #5: moving from elementary actions to complex activities. Finally, in order to<br />
learn and recognize more complex, sophisticated activities, we need to incorporate some sort of description<br />
of the activity itself: in this project we propose to do so by both developing new spatio-temporal<br />
discriminative models, and by moving to a more flexible class of "imprecise" stochastic generative models.<br />
1.3.2 Overview of the S/T methodology and project articulation into Work Packages<br />
To achieve such a goal, we articulate our project into three parallel lines of research, which amount to the<br />
development of novel, ground breaking methodologies in three distinct scientific fields: 1 – structured<br />
discriminative models, i.e., collections of discriminative models able to describe the spatio-temporal<br />
structure of a motion; 2 – a fully fledged theory of imprecise-probabilistic graphical models, capable of<br />
retaining the good features of traditional graphical models, while allowing more flexibility in the modelling<br />
the inherent variability of actions, especially considering the limited size of any realistic training set; 3 – a<br />
novel theory of classification for both precise and imprecise graphical models, able to cope with the large<br />
number of nuisance factors and push towards action recognition “in the wild”.<br />
The project is split into six interdependent work packages (WPs), one for each of the three methodological<br />
developments, plus one dedicated to the necessary analysis of novel algorithms for the extraction of suitable<br />
features from video/range sequences, one explicitly devoted to the integration and validation in various<br />
scenarios of the algorithms developed in the three research pipelines, and one for project administration.<br />
The goal of WP1 is the development of a theory of imprecise-probabilistic graphical models, in<br />
particular learning and inference techniques which generalize classical methods for models based on sharp,<br />
“precise” assessments. The first step is the identification of a reliable technique to learn imprecise models<br />
from incomplete data, via the generalisation of the classical Expectation-Maximization algorithm to the<br />
imprecise-probabilistic framework, and its specialisation to imprecise hidden Markov models (iHMMs).<br />
Later on these tools will be extended to general imprecise-probabilistic graphical models. This concerns both<br />
the case of directed models, for which the class of “credal networks” has already been formalised and<br />
studied, and the case of Markov Random Fields (MRFs), for which an extension to imprecise probabilities<br />
represents a truly groundbreaking research challenge. This will give more flexibility to the description of the<br />
motion's dynamics (e.g., by relaxing the Markovian assumption to support more complex topologies).<br />
The aim of WP2 is to develop a theory of classification for both imprecise-probabilistic and classical<br />
generative models, able to tackle the issue with nuisance factors. To classify a new instance of an action we<br />
need to compare the corresponding generative model with those already stored in a labelled training set. This<br />
requires the development of dissimilarity measures for generative models, both precise and imprecise. In<br />
particular we will investigate techniques for learning the most appropriate distance measure given a training<br />
set. In the second part of WP2, these measures will be applied to classification proper, by exploring the use<br />
of innovative frameworks such as structured learning and compressed sensing.<br />
The aim of WP3 is to develop a novel class of discriminative models able to explicitly capture the<br />
spatio-temporal structure of human motions. We can decompose this task into two parts. First, we need to<br />
identify the “most discriminative” (in the sense that they contribute the most to discriminating the presence<br />
of an action of a certain class) parts of an action/activity: we propose to use Multiple Instance Learning<br />
(MIL) techniques to learn such action parts from a weakly labelled training set. The second task is to arrange<br />
these parts in a coherent spatio-temporal structure. We will build on recent advanced in the field of 2D object<br />
<strong>Proposal</strong> Part B: page [12] of [67]