28.07.2013 Views

Project Proposal (PDF) - Oxford Brookes University

Project Proposal (PDF) - Oxford Brookes University

Project Proposal (PDF) - Oxford Brookes University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FP7-ICT-2011-9 STREP proposal<br />

18/01/12 v1 [Dynact]<br />

multiple actions performed by the same person.<br />

Breakthrough #3: presence of covariance factors and recognition "in the wild". In order to<br />

bring action recognition out of our research labs, and towards extensive deployment in the real world, the<br />

issue of the manifold nuisance factors that affect human motion has to be somehow addressed. Again, we<br />

propose two parallel research lines to tackle this problem, one based on imprecise-probabilistic graphical<br />

models, classified in a flexible, adaptive way by learning classifiers from the actual distribution of the data in<br />

the training set; the other, founded on flexible spatio-temporal discriminative models.<br />

Breakthrough #4: inherent variability and overfitting due to the limited size of training sets. In<br />

a machine learning framework, the ability to classify (in our case, to recognize actions or activities) is<br />

acquired from a training set of (labelled or unlabeled) examples. Unfortunately, collecting (and labelling) a<br />

large training set is a long and costly procedure: fully processed training sets for which ground truth is<br />

provided (think action labels and the corresponding bounding boxes in the video sequence) are a scarce<br />

resource. On the other hand a massive amount of unlabeled data is theoretically available, but in practise<br />

cannot be used for supervised learning of motions. We propose to cope with the overfitting issue by<br />

introducing a more robust, flexible class of "imprecise-probabilistic" graphical models which allow for<br />

uncertainty in the determination of the probabilities which characterize the model.<br />

Breakthrough #5: moving from elementary actions to complex activities. Finally, in order to<br />

learn and recognize more complex, sophisticated activities, we need to incorporate some sort of description<br />

of the activity itself: in this project we propose to do so by both developing new spatio-temporal<br />

discriminative models, and by moving to a more flexible class of "imprecise" stochastic generative models.<br />

1.3.2 Overview of the S/T methodology and project articulation into Work Packages<br />

To achieve such a goal, we articulate our project into three parallel lines of research, which amount to the<br />

development of novel, ground breaking methodologies in three distinct scientific fields: 1 – structured<br />

discriminative models, i.e., collections of discriminative models able to describe the spatio-temporal<br />

structure of a motion; 2 – a fully fledged theory of imprecise-probabilistic graphical models, capable of<br />

retaining the good features of traditional graphical models, while allowing more flexibility in the modelling<br />

the inherent variability of actions, especially considering the limited size of any realistic training set; 3 – a<br />

novel theory of classification for both precise and imprecise graphical models, able to cope with the large<br />

number of nuisance factors and push towards action recognition “in the wild”.<br />

The project is split into six interdependent work packages (WPs), one for each of the three methodological<br />

developments, plus one dedicated to the necessary analysis of novel algorithms for the extraction of suitable<br />

features from video/range sequences, one explicitly devoted to the integration and validation in various<br />

scenarios of the algorithms developed in the three research pipelines, and one for project administration.<br />

The goal of WP1 is the development of a theory of imprecise-probabilistic graphical models, in<br />

particular learning and inference techniques which generalize classical methods for models based on sharp,<br />

“precise” assessments. The first step is the identification of a reliable technique to learn imprecise models<br />

from incomplete data, via the generalisation of the classical Expectation-Maximization algorithm to the<br />

imprecise-probabilistic framework, and its specialisation to imprecise hidden Markov models (iHMMs).<br />

Later on these tools will be extended to general imprecise-probabilistic graphical models. This concerns both<br />

the case of directed models, for which the class of “credal networks” has already been formalised and<br />

studied, and the case of Markov Random Fields (MRFs), for which an extension to imprecise probabilities<br />

represents a truly groundbreaking research challenge. This will give more flexibility to the description of the<br />

motion's dynamics (e.g., by relaxing the Markovian assumption to support more complex topologies).<br />

The aim of WP2 is to develop a theory of classification for both imprecise-probabilistic and classical<br />

generative models, able to tackle the issue with nuisance factors. To classify a new instance of an action we<br />

need to compare the corresponding generative model with those already stored in a labelled training set. This<br />

requires the development of dissimilarity measures for generative models, both precise and imprecise. In<br />

particular we will investigate techniques for learning the most appropriate distance measure given a training<br />

set. In the second part of WP2, these measures will be applied to classification proper, by exploring the use<br />

of innovative frameworks such as structured learning and compressed sensing.<br />

The aim of WP3 is to develop a novel class of discriminative models able to explicitly capture the<br />

spatio-temporal structure of human motions. We can decompose this task into two parts. First, we need to<br />

identify the “most discriminative” (in the sense that they contribute the most to discriminating the presence<br />

of an action of a certain class) parts of an action/activity: we propose to use Multiple Instance Learning<br />

(MIL) techniques to learn such action parts from a weakly labelled training set. The second task is to arrange<br />

these parts in a coherent spatio-temporal structure. We will build on recent advanced in the field of 2D object<br />

<strong>Proposal</strong> Part B: page [12] of [67]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!