28.07.2013 Views

Project Proposal (PDF) - Oxford Brookes University

Project Proposal (PDF) - Oxford Brookes University

Project Proposal (PDF) - Oxford Brookes University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FP7-ICT-2011-9 STREP proposal<br />

18/01/12 v1 [Dynact]<br />

Work package description<br />

Work package number 3 Start date or starting event: Month 1<br />

Work package title Dynamical discriminative modeling<br />

Activity type 14 RTD<br />

Participant number 1<br />

Participant short name OBU<br />

Person-months per participant 39<br />

Objectives<br />

The goal of this WP is to develop a novel framework of discriminative models able to explicitly capture the<br />

spatio-temporal structure of human motions. This involves:<br />

– identifying the most discriminative parts of an action or activity via Multiple Instance Learning (MIL)<br />

from a weakly labeled training set;<br />

– learning and classifying pictorial structure models representing constellations of elementary, most<br />

discriminative parts.<br />

Description of work<br />

The work can be articulated into the following tasks:<br />

Task 3.1 – Multiple instance learning of discriminative action parts. Let an action-part be defined as a<br />

Bag of Features model bounded in a space-time cube. The task here is to learn models for a set of most<br />

discriminative action parts, given a training set of video sequences in which an action class is assigned to<br />

each video clip, assuming that one action occurs in each clip. This is a weakly labeled scenario, where it is<br />

known that a positive example of the action exists within the clip, but the exact location of the action is<br />

unknown. The learning task can then be cast in a Multiple Instance Learning (MIL) framework, in which<br />

the training set consists of a set of “bags” (the training sequences), containing a number of BoF models (in<br />

our case SVM classifiers learnt for each sub-volume of the spatio-temporal sequence), and the<br />

corresponding ground truth class labels.<br />

Task 3.2 – Learning and classifying structured discriminative models of actions. Once the most<br />

discriminative action parts are learnt by applying MIL to the “bags” corresponding to the given training<br />

spatio-temporal sequences, we can construct tree-like ensembles of action parts which can be later used for<br />

both localizing and efficiently classifying complex activities. A cost function can be defined as a function of<br />

both the appearance models of the individual parts and of the relative positions between pairs of action<br />

parts, whose maximization yields the best action configuration.<br />

The problem can be efficiently solved by dynamic programming.<br />

Deliverables<br />

D3.1 A scientific report on the Multiple Instance Learning framework for the detection of the most<br />

discriminative action parts (month 16).<br />

D3.2 The efficient algorithmic implementation of the MIL framework as a software prototype (month 18).<br />

D3.3 A scientific report detailing the approach based on the optimization of constellations of elementary<br />

actions for the localization and classification of complex activities (month 24).<br />

D3.4 The efficient implementation of our novel framework for “structural” discriminative action modelling,<br />

localization and classification as a software prototype (month 24).<br />

14 Please indicate one activity per work package:<br />

RTD = Research and technological development; DEM = Demonstration; MGT = Management of the<br />

consortium.<br />

<strong>Proposal</strong> Part B: page [30] of [67]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!