28.07.2013 Views

Project Proposal (PDF) - Oxford Brookes University

Project Proposal (PDF) - Oxford Brookes University

Project Proposal (PDF) - Oxford Brookes University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FP7-ICT-2011-9 STREP proposal<br />

18/01/12 v1 [Dynact]<br />

vision group at Microsoft Research Cambridge UK. He was also involved in the creation of the startup<br />

company 2d3 (http://www.2d3.com/), part of the <strong>Oxford</strong> Metrics Group (OMG), whose first<br />

product, “boujou”, is still used by special effects studios all over the world and has been used on the<br />

special effects of almost every major feature film in the last five years, including the “Harry Potter”<br />

and “Lord of the Rings” series. Yotta (http://www.yottadcl.com/) is another OMG company<br />

currently funding Ph.D. studentships at the <strong>Oxford</strong> <strong>Brookes</strong> vision group. In the course of its<br />

cooperation with Armasuisse (the R&D agency of the Swiss army) IDSIA developed a number of<br />

intelligent systems currently used by the Swiss air force to enforce no-fly zones as well as automatic<br />

reasoning tools for social networking in the music business, developed by a Swiss start-up company<br />

(www.stagend.com) Many other examples of successful industry collaborations and exploitation<br />

agreements could be cited.<br />

The new datasets acquired in the course of the project will likely have great impact as new<br />

benchmarks for the whole community.<br />

Actions to undertake: the consortium will exploit its already considerable academic and industrial<br />

links to maximise the dissemination of the results of the project in terms of theoretical advances,<br />

public domain code, gathered datasets. We will both build on the existing network by engaging<br />

companies on a personal level, and work on building new strategic contacts via partecipation to<br />

dedicated events organised by the EU and other bodies. We envisage likely exploitations of our<br />

results on both the national and European level, for instance (in the UK) by setting up Knowledge<br />

Transfer Partnerships (KTPs) with companies expressing an interest.<br />

Again, many more details on the measures we plan to enact to maximise the expected impacts can be found<br />

in Section 3.2.<br />

3.1.2 External factors affecting the potential impacts<br />

In principle, we see three possible sources of threat to the concrete realization of these impacts: the lack of<br />

empirical support to the scientific hypothesis at the foundation of this project; the lack of data on which to<br />

extensively validate the results in order to bring them close to real-world deployment; an unfriendly societal<br />

and industrial response to the research conducted here.<br />

None of these consitute, in our view, significant problems.<br />

Scientific hypothesis.<br />

Obviously, the potential impacts listed above can be reached only in the case of a successful termination of<br />

the project. The overall, underlying assumption is that encoding dynamics is important to achieve robust<br />

action recognition in both the discriminative and the generative approach to the problem, and that in the<br />

latter case imprecise-probabilistic graphical models are suitable to cope with the sources of nuisance, the<br />

inherent variability and the overfitting due to the limited size of the training sets that characterise the<br />

problem. Given our already considerable and growing expertise in all these matters, however, evidence does<br />

support this key assumption.<br />

As argued in Section 1, on new datasets with 51 action categories bag-of-features models achieve<br />

classification results of just above 20%, suggesting that these approaches do not fully represent the<br />

complexities of human actions. Kuehne al. have shown that this drop in performance with respect to previous<br />

datasets is most likely due to the increased number of action categories. The most recent literature [104] has<br />

shown that, while in the UCF Sports dataset [109] human static joint locations alone are sufficient to classify<br />

sports actions with an accuracy of over 98%, the same experiment on HMDB51 performs quite poorly<br />

(35%). This indicates that it may be necessary to take more into account action dynamics, in accordance with<br />

psychophysical experiments in which both motion and shape are critical for visual recognition of human<br />

actions [110, 111].<br />

Availability of data.<br />

The chance that suitable data are not avaible to thoroughly validate our methodologies is neutered by both<br />

the availability of a number of datasets of significant size and complexity (though limited to some extent as<br />

discuss for WP4), and that of proprietary equipment which will allow us to acquire testbeds in multiple<br />

modalities, with a focus on all the aspects we feel current datasets still neglect: complex activities, range<br />

data, large number of action categories, etcetera.<br />

<strong>Proposal</strong> Part B: page [53] of [67]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!