Project Proposal (PDF) - Oxford Brookes University
Project Proposal (PDF) - Oxford Brookes University
Project Proposal (PDF) - Oxford Brookes University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
FP7-ICT-2011-9 STREP proposal<br />
18/01/12 v1 [Dynact]<br />
vision group at Microsoft Research Cambridge UK. He was also involved in the creation of the startup<br />
company 2d3 (http://www.2d3.com/), part of the <strong>Oxford</strong> Metrics Group (OMG), whose first<br />
product, “boujou”, is still used by special effects studios all over the world and has been used on the<br />
special effects of almost every major feature film in the last five years, including the “Harry Potter”<br />
and “Lord of the Rings” series. Yotta (http://www.yottadcl.com/) is another OMG company<br />
currently funding Ph.D. studentships at the <strong>Oxford</strong> <strong>Brookes</strong> vision group. In the course of its<br />
cooperation with Armasuisse (the R&D agency of the Swiss army) IDSIA developed a number of<br />
intelligent systems currently used by the Swiss air force to enforce no-fly zones as well as automatic<br />
reasoning tools for social networking in the music business, developed by a Swiss start-up company<br />
(www.stagend.com) Many other examples of successful industry collaborations and exploitation<br />
agreements could be cited.<br />
The new datasets acquired in the course of the project will likely have great impact as new<br />
benchmarks for the whole community.<br />
Actions to undertake: the consortium will exploit its already considerable academic and industrial<br />
links to maximise the dissemination of the results of the project in terms of theoretical advances,<br />
public domain code, gathered datasets. We will both build on the existing network by engaging<br />
companies on a personal level, and work on building new strategic contacts via partecipation to<br />
dedicated events organised by the EU and other bodies. We envisage likely exploitations of our<br />
results on both the national and European level, for instance (in the UK) by setting up Knowledge<br />
Transfer Partnerships (KTPs) with companies expressing an interest.<br />
Again, many more details on the measures we plan to enact to maximise the expected impacts can be found<br />
in Section 3.2.<br />
3.1.2 External factors affecting the potential impacts<br />
In principle, we see three possible sources of threat to the concrete realization of these impacts: the lack of<br />
empirical support to the scientific hypothesis at the foundation of this project; the lack of data on which to<br />
extensively validate the results in order to bring them close to real-world deployment; an unfriendly societal<br />
and industrial response to the research conducted here.<br />
None of these consitute, in our view, significant problems.<br />
Scientific hypothesis.<br />
Obviously, the potential impacts listed above can be reached only in the case of a successful termination of<br />
the project. The overall, underlying assumption is that encoding dynamics is important to achieve robust<br />
action recognition in both the discriminative and the generative approach to the problem, and that in the<br />
latter case imprecise-probabilistic graphical models are suitable to cope with the sources of nuisance, the<br />
inherent variability and the overfitting due to the limited size of the training sets that characterise the<br />
problem. Given our already considerable and growing expertise in all these matters, however, evidence does<br />
support this key assumption.<br />
As argued in Section 1, on new datasets with 51 action categories bag-of-features models achieve<br />
classification results of just above 20%, suggesting that these approaches do not fully represent the<br />
complexities of human actions. Kuehne al. have shown that this drop in performance with respect to previous<br />
datasets is most likely due to the increased number of action categories. The most recent literature [104] has<br />
shown that, while in the UCF Sports dataset [109] human static joint locations alone are sufficient to classify<br />
sports actions with an accuracy of over 98%, the same experiment on HMDB51 performs quite poorly<br />
(35%). This indicates that it may be necessary to take more into account action dynamics, in accordance with<br />
psychophysical experiments in which both motion and shape are critical for visual recognition of human<br />
actions [110, 111].<br />
Availability of data.<br />
The chance that suitable data are not avaible to thoroughly validate our methodologies is neutered by both<br />
the availability of a number of datasets of significant size and complexity (though limited to some extent as<br />
discuss for WP4), and that of proprietary equipment which will allow us to acquire testbeds in multiple<br />
modalities, with a focus on all the aspects we feel current datasets still neglect: complex activities, range<br />
data, large number of action categories, etcetera.<br />
<strong>Proposal</strong> Part B: page [53] of [67]