Mr Kiran Varanasi / Dr Fabio Cuzzolin Applicant Career Summary

Mr Kiran Varanasi / Dr Fabio Cuzzolin 

Title: 

First Name: 

Surname: 

Other Names: 

Honours: 

Address: 

Town: 

Postcode: 

Country: 

Nationality: 

Email Address: 

Telephone (work): 

Fax: 

Abstract: 

Mr 

Kiran 

Varanasi 

655 Avenue de l'Europe 

Montbonnot 

St Ismier 

38334 

France 

Nationality 

Indian 

Applicant Career Summary 

INRIA Grenoble Rhone Alpes 

vakibs@gmail.com 

+33-616574493 

Created: Friday, January 15, 2010 15:09 [Approved] 

Newton International Fellowships - 2010 

As cameras and depth sensors get cheaper, they become deployed ubiquitously 

and capture images and video that are large beyond any human consumption. 

These are meant to be processed by computer algorithms, that analyse and 

summarize the information for human beings. Several challenges remain for 

developing such algorithms, especially in analysing motion and understanding 

scene dynamics. 

When it is known beforehand the kind of objects that are being observed, a 

particular template model can be chosen and fit to the observed image (e.g, 

Project Natal of Microsoft). However, such an approach cannot handle unknown 

scenes, when multiple actors are present in an outdoor environment. 

Unsupervised learning is necessary in that case. We propose to use "manifold 

learning", a statistical framework, to achieve this. This has to be extended to 

partial recognition, when objects in the scene are not segmented from 

background. Applications include cinematic motion capture and automatic 

surveillance. 

Page 1 of 8

Mr Kiran Varanasi / Dr Fabio Cuzzolin Newton International Fellowships - 2010 

Statement of 

qualifications and 

career: 

Field of Specialisation: 

Publications: 

Subject: 

Present Research: 

Present Position: 

Present Employer: 

Research Assistant, Center for Visual Information 

Technology, IIIT Hyderabad, India 

Principal Mentor (Instructor for computer graphics 

course), MSIT Program, India 

Research Associate, ISRI, Carnegie Mellon University, 

USA 

INRIA Grenoble Rhone Alpes 


Qualification Date 

Ph.D Fellow, INRIA Rhône Alpes, Grenoble, France 26/11/2006 - 25/05/2010 

01/04/2004 - 31/10/2006 

01/08/2004 - 31/03/2005 

01/06/2003 - 31/12/2003 

Computer Vision, Spatio-temporal modeling of dynamic scenes, 3D features for 

matching and recognition, Tracking from images 

"Temporal surface tracking via mesh evolution" 

Kiran Varanasi, Andrei Zaharescu, Edmond Boyer, Radu Horaud 

European Conference on Computer Vision (ECCV), Marseille, October 2008 

(Oral Presentation : 5% acceptance rate) 

"Surface feature detection and description with application to mesh matching" 

Andrei Zaharescu, Edmond Boyer, Kiran Varanasi, Radu Horaud 

International Conference on Computer Vision and Pattern Recognition (CVPR), 

Miami, June 2009 

"A document space model for automated text classification based on frequency 

distribution across categories" 

Kiran Varanasi, Chaitanya Kamisetty, Sushma Bendre, Rajeev Sangal, Akshar 

Bharati 

International Conference on Natural Language Processing (ICON), Mumbai, 

December 2002 

NIF Group 05: Information communication technology (ICT) / Computer Vision - 

ICT 

Through synchronized cameras in an indoor setting, a multi-view video is captured 

of different actors performing diverse tasks and interacting with each other. 3D 

mesh-models of the scene are built independently, by extracting silhouettes at 

each frame. These meshes are not topologically or geometrically coherent, due to 

artefacts in silhouette extractions and by occlusions. I have worked on obtaining a 

coherent spatio-temporal model of the observed scene, which connects these 

meshes. 

I worked on dense 3D motion estimation, which is a particularly hard problem 

when no assumptions can be made of the scene, as that it has a single topology 

over time. I demonstrated results on complex sequences, e.g, a dancer wearing a 

loose fitting robe with a feather boa belt, a juggler juggling with clubs, two children 

playing ball with each other etc. No additional information was input, apart from 

that captured by the cameras. I achieved this through a mesh evolution framework 

which accounts for topological changes. 

Later, I worked on computing 3D features directly on meshes, which can be used 

for initializing global motion estimation. 

Finally, I worked on a temporally coherent segmentation scheme to automatically 

extract body-parts that move rigidly over time. I demonstrated results on similar 

complex sequences involving multiple actors and background clutter (recent work 

submitted for publication). 

Thus, spatio-temporal modelling of arbitrary scenes is achieved at various scales. 

Doctoral Student 

Page 2 of 8


Present Department: 

Present Position Start 

Date: 

Present Position End 

Date: 

PhD Awarded Date: 

PhD Institution: 

PhD Country: 

Previous Support 

Description: 

Where did you hear of 

this scheme?: 

26/11/2006 

25/05/2010 

Co-Applicant Career Summary 


Perception Group (Computer Science Department) 

30/06/2010 

INRIA Grenoble Rhône Alpes and Laboratoire Jean Kuntzmann, Université 

Grenoble 

France 

Employed by INRIA Grenoble. Supported by the INTERACT project of 

PERCEPTION group (European Grant) 

Other 

Co-Applicant Personal Details 

Title: 

First Name: 

Surname: 

Other Names: 

Honours: 

Address Line 1: 





Town: 

Postcode: 

Country: 

Nationality: 

Email Address: 

Telephone (work): 

Fax: 

Dr 

Fabio 

Cuzzolin 

Ph.D. 

Department of Computing 

Oxford Brookes University 

Wheatley campus 

Wheatley 

OX33 1HX 

United Kingdom 

Nationality 

Italian 

fabio.cuzzolin@brookes.ac.uk 

01865 484526 

01865 484545 

Page 3 of 8


Statement of 

qualifications and 

career: 

Field of Specialisation: 

Publications: 

Subject: 

Present Research: 

Present Position: 

Present Employer: 

Present Department: 

Marie Curie Fellow, INRIA Rhone-Alpes, Grenoble, 

France 

Post-doctoral researcher, University of California at 

Los Angeles 

Fixed-term assistant professor, Politecnico di Milano, 

Italy 

Lecturer - Early Career Fellow 

Oxford Brookes University 

Department of Computing 


Qualification Date 

Lecturer, Oxford Brookes University 01/09/2008 - to date 

04/09/2006 - 03/09/2008 

01/10/2004-09/04/2006 

01/01/2003-31/12/2004 

Post-doctoral researcher, University of Padua, Italy 01/06/2001-31/05/2003 

Dr Cuzzolin’s research interests include machine and manifold learning, computer 

vision and human motion analysis, the theory of belief functions and imprecise 

probabilities. 

Fabio Cuzzolin, Multilinear modeling for robust identity recognition from gait, in 

“Behavioral Biometrics for Human Identification: Intelligent Applications", Liang 

Wang and Xin Geng (Eds.), IGI Publishing, 2009 

Fabio Cuzzolin, A geometric approach to the theory of evidence, IEEE 

Transactions on Systems, Man, and Cybernetics part C, 38(4), pages 522-534, 

July 2008 

Fabio Cuzzolin, Diana Mateus, David Knossow, Edmond Boyer, and Radu 

Horaud, Coherent Laplacian protrusion segmentation, Proceedings of CVPR'08, 

Anchorage, Alaska; 

Diana Mateus, Radu Horaud, David Knossow, Fabio Cuzzolin, and Edmond 

Boyer, Articulated Shape Matching Using Laplacian Eigenfunctions and 

Unsupervised Point Registration, Proceedings of CVPR'08, Anchorage, Alaska; 

Fabio Cuzzolin, Using Bilinear Models for View-invariant Action and Identity 

Recognition, Proceedings of 

CVPR'06, pp. 1701-1708, New York, June 18-22 2006 


ICT 

Dr Cuzzolin’s research interests include machine learning, computer vision and 

imprecise probabilities. 

He is first or single author of some 50 publications (including 9 journals + 6 under 

review), some of which received awards. He collaborates with several journals in 

both computer vision and probabilities, and served in the program committee of 

some 15 international conferences. 

Dr Cuzzolin is a prominent expert in the field of random sets. He formulated a 

geometric approach to uncertainty in which probabilities, possibilities and belief 

functions can all be represented as points of a Cartesian space and there 

analyzed. He studied how to approximate random sets with probabilities, and 

proposed novel formulations of the theory of belief functions. 

Within computer vision, his work focused on human motion analysis and action 

recognition. He proposed the use of multilinear models for identity recognition 

from gait, and explored spectral motion capture techniques for unsupervised 3D 

segmentation and matching. 

Dr Cuzzolin is finalizing collaborations with IDSIA, Switzerland for a STREP on 

imprecise Markov chains for gesture recognition, and with INRIA, Pompeu Fabra 

and Technion on a Future and Emerging Technology (FET) EU proposal on large 

scale manifold learning. He is discussing a collaborative project on uncertainty 

theory at UK level with U. Bristol and Durham’s Dept of Statistics. He is also 

exploring the opportunity of a European Network of Excellence in the same fiel 

Page 4 of 8


Present Position Start 

Date: 

Present Position End 

Date: 

PhD Awarded Date: 

Proposal 

Current Funding 

Description: 

Previous Support 

Description: 

Subject: 

Project Title: 

Start Date: 

End Date: 

Research proposal: 

01/09/2008 

31/08/2013 

19/02/2001 


Dr Cuzzolin is currently applying for his EPSRC First Grant, he is finalizing a FET 

(Future Emerging Technology) EU FP7 project with as partners INRIA Rhone- 

Alpes, U. Pompeu Fabra (Barcelona), and Technion (Haifa). He is also preparing 

a Leverhulme research proposal, and will submit a European Research Council 

Starting Grant before October 2010. No funding though is currently available. 

Dr Cuzzolin has received Oxford Brookes' central university funding in 2008 

(some 3600 pounds overall). 


ICT 

Manifold learning for motion analysis and recognition 

01/01/2011 

31/12/2012 

In the course of this project, we will explore the problem of recognizing objects 

and their motion in an unknown environment filled with background clutter. One 

good example is the recognition of human activities in an outdoor environment, 

from a single image or from a set of images, obtained from cameras or depth 

sensors. In this case, we aim to recognize the human beings in the scene, their 

body postures and their motion over time. Our problem is made complex by the 

facts that the people need not be visible entirely to the camera, that they might 

wear loose clothes which cause further occlusions, and that they might interact 

with each other using various unknown objects. 

In contrast, if it is known that a person faces straight to the sensor, or at least, that 

he/she is visible entirely to the sensor's field of view, a known articulated model of 

human beings can be fit to the observed image and the pose of the body be 

computed. Certain commercial applications are in the course of development (e.g, 

Project Natal of Microsoft corporation) which use cameras and depth sensors in 

this way, to compute body poses and provide novel modes of interaction for users 

to play computer games. The variations in human body shapes and their poses 

are learnt into a statistical model, using a vast database of actors captured in a 

pre-defined set of poses. Apart from commercial ventures, there have also been 

various research publications that studied this problem in the recent past. Diverse 

solutions have been proposed to define and represent such statistical model, and 

to compute the closest fit for an observation. 

One such solution was proposed by Dr. Cuzzolin et al, through manifold learning. 

The observation is transformed into a new space using adaptive spectral 

embeddings that are isometrically invariant (i.e, that are invariant to articulated 

motions). Articulated motion can be estimated as a simple global rotation in this 

embedding space. The power of this solution is its generality, as it can be used for 

motion estimation of an unknown object, not necessarily a human being. 

However, its weakness is that the object needs to be segmented properly from the 

background, which is difficult to achieve in a natural setting, or when multiple 

actors are present. 

Page 5 of 8



During my Ph.D, I have worked on the problem of motion estimation without 

segmenting an object from background clutter. I demonstrated results on scenes 

involving multiple actors interacting with each other. To achieve this, I used a 

mesh evolution framework that handles topological changes. I proposed methods 

for computing local 3D features on objects, and algorithms for matching shapes 

that rely on such local features. Such methods can handle partial shape matching, 

and work even when the object is not entirely visible to the sensor. 

In the current project, we aim at combining the strengths of the two approaches 

developed by applicant and co-applicant and fit them into a statistical learning 

framework. The first challenge that we face is the huge dimensionality of the input 

data. Even though adaptive spectral embeddings translate this input into a new 

space of much lower dimensionality, they have to be first learnt by training. When 

the dimensionality of the input data is prohibitively large (as is in our case), the 

learning takes too much time and needs too much memory space. We would like 

to attack this problem by using the idea of "compressed sensing" that has recently 

become popular in the signal processing community. We shall code the input data 

using random projections that are linear, sparsity oriented and metric-preserving. 

This vastly reduces the dimensionality of the feature space and makes adaptive 

embeddings tractable. 

Adaptive spectral embeddings can handle global matching, but not partial 

matching. Our second challenge is to fit these embeddings into a partial matching 

framework. Similar to my Ph.D work, we would like to derive feature descriptors 

on these embeddings that are local in scope, and which can be used for matching 

shapes amidst background clutter. Ideally, we should be able to learn body-parts 

of the shapes, with their distinctive shape signatures. 

The third and final challenge that we will have to address is that of unsupervised 

learning over a large database of example shapes and poses. This learning will 

further improve the descriptiveness of our features. For example, statistical 

learning can restrict the lower arm of a human body to have a joint angle not more 

than 180 degrees with the upper arm. We seek to learn the shape and motion 

statistics of arbitrary shapes, not just human beings. For example, we can learn 

the statistics of men riding bicycles, or of children playing with their pets. 

The potential applications of the proposed work are numerous. A straightforward 

application, for instance, is kinematic motion capture in outdoor environments, 

crucial in the entertainment industry, such as in film production and gaming. A 

second natural application is automatic visual surveillance, especially in crowded 

and cluttered scenes where the suspect is on the run. An additional example 

concerns the assistance of elderly people, through monitoring for signs of ageing 

and weakness in their limb movements. A fourth application could be in sports 

analysis, where the movements of a sports-person are analysed for their 

correctness. Due to the general-purpose nature of our approach, it shall be able 

for deployment amidst multiple people performing diverse tasks, and in outdoor 

environments. This opens doors for several new applications, to a degree that is 

not possible with today's technology. 

Apart from practical applications, there shall also be theoretical breakthroughs that 

would be of interest, for various other fields that fit in the broad scope of pervasive 

computing. Bringing together the two fields of manifold learning and compressed 

sensing fits the broader research goal of Dr. Cuzzolin, as outlined by his 

collaboration with other European centers of research. The task of analysing 

motion (and human activities in particular) fits well with the long-term goals of the 

Vision Group of Oxford Brookes University, which has produced several worldclass 

publications in this field. I would like to benefit from the strong collaborations 

that tie the group in Oxford Brookes to other centres of excellence in the world, 

particularly to the Visual Geometry group in the University of Oxford and to the 

Page 6 of 8


Accompanying 

dependants: 

Previous contact: 

Potential applications: 

Comply with Policy on 

use of Animals: 

Proficient in reading, 

writing & speaking 

English: 

Benefits to 

individuals/institutions 

: 

Benefits to UK: 

none 

Not applicable 


Perception Group at INRIA Grenoble (where I am pursuing my Ph.D at the 

moment). The scope for collaboration is much higher at the later stages of the 

project, when we devise applications in motion capture, or in visual surveillance. 

Here, it is noteworthy to mention the highly successful industrial partnerships of 

the Vision group in Oxford Brookes. For example, the collaboration with the 

motion capture company Vicon has won the prestigious knowledge transfer 

partnership (KTP) in 2009. 

I have known Dr. Fabio Cuzzolin at INRIA Grenoble in France for 2 years, where 

we have participated together in seminars and lab-meetings. At INRIA, Dr. 

Cuzzolin and I have worked on similar problems (in 3D motion estimation), but 

from different perspectives. It provided us the opportunity to critique each other's 

ideas, and also to identify the individual strengths of our approaches. I wish to 

capitalize on that exchange, and collaborate more rigorously with Dr. Cuzzolin. 

Excellent reading, writing and spoken English capabilities. Received all 

professional education in English for the last 10 years. TOEFL 287/300 

Images and videos from visual sensors provide a wealth of information, potentially 

allowing more natural forms of interaction between humans and machines. This 

data, however, is bulky and expensive to store or transmit: its efficient but faithful 

representation could allow applications currently forbidden because of the scale of 

the data involved. Streamlining such interactions through cutting-edge large scale 

manifold learning will have enormous impact on productivity and as a 

consequence on economic growth. People will enjoy the effect of distributed 

computing on their lives, in the form of automated assistance to the elderly, or 

new home entertainment products not linked to inconvenient motion estimation 

devices. Companies active in entertainment, surveillance, and biometrics are all 

poised to receive clear benefits from such developments. Individuals' health will 

also benefit from advanced techniques of diagnosis based on semi-automatic 

recognition of diseases from 3D medical data. 

Clear benefits to the UK in terms of public security will come from the impact of 

the project on areas such as automatic surveillance. Biometrics such as face or 

iris recognition suffer from major limitations: they cannot be used at a distance, 

and require user cooperation, making them not practical in real-world scenarios. 

Methods based on gait analysis can be built on top of the proposed algorithm to 

design semi-automatic alert system with the potential of significantly improve the 

country's level of security. 

Besides, the scope of the techniques developed within the project is not limited to 

motion analysis or medical imaging. For instance, support to decision making and 

customization in business is another important area in which intelligent 

management of large amounts of data can have an impact. Using real time 

detailed data, products can be customized for individual clients, allowing 

companies to offer tailored services and boosting UK competitiveness. 

Page 7 of 8


Benefits to Overseas 

Country: 

Multidisciplinary 

Proposal: 

Financial Details 

Financial Details: 

Start Date: 

Duration (Years): 

Justification: 


The community’s awareness of the need for new, radical ways of learning from 

and making decisions based large scale data is arising. The Lisbon Strategy for 

growth and jobs explicitly mentions the need for Europe, and France in particular, 

to keep open to the most recent developments in ICT research. Failure to lead the 

mounting tide in large scale data management and learning, an issue at the root 

of most components of the connected society of the near future, would 

compromise the continent’s competitiveness and put serious obstacles before the 

realization of such scenarios. 

The potential of non-traditional manifold learning techniques in allowing efficient 

communication between different agents/sensors is apparent, making scenarios in 

which self-organizing systems (such as, for instance, self-organizing traffic lights) 

will adapt autonomously to changing requirements, reducing the need for human 

intervention or even centrally directed autonomous planning, realistic. 

Year Payment type Justification Amount Requested 

Year 1 Travel International 2,000.00 

Year 1 Other 0.00 

Year 1 Subsistence 24,000.00 

Year 1 Research Costs 8,000.00 

Year 2 Travel International 0.00 

Year 2 Other 0.00 

Year 2 Subsistence 24,000.00 

Year 2 Research Costs 8,000.00 

Total 66,000.00 

01/01/2011 

2 

£4000 per year for major international conferences (2 sums per year of £2000) 

£2000 per year for visits/seminars/collaborations travels in UK/Europe 

£2000 per year for equipment (eg cameras etc) 

Page 8 of 8

Mr Kiran Varanasi / Dr Fabio Cuzzolin Applicant Career Summary

Create successful ePaper yourself

Delete template?

Save as template?