Outline Proposal - Oxford Brookes University
Outline Proposal - Oxford Brookes University
Outline Proposal - Oxford Brookes University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Request 69009 Page 6 of 11<br />
To quantify the respective algorithm performance, we report the accuracy (Acc), mean<br />
average precision (mAP), and mean F1 scores (mF1).<br />
(2) Recognition performance achieved so far:<br />
Figure 4. Left: preliminary localisation results (left) on a Hollywood2 video [45]. The colour of<br />
each box (subvolume) indicates the positive rank score of it belonging to the action class (red =<br />
high). In actioncliptest00058, a woman gets out of her car roughly around the middle of the<br />
video, as indicated by the detected subvolumes. Right: performance of MIL discriminative<br />
modelling (Step 3) with Dense Trajectory Features as features on the most common datasets,<br />
compared to the traditional BoF baseline. Even when using traditional feature, learning the most<br />
discriminative action parts via MIL much improve performance on challenging testbeds.<br />
Figure 5: performance of BoF global models with Fisher representation (Step 2) on the most<br />
common datasets, compared to the State of the Art. Note how accuracy and average precision<br />
(recognition rate) dramatically improve w.r.t. to previous approaches.<br />
(3) Latency to recognize specific human activity (how many seconds after the occurrence or<br />
specific human activities, can the activities be recognized by the algorithm):<br />
( ~2 ) sec for recognition on the KTH dataset: as features are computed from volumes frameper-seconds<br />
do not make much sense in our approach: anyway, the frame rate in all sequences<br />
is around 30fps.<br />
Computing the classification scores for 60,000 testing video instances (each 1000dim) on the<br />
KTH dataset takes 0.5 seconds on a standard laptop: this does not include feature computation<br />
and representation times, which can vary largely depending on choice of features,<br />
representation, classification methods, and pc hardware.<br />
(4) Possibility to predict the occurrence of specific human activities (please select a<br />
relevant one)<br />
Possible<br />
23611 Chagrin Blvd., Suite 320, Cleveland, OH 44122 • 216-295-4800 • www.ninesigma.com