06.02.2013 Views

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Huang, Qingming, Chinese Acad. of Sciences<br />

Jiang, Shuqiang, Chinese Acad. of Sciences<br />

Tian, Qi, Univ. of Texas at San Antonio<br />

Human action recognition has been well studied recently, but recognizing the activities of more than three persons remains<br />

a challenging task. In this paper, we propose a motion trajectory based method to classify human group activities. Gaussian<br />

Processes are introduced to represent human motion trajectories from a probabilistic perspective to handle the variability<br />

of people’s activities in group. With respect to the relationships of persons in group activities, three discriminative descriptors<br />

are designed, which are Individual, Dual and Unitized Group Activity Pattern. We adopt the Bag of Words approach<br />

to solve the problem of unbalanced number of persons in different activities. Experiments are conducted on the<br />

human group-activity video database, and the results show that our approach outperforms the state-of-the-art.<br />

13:30-16:30, Paper WeBCT9.27<br />

Extracting Captions in Complex Background from Videos<br />

Liu, Xiaoqian, Chinese Acad. of Sciences<br />

Wang, Weiqiang, Chinese Acad. of Sciences<br />

Captions in videos play a significant role for automatically understanding and indexing video content, since much semantic<br />

information is associated with them. This paper presents an effective approach to extracting captions from videos, in which<br />

multiple different categories of features (edge, color, stroke etc.) are utilized, and the spatio-temporal characteristics of<br />

captions are considered. First, our method exploits the distribution of gradient directions to decompose a video into a sequence<br />

of clips temporally, so that each clip contains a caption at most, which makes the successive extraction computation<br />

more efficient and accurate. For each clip, the edge and corner information are then utilized to locate text regions. Further,<br />

text pixels are extracted based on the assumption that text pixels in text regions always have homogeneous color, and their<br />

quantity dominates the region relative to non-text pixels with different colors. Finally, the segmentation results are further<br />

refined. The encouraging experimental results on 2565 characters have preliminarily validated our approach.<br />

13:30-16:30, Paper WeBCT9.28<br />

Keyframe-Guided Automatic Non-Linear Video Editing<br />

Rajgopalan, Vaishnavi, Concordia Univ.<br />

Ranganathan, Ananth, Honda Res. Inst. USA<br />

Rajagopalan, Ramgopal, Res. in Motion<br />

Mudur, Sudhir, Concordia Univ.<br />

We describe a system for generating coherent movies from a collection of unedited videos. The generation process is<br />

guided by one or more input keyframes, which determine the content of the generated video. The basic mechanism involves<br />

similarity analysis using the histogram intersection function. The function is applied to spatial pyramid histograms computed<br />

on the video frames in the collection using Dense SIFT features. A two-directional greedy path finding algorithm is<br />

used to select and arrange frames from the collection while maintaining visual similarity, coherence, and continuity. Our<br />

system demonstrates promising results on large video collections and is a first step towards increased automation in nonlinear<br />

video editing.<br />

13:30-16:30, Paper WeBCT9.29<br />

Images in News<br />

Sankaranarayanan, Jagan, Univ. of Maryland<br />

Samet, Hanan, Univ. of Maryland<br />

A system, called News Stand, is introduced that automatically extracts images from news articles. The system takes RSS feeds of news<br />

article and applies an online clustering algorithm so that articles belonging to the same news topic can be associated with the same cluster.<br />

Using the feature vector associated with the cluster, the images from news articles that form the cluster are extracted. First, the caption text<br />

associated with each of the images embedded in the news article is determined. This is done by analyzing the structure of the news article’s<br />

HTML page. If the caption and feature vector of the cluster are found to contain keywords in common, then the image is added to an image<br />

repository. Additional meta-information are now associated with each image such as caption, cluster features, names of people in the news<br />

article, etc. A very large repository containing more than 983k images from 12 million news articles was built using this approach. This<br />

repository also contained more than 86.8 million keywords associated with the images. The key contribution of this work is that it combines<br />

clustering and natural language processing tasks to automatically create a large corpus of news images with good quality tags or meta-information<br />

so that interesting vision tasks can be performed on it.<br />

- 234 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!