Abstract book (pdf) - ICPR 2010
Abstract book (pdf) - ICPR 2010
Abstract book (pdf) - ICPR 2010
- TAGS
- abstract
- icpr
- icpr2010.org
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Huang, Qingming, Chinese Acad. of Sciences<br />
Jiang, Shuqiang, Chinese Acad. of Sciences<br />
Tian, Qi, Univ. of Texas at San Antonio<br />
Human action recognition has been well studied recently, but recognizing the activities of more than three persons remains<br />
a challenging task. In this paper, we propose a motion trajectory based method to classify human group activities. Gaussian<br />
Processes are introduced to represent human motion trajectories from a probabilistic perspective to handle the variability<br />
of people’s activities in group. With respect to the relationships of persons in group activities, three discriminative descriptors<br />
are designed, which are Individual, Dual and Unitized Group Activity Pattern. We adopt the Bag of Words approach<br />
to solve the problem of unbalanced number of persons in different activities. Experiments are conducted on the<br />
human group-activity video database, and the results show that our approach outperforms the state-of-the-art.<br />
13:30-16:30, Paper WeBCT9.27<br />
Extracting Captions in Complex Background from Videos<br />
Liu, Xiaoqian, Chinese Acad. of Sciences<br />
Wang, Weiqiang, Chinese Acad. of Sciences<br />
Captions in videos play a significant role for automatically understanding and indexing video content, since much semantic<br />
information is associated with them. This paper presents an effective approach to extracting captions from videos, in which<br />
multiple different categories of features (edge, color, stroke etc.) are utilized, and the spatio-temporal characteristics of<br />
captions are considered. First, our method exploits the distribution of gradient directions to decompose a video into a sequence<br />
of clips temporally, so that each clip contains a caption at most, which makes the successive extraction computation<br />
more efficient and accurate. For each clip, the edge and corner information are then utilized to locate text regions. Further,<br />
text pixels are extracted based on the assumption that text pixels in text regions always have homogeneous color, and their<br />
quantity dominates the region relative to non-text pixels with different colors. Finally, the segmentation results are further<br />
refined. The encouraging experimental results on 2565 characters have preliminarily validated our approach.<br />
13:30-16:30, Paper WeBCT9.28<br />
Keyframe-Guided Automatic Non-Linear Video Editing<br />
Rajgopalan, Vaishnavi, Concordia Univ.<br />
Ranganathan, Ananth, Honda Res. Inst. USA<br />
Rajagopalan, Ramgopal, Res. in Motion<br />
Mudur, Sudhir, Concordia Univ.<br />
We describe a system for generating coherent movies from a collection of unedited videos. The generation process is<br />
guided by one or more input keyframes, which determine the content of the generated video. The basic mechanism involves<br />
similarity analysis using the histogram intersection function. The function is applied to spatial pyramid histograms computed<br />
on the video frames in the collection using Dense SIFT features. A two-directional greedy path finding algorithm is<br />
used to select and arrange frames from the collection while maintaining visual similarity, coherence, and continuity. Our<br />
system demonstrates promising results on large video collections and is a first step towards increased automation in nonlinear<br />
video editing.<br />
13:30-16:30, Paper WeBCT9.29<br />
Images in News<br />
Sankaranarayanan, Jagan, Univ. of Maryland<br />
Samet, Hanan, Univ. of Maryland<br />
A system, called News Stand, is introduced that automatically extracts images from news articles. The system takes RSS feeds of news<br />
article and applies an online clustering algorithm so that articles belonging to the same news topic can be associated with the same cluster.<br />
Using the feature vector associated with the cluster, the images from news articles that form the cluster are extracted. First, the caption text<br />
associated with each of the images embedded in the news article is determined. This is done by analyzing the structure of the news article’s<br />
HTML page. If the caption and feature vector of the cluster are found to contain keywords in common, then the image is added to an image<br />
repository. Additional meta-information are now associated with each image such as caption, cluster features, names of people in the news<br />
article, etc. A very large repository containing more than 983k images from 12 million news articles was built using this approach. This<br />
repository also contained more than 86.8 million keywords associated with the images. The key contribution of this work is that it combines<br />
clustering and natural language processing tasks to automatically create a large corpus of news images with good quality tags or meta-information<br />
so that interesting vision tasks can be performed on it.<br />
- 234 -