06.02.2013 Views

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

We report performance evaluation of our automatic feature discovery method on the publicly available Gisette dataset: a<br />

set of 29 features discovered by our method ranks 129 among all 411 current entries on the validation set. Our approach<br />

is a greedy forward selection algorithm guided by error clusters. The algorithm finds error clusters in the current feature<br />

space, then projects one tight cluster into the null space of the feature mapping, where a new feature that helps to classify<br />

these errors can be discovered. This method assumes a ``data-rich’’ problem domain and works well when large amount<br />

of labeled data is available. The result on the Gisette dataset shows that our method is competitive to many of the current<br />

feature selection algorithms. We also provide analytical results showing that our method is guaranteed to lower the error<br />

rate on Gaussian distributions and that our approach may outperform the standard Linear Discriminant Analysis (LDA)<br />

method in some cases.<br />

15:00-17:10, Paper MoBT9.56<br />

Optimized Entropy-Constrained Vector Quantization of Lossy Vector Map Compression<br />

Chen, Minjie, Univ. of Eastern Finland<br />

Xu, Mantao, Carestream Health Corp. Shanghai, China<br />

Fränti, Pasi, Univ. of Eastern Finland<br />

Quantization plays an important part in lossy vector map compression, for which the existing solutions are based on either<br />

a fixed size open-loop code<strong>book</strong>, or a simple uniform quantization. In this paper, we proposed an entropy-constrained<br />

vector quantization to optimize both the structure and size of the code<strong>book</strong> at the same time using a closed-loop approach.<br />

In order to lower the distortion to a desirable level, we exploit two-level design strategy, where the vector quantization<br />

code<strong>book</strong> is designed only for most common vectors and the remaining (outlier) vectors are coded by uniform quantization.<br />

15:00-17:10, Paper MoBT9.57<br />

Nonnegative Embeddings and Projections for Dimensionality Reduction and Information Visualization<br />

Zafeiriou, Stefanos, Imperial Coll. of London<br />

Laskaris, Nikolaos, AiiA-Lab. AUTH,<br />

In this paper, we propose novel algorithms for low dimensionality nonnegative embedding of vectorial and/or relational<br />

data, as well as nonnegative projections for dimensionality reduction. We start by introducing a novel algorithm for Metric<br />

Multidimensional Scaling (MMS). We propose algorithms for Nonnegative Locally Linear Embedding (NLLE) and Nonnegative<br />

Laplacian Eigenmaps (NLE). By reformulating the problem of MMS, NLLE and NLE for finding projections<br />

we propose algorithms for Nonnegative Principal Component Analysis (NPCA), for Nonnegative Orthogonal Neighbourhood<br />

Preserving Projections (NONPP) and Nonnegative Orthogonal Locality Preserving Projections (NOLPP). We demonstrate<br />

some first preliminary results of the proposed methods in data visualization.<br />

15:00-17:10, Paper MoBT9.58<br />

Unsupervised Learning from Linked Documents<br />

Guo, Zhen, SUNY at Binghamton<br />

Zhu, Shenghuo, NEC Lab.<br />

Chi, Yun, NEC Lab.<br />

Zhang, Zhongfei, State Univ. of New York, Binghamton<br />

Gong, Yihong, NEC Lab. America, Inc.<br />

Documents in many corpora, such as digital libraries and webpages, contain both content and link information. In a traditional<br />

topic model which plays an important role in the unsupervised learning, the link information is either totally ignored<br />

or treated as a feature similar to content. We believe that neither approach is capable of accurately capturing the relations<br />

represented by links. To address the limitation of traditional topic models, in this paper we propose a citation-topic (CT)<br />

model that explicitly considers the document relations represented by links. In the CT model, instead of being treated as<br />

yet another feature, links are used to form the structure of the generative model. As a result, in the CT model a given document<br />

is modeled as a mixture of a set of topic distributions, each of which is borrowed (cited) from a document that is<br />

related to the given document. We apply the CT model to several document collections and the experimental comparisons<br />

against state-of-the-art approaches demonstrate very promising performances.<br />

- 66 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!