06.02.2013 Views

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

clean conditions and are most robust to noise. ZCPA performance is shown to vary widely with filter bank configuration<br />

and frame length. The ZCPA performance is poor in clean conditions but is the least affected by white noise. PNCC is<br />

shown to be the most promising new feature set for robust ASR in recent years.<br />

13:30-16:30, Paper ThBCT9.17<br />

Sparse Representation for Speaker Identification<br />

Naseem, Imran, The Univ. of Western Australia<br />

Togneri, Roberto, The Univ. of Western Australia<br />

Bennamoun, Mohammed, The Univ. of Western Australia<br />

We address the closed-set problem of speaker identification by presenting a novel sparse representation classification algorithm.<br />

We propose to develop an over complete dictionary using the GMM mean super vector kernel for all the training<br />

utterances. A given test utterance corresponds to only a small fraction of the whole training database. We therefore propose<br />

to represent a given test utterance as a linear combination of all the training utterances, thereby generating a naturally<br />

sparse representation. Using this sparsity, the unknown vector of coefficients is computed via l1minimization which is<br />

also the sparsest solution [12]. Ideally, the vector of coefficients so obtained has nonzero entries representing the class<br />

index of the given test utterance. Experiments have been conducted on the standard TIMIT [14] database and a comparison<br />

with the state-of-art speaker identification algorithms yields a favorable performance index for the proposed algorithm.<br />

13:30-16:30, Paper ThBCT9.18<br />

Latency in Speech Feature Analysis for Telepresence State Coding<br />

O’Gorman, Lawrence, Alcatel-Lucent Bell Lab.<br />

For video conferencing, there are network bandwidth and screen real-estate constraints that limit the number of user channels.<br />

We propose an intermediate transmission mode that transmits only at events, where these are detected by both audio<br />

and video changes from the short-term signal average. Our objective in this paper is to determine latency until the audio<br />

portion of a single telepresence channel stabilizes. It is this stable signal from which we detect events. We describe a recursive<br />

filter approach for feature determination and experiments on the Switchboard telephone call database. Results<br />

show latency to stable signal of up to 10 seconds. Although events can be detected much more quickly.<br />

13:30-16:30, Paper ThBCT9.19<br />

Automatically Detecting Peaks in Terahertz Time-Domain Spectroscopy<br />

Stephani, Henrike, Fraunhofer ITWM<br />

Jonuscheit, Joachim, Fraunhofer IPM<br />

Robiné, Christoph, Fraunhofer IPM<br />

Heise, Bettina, JKU<br />

To classify spectroscopic measurements it is necessary to have comparable methods of evaluation. In Terahertz (THz)<br />

time-domain spectroscopy, as a new technology, neither the presentation of the data nor the peak detection is standardized<br />

yet. We propose a procedure for automatic peak extraction in THz spectra of chemical compounds. After preprocessing in<br />

the time-domain, we use a variance based algorithm for determining the valid frequency region. We furthermore propose<br />

a baseline correction using simulated THz spectra. We illustrate how this procedure works on the example of hyperspectral<br />

THz measurements of six chemical compounds. Subsequently we propose to use unsupervised classification on the thus<br />

processed data to robustly detect the characteristic peaks of a compound.<br />

13:30-16:30, Paper ThBCT9.20<br />

Iwasawa Decomposition and Computational Riemannian Geometry<br />

Lenz, Reiner, Linköping Univ.<br />

Mochizuki, Rika, Nippon Telegraph and Telephone Corp.<br />

Chao, Jinhui, Chuo Univ.<br />

We investigate several topics related to manifold-techniques for signal processing. On the most general level we consider<br />

manifolds with a Riemannian Geometry. These manifolds are characterized by their inner products on the tangent spaces.<br />

We describe the connection between the symmetric positive-definite matrices defining these inner products and the Cartan<br />

and the Iwasawa decomposition of the general linear matrix groups. This decomposition gives rise to the decomposition<br />

- 318 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!