14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

240 Chapter 17. IEEE SPL 13(2):96-99, 2006<br />

IEEE SIGNAL PROCESSING LETTERS, VOL. X, NO. XX, XXXX 1<br />

Median-based cluster<strong>in</strong>g for underdeterm<strong>in</strong>ed bl<strong>in</strong>d<br />

signal process<strong>in</strong>g<br />

Fabian J. Theis, Member, IEEE, Carlos G. Puntonet, Member, IEEE, Elmar W. Lang<br />

Abstract— In underdeterm<strong>in</strong>ed bl<strong>in</strong>d source separation, more<br />

sources are to be extracted from less observed mixtures without<br />

know<strong>in</strong>g both sources and mix<strong>in</strong>g matrix. k-means-style cluster<strong>in</strong>g<br />

algorithms are commonly used to do this algorithmically<br />

given sufficiently sparse sources, but <strong>in</strong> any case other than<br />

determ<strong>in</strong>istic sources, this lacks theoretical justification. After<br />

establish<strong>in</strong>g that mean-based algorithms converge to wrong solutions<br />

<strong>in</strong> practice, we propose a median-based cluster<strong>in</strong>g scheme.<br />

Theoretical justification as well as algorithmic realizations (both<br />

onl<strong>in</strong>e and batch) are given and illustrated by some examples.<br />

x2<br />

α<br />

x1<br />

α<br />

F(α)<br />

0.1<br />

0.08<br />

0.06<br />

0.04<br />

0.02<br />

∆(α) mean<br />

∆(α) median<br />

0<br />

0 0.2 0.4 0.6 0.8<br />

EDICS Category: SAS-ICAB<br />

BLIND source separation (BSS), ma<strong>in</strong>ly based on the assumption<br />

of <strong>in</strong>dependent sources, is currently the topic of<br />

many researchers [1], [2]. Given an observed m-dimensional<br />

mixture random vector x, which allows an unknown decomposition<br />

x = As, the goal is to identify the mix<strong>in</strong>g matrix<br />

A and the unknown n-dimensional source random vector s.<br />

Commonly, first A is identified, and only then are the sources<br />

recovered. We will therefore denote the former task by bl<strong>in</strong>d<br />

mix<strong>in</strong>g model recovery (BMMR), and the latter (with known<br />

A) by bl<strong>in</strong>d source recovery (BSR).<br />

In the difficult case of underdeterm<strong>in</strong>ed or overcomplete<br />

BSS, where less mixtures than sources are observed (m < n),<br />

BSR is non-trivial, see section II. However, our ma<strong>in</strong> focus<br />

lies on the usually more elaborate matrix recovery. Assum<strong>in</strong>g<br />

statistically <strong>in</strong>dependent sources with exist<strong>in</strong>g variance and<br />

at most one Gaussian component, it is well-known that A<br />

is determ<strong>in</strong>ed uniquely by x [3]. However, how to do this<br />

algorithmically is far from obvious, and although quite a few<br />

algorithms have been proposed recently [4]–[6], performance<br />

is yet limited. The most commonly used overcomplete algorithms<br />

rely on sparse sources (after possible sparsification by<br />

preprocess<strong>in</strong>g), which can be identified by cluster<strong>in</strong>g, usually<br />

by k-means or some extension [5], [6]. But apart from the fact<br />

that theoretical justifications have not been found, mean-based<br />

cluster<strong>in</strong>g only identifies the correct A if the data density<br />

approaches a delta distribution. In figure 1, we illustrate the<br />

deficiency of mean-based cluster<strong>in</strong>g; we get an error of up to<br />

5◦ (a) circle histogram for α = 0.4 (b) comparison of mean and median<br />

Fig. 1. Mean- versus median-based cluster<strong>in</strong>g. We consider the mixture x of<br />

two <strong>in</strong>dependent gamma-distributed sources (γ = 0.5, 10<br />

per mix<strong>in</strong>g angle, which is rather substantial consider<strong>in</strong>g the<br />

sparse density and the simple, complete case of m = n = 2.<br />

Moreover the figure <strong>in</strong>dicates that median-based cluster<strong>in</strong>g<br />

performs much better. Indeed, mean-based cluster<strong>in</strong>g does not<br />

Manuscript received xxx; revised xxx.<br />

Some prelim<strong>in</strong>ary results were reported at the conferences ESANN 2002,<br />

SIP 2002 and ICA 2003.<br />

FT and EL are with the Institute of Biophysics, University of Regensburg,<br />

93040 Regensburg, Germany (phone: +49 941 943 2924, fax: +49 941 943<br />

2479, e-mail: fabian@theis.name), and CP is with the Dep. Arqitectura y<br />

Tecnología de Computadores, Universidad de Granada, 18071 Granada, Spa<strong>in</strong>.<br />

5 samples) us<strong>in</strong>g a<br />

mix<strong>in</strong>g matrix A with columns <strong>in</strong>cl<strong>in</strong>ed by α and (π/2−α) respectively. (a)<br />

shows the mixture density for α = 0.4 after projection onto the circle. For<br />

α ∈ [0, π/4), (b) compares the error when estimat<strong>in</strong>g A by the mean and the<br />

median of the projected density <strong>in</strong> the receptive field F(α) = (−π/4, π/4) of<br />

the known column a1 of A. The former is the k-means convergence criterion.<br />

possess any equivariance property (performance <strong>in</strong>dependent<br />

of A). In the follow<strong>in</strong>g we propose a novel median-based<br />

cluster<strong>in</strong>g method and prove its equivariance (lemma 1.2) and<br />

convergence. For brevity, the proofs are given for the case of<br />

arbitrary n, but m = 2, although they can be readily extended<br />

to higher sensor signal dimensions. Correspond<strong>in</strong>g algorithms<br />

are proposed and experimentally validated.<br />

I. GEOMETRIC MATRIX RECOVERY<br />

Without loss of generality we assume that A has pairwise<br />

l<strong>in</strong>early <strong>in</strong>dependent columns, and m ≤ n. BMMR tries to<br />

identify A <strong>in</strong> x = As given x, where s is assumed to be<br />

statistically <strong>in</strong>dependent. Obviously, this can only be done up<br />

to equivalence [3], where B is said to be equivalent to A,<br />

B ∼ A, if B can be written as B = APL with an <strong>in</strong>vertible<br />

diagonal matrix L (scal<strong>in</strong>g matrix) and an <strong>in</strong>vertible matrix P<br />

with unit vectors <strong>in</strong> each row (permutation matrix). Hence we<br />

may assume the columns ai of A to have unit norm.<br />

For geometric matrix-recovery, we use a generalization [7]<br />

of the geometric ICA algorithm [8]. Let s be an <strong>in</strong>dependent<br />

n-dimensional, Lebesgue-cont<strong>in</strong>uous, random vector with<br />

density ps describ<strong>in</strong>g the sources. As s is <strong>in</strong>dependent, ps<br />

factorizes <strong>in</strong>to ps(s1, . . . , sn) = ps1(s1) · · · psn(sn) with the<br />

one-dimensional marg<strong>in</strong>al source density functions psi. We<br />

assume symmetric sources, i.e. psi(s) = psi(−s) for s ∈ R<br />

and i ∈ [1 : n] := {1, . . .,n}, <strong>in</strong> particular E(s) = 0.<br />

The geometric BMMR algorithm for symmetric distributions<br />

goes as follows [7]: Pick 2n start<strong>in</strong>g vectors<br />

w1,w ′ 1, . . . ,wn,w ′ n on the unit sphere Sm−1 ⊂ Rm such<br />

that the wi are pairwise l<strong>in</strong>early <strong>in</strong>dependent and wi = −w ′ i .<br />

Often, these wi are called neurons because they resemble the<br />

neurons used <strong>in</strong> cluster<strong>in</strong>g algorithms and <strong>in</strong> Kohonen’s selforganiz<strong>in</strong>g<br />

maps. Furthermore fix a learn<strong>in</strong>g rate η : N → R.<br />

α

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!