14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6 Chapter 1. Statistical mach<strong>in</strong>e learn<strong>in</strong>g of biomedical data<br />

1.2 Uniqueness issues <strong>in</strong> <strong>in</strong>dependent component analysis<br />

Application of ICA to BSS tacitly assumes that the data follow the model (1.1) i.e. x(t) admits<br />

a decomposition <strong>in</strong>to <strong>in</strong>dependent sources, and we want to f<strong>in</strong>d this decomposition. But neither<br />

the mix<strong>in</strong>g function f nor the source signals s(t) are known, so we should expect to f<strong>in</strong>d many<br />

solutions for this problem. Indeed, the order of the sources cannot be recovered—the speakers<br />

at the cocktail party do not have numbers—so there is always an <strong>in</strong>herent permutation <strong>in</strong>determ<strong>in</strong>acy.<br />

Moreover, also the strength of each source cannot be extracted from this model alone,<br />

because f and s(t) can <strong>in</strong>terchange so-called scal<strong>in</strong>g factors. In other words, by not know<strong>in</strong>g<br />

the power of each speaker at the cocktail party, we can only extract his speech, but not the<br />

volume—he could also be stand<strong>in</strong>g further away from the microphones, but shout<strong>in</strong>g <strong>in</strong>stead of<br />

speak<strong>in</strong>g.<br />

One of the key questions <strong>in</strong> ICA-based source separation is the question: are there any other<br />

<strong>in</strong>determ<strong>in</strong>acies? Without fully answer<strong>in</strong>g this question, ICA algorithms cannot be applied to<br />

BSS, simply because we would not have any clue how to relate the result<strong>in</strong>g sources to the<br />

orig<strong>in</strong>al ones. But apparently, the set of <strong>in</strong>determ<strong>in</strong>acies cannot be very large—after all, at a<br />

cocktail party, we ourselves are able to dist<strong>in</strong>guish between the various speakers.<br />

1.2.1 L<strong>in</strong>ear case<br />

In 1994, Comon was able to answer this question (Comon, 1994) <strong>in</strong> the l<strong>in</strong>ear case where f = A<br />

by reduc<strong>in</strong>g it to the Darmois-Skitovitch theorem (Darmois, 1953, Skitovitch, 1953, 1954). Essentially,<br />

he showed that if the sources conta<strong>in</strong> at most one Gaussian component, the <strong>in</strong>determ<strong>in</strong>acies<br />

of the above model are only scal<strong>in</strong>g and permutation. This positive answer more or<br />

less made the field popular; from then on the number of papers published <strong>in</strong> this field each year<br />

<strong>in</strong>creased considerably. However, it may be argued that Comon’s proof lacked two po<strong>in</strong>ts: by<br />

us<strong>in</strong>g the rather difficult to prove old theorem by the two statisticians, the central idea why<br />

there are no more <strong>in</strong>determ<strong>in</strong>acies is not at all obvious. Hence not many attempts have been<br />

made to extend it to more general situations. Furthermore, no algorithm can be extracted from<br />

the proof, because it is non-constructive.<br />

In (Theis, 2004a), we took a somewhat different approach. Instead of us<strong>in</strong>g Comon’s idea of<br />

m<strong>in</strong>imal mutual <strong>in</strong>formation, we reformulated the condition of source <strong>in</strong>dependence <strong>in</strong> a different<br />

way: <strong>in</strong> simple terms, a two-dimensional source vector s is <strong>in</strong>dependent if its density ps factorizes<br />

<strong>in</strong>to two one-component densities ps1 and ps2 . But this is the case only if ln ps is the sum of<br />

one-dimensional functions, each depend<strong>in</strong>g on a different variable. Hence tak<strong>in</strong>g the differential<br />

with respect to s1 and then to s2 always yields zero. In other words, the Hessian Hln ps of<br />

the logarithmic densities of the sources is diagonal—this is what we denoted by ps be<strong>in</strong>g a<br />

‘separated function’ <strong>in</strong> Theis (2004a), see chapter 2. Us<strong>in</strong>g only this property, we were able to<br />

prove Comon’s uniqueness theorem (Theis, 2004a, theorem 2) without hav<strong>in</strong>g to resort to the<br />

Darmois-Skitovitch theorem. Here Gl(n) def<strong>in</strong>es the group of <strong>in</strong>vertible (n × n)-matrices.<br />

Theorem 1.2.1 (Separability of l<strong>in</strong>ear BSS). Let A ∈ Gl(n; R) and s be an <strong>in</strong>dependent random<br />

vector. Assume that s has at most one Gaussian component and that the covariance of s exists.<br />

Then As is <strong>in</strong>dependent if and only if A is the product of a scal<strong>in</strong>g and permutation matrix.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!