14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 10. IEEE TNN 16(4):992-996, 2005 157<br />

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 4, JULY 2005 993<br />

are satisfied (see Theorem 1). When the sources are locally very sparse<br />

(see condition i) of Theorem 2) the matrix identification algorithm<br />

is much simpler. We used this simpler form for separation of mixtures<br />

of images. After sparsification transformation (which is <strong>in</strong> fact<br />

appropriate wavelet transformation) the algorithm works perfectly <strong>in</strong><br />

the complete case. We demonstrate the effectiveness of our general<br />

matrix identification algorithm and the source recovery algorithm<br />

<strong>in</strong> the underdeterm<strong>in</strong>ed case for 7 artificially created sparse source<br />

signals, such that thesourcematrix ƒ has at most 2 nonzero elements<br />

<strong>in</strong> each column, mixed with a randomly generated (3 2 7) matrix.<br />

For a comparison, we present a recovery us<strong>in</strong>g �I-norm m<strong>in</strong>imization<br />

[3], [4], which gives signals that are far from the orig<strong>in</strong>al ones. This<br />

implies that the conditions which ensure equivalence of �I-norm and<br />

�H-norm m<strong>in</strong>imization [4], Theorem 7, are generally not satisfied for<br />

randomly generated matrices. Note that �I-norm m<strong>in</strong>imization gives<br />

solutions which haveat most � nonzeros [3], [4]. Another connection<br />

with [4] is the fact that our algorithm for source recovery works “with<br />

probability one,” i.e., for almost all data vectors � (<strong>in</strong> measure sense)<br />

such that thesystem � a e� has a sparsesolution with less than<br />

� nonzero elements, this solution is unique, while <strong>in</strong> [4] the authors<br />

proved that for all data vectors � such that thesystem � a e� has<br />

a sparsesolution with less than Spark@eAaP nonzero elements, this<br />

solution is unique. Note that Spark@eA � CI, where Spark@eA is<br />

the smallest number of l<strong>in</strong>early dependent columns of e.<br />

exist � <strong>in</strong>dexes ���� � �aI &�IY FFFYx� such that any � 0 I vector<br />

columns of �ƒ@XY��A� � �aI form a basis of the @�0 IA-dimensional coord<strong>in</strong>atesubspaceof<br />

� with zero coord<strong>in</strong>ates given by �IY FFFY���t.<br />

Because of the mix<strong>in</strong>g model, vectors of the form<br />

�� a ƒ@�Y ��A—�Y � aIYFFFY� �Pt<br />

II. BLIND SOURCE SEPARATION<br />

In this section, we develop a method for completely solv<strong>in</strong>g the BSS<br />

problem if the follow<strong>in</strong>g assumptions are satisfied:<br />

A1) themix<strong>in</strong>g matrix e P s‚ �2� belong to the data matrix ˆ. Now, by condition A1) it follows that any<br />

�0I of thevectors ����<br />

has theproperty that any<br />

square � 2 � submatrix of it is nons<strong>in</strong>gular;<br />

A2) each column of the source matrix ƒ has at most �0I nonzero<br />

elements;<br />

A3) the sources are sufficiently rich represented <strong>in</strong> the follow<strong>in</strong>g<br />

sense: for any <strong>in</strong>dex set of � 0 � C Ielements<br />

s a ��IY FFFY��0�CI� &�IYFFFY�� there exist at least �<br />

column vectors of the matrix ƒ such that each of them has<br />

zero elements <strong>in</strong> places with <strong>in</strong>dexes <strong>in</strong> s and each � 0 I of<br />

them are l<strong>in</strong>early <strong>in</strong>dependent.<br />

A. Matrix Identification<br />

We describe conditions <strong>in</strong> the sparse BSS problem under which we<br />

can identify the mix<strong>in</strong>g matrix uniquely up to permutation and scal<strong>in</strong>g<br />

of thecolumns. Wegivetwo typeof such conditions. Thefirst one<br />

corresponds to the least sparsest case <strong>in</strong> which such identification is<br />

possible. Further, we consider the most sparsest case (for small number<br />

of samples) as <strong>in</strong> this case the algorithm is much simpler.<br />

1) General Case—Full Identifiability:<br />

Theorem 1: (Identifiability Conditions—General Case): Assume<br />

that <strong>in</strong> the representation ˆ a eƒ thematrix e satisfies condition<br />

A1), thematrix ƒ satisfies conditions A2) and A3) and only the matrix<br />

ˆ is known. Then the mix<strong>in</strong>g matrix e is identifiable uniquely up to<br />

permutation and scal<strong>in</strong>g of the columns.<br />

Proof: It is clear that any column —� of themix<strong>in</strong>g matrix lies<br />

� 0 I<br />

<strong>in</strong> the <strong>in</strong>tersection of all hyperplanes generated by those<br />

� 0 P<br />

columns of e <strong>in</strong> which —� participates.<br />

We will show that these hyperplanes can be obta<strong>in</strong>ed by the columns<br />

of thedata ˆ under the condition of the theorem. Let t betheset of<br />

all subsets of �IY FFFY�� conta<strong>in</strong><strong>in</strong>g � 0 I elements and let t Pt.<br />

�<br />

Notethat t consists of elements. We will show that the hy-<br />

� 0 I<br />

perplane (denoted by rt ) generated by the columns of e with <strong>in</strong>dexes<br />

from t can beobta<strong>in</strong>ed by somecolumns of ˆ. By A2) and A3), there<br />

�0I<br />

�aI are l<strong>in</strong>early <strong>in</strong>dependent, which implies<br />

that they will span the same hyperplane rt . By A1) and theabove,<br />

�<br />

it follows that wecan cluster thecolumns of ˆ <strong>in</strong> groups<br />

� 0 I<br />

�<br />

r�Y� a IYFFFY uniquely such that each group r� con-<br />

� 0 I<br />

ta<strong>in</strong>s at least � elements and they span one hyperplane rt for some<br />

t� Pt. Now we cluster the hyperplanes obta<strong>in</strong>ed <strong>in</strong> such a way <strong>in</strong> the<br />

smallest number of groups such that the <strong>in</strong>tersection of all hyperplanes<br />

<strong>in</strong> each group gives a s<strong>in</strong>gle one-dimensional (1-D) subspace. It is clear<br />

that such 1-D subspacewill conta<strong>in</strong> onecolumn of themix<strong>in</strong>g matrix,<br />

the number of these groups is � and each group consists of<br />

hyperplanes.<br />

� 0 I<br />

� 0 P<br />

The proof of this theorem gives the idea for the matrix identification<br />

algorithm.<br />

Algorithm for Identification of the Mix<strong>in</strong>g Matrix:<br />

1) Cluster the columns of ˆ <strong>in</strong><br />

�<br />

� 0 I<br />

groups r�Y� a<br />

�<br />

IY FFFY such that the span of the elements of each<br />

� 0 I<br />

group r� produces one hyperplane and these hyperplanes are<br />

different.<br />

2) Cluster the normal vectors to these hyperplanes <strong>in</strong> the smallest<br />

number of groups q�Y� a IYFFFY� (which gives the number<br />

of sources �) such that the normal vectors to the hyperplanes <strong>in</strong><br />

each group q� lie<strong>in</strong> a new hyperplane ” r�.<br />

3) Calculatethenormal vectors ”—� to each hyperplane ” r�Y� a<br />

IY FFFY�. Notethat the1-D subspacespanned by ”—� is the<strong>in</strong>tersection<br />

of all hyperplanes <strong>in</strong> q�. Thematrix ” e with columns<br />

”—� is an estimation of the mix<strong>in</strong>g matrix (up to permutation and<br />

scal<strong>in</strong>g of thecolumns).<br />

2) Degenerate Case—Sparse Instances:<br />

Theorem 2: (Identifiability Conditions—Locally Very Sparse Representation):<br />

Assume that the number of sources is unknown and the<br />

follow<strong>in</strong>g:<br />

i) for each <strong>in</strong>dex � a IYFFFY� there are at least two columns of<br />

ƒ X ƒ@XY�IAYand ƒ@XY�PA which have nonzero elements only <strong>in</strong><br />

position � (so each source is uniquely present at least twice);<br />

ii) ˆ@XY�A Ta ˆ@XY�A for any P ,any�a IYFFFYx and<br />

any � aIYFFFYxY� Ta � for which ƒ@XY�A has morethat one<br />

nonzero element.<br />

Then the number of sources and the matrix e areidentifiable<br />

uniquely up to permutation and scal<strong>in</strong>g.<br />

Proof: We cluster <strong>in</strong> groups all nonzero normalized column vectors<br />

of ˆ such that each group consists of vectors which differ only by<br />

sign. From conditions i) and ii), it follows that thenumber of thegroups<br />

conta<strong>in</strong><strong>in</strong>g more that one element is precisely the number of sources �,<br />

and that each such group will represent a normalized column of e (up<br />

to sign).<br />

In thefollow<strong>in</strong>g, we<strong>in</strong>cludean algorithm for identification of the<br />

mix<strong>in</strong>g matrix based on Theorem 2.<br />

Algorithm for Identification of the Mix<strong>in</strong>g Matrix <strong>in</strong> the Very Sparse<br />

Case:<br />

1) Remove all zero columns of ˆ (if any) and obta<strong>in</strong> a matrix ˆI P<br />

�2x<br />

.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!