3 years ago

Multichannel Extensions of Non-Negative Matrix ... - IEEE Xplore

Multichannel Extensions of Non-Negative Matrix ... - IEEE Xplore


4Preprocessed observationFactorization modelDistance / DivergenceProbability distribution-element wise:Fig. 5. Illustrative example of multichannel NMF: I =6, J =10, K =2,M = 2. Non-negative values are shown in gray and complex values areshown in red.Fig. 4. Variations of NMF presented in this paper. Items that correspond tostandard single-channel NMF are shown in blue. Three distance/divergencesare discussed, and their corresponding probability distributions are presented.Items that correspond to multichannel extensions of NMF are shown in red.Two distance/divergence are extended to multichannel.A. Formulation (IS divergence)Let M be the number of microphones, and ˜x =[˜x 1 ,...,˜x M ] T ∈ C M be a complex-valued vector for a timefrequencyslot, with ˜x m being the STFT coefficient at the m-thmicrophone. Let ˜x ij be such a vector at frequency bin i andtime frame j. Now, let us introduce a multivariate complexGaussian distribution N c that extends (12)N c (˜x ij |0, ˆX 1()ij ) ∝det ˆXexp −˜x H −1ij ˆX ij ˜x ij , (14)ijwhere t ik and v kj are non-negative scalars as in the singlechannelcase. To solve the scaling ambiguity between H ik andt ik , let H ik have a unit trace tr(H ik )=1.In a matrix-wise notation, let X and H be I × J andI × K hierarchical matrices whose elements are M × Mmatrices, i.e., [X] ij = X ij and [H] ik = H ij . Figure 5provides an illustrative example in which multichannel NMFfactorizes a hierarchically structured matrix X into the productof H ◦ T and V, where ◦ represents the Hadamard product,i.e., [H◦T] ik = H ik t ik . The multichannel NMF is formulatedto minimize the total multichannel divergence similar to (3)D ∗ (X, {T, V, H}) =I∑i=1 j=1J∑d ∗ (X ij , ˆX ij ) (18)where d ∗ is the element-wise multichannel divergence such as(16) in the IS divergence case.where ˆX ij is an M × M covariance matrix that should beHermitian positive definite. Let X ij = ˜x ij ˜x H ij or⎡⎤|˜x 1 | 2 ... ˜x 1˜x ∗ MX = ˜x ˜x H ⎢=.⎣.. ... ⎥. ⎦ (15)˜x M ˜x ∗ 1 ... |˜x M | 2be the outer product of a complex-valued vector. We thendefine the multichannel IS divergence similarly to (13)d IS (X ij , ˆX ij )=logN c (˜x ij |0, X ij ) − log N c (˜x ij |0, ˆX ij )[= − log det X ij − tr(X ij X −1ij ) − − log det ˆX]ij − tr(X ˆX−1 ij ij )=tr(X ˆX−1 ij ij ) − log det X −1ij ˆXij− M, (16)where tr(X) = ∑ Mm=1 x mm is the trace of a square matrix X.We assume that the source locations are time-invariant in asource separation task (see Fig. 2). Therefore, we introduce amatrix H ik that models the spatial property of the k-th NMFbasis at frequency bin i. The matrix is of size M × M tobe matched with the size of ˆX ij . Also, the matrix H ik isHermitian positive semidefinite to possess the non-negativityin a multichannel sense. Then, we model ˆX ij with a sum-ofproductformK∑ˆX ij = H ik t ik v kj , (17)k=1B. Formulation (Squared Euclidean distance)In this subsection, we consider a multichannel extension ofEuclidean NMF. Thanks to the versatility of a univariate complexGaussian distribution N c , we can model the preprocessedobservations X, [X] ij = X ij ,asp(X|T, V, H) =∝I∏I∏J∏M∏i=1 j=1 m=1 n=1i=1 j=1M∏N c ([X ij ] mn |[ˆX ij ] mn , 1)J∏exp(−||X ij − ˆX ij || 2 F ) , (19)where ||B|| 2 F = ∑ Mm=1∑ Mn=1 |b mn| 2 is the squared Frobeniusnorm of matrix B. Maximizing the log of the likelihood (19) isequivalent to minimizing the distance (18) with the elementwisemultichannel distanced Eu (X ij , ˆX ij )=||X ij − ˆX ij || 2 F . (20)Therefore, multichannel Euclidean NMF has been formulatedas minimizing (18) with (20).When applying standard single-channel Euclidean NMF, itis typical that the absolute value x ij = |˜x ij | in (1) is employedrather than the squared value to prevent some observationsfrom being unnecessarily enhanced. In the same sense, anamplitude square-rooted version of the outer product (15)

5would be useful in the multichannel Euclidean NMF:⎡|x 1 | ... |x 1 x M | 1 2 sign(x 1 x ∗ M⎢) ⎤X =.⎣..... ⎥. ⎦ ,|x M x 1 | 1 2 sign(x M x ∗ 1 ) ... |x M |(21)where sign(x) = x|x| .C. Algorithm: (Multiplicative Update Rules)As shown in the next two subsections, the following multiplicativeupdate rules are derived to minimize the totaldistance/divergence (18) with (16) or (20). These update rulesreduce to their single channel counterparts (9) and (7) ifM =1, X ij = x ij and H ik =1. Therefore, sets of theseupdates constitute multichannel extensions of NMF.IS-NMF (IS divergence)ij X ij√ √√√ ∑jt ik ← t v kjtr(ˆX −1ikˆX−1ij H ik)∑j v kjtr(ˆX −1ij H ik)ij X ij√ √√√ ∑iv kj ← v t iktr(ˆX −1kjˆX−1ij H ik)∑i t iktr(ˆX −1ij H ik)(22)(23)To update H ik , we solve an algebraic Riccati equation (seeAppendix I)H ik AH ik = B (24)withA = ∑ jv kj ˆX−1 ij⎛, B = H′ ⎝ ∑ ikjv kj ˆX−1 ij X ijwhere H ′ ikis the target matrix before the update.EU-NMF (Squared Euclidean distance)∑jt ik ← t v kjtr(X ij H ik )ik ∑j v kjtr(ˆX ij H ik )v kj ← v kj∑i t iktr(X ij H ik )∑i t iktr(ˆX ij H ik )ˆX−1ij⎞⎠ H ′ ik(25)(26)( ∑H ik ← H ik j v ˆX) −1 ( ∑ )kj ij j v kjX ij . (27)Post-processing is needed to make H ik Hermitian and positivesemidefinite. This can be accomplished by H ik ← 1 2 (H ik +H H ik ) and then by performing eigenvalue decomposition asH ik = UDU H , setting all the negative elements of D atzero, and updating H ik ← UDU H with the new D. We confirmedempirically that the update (27) followed by the postprocessingalways decreases the squared Euclidean distance.However, we have not yet found a theoretical guarantee.For both the IS and Euclidean cases, unit-trace normalizationH ik ← H ik /tr(H ik ) should follow.D. Derivation of algorithm (IS-NMF)This subsection explains the derivation of the multiplicativeupdate rules (22)-(24) for IS divergence. For a given observationX, the total distance (18) together with (16) can bewritten asf(T, V, H) = ∑ []tr(X ˆX−1 ij ij )+logdetˆX ij , (28)i,jwhere constant terms are omitted. To minimize this functionf(T, V, H), we follow the optimization scheme of majorization[21], [22], in which an auxiliary (majorization) functionis used. Let us define an auxiliary functionf + (T, V, H, R, U) = ∑ (29)i,j[ ∑ tr(X ij R H ijk H−1 ik R ijk)+logdetU ij + det ˆX]ij − det U ijt ik v kjdet U ijkwhere R ijk and U ij are auxiliary variables that satisfy positivedefiniteness, ∑ k R ijk = I with I being an identity matrix ofsize M, and U ij = U H ij (Hermitian). It can be verified that theauxiliary function f + has two properties:1) f(T, V, H) ≤ f + (T, V, H, R, U)2) f(T, V, H) =min R,U f + (T, V, H, R, U)and the equality f = f + is satisfied whenR ijk = t ik v kj H ˆX−1 ik ij , U ij = ˆX ij (30)(see Appendix II for the proof).The function f is indirectly minimized by repeating thefollowing two steps:1) Minimizing f + with respect to R and U by (30), whichmakes f(T, V, H) =f + (T, V, H, R, U).2) Minimizing f + with respect to T, V or H, which alsominimizes f.For the second step, we calculate the partial derivatives of f +w.r.t. the variables T, V and H. Setting these derivatives atzero, we have the following equations.t 2 ik ← ∑j1v kjtr(R H ijk H−1 ik R ijkX ij )∑ det ˆX ijj det U ijv kj tr(ˆX −1ij H ik)(31)∑1vkj 2 ← i t iktr(R H ijk H−1 ik R ijkX ij )∑ det ˆX iji det U ijt ik tr(ˆX −1ij H ik)(32)⎛⎞∑H ik⎝t ikˆX −1ij v ⎠kj H ik = ∑ R ijk X ij R H ijktjj ik v kj(33)By substituting (30) into these equations, we obtain themultiplicative update rules (22)-(24).E. Derivation of algorithm (EU-NMF)The EU-NMF updates (25)-(27) can be derived in a similarmanner. For a given observation X, the total distance (18)

IEEE Signal Processing Magazine - IEEE Xplore
Extension of the Matrix Bartlett's Formula to the Third ... - IEEE Xplore
Recognition of non-negative patterns - Pattern ... - IEEE Xplore
Natural Gradient Multichannel Blind Deconvolution ... - IEEE Xplore
An Adaptive Nearest Neighbor Multichannel Filter ... - IEEE Xplore
Adaptive filters and multichannel signal processing ... - IEEE Xplore
Multidimensional Multichannel FIR Deconvolution ... - IEEE Xplore
Time-frequency spectral estimation of multichannel ... - IEEE Xplore
Noise Correlation Matrix Estimation for Multi ... - IEEE Xplore
Unsupervised Change Detection From Multichannel ... - IEEE Xplore
Multichannel Raman Gas Analyzer: The Data ... - IEEE Xplore
Restoration of images corrupted by additive non ... - IEEE Xplore
Multichannel Mobile Ad Hoc Links for Multimedia ... - IEEE Xplore
Non-negative matrix factorization with Gaussian ... - Mikkel N. Schmidt
for toeplitz matrices - IEEE Xplore
The decompositional approach to matrix computation ... - IEEE Xplore
Pseudosaturation and Negative Differential ... - IEEE Xplore
Multichannel Extension of the Self-optimizing Narrowband ...
An Extension of the Argument Principle and Nyquist ... - IEEE Xplore
Cellular Systems with Non-Regenerative Relaying ... - IEEE Xplore
Locality-Preserving Nonnegative Matrix Factorization ... - IEEE Xplore
Analysis of Multiantenna Architectures for Non-LOS ... - IEEE Xplore
Source Separation and Clustering of Phase-Locked ... - IEEE Xplore
The Effect of Negative Feedback on Single Event ... - IEEE Xplore
Exact Derivation of the Nonlinear Negative ... - IEEE Xplore
The optical analog of negative temperature coefficient ... - IEEE Xplore
Robust Non-negative Matrix Factorization
The Eigenvalues Of Matrices That Occur In Certain ... - IEEE Xplore