11.03.2015 Views

MIMO Detection for High-Order QAM Based on a ... - IEEE Xplore

MIMO Detection for High-Order QAM Based on a ... - IEEE Xplore

MIMO Detection for High-Order QAM Based on a ... - IEEE Xplore

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011 4973<br />

<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> <str<strong>on</strong>g>Detecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>High</str<strong>on</strong>g>-<str<strong>on</strong>g>Order</str<strong>on</strong>g> <str<strong>on</strong>g>QAM</str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> a<br />

Gaussian Tree Approximati<strong>on</strong><br />

Jacob Goldberger and Amir Leshem, Senior Member, <strong>IEEE</strong><br />

Abstract—This paper proposes a new detecti<strong>on</strong> algorithm <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> communicati<strong>on</strong> systems employing high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s.<br />

The factor graph that corresp<strong>on</strong>ds to this problem is<br />

very loopy; in fact, it is a complete graph. Hence, a straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward<br />

applicati<strong>on</strong> of the Belief Propagati<strong>on</strong> (BP) algorithm yields very<br />

poor results. Our algorithm is based <strong>on</strong> an optimal tree approximati<strong>on</strong><br />

of the Gaussian density of the unc<strong>on</strong>strained linear system.<br />

The finite-set c<strong>on</strong>straint is then applied to obtain a cycle-free discrete<br />

distributi<strong>on</strong>. Simulati<strong>on</strong> results show that even though the<br />

approximati<strong>on</strong> is not directly applied to the exact discrete distributi<strong>on</strong>,<br />

applying the BP algorithm to the cycle-free factor graph<br />

outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms current methods in terms of both per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance and<br />

complexity. The improved per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the proposed algorithm<br />

is dem<strong>on</strong>strated <strong>on</strong> the problem of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong>.<br />

Index Terms—<str<strong>on</strong>g>High</str<strong>on</strong>g>-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g>, integer least squares, <str<strong>on</strong>g>MIMO</str<strong>on</strong>g><br />

communicati<strong>on</strong> systems, <str<strong>on</strong>g>MIMO</str<strong>on</strong>g>-OFDM systems.<br />

Here, noise is modeled by the random vector which is independent<br />

of and whose comp<strong>on</strong>ents are assumed to be i.i.d. according<br />

to a complex Gaussian distributi<strong>on</strong> with mean zero and<br />

with known variance . The matrix comprises i.i.d.<br />

elements drawn from a complex normal distributi<strong>on</strong> of unit variance.<br />

The <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem c<strong>on</strong>sists of finding the unknown<br />

transmitted vector given and . The task, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e,<br />

boils down to solving a linear system in which the unknowns<br />

are c<strong>on</strong>strained to a discrete finite set. It is c<strong>on</strong>venient to re<str<strong>on</strong>g>for</str<strong>on</strong>g>mulate<br />

the complex-valued model into a real-valued <strong>on</strong>e. It can<br />

be translated into an equivalent double-size real-valued representati<strong>on</strong><br />

that is obtained by c<strong>on</strong>sidering the real and imaginary<br />

parts separately<br />

I. INTRODUCTION<br />

F<br />

INDING A linear least squares fit to data is a well-known<br />

problem, with applicati<strong>on</strong>s in almost every field of<br />

science. When there are no restricti<strong>on</strong>s <strong>on</strong> the variables, the<br />

problem has a closed-<str<strong>on</strong>g>for</str<strong>on</strong>g>m soluti<strong>on</strong>. In many cases, a priori<br />

knowledge <strong>on</strong> the values of the variables is available. One<br />

example is the existence of priors, which leads to Bayesian<br />

estimators. Another example of great interest in a variety of<br />

areas is when the variables are c<strong>on</strong>strained to a discrete finite<br />

set. This problem has many diverse applicati<strong>on</strong>s such as the<br />

decoding of multi-input-multi-output (<str<strong>on</strong>g>MIMO</str<strong>on</strong>g>) digital communicati<strong>on</strong><br />

systems. In c<strong>on</strong>trast to the c<strong>on</strong>tinuous linear least<br />

squares problem, this problem is known to be NP hard [1].<br />

We c<strong>on</strong>sider a <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> communicati<strong>on</strong> system with transmit<br />

antennas and receive antennas. The tap gain from transmit antenna<br />

to receive antenna is denoted by . In each use of the<br />

<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channel a vector<br />

is independently<br />

selected from a finite set of complex numbers according to<br />

the data to be transmitted, so that . We further assume<br />

that in each use of the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channel, is uni<str<strong>on</strong>g>for</str<strong>on</strong>g>mly sampled<br />

from . The received vector is given by<br />

Manuscript received January 20, 2010; revised April 13, 2011; accepted April<br />

23, 2011. Date of current versi<strong>on</strong> July 29, 2011. The material in this paper was<br />

presented in part at the Neural In<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> Processing Systems C<strong>on</strong>ference,<br />

Vancouver, BC, Canada, Dec. 2009.<br />

J. Goldberger is with the School of Engineering, Bar-Ilan University, 52900,<br />

Ramat-Gan, Israel (e-mail: goldbej@eng.biu.ac.il).<br />

A. Leshem is with the School of Engineering, Bar-Ilan University, 52900,<br />

Ramat-Gan, Israel (e-mail: leshema@eng.biu.ac.il).<br />

Communicated by P. O. V<strong>on</strong>tobel, Associate Editor <str<strong>on</strong>g>for</str<strong>on</strong>g> Coding Techniques.<br />

Digital Object Identifier 10.1109/TIT.2011.2159037<br />

(1)<br />

Hence we assume hereafter that has real values without any<br />

loss of generality. The maximum likelihood (ML) soluti<strong>on</strong> is<br />

However, going over all the vectors is practically unfeasible<br />

when either or are large.<br />

A simple suboptimal soluti<strong>on</strong>, known as the zero-<str<strong>on</strong>g>for</str<strong>on</strong>g>cing algorithm,<br />

is based <strong>on</strong> a linear decisi<strong>on</strong> that ignores the finite-set<br />

c<strong>on</strong>straint<br />

and then, neglecting the correlati<strong>on</strong> between the symbols,<br />

finding the closest point in <str<strong>on</strong>g>for</str<strong>on</strong>g> each symbol independently<br />

This scheme per<str<strong>on</strong>g>for</str<strong>on</strong>g>ms poorly due to its inability to handle illc<strong>on</strong>diti<strong>on</strong>ed<br />

realizati<strong>on</strong>s of the matrix (if then is<br />

even singular). Somewhat better per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance can be obtained<br />

by using a minimum mean square error (MMSE) Bayesian estimati<strong>on</strong><br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> the c<strong>on</strong>tinuous linear system. Let be the mean<br />

symbol energy. We can partially incorporate the in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong><br />

that by using the prior Gaussian distributi<strong>on</strong><br />

. The MMSE estimati<strong>on</strong> becomes<br />

and then the finite-set soluti<strong>on</strong> is obtained by finding the<br />

closest lattice point in each comp<strong>on</strong>ent independently. A vast<br />

improvement over the linear approaches described above can<br />

(2)<br />

(3)<br />

(4)<br />

(5)<br />

0018-9448/$26.00 © 2011 <strong>IEEE</strong>


4974 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />

be achieved by the MMSE with Successive Interference Cancelati<strong>on</strong><br />

(MMSE-SIC) algorithm that is based <strong>on</strong> sequential<br />

decoding with optimal ordering [2]. These linear type algorithms<br />

can also easily provide probabilistic (soft-decisi<strong>on</strong>)<br />

estimates <str<strong>on</strong>g>for</str<strong>on</strong>g> each symbol. However, there is still a significant<br />

gap between the detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the MMSE-SIC<br />

algorithm and the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the ML detector.<br />

Many alternative methods have been proposed to approach the<br />

ML detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. The sphere decoding (SD) algorithm<br />

finds the exact ML soluti<strong>on</strong> by searching the nearest lattice point<br />

[1], [3]–[5]. Although SD reduces computati<strong>on</strong>al complexity<br />

compared to the exhaustive search of the ML soluti<strong>on</strong>, sphere<br />

decoding is not feasible <str<strong>on</strong>g>for</str<strong>on</strong>g> high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s.<br />

While SD has been empirically found to be computati<strong>on</strong>ally very<br />

fast <str<strong>on</strong>g>for</str<strong>on</strong>g> small to moderate problem sizes (say, <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g>), the sphere decoding complexity would be prohibitive<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> large , higher order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> and/or low SNRs [6]. Another<br />

family of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding algorithms is based <strong>on</strong> semidefinite<br />

relaxati<strong>on</strong> (e.g., [7]–[9]). Although the theoretical computati<strong>on</strong>al<br />

complexity of semidefinite relaxati<strong>on</strong> is a low degree polynomial,<br />

in practice the running time is very high. Thus, there is still a<br />

need <str<strong>on</strong>g>for</str<strong>on</strong>g> low-complexity detecti<strong>on</strong> algorithms that per<str<strong>on</strong>g>for</str<strong>on</strong>g>m well.<br />

This study attempts to solve the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding problem<br />

using the Belief Propagati<strong>on</strong> (BP) paradigm. It is well-known<br />

(see e.g., [10]) that a straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward implementati<strong>on</strong> of the BP<br />

algorithm to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem yields very poor results<br />

since there are a large number of short cycles in the underlying<br />

factor graph. In this study we introduce a novel approach<br />

to utilize the BP paradigm <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong>. The proposed<br />

variant of the BP algorithm is both computati<strong>on</strong>ally efficient and<br />

achieves improved results. The paper proceeds as follows. In<br />

Secti<strong>on</strong> II we discuss previous attempts to apply variants of the<br />

BP algorithm to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding problem. The proposed<br />

algorithm which we dub “The Gaussian-Tree-Approximati<strong>on</strong><br />

(GTA) Algorithm” is described in Secti<strong>on</strong> III. Experimental results<br />

are presented in Secti<strong>on</strong> IV.<br />

II. THE LOOPY BELIEF PROPAGATION APPROACH<br />

Given the c<strong>on</strong>strained linear system<br />

, and a uni<str<strong>on</strong>g>for</str<strong>on</strong>g>m<br />

prior distributi<strong>on</strong> <strong>on</strong> over a finite set of points , the<br />

posterior probability functi<strong>on</strong> of the discrete random vector<br />

given is<br />

The notati<strong>on</strong> stands <str<strong>on</strong>g>for</str<strong>on</strong>g> equality up to a normalizati<strong>on</strong> c<strong>on</strong>stant.<br />

Observing that is a quadratic expressi<strong>on</strong>, it can<br />

be easily verified that can be factorized into a product of<br />

two- and single-variable potentials<br />

such that<br />

(6)<br />

(7)<br />

(8)<br />

Fig. 1. The MRF undirected graphical model corresp<strong>on</strong>ds to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong><br />

problem with n =7.<br />

where is the th column of the matrix . Since the obtained<br />

factors are simply functi<strong>on</strong>s of pairs of random variables, we obtain<br />

a Markov Random Field (MRF) representati<strong>on</strong> [11]. In the<br />

<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> applicati<strong>on</strong> the (known) matrix is randomly selected<br />

and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e the MRF graph is usually a completely c<strong>on</strong>nected<br />

graph (see an MRF graph illustrati<strong>on</strong> in Fig. 1).<br />

The Belief Propagati<strong>on</strong> (BP) algorithm aims to solve inference<br />

problems by propagating in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> throughout this MRF<br />

via a series of messages sent between neighboring nodes (see<br />

[12] <str<strong>on</strong>g>for</str<strong>on</strong>g> an excellent tutorial <strong>on</strong> BP). In the sum-product variant<br />

of the BP algorithm applied to the MRF (7), the message from<br />

to is<br />

(9)<br />

In each iterati<strong>on</strong> messages are passed al<strong>on</strong>g all the graph edges<br />

in both edge directi<strong>on</strong>s. In every iterati<strong>on</strong>, an estimate of the<br />

posterior marginal distributi<strong>on</strong> (“belief”) <str<strong>on</strong>g>for</str<strong>on</strong>g> each variable can<br />

be computed by multiplying together all the incoming messages<br />

from all the other nodes<br />

(10)<br />

A variant of the sum-product algorithm is the max-product algorithm<br />

in which the summati<strong>on</strong> in (9) is replaced by a maximizati<strong>on</strong><br />

over all the symbols in . In a cycle-free MRF graph<br />

the sum-product algorithm always c<strong>on</strong>verges to the exact marginal<br />

probabilities (which in the case of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> corresp<strong>on</strong>ds<br />

to a soft decisi<strong>on</strong> of each symbol, i.e., ). In a cyclefree<br />

MRF graph the max-product variant of the BP algorithm<br />

always c<strong>on</strong>verges to the most likely c<strong>on</strong>figurati<strong>on</strong> [13] (which<br />

corresp<strong>on</strong>ds to ML decoding in our case). For cycle-free graphs,<br />

BP is essentially a distributed variant of dynamic programming.<br />

The BP message update equati<strong>on</strong>s <strong>on</strong>ly involve passing messages<br />

between neighboring nodes. Computati<strong>on</strong>ally, it is thus straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward<br />

to apply the same local message updates in graphs with<br />

cycles. In most such models, however, this loopy BP algorithm<br />

will not compute exact marginal distributi<strong>on</strong>s; hence, there is almost<br />

no theoretical justificati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> applying the BP algorithm<br />

(<strong>on</strong>e excepti<strong>on</strong> is that, <str<strong>on</strong>g>for</str<strong>on</strong>g> Gaussian graphs, if BP c<strong>on</strong>verges, then<br />

the means are correct [14]). However, the BP algorithm applied to<br />

loopy graphs has been found to have outstanding empirical success<br />

in many applicati<strong>on</strong>s, e.g., in decoding of LDPC codes [15].<br />

The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of BP in this applicati<strong>on</strong> may be attributed to<br />

the sparsity of the graphs. The cycles in the graph are l<strong>on</strong>g, hence


GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4975<br />

Fig. 2. Decoding results <str<strong>on</strong>g>for</str<strong>on</strong>g> 8 2 8 BPSK real valued system, A = f01; 1g.<br />

the graph has tree-like properties, so that messages are approximately<br />

independent and inference may be per<str<strong>on</strong>g>for</str<strong>on</strong>g>med as though<br />

the graph was cycle-free. The BP algorithm has also been used<br />

successfully in image processing and computer visi<strong>on</strong> (e.g., [16])<br />

where the image is represented by a grid-structured MRF that is<br />

based <strong>on</strong> local c<strong>on</strong>necti<strong>on</strong>s between neighboring nodes.<br />

However, when the graph is not sparse, and is not based <strong>on</strong><br />

local grid c<strong>on</strong>necti<strong>on</strong>s, loopy BP almost always fails to c<strong>on</strong>verge.<br />

Unlike the sparse graphs of LDPC codes, or grid graphs in<br />

computer visi<strong>on</strong> applicati<strong>on</strong>s, the MRF graphs of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channels<br />

are completely c<strong>on</strong>nected graphs and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e the associated<br />

detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance is poor. This has prevented the BP<br />

from being an asset <str<strong>on</strong>g>for</str<strong>on</strong>g> the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> problem. Fig. 2 shows per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance<br />

curves <str<strong>on</strong>g>for</str<strong>on</strong>g> a BPSK <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> system based <strong>on</strong> an 8 8<br />

matrix and<br />

(see Secti<strong>on</strong> IV <str<strong>on</strong>g>for</str<strong>on</strong>g> a detailed descripti<strong>on</strong><br />

of the simulati<strong>on</strong> setup). As can be seen in Fig. 2, the BP<br />

decoder based <strong>on</strong> the MRF representati<strong>on</strong> (7) yields very poor<br />

results. Standard techniques to stabilize the BP iterati<strong>on</strong>s such<br />

as damping the message updates [17] do not help here. Even applying<br />

more advanced versi<strong>on</strong>s of BP (e.g., Generalized BP and<br />

Expectati<strong>on</strong> Propagati<strong>on</strong> [18]) to inference problems <strong>on</strong> complete<br />

MRF graphs yields poor results. The problem here is not<br />

in the optimizati<strong>on</strong> method but in the cost functi<strong>on</strong> that needs to<br />

be modified to yield a good approximate soluti<strong>on</strong> (see e.g., [19]).<br />

Shental et al. [10] analyzed the case where the matrix is<br />

relatively sparse (and has a grid structure) (see Fig. 3). They<br />

showed that even under this restricted assumpti<strong>on</strong> the BP still<br />

does not per<str<strong>on</strong>g>for</str<strong>on</strong>g>m well. As an alternative method they proposed<br />

the generalized belief propagati<strong>on</strong> (GBP) algorithm that does<br />

work well <strong>on</strong> the sparse matrix if the algorithm regi<strong>on</strong>s are carefully<br />

chosen. There are situati<strong>on</strong>s where the sparsity assumpti<strong>on</strong><br />

makes sense (e.g., 2-D intersymbol interference (ISI) channels).<br />

However, in the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channel model we assume that<br />

the channel matrix elements are i.i.d. and Gaussian; hence we<br />

cannot assume that the channel matrix is sparse.<br />

There is yet another way to write the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong><br />

problem (6) as a factor graph [20]. It can be easily verified that<br />

(11)<br />

Fig. 3. The MRF grid model corresp<strong>on</strong>ds to 5 2 5 2-D intersymbol interference<br />

(ISI) channels.<br />

Fig. 4. Factor graph representati<strong>on</strong> of the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> system. The variables<br />

of the factor graph x ; ...;x corresp<strong>on</strong>d to the transmit antennas and<br />

the factors y ; ...;y corresp<strong>on</strong>d to the receive antennas.<br />

with is the th row of . Each factor corresp<strong>on</strong>ds to a single<br />

linear equati<strong>on</strong> and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e each factor is a functi<strong>on</strong> of all the<br />

unknown variables. An illustrati<strong>on</strong> of the resulting factor graph<br />

is shown in Fig. 4. The variables of the factor graph corresp<strong>on</strong>d<br />

to the transmit antennas and the factors corresp<strong>on</strong>d to the receive<br />

antennas. Note that the factor graph representati<strong>on</strong> of (11)<br />

is a completely c<strong>on</strong>nected bipartite graph. Applying the BP algorithm<br />

<strong>on</strong> the factor graph (11) yields the following messages.<br />

to the vari-<br />

The BP message from the factor associated with<br />

able is<br />

and the BP message from variable to factor is<br />

(12)<br />

(13)<br />

A major drawback of applying BP to the factor graph (11) is<br />

the complexity of computing the factor-to-variable BP messages<br />

(12), which is exp<strong>on</strong>ential in the number of variables<br />

[21]. Several recent studies suggested an efficiently computed<br />

approximati<strong>on</strong> of the factor-to-variable BP messages based <strong>on</strong><br />

a Gaussian approximati<strong>on</strong>. The Gaussian approximati<strong>on</strong> can<br />

be justified <str<strong>on</strong>g>for</str<strong>on</strong>g> large values of by the central limit theorem


4976 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />

(see, e.g., [22]–[24]). While these algorithms were originally<br />

described <str<strong>on</strong>g>for</str<strong>on</strong>g> BPSK c<strong>on</strong>stellati<strong>on</strong>s, they can be generalized to<br />

arbitrary <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s.<br />

In recent years there were also several attempts to apply BP<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> densely c<strong>on</strong>nected Gaussian graphs. For example c<strong>on</strong>sider<br />

the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem (6) without the finite-set c<strong>on</strong>straint<br />

<strong>on</strong> the unknown variable (see, e.g., [25]). The Gaussian<br />

MRF model, however, is much easier. In that case the computati<strong>on</strong>al<br />

complexity of finding the exact soluti<strong>on</strong> is a low-degree<br />

polynomial in the number of variables. The BP algorithm can<br />

be utilized to further reduce the complexity.<br />

III. THE GAUSSIAN TREE APPROXIMATION ALGORITHM<br />

Our approach is based <strong>on</strong> an approximati<strong>on</strong> of the exact probability<br />

functi<strong>on</strong><br />

where is a Gaussian density with mean and covariance<br />

matrix . can be viewed as a posterior distributi<strong>on</strong><br />

of assuming a n<strong>on</strong>in<str<strong>on</strong>g>for</str<strong>on</strong>g>mative prior. Note that (15) is<br />

<strong>on</strong>ly defined if , otherwise the matrix is singular and<br />

there<str<strong>on</strong>g>for</str<strong>on</strong>g>e is undefined. Now, instead of marginalizing the true<br />

distributi<strong>on</strong> , which is an NP hard problem, we approximate<br />

it by the product of the marginals of the Gaussian density<br />

(16)<br />

At this stage we apply the finite-set c<strong>on</strong>straint. From the<br />

Gaussian approximati<strong>on</strong> (16) we can extract a discrete approximati<strong>on</strong><br />

(14)<br />

that enables a successful implementati<strong>on</strong> of the Belief Propagati<strong>on</strong><br />

paradigm. Since the BP algorithm is optimal <strong>on</strong> c<strong>on</strong>nected<br />

cycle-free factor graphs, i.e., <strong>on</strong> trees, a reas<strong>on</strong>able approach is<br />

finding an optimal tree approximati<strong>on</strong> of the exact distributi<strong>on</strong><br />

(14). Chow and Liu [26] proposed a method to find a tree<br />

approximati<strong>on</strong> of a given distributi<strong>on</strong> that has the minimal<br />

Kullback-Leibler (KL) divergence to the true distributi<strong>on</strong>.<br />

They showed that the optimal tree can be found efficiently<br />

by searching <str<strong>on</strong>g>for</str<strong>on</strong>g> the maximum-weight spanning tree in the<br />

weighted complete graph <strong>on</strong> vertices where the weight of<br />

an edge is given by the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> of the two random<br />

variables associated with the edge’s endpoints. The problem is<br />

that the Chow-Liu algorithm is based <strong>on</strong> the two-dimensi<strong>on</strong>al<br />

marginal distributi<strong>on</strong>s. However, finding the marginal distributi<strong>on</strong><br />

of the probability functi<strong>on</strong> (14) is, un<str<strong>on</strong>g>for</str<strong>on</strong>g>tunately, NP hard<br />

and it is (equivalent to) our final target.<br />

To overcome this obstacle, our approach is based <strong>on</strong> applying<br />

the Chow-Liu algorithm <strong>on</strong> the distributi<strong>on</strong> corresp<strong>on</strong>ding to the<br />

unc<strong>on</strong>strained linear system. This distributi<strong>on</strong> is Gaussian and<br />

there<str<strong>on</strong>g>for</str<strong>on</strong>g>e it is straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward in this case to compute the two-dimensi<strong>on</strong>al<br />

marginal distributi<strong>on</strong>s. Given the Gaussian tree approximati<strong>on</strong>,<br />

the next step of our approach is to apply the finite-set<br />

c<strong>on</strong>straint and utilize the Gaussian tree distributi<strong>on</strong> to<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g>m a discrete loop free approximati<strong>on</strong> of which can be<br />

efficiently globally maximized using the BP algorithm. To motivate<br />

this approach we first show that the zero-<str<strong>on</strong>g>for</str<strong>on</strong>g>cing decoding<br />

algorithm (4) can be derived by utilizing a Gaussian approximati<strong>on</strong><br />

which is a simplified versi<strong>on</strong> of our Gaussian tree approximati<strong>on</strong>.<br />

Let be the least-squares estimator (3)<br />

and<br />

be its covariance matrix. It can be easily<br />

verified that (14) can be written as<br />

(17)<br />

Since this joint probability functi<strong>on</strong> is obtained as a product of<br />

marginal probabilities, we can decode each variable separately<br />

(18)<br />

Taking the most likely symbol we obtain the suboptimal zero<str<strong>on</strong>g>for</str<strong>on</strong>g>cing<br />

soluti<strong>on</strong> (4).<br />

Motivated by the simple product-of-marginals approximati<strong>on</strong><br />

described above, we suggest approximating the discrete distributi<strong>on</strong><br />

via a tree-based approximati<strong>on</strong> of the Gaussian<br />

distributi<strong>on</strong> . Although the Chow-Liu algorithm was<br />

originally stated <str<strong>on</strong>g>for</str<strong>on</strong>g> discrete distributi<strong>on</strong>s, <strong>on</strong>e can verify that it<br />

also applies <str<strong>on</strong>g>for</str<strong>on</strong>g> the Gaussian case. For the sake of completeness<br />

we give a detailed derivati<strong>on</strong> below.<br />

A. Finding the Optimal Gaussian Tree Approximati<strong>on</strong><br />

We represent an -node tree graph by the cycle-free parent<br />

relati<strong>on</strong>s such that is the parent node of .To<br />

simplify notati<strong>on</strong> we do not separately describe the root node.<br />

The parent of the root is implicitly assumed to be the empty<br />

set. A distributi<strong>on</strong><br />

is described by a tree<br />

if it can be written as<br />

. We start with a<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g>mula <str<strong>on</strong>g>for</str<strong>on</strong>g> the KL divergence between a Gaussian distributi<strong>on</strong><br />

and a distributi<strong>on</strong> defined <strong>on</strong> the same space that is<br />

represented by a tree graphical model.<br />

Theorem 1: Let<br />

be a multivariate<br />

Gaussian distributi<strong>on</strong> and let<br />

be another<br />

distributi<strong>on</strong> that is represented by a cycle-free graphical<br />

model (tree). The KL divergence between and is<br />

(15)<br />

(19)


GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4977<br />

such that is the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> and is the differential<br />

entropy, based <strong>on</strong> the distributi<strong>on</strong> .<br />

Proof: The definiti<strong>on</strong> of the KL divergence implies<br />

From (19) it can be easily seen that if we fix a tree graph<br />

, the tree distributi<strong>on</strong> whose KL divergence to is<br />

minimal is<br />

. In other words, the best<br />

tree approximati<strong>on</strong> of is c<strong>on</strong>structed from the c<strong>on</strong>diti<strong>on</strong>al<br />

distributi<strong>on</strong>s of . For that tree approximati<strong>on</strong> we have<br />

(20)<br />

Moreover, since in (20) and do not depend <strong>on</strong> the<br />

tree structure, the tree topology that best approximates<br />

is the <strong>on</strong>e that maximizes the sum<br />

(21)<br />

A spanning tree of a c<strong>on</strong>nected graph is a subgraph that c<strong>on</strong>tains<br />

all the vertices and is a tree. Now suppose the edges of the graph<br />

have weights. The weight of a spanning tree is simply the sum of<br />

weights of its edges. Equati<strong>on</strong> (21) reveals that the problem of<br />

finding the best tree approximati<strong>on</strong> of the Gaussian distributi<strong>on</strong><br />

can be reduced to the well-known problem of finding the<br />

maximum spanning tree of the weighted -node graph where<br />

the weight of the - edge is the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> between<br />

and [26]. It can be easily verified that the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong><br />

between two r.v. and that are jointly Gaussian is<br />

(22)<br />

where is the correlati<strong>on</strong> coefficient between and .<br />

There are several algorithms to find a minimum spanning tree.<br />

They all utilize a greedy approach. In this work we use the Prim<br />

algorithm [27], which is efficient and very simple to implement.<br />

The Prim algorithm begins with some vertex in a given graph,<br />

defining the initial set of vertices . Then, in each iterati<strong>on</strong>,<br />

we choose the edge with minimal weight am<strong>on</strong>g all the edges<br />

, where is outside of and is in . Then vertex<br />

is brought in to . This process is repeated until a spanning<br />

tree is <str<strong>on</strong>g>for</str<strong>on</strong>g>med. We can use a heap to remember, <str<strong>on</strong>g>for</str<strong>on</strong>g> each vertex,<br />

the lowest-weight edge c<strong>on</strong>necting the current subtree with<br />

that vertex. The complexity of Prim’s algorithm, <str<strong>on</strong>g>for</str<strong>on</strong>g> finding the<br />

minimum-weight spanning tree of a -vertex graph is .<br />

We note in passing that the Prim algorithm <strong>on</strong>ly relies <strong>on</strong> the<br />

order of the weights and not <strong>on</strong> their exact values. Hence, applying<br />

a m<strong>on</strong>ot<strong>on</strong>ically increasing functi<strong>on</strong> <strong>on</strong> the graph weights<br />

does not change the topology of the optimal tree. To find the optimal<br />

Gaussian tree approximati<strong>on</strong> we can, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e, use the<br />

weights instead of . The optimal<br />

Gaussian tree is, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e, the <strong>on</strong>e that maximizes the sum<br />

of the square correlati<strong>on</strong> coefficients between adjacent nodes.<br />

To summarize, the algorithm that finds the best Gaussian tree<br />

approximati<strong>on</strong> is as follows. Define as the root. Then find<br />

the edge c<strong>on</strong>necting a vertex in the tree to a vertex outside the<br />

tree, such that the corresp<strong>on</strong>ding square correlati<strong>on</strong> coefficient<br />

is maximal and add the edge to the tree. C<strong>on</strong>tinue this procedure<br />

until a spanning tree is obtained.<br />

B. Applying BP <strong>on</strong> the Tree Approximati<strong>on</strong><br />

Let be the optimal Chow-Liu tree approximati<strong>on</strong> of<br />

(15). We can assume, without loss of generality, that<br />

is rooted at . is a cycle-free Gaussian distributi<strong>on</strong><br />

<strong>on</strong> , i.e.,<br />

(23)<br />

where is the “parent” of the th node in the optimal tree.<br />

The Chow-Liu algorithm guarantees that is the optimal<br />

Gaussian tree approximati<strong>on</strong> of in the sense that the<br />

KL divergence is minimal.<br />

Given the Gaussian tree approximati<strong>on</strong>, the next step of our<br />

approach is to apply the finite-set c<strong>on</strong>straint to <str<strong>on</strong>g>for</str<strong>on</strong>g>m a discrete<br />

loop free approximati<strong>on</strong> of which can be efficiently globally<br />

maximized using the BP algorithm. Our approximati<strong>on</strong> approach<br />

is, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e, based <strong>on</strong> replacing the true distributi<strong>on</strong><br />

with the following approximati<strong>on</strong>:<br />

(24)<br />

The probability functi<strong>on</strong> is a cycle-free factor graph.<br />

Hence the BP algorithm can be applied to find its most likely<br />

c<strong>on</strong>figurati<strong>on</strong>. We next derive the messages of the BP algorithm<br />

that is applied <strong>on</strong> . An optimal BP schedule, when applied<br />

to a tree, requires passing a message <strong>on</strong>ce in each directi<strong>on</strong> of<br />

each edge [20]. The BP messages are first sent from leaf variables<br />

“downward” to the root. The computati<strong>on</strong> begins at the<br />

leaves of the graph. Each leaf variable node sends a message to<br />

its parent. Each vertex waits <str<strong>on</strong>g>for</str<strong>on</strong>g> messages from all of its children<br />

be<str<strong>on</strong>g>for</str<strong>on</strong>g>e computing the message to be sent to its parent. The<br />

“downward” BP message from a variable to its parent variable<br />

is computed based <strong>on</strong> all the messages received<br />

from its children<br />

(25)


4978 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />

If<br />

is a leaf node in the tree then the message is simply<br />

(26)<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> decoding <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> systems presented in the experiment secti<strong>on</strong>.<br />

The max-product is more computati<strong>on</strong>ally efficient since<br />

the BP messages can be entirely computed in the log-domain.<br />

The “downward” computati<strong>on</strong> terminates at the root node.<br />

Next, BP messages are sent “upward” back to the leaves. The<br />

computati<strong>on</strong> begins at the root of the graph. Each vertex waits<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> a message from its parent be<str<strong>on</strong>g>for</str<strong>on</strong>g>e computing the messages to<br />

be sent to each of its children. The “upward” BP message from<br />

a parent variable to its child variable is computed based<br />

<strong>on</strong> the “upward” message received from its parent<br />

and from “downward” messages that received from all the<br />

siblings of<br />

C. An MMSE Versi<strong>on</strong> of A Tree Approximati<strong>on</strong><br />

The MMSE Bayesian approach (5) is known to be better<br />

than the zero-<str<strong>on</strong>g>for</str<strong>on</strong>g>cing soluti<strong>on</strong> (4). In MMSE we partially incorporate<br />

the in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> that by using the prior Gaussian<br />

distributi<strong>on</strong><br />

. In a similar way we can c<strong>on</strong>sider a<br />

Bayesian versi<strong>on</strong> of the proposed Gaussian tree approximati<strong>on</strong>.<br />

We can partially incorporate the in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> that<br />

by using the prior Gaussian distributi<strong>on</strong><br />

such<br />

that<br />

. This yields the posterior Gaussian<br />

distributi<strong>on</strong><br />

If<br />

is the root of the tree then the message is simply<br />

(27)<br />

(28)<br />

After the downward-upward message passing procedure is<br />

completed, we can compute the “belief” at each variable which<br />

is the product of all the messages sent to the variable from its<br />

parent and from its children (if there are any)<br />

In the case<br />

follows:<br />

(29)<br />

is the root node, the “belief” is computed as<br />

(30)<br />

Since the approximated distributi<strong>on</strong> (24) is cycle-free,<br />

the general Belief Propagati<strong>on</strong> theory guarantees that (the normalized)<br />

belief vector is exactly the marginal distributi<strong>on</strong><br />

of the approximated distributi<strong>on</strong> (24). To obtain<br />

a hard-decisi<strong>on</strong> decoding we choose the symbol whose posterior<br />

probability is maximal<br />

(31)<br />

Above we described the sum-product versi<strong>on</strong> of the BP algorithm<br />

that computes the marginal probabilities .A<br />

variant of the sum-product algorithm is the max-product algorithm<br />

in which the summati<strong>on</strong> in (25)–(28) is replaced by<br />

a maximizati<strong>on</strong> over all the symbols in . The max-product<br />

algorithm finds the most likely pattern of the approximati<strong>on</strong><br />

. We did not observe any significant per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance difference<br />

using either the sum-product or the max-product variants<br />

(32)<br />

such that<br />

and<br />

. The MMSE method is obtained by approximating<br />

the unc<strong>on</strong>strained posterior distributi<strong>on</strong><br />

by<br />

a product of marginals. In our approach we use, instead, the best<br />

cycle-free Gaussian approximati<strong>on</strong>. We can apply the Chow-Liu<br />

tree approximati<strong>on</strong> <strong>on</strong> the Gaussian distributi<strong>on</strong> (32) to obtain<br />

a “Bayesian” Gaussian tree approximati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> . In this<br />

way we partially use the finite-set c<strong>on</strong>straint when we search<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> the best tree approximati<strong>on</strong> of the true discrete distributi<strong>on</strong><br />

. This is likely to yield a better approximati<strong>on</strong> of the discrete<br />

distributi<strong>on</strong> than the tree distributi<strong>on</strong> which is based<br />

<strong>on</strong> the unc<strong>on</strong>strained distributi<strong>on</strong> . Note that if there<br />

are more transmit antennas than receive antennas, i.e., ,<br />

the linear system<br />

is underdetermined and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e<br />

the ZF soluti<strong>on</strong> does not exist and we cannot apply the<br />

Gaussian tree approximati<strong>on</strong>. However, the MMSE soluti<strong>on</strong> and<br />

the MMSE versi<strong>on</strong> of the Gaussian tree approximati<strong>on</strong> are still<br />

valid.<br />

To summarize, our soluti<strong>on</strong> to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding problem<br />

is based <strong>on</strong> applying BP <strong>on</strong> a discrete versi<strong>on</strong> of the Gaussian<br />

tree approximati<strong>on</strong> of the Bayesian versi<strong>on</strong> of the c<strong>on</strong>tinuous<br />

least-square soluti<strong>on</strong>. We dub this method “The Gaussian-Tree-<br />

Approximati<strong>on</strong> (GTA) Algorithm.” The GTA algorithm is summarized<br />

in Fig. 5. We next compute the complexity of the GTA<br />

algorithm. The complexity of computing the covariance matrix<br />

is (if we can utilize<br />

the matrix inversi<strong>on</strong> lemma <str<strong>on</strong>g>for</str<strong>on</strong>g> the matrix inversi<strong>on</strong>). The<br />

complexity of the Chow-Liu algorithm (based <strong>on</strong> Prim’s algorithm<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> finding the minimum spanning tree) is and the<br />

complexity of the BP algorithm is (regardless of the<br />

number of receive antennas).<br />

IV. EXPERIMENTAL RESULTS<br />

In this secti<strong>on</strong> we provide simulati<strong>on</strong> results <str<strong>on</strong>g>for</str<strong>on</strong>g> the GTA<br />

algorithm over various <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> systems. The channel matrix<br />

comprised i.i.d. elements drawn from a zero-mean complex


GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4979<br />

Fig. 6. Comparis<strong>on</strong> of various detectors in 12 2 12 system, 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> symbols.<br />

Fig. 5. The Gaussian Tree Approximati<strong>on</strong> (GTA) Algorithm.<br />

normal distributi<strong>on</strong> of unit variance. We used 500 000 realizati<strong>on</strong>s<br />

of the channel matrix and each matrix was used <strong>on</strong>ce <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

sending a message. The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the proposed algorithm<br />

is shown as a functi<strong>on</strong> of the variance of the additive noise .<br />

The signal-to-noise ratio (SNR) is defined as<br />

where ( is the number of variables, is the<br />

variance of the Gaussian additive noise, and is the mean<br />

symbol energy). We compared the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the GTA<br />

method to the MMSE-SIC algorithm with optimal ordering<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> the successive interference cancelati<strong>on</strong> [2], and to the<br />

Schnorr-Euchner variant of sphere decoding (SD-SE) with<br />

infinite radius [3], [28]. We used sorting of the channel matrix<br />

using the SQRD algorithm [29] and regularizati<strong>on</strong> [30], which<br />

substantially reduces the computati<strong>on</strong>al complexity.<br />

We also implemented the SDR detector suggested by<br />

Sidiropoulos and Luo [8]. Recently it was shown [9] that <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

the cases of 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> and 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> this SDR based method is<br />

equivalent to the SDR based detecti<strong>on</strong> method suggested by<br />

Wiesel, Eldar, and Shamai [7]. The SDRs were solved using<br />

the CSDP package [31]. In the SDR Gaussian randomizati<strong>on</strong><br />

step, 100 independent randomizati<strong>on</strong>s were implemented. All<br />

the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> algorithms were implemented in C <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

efficiency.<br />

Fig. 6 shows <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance <str<strong>on</strong>g>for</str<strong>on</strong>g> a 12 12<br />

<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> system using 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g>. The methods that are shown are<br />

MMSE-SIC, GTA, SDR, and sphere-decoding. In this case the<br />

SDR outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms both the MMSE-SIC and the GTA methods.<br />

However, in this case it is still feasible to compute the optimal<br />

maximum-likelihood algorithm using the sphere decoding<br />

algorithm.<br />

The SD-SE is the favorite detecti<strong>on</strong> method when the problem<br />

size is small or moderate. In this case the SD-SE can always<br />

yield the exact ML soluti<strong>on</strong> at acceptable computati<strong>on</strong>al cost.<br />

In large size problems, however, there is still a need <str<strong>on</strong>g>for</str<strong>on</strong>g> good<br />

approximati<strong>on</strong> methods. Fig. 7 shows the SER versus SNR and<br />

worst case executi<strong>on</strong> time versus SNR <str<strong>on</strong>g>for</str<strong>on</strong>g> the 12 12 system<br />

using 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g>. Fig. 8 shows the same experimental results <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

the 16 16 system using 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> (time was measured <strong>on</strong> a<br />

dual quad-core Intel Xe<strong>on</strong> 2.33 GHz). To assess the computati<strong>on</strong>al<br />

complexity we used a measure of the worst case rather<br />

than the average case since in <strong>on</strong>line applicati<strong>on</strong>s we have to<br />

decode within a specified time. The choice between executi<strong>on</strong><br />

time or number of floating point operati<strong>on</strong>s is debatable. The<br />

differences in running time between the methods we implemented<br />

was in orders of magnitude and running time is easier<br />

to appreciate.<br />

As can be seen from Figs. 7 and 8, the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the<br />

GTA algorithm in high SNR is significantly better than the<br />

MMSE-SIC. The computati<strong>on</strong>al complexity of GTA is comparable<br />

to MMSE-SIC and it is much better than SDR. Note<br />

that the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the SDR method [8] in these 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g><br />

cases is worse than that of MMSE-SIC and the computati<strong>on</strong>al<br />

complexity is much higher. From the derivati<strong>on</strong> of the SDR<br />

method it can be seen that the relaxati<strong>on</strong> becomes more crude<br />

as at higher c<strong>on</strong>stellati<strong>on</strong>s.<br />

We next show the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of several variants of the GTA<br />

algorithm. The GTA algorithm differs from the ZF, MMSE, and<br />

MMSE-SIC algorithms in several ways. The first difference is<br />

that the GTA algorithm utilizes a Markovian approximati<strong>on</strong> of<br />

instead of an approximati<strong>on</strong> based <strong>on</strong> a product of<br />

independent densities. The sec<strong>on</strong>d aspect is the use of an optimal<br />

tree. To clarify the c<strong>on</strong>tributi<strong>on</strong> of each comp<strong>on</strong>ent we modified<br />

the GTA algorithm by replacing the Chow-Liu optimal tree by<br />

the tree<br />

. We call this method the<br />

“Line-Tree.” As can be seen from Fig. 9, using the optimal tree<br />

is crucial to obtain improved results. Fig. 9 also shows the results<br />

of the n<strong>on</strong>-Bayesian variant of the GTA algorithm. As can be<br />

seen, the Bayesian versi<strong>on</strong> yields better results. Fig. 9 shows<br />

the symbol error rate (SER) versus SNR <str<strong>on</strong>g>for</str<strong>on</strong>g> a 20 20, ,<br />

real <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> system. The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the GTA method and<br />

its variants was compared to the MMSE and the MMSE-SIC<br />

algorithms.


4980 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />

Fig. 7. 12 2 12 system, 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> symbols. (a) SER versus SNR. (b) Max sec<strong>on</strong>ds <str<strong>on</strong>g>for</str<strong>on</strong>g> decoded symbol vector versus SNR. Note that the graphs of GTA and<br />

MMSE-SIC overlap.<br />

Fig. 8. 16 2 16 system, 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> symbols. (a) SER versus SNR. (b) Max sec<strong>on</strong>ds <str<strong>on</strong>g>for</str<strong>on</strong>g> decoded symbol vector versus SNR.<br />

Fig. 9. Comparative results of MMSE, MMSE-SIC and variants of the GTA<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> a 20 2 20 real system, A = f61; 63g.<br />

V. CONCLUSION<br />

We proposed a novel <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> technique based <strong>on</strong><br />

the principle of a tree approximati<strong>on</strong> of the Gaussian distributi<strong>on</strong><br />

that corresp<strong>on</strong>ds to the c<strong>on</strong>tinuous linear problem. The<br />

proposed method outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms previously suggested <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding<br />

algorithms in high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>, as dem<strong>on</strong>strated<br />

in simulati<strong>on</strong>s. Finding the best tree <str<strong>on</strong>g>for</str<strong>on</strong>g> the integer c<strong>on</strong>strained<br />

linear problem is NP hard (unless in which<br />

the -node graph is a tree and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e the Gaussian tree approximati<strong>on</strong><br />

is exact). Although the proposed method yields<br />

improved results, the tree approximati<strong>on</strong> we applied may not<br />

be the best <strong>on</strong>e. It is left <str<strong>on</strong>g>for</str<strong>on</strong>g> future research to search <str<strong>on</strong>g>for</str<strong>on</strong>g> a<br />

better discrete tree approximati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> the c<strong>on</strong>strained linear least<br />

squares problem. This paper dealt with a tree approximati<strong>on</strong> approach,<br />

more complicated approximati<strong>on</strong>s such as multiparent<br />

trees could improve per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance and can potentially provide a<br />

smooth per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-complexity trade-off (see, e.g., [24], [32]<br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> other methods that allow an arbitrary tradeoff of speed versus<br />

accuracy). Another aspect of the GTA algorithm that is worthwhile<br />

to be further investigated is the dependency of the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance<br />

<strong>on</strong> the channel matrix probabilistic model. In the experiments,<br />

the channel matrix was sampled from a standard<br />

normal distributi<strong>on</strong>, leading to a rather dense system of equati<strong>on</strong>s.<br />

This is a standard simulati<strong>on</strong> modeling of a <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> communicati<strong>on</strong><br />

system. The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the GTA algorithm,<br />

however, can be different <str<strong>on</strong>g>for</str<strong>on</strong>g> more sparse matrices and <str<strong>on</strong>g>for</str<strong>on</strong>g> binary<br />

matrices that occur in CDMA systems. We have shown


GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4981<br />

that <str<strong>on</strong>g>for</str<strong>on</strong>g> high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> the GTA method outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms previously<br />

suggested <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding methods such as MMSE-SIC<br />

and SDR. As <str<strong>on</strong>g>for</str<strong>on</strong>g> potential future work, it would be interesting to<br />

compare our Gaussian tree approximati<strong>on</strong> BP algorithm to other<br />

BP Gaussian approximati<strong>on</strong> methods that are based <strong>on</strong> the factor<br />

graph representati<strong>on</strong> (11), e.g., the decoding algorithm proposed<br />

by Kabashima [22].<br />

While the method provides excellent per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance, it is<br />

worthwhile to menti<strong>on</strong> that the method provides estimates<br />

of the a posteriori probabilities <str<strong>on</strong>g>for</str<strong>on</strong>g> each variable that can<br />

be used to improve per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. This is d<strong>on</strong>e by applying a<br />

Schnorr-Euchner sphere decoding algorithm [3], where we<br />

order the symbols according to their a posteriori probabilities.<br />

This can lead to very close to optimal per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance with highly<br />

reduced complexity. This is due to the fact that by utilizing the<br />

a posteriori probabilities, we have a much higher probability<br />

of finding the true soluti<strong>on</strong> during the first search, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e<br />

significantly reducing the search radius.<br />

There are several important applicati<strong>on</strong>s of the proposed technique.<br />

We comment here <strong>on</strong> combining it into communicati<strong>on</strong><br />

systems with coding and interleaving. It is useful both <str<strong>on</strong>g>for</str<strong>on</strong>g> single<br />

carrier and OFDM systems. It can serve as a <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoder <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

wireless communicati<strong>on</strong> systems. Using the a posteriori probability<br />

distributi<strong>on</strong> of the symbols we can easily estimate the a<br />

posteriori probability and the likelihood ratio <str<strong>on</strong>g>for</str<strong>on</strong>g> the bits. The<br />

technique can be combined with <str<strong>on</strong>g>MIMO</str<strong>on</strong>g>-OFDM system with bit<br />

interleaved coded modulati<strong>on</strong> or with trellis coded modulati<strong>on</strong><br />

by joint coding over all frequency t<strong>on</strong>es of the OFDM system,<br />

and running the decoder <str<strong>on</strong>g>for</str<strong>on</strong>g> each t<strong>on</strong>e independently.<br />

In this paper we focused <strong>on</strong> the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem.<br />

The proposed method, however, can be applied to solve more<br />

general c<strong>on</strong>strained linear least squares problems which is an<br />

important issue in many fields. A main c<strong>on</strong>cept in the GTA<br />

model is the interplay between discrete and Gaussian models.<br />

Such hybrid ideas can be c<strong>on</strong>sidered also <str<strong>on</strong>g>for</str<strong>on</strong>g> discrete inference<br />

problems other than least-squares. One example is the work<br />

of Opper and Winther [33] who applied an iterative algorithm<br />

using a model which is seen as discrete and Gaussian in turn to<br />

address Ising model inference problems. In their approach they<br />

approximated intractable probabilistic models by replacing an<br />

average over the original intractable distributi<strong>on</strong> with a tractable<br />

<strong>on</strong>e. Our method can be viewed as a special case of their tree expectati<strong>on</strong>-c<strong>on</strong>sistent<br />

approximati<strong>on</strong> algorithm [33], applied to<br />

the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem.<br />

REFERENCES<br />

[1] U. Fincke and M. Pohst, “Improved methods <str<strong>on</strong>g>for</str<strong>on</strong>g> calculating vectors<br />

of short length in a lattice, including a complexity analysis,” Math.<br />

Comput., vol. 44, no. 170, pp. 463–471, 1985.<br />

[2] G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky,<br />

“<str<strong>on</strong>g>Detecti<strong>on</strong></str<strong>on</strong>g> algorithm and initial laboratory results using V-BLAST<br />

space-time communicati<strong>on</strong> architecture,” Electr<strong>on</strong>. Let., vol. 35, no. 1,<br />

pp. 14–16, 1999.<br />

[3] C. Schnorr and M. Euchner, “Lattice basis reducti<strong>on</strong>: Improved practical<br />

algorithms and solving subset sum problems,” J. Math. Program.,<br />

vol. 66, no. 1–3, pp. 181–199, 1994.<br />

[4] J. Boutros, N. Gresset, L. Brunel, and M. Fossorier, “Soft-input softoutput<br />

lattice sphere decoder <str<strong>on</strong>g>for</str<strong>on</strong>g> linear channels,” in Proc. <strong>IEEE</strong> Global<br />

Commun. C<strong>on</strong>f., Dec. 2003, vol. 3, pp. 1583–1587.<br />

[5] C. Studer, A. Burg, and H. Bölcskei, “Soft-output sphere decoding:<br />

Algorithms and VLSI implementati<strong>on</strong>,” <strong>IEEE</strong> J. Sel. Areas Commun.,<br />

vol. 26, no. 2, pp. 290–300, 2008.<br />

[6] J. Jalden and B. Ottersten, “On the complexity of sphere decoding in<br />

digital communicati<strong>on</strong>s,” <strong>IEEE</strong> Trans. Signal Process., vol. 53, no. 4,<br />

pp. 1474–1484, 2005.<br />

[7] A. Wiesel, Y. C. Eldar, and S. Shamai, “Semidefinite relaxati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

detecti<strong>on</strong> of 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> signaling in <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channels,” <strong>IEEE</strong> Signal<br />

Process. Lett., vol. 12, no. 9, pp. 653–656, 2005.<br />

[8] N. Sidiropoulos and Z. Luo, “A semidefinite relaxati<strong>on</strong> approach to<br />

<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s,” <strong>IEEE</strong> Signal<br />

Process. Lett., vol. 13, no. 9, pp. 525–528, 2006.<br />

[9] W. Ma, C. Su, J. Jalden, T. Chang, and C. Chi, “The equivalence of<br />

semidefinite relaxati<strong>on</strong> <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detectors <str<strong>on</strong>g>for</str<strong>on</strong>g> higher-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g>,” <strong>IEEE</strong><br />

J. Sel. Topics Signal Process., vol. 3, no. 6, pp. 1038–1052, 2009.<br />

[10] O. Shental, N. Shental, S. Shamai (Shitz), I. Kanter, A. Weiss, and<br />

Y. Weiss, “Discrete-input two-dimensi<strong>on</strong>al Gaussian channels with<br />

memory: Estimati<strong>on</strong> and in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> rates via graphical models and<br />

statistical mechanics,” <strong>IEEE</strong> Trans. Inf. Theory, vol. 54, no. 4, pp.<br />

1500–1513, Apr. 2008.<br />

[11] J. S. Yedidia, W. T. Freeman, and Y. Weiss, “Understanding belief<br />

propagati<strong>on</strong> and its generalizati<strong>on</strong>s,” in Proc. Int. Joint C<strong>on</strong>f. Artif. Intell.,<br />

Aug. 2001, pp. 239–236.<br />

[12] E. Sudderth and W. Freeman, “Signal and image processing with belief<br />

propagati<strong>on</strong>,” <strong>IEEE</strong> Signal Process. Mag., vol. 25, no. 2, pp. 114–141,<br />

2008.<br />

[13] J. Pearl, Probabilistic Reas<strong>on</strong>ing in Intelligent Systems. San Mateo,<br />

CA: Morgan Kaufman, 1988.<br />

[14] Y. Weiss and W. Freeman, “Correctness of belief propagati<strong>on</strong> in<br />

Gaussian graphical models of arbitrary topology,” Neural Comput.,<br />

vol. 13, no. 10, pp. 2173–2200, 2001.<br />

[15] R. G. Gallager, Low Density Parity Check Codes. Cambridge, MA:<br />

MIT Press, 1963.<br />

[16] P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient belief propagati<strong>on</strong><br />

<str<strong>on</strong>g>for</str<strong>on</strong>g> early visi<strong>on</strong>,” Int. J. Comput. Vis., vol. 70, no. 1, pp. 41–54, 2006.<br />

[17] K. Murphy, Y. Weiss, and M. Jordan, “Loopy belief propagati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

approximate inference: An empirical study,” in Proc. Uncertainty Artif.<br />

Intell., Jul. 1999, pp. 467–475.<br />

[18] T. Minka and Y. Qi, “Tree-structured approximati<strong>on</strong>s by expectati<strong>on</strong><br />

propagati<strong>on</strong>,” in Proc. Neural Inf. Process. Syst., Vancouver, BC,<br />

Canada, Dec. 2004.<br />

[19] J. Goldberger and A. Leshem, “Pseudo prior belief propagati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />

densely c<strong>on</strong>nected discrete graphs,” in Proc. <strong>IEEE</strong> Inf. Theory Workshop,<br />

Jan. 2010, pp. 1–5.<br />

[20] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and<br />

the sum-product algorithm,” <strong>IEEE</strong> Trans. Inf. Theory, vol. 47, no. 2,<br />

pp. 498–519, Feb. 2001.<br />

[21] M. Kaynak, T. Duman, and E. Kurtas, “Belief propagati<strong>on</strong> over<br />

SISO/<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> frequency selective fading channels,” <strong>IEEE</strong> Trans.<br />

Wireless Commun, vol. 6, no. 6, pp. 2001–2005, Jun. 2007.<br />

[22] Y. Kabashima, “A CDMA multiuser detecti<strong>on</strong> algorithm <strong>on</strong> the basis<br />

of belief propagati<strong>on</strong>,” J. Phys. A, vol. 36, pp. 11 111–11 121, 2003.<br />

[23] J. Neirotti and D. Saad, “Improved message passing <str<strong>on</strong>g>for</str<strong>on</strong>g> inference in<br />

densely c<strong>on</strong>nected systems,” Europhys. Lett., vol. 71, pp. 866–872,<br />

2005.<br />

[24] J. Hu and T. M. Duman, “Graph-based detector <str<strong>on</strong>g>for</str<strong>on</strong>g> BLAST architecture,”<br />

in Proc. <strong>IEEE</strong> Int. C<strong>on</strong>f. Commun., Jun. 2007, pp. 1018–1023.<br />

[25] D. Bicks<strong>on</strong>, O. Shental, P. H. Siegel, J. K. Wolf, and D. Dolev,<br />

“Gaussian belief propagati<strong>on</strong> based multiuser detecti<strong>on</strong>,” in Proc.<br />

<strong>IEEE</strong> Int. Symp. Inf. Theory, Jul. 2008, pp. 1878–1882.<br />

[26] C. K. Chow and C. N. Liu, “Approximating discrete probability distributi<strong>on</strong>s<br />

with dependence trees,” <strong>IEEE</strong> Trans. Inf. Theory, vol. IT-14,<br />

no. 3, pp. 462–467, May 1968.<br />

[27] R. C. Prim, “Shortest c<strong>on</strong>necti<strong>on</strong> networks and some generalizati<strong>on</strong>s,”<br />

J. Bell Syst. Tech., pp. 1389–1401, 1957.<br />

[28] E. Agrell, T. Erikss<strong>on</strong>, A. Vardy, and K. Zeger, “Closest point search in<br />

lattices,” <strong>IEEE</strong> Trans. Inf. Theory, vol. 48, no. 8, pp. 2201–2214, Aug.<br />

2002.<br />

[29] D. Wübben, R. Böhnke, J. Rinas, V. Kühn, and K. Kammeyer, “Efficient<br />

algorithm <str<strong>on</strong>g>for</str<strong>on</strong>g> decoding layered space-time codes,” IEE Electr<strong>on</strong>.<br />

Lett., vol. 37, pp. 1348–1350, 2001.<br />

[30] D. Wübben, R. Böhnke, V. Kühn, and K. Kammeyer, “MMSE extensi<strong>on</strong><br />

of V-BLAST based <strong>on</strong> sorted QR decompositi<strong>on</strong>,” in Proc. <strong>IEEE</strong><br />

Veh. Technol. C<strong>on</strong>f., Oct. 2003, vol. 1, pp. 508–512.<br />

[31] B. Borchers, “A C library <str<strong>on</strong>g>for</str<strong>on</strong>g> semidefinite programming,” Optim.<br />

Methods Softw., vol. 11, pp. 613–623, 1999.<br />

[32] P. H. Tan and L. K. Rasmussen, “Multiuser detecti<strong>on</strong> in CDMA—A<br />

comparis<strong>on</strong> of relaxati<strong>on</strong>s, exact, and heuristic search methods,” <strong>IEEE</strong><br />

Trans. Wireless Commun., vol. 3, no. 5, pp. 1802–1809, Sep. 2004.<br />

[33] M. Opper and O. Winther, “Expectati<strong>on</strong> c<strong>on</strong>sistent approximate inference,”<br />

J. Mach. Learn. Res., vol. 6, no. 12, pp. 2177–2204, 2005.


4982 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />

Jacob Goldberger received the B.Sc. degree (cum laude) in mathematics and<br />

computer science from the Bar-Ilan University, Ramat Gan, Israel, the M.Sc.<br />

degree (cum laude) in mathematics, and the Ph.D. degree in Engineering, from<br />

the Tel-Aviv University, Tel-Aviv, Israel. In 2004 he joined the School of Electrical<br />

and Computer Engineering, Bar-Ilan University, Ramat Gan, Israel, where<br />

he is currently a faculty member. His main research interests include machine<br />

learning, in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> theory, computer visi<strong>on</strong>, medical imaging, and speech<br />

recogniti<strong>on</strong>.<br />

Amir Leshem (M’98-SM’06) received the B.Sc. degree (cum laude) in mathematics<br />

and physics, the M.Sc. degree (cum laude) in mathematics, and the Ph.D.<br />

degree in mathematics, all from the Hebrew University, Jerusalem, Israel. He<br />

is <strong>on</strong>e of the founders of the School of Electrical and Computer Engineering,<br />

Bar-Ilan University, Ramat Gan, Israel, where he is currently an Associate Professor<br />

and Head of the signal processing track. His main research interests include<br />

multichannel communicati<strong>on</strong>, applicati<strong>on</strong>s of game theory to communicati<strong>on</strong>,<br />

array and statistical signal processing with applicati<strong>on</strong>s to sensor arrays<br />

and networks, wireless communicati<strong>on</strong>s, radio-astr<strong>on</strong>omy, and brain research,<br />

set theory, logic, and foundati<strong>on</strong>s of mathematics.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!