MIMO Detection for High-Order QAM Based on a ... - IEEE Xplore
MIMO Detection for High-Order QAM Based on a ... - IEEE Xplore
MIMO Detection for High-Order QAM Based on a ... - IEEE Xplore
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011 4973<br />
<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> <str<strong>on</strong>g>Detecti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>High</str<strong>on</strong>g>-<str<strong>on</strong>g>Order</str<strong>on</strong>g> <str<strong>on</strong>g>QAM</str<strong>on</strong>g> <str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> a<br />
Gaussian Tree Approximati<strong>on</strong><br />
Jacob Goldberger and Amir Leshem, Senior Member, <strong>IEEE</strong><br />
Abstract—This paper proposes a new detecti<strong>on</strong> algorithm <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> communicati<strong>on</strong> systems employing high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s.<br />
The factor graph that corresp<strong>on</strong>ds to this problem is<br />
very loopy; in fact, it is a complete graph. Hence, a straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward<br />
applicati<strong>on</strong> of the Belief Propagati<strong>on</strong> (BP) algorithm yields very<br />
poor results. Our algorithm is based <strong>on</strong> an optimal tree approximati<strong>on</strong><br />
of the Gaussian density of the unc<strong>on</strong>strained linear system.<br />
The finite-set c<strong>on</strong>straint is then applied to obtain a cycle-free discrete<br />
distributi<strong>on</strong>. Simulati<strong>on</strong> results show that even though the<br />
approximati<strong>on</strong> is not directly applied to the exact discrete distributi<strong>on</strong>,<br />
applying the BP algorithm to the cycle-free factor graph<br />
outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms current methods in terms of both per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance and<br />
complexity. The improved per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the proposed algorithm<br />
is dem<strong>on</strong>strated <strong>on</strong> the problem of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong>.<br />
Index Terms—<str<strong>on</strong>g>High</str<strong>on</strong>g>-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g>, integer least squares, <str<strong>on</strong>g>MIMO</str<strong>on</strong>g><br />
communicati<strong>on</strong> systems, <str<strong>on</strong>g>MIMO</str<strong>on</strong>g>-OFDM systems.<br />
Here, noise is modeled by the random vector which is independent<br />
of and whose comp<strong>on</strong>ents are assumed to be i.i.d. according<br />
to a complex Gaussian distributi<strong>on</strong> with mean zero and<br />
with known variance . The matrix comprises i.i.d.<br />
elements drawn from a complex normal distributi<strong>on</strong> of unit variance.<br />
The <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem c<strong>on</strong>sists of finding the unknown<br />
transmitted vector given and . The task, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e,<br />
boils down to solving a linear system in which the unknowns<br />
are c<strong>on</strong>strained to a discrete finite set. It is c<strong>on</strong>venient to re<str<strong>on</strong>g>for</str<strong>on</strong>g>mulate<br />
the complex-valued model into a real-valued <strong>on</strong>e. It can<br />
be translated into an equivalent double-size real-valued representati<strong>on</strong><br />
that is obtained by c<strong>on</strong>sidering the real and imaginary<br />
parts separately<br />
I. INTRODUCTION<br />
F<br />
INDING A linear least squares fit to data is a well-known<br />
problem, with applicati<strong>on</strong>s in almost every field of<br />
science. When there are no restricti<strong>on</strong>s <strong>on</strong> the variables, the<br />
problem has a closed-<str<strong>on</strong>g>for</str<strong>on</strong>g>m soluti<strong>on</strong>. In many cases, a priori<br />
knowledge <strong>on</strong> the values of the variables is available. One<br />
example is the existence of priors, which leads to Bayesian<br />
estimators. Another example of great interest in a variety of<br />
areas is when the variables are c<strong>on</strong>strained to a discrete finite<br />
set. This problem has many diverse applicati<strong>on</strong>s such as the<br />
decoding of multi-input-multi-output (<str<strong>on</strong>g>MIMO</str<strong>on</strong>g>) digital communicati<strong>on</strong><br />
systems. In c<strong>on</strong>trast to the c<strong>on</strong>tinuous linear least<br />
squares problem, this problem is known to be NP hard [1].<br />
We c<strong>on</strong>sider a <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> communicati<strong>on</strong> system with transmit<br />
antennas and receive antennas. The tap gain from transmit antenna<br />
to receive antenna is denoted by . In each use of the<br />
<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channel a vector<br />
is independently<br />
selected from a finite set of complex numbers according to<br />
the data to be transmitted, so that . We further assume<br />
that in each use of the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channel, is uni<str<strong>on</strong>g>for</str<strong>on</strong>g>mly sampled<br />
from . The received vector is given by<br />
Manuscript received January 20, 2010; revised April 13, 2011; accepted April<br />
23, 2011. Date of current versi<strong>on</strong> July 29, 2011. The material in this paper was<br />
presented in part at the Neural In<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> Processing Systems C<strong>on</strong>ference,<br />
Vancouver, BC, Canada, Dec. 2009.<br />
J. Goldberger is with the School of Engineering, Bar-Ilan University, 52900,<br />
Ramat-Gan, Israel (e-mail: goldbej@eng.biu.ac.il).<br />
A. Leshem is with the School of Engineering, Bar-Ilan University, 52900,<br />
Ramat-Gan, Israel (e-mail: leshema@eng.biu.ac.il).<br />
Communicated by P. O. V<strong>on</strong>tobel, Associate Editor <str<strong>on</strong>g>for</str<strong>on</strong>g> Coding Techniques.<br />
Digital Object Identifier 10.1109/TIT.2011.2159037<br />
(1)<br />
Hence we assume hereafter that has real values without any<br />
loss of generality. The maximum likelihood (ML) soluti<strong>on</strong> is<br />
However, going over all the vectors is practically unfeasible<br />
when either or are large.<br />
A simple suboptimal soluti<strong>on</strong>, known as the zero-<str<strong>on</strong>g>for</str<strong>on</strong>g>cing algorithm,<br />
is based <strong>on</strong> a linear decisi<strong>on</strong> that ignores the finite-set<br />
c<strong>on</strong>straint<br />
and then, neglecting the correlati<strong>on</strong> between the symbols,<br />
finding the closest point in <str<strong>on</strong>g>for</str<strong>on</strong>g> each symbol independently<br />
This scheme per<str<strong>on</strong>g>for</str<strong>on</strong>g>ms poorly due to its inability to handle illc<strong>on</strong>diti<strong>on</strong>ed<br />
realizati<strong>on</strong>s of the matrix (if then is<br />
even singular). Somewhat better per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance can be obtained<br />
by using a minimum mean square error (MMSE) Bayesian estimati<strong>on</strong><br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> the c<strong>on</strong>tinuous linear system. Let be the mean<br />
symbol energy. We can partially incorporate the in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong><br />
that by using the prior Gaussian distributi<strong>on</strong><br />
. The MMSE estimati<strong>on</strong> becomes<br />
and then the finite-set soluti<strong>on</strong> is obtained by finding the<br />
closest lattice point in each comp<strong>on</strong>ent independently. A vast<br />
improvement over the linear approaches described above can<br />
(2)<br />
(3)<br />
(4)<br />
(5)<br />
0018-9448/$26.00 © 2011 <strong>IEEE</strong>
4974 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />
be achieved by the MMSE with Successive Interference Cancelati<strong>on</strong><br />
(MMSE-SIC) algorithm that is based <strong>on</strong> sequential<br />
decoding with optimal ordering [2]. These linear type algorithms<br />
can also easily provide probabilistic (soft-decisi<strong>on</strong>)<br />
estimates <str<strong>on</strong>g>for</str<strong>on</strong>g> each symbol. However, there is still a significant<br />
gap between the detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the MMSE-SIC<br />
algorithm and the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the ML detector.<br />
Many alternative methods have been proposed to approach the<br />
ML detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. The sphere decoding (SD) algorithm<br />
finds the exact ML soluti<strong>on</strong> by searching the nearest lattice point<br />
[1], [3]–[5]. Although SD reduces computati<strong>on</strong>al complexity<br />
compared to the exhaustive search of the ML soluti<strong>on</strong>, sphere<br />
decoding is not feasible <str<strong>on</strong>g>for</str<strong>on</strong>g> high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s.<br />
While SD has been empirically found to be computati<strong>on</strong>ally very<br />
fast <str<strong>on</strong>g>for</str<strong>on</strong>g> small to moderate problem sizes (say, <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g>), the sphere decoding complexity would be prohibitive<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> large , higher order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> and/or low SNRs [6]. Another<br />
family of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding algorithms is based <strong>on</strong> semidefinite<br />
relaxati<strong>on</strong> (e.g., [7]–[9]). Although the theoretical computati<strong>on</strong>al<br />
complexity of semidefinite relaxati<strong>on</strong> is a low degree polynomial,<br />
in practice the running time is very high. Thus, there is still a<br />
need <str<strong>on</strong>g>for</str<strong>on</strong>g> low-complexity detecti<strong>on</strong> algorithms that per<str<strong>on</strong>g>for</str<strong>on</strong>g>m well.<br />
This study attempts to solve the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding problem<br />
using the Belief Propagati<strong>on</strong> (BP) paradigm. It is well-known<br />
(see e.g., [10]) that a straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward implementati<strong>on</strong> of the BP<br />
algorithm to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem yields very poor results<br />
since there are a large number of short cycles in the underlying<br />
factor graph. In this study we introduce a novel approach<br />
to utilize the BP paradigm <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong>. The proposed<br />
variant of the BP algorithm is both computati<strong>on</strong>ally efficient and<br />
achieves improved results. The paper proceeds as follows. In<br />
Secti<strong>on</strong> II we discuss previous attempts to apply variants of the<br />
BP algorithm to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding problem. The proposed<br />
algorithm which we dub “The Gaussian-Tree-Approximati<strong>on</strong><br />
(GTA) Algorithm” is described in Secti<strong>on</strong> III. Experimental results<br />
are presented in Secti<strong>on</strong> IV.<br />
II. THE LOOPY BELIEF PROPAGATION APPROACH<br />
Given the c<strong>on</strong>strained linear system<br />
, and a uni<str<strong>on</strong>g>for</str<strong>on</strong>g>m<br />
prior distributi<strong>on</strong> <strong>on</strong> over a finite set of points , the<br />
posterior probability functi<strong>on</strong> of the discrete random vector<br />
given is<br />
The notati<strong>on</strong> stands <str<strong>on</strong>g>for</str<strong>on</strong>g> equality up to a normalizati<strong>on</strong> c<strong>on</strong>stant.<br />
Observing that is a quadratic expressi<strong>on</strong>, it can<br />
be easily verified that can be factorized into a product of<br />
two- and single-variable potentials<br />
such that<br />
(6)<br />
(7)<br />
(8)<br />
Fig. 1. The MRF undirected graphical model corresp<strong>on</strong>ds to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong><br />
problem with n =7.<br />
where is the th column of the matrix . Since the obtained<br />
factors are simply functi<strong>on</strong>s of pairs of random variables, we obtain<br />
a Markov Random Field (MRF) representati<strong>on</strong> [11]. In the<br />
<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> applicati<strong>on</strong> the (known) matrix is randomly selected<br />
and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e the MRF graph is usually a completely c<strong>on</strong>nected<br />
graph (see an MRF graph illustrati<strong>on</strong> in Fig. 1).<br />
The Belief Propagati<strong>on</strong> (BP) algorithm aims to solve inference<br />
problems by propagating in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> throughout this MRF<br />
via a series of messages sent between neighboring nodes (see<br />
[12] <str<strong>on</strong>g>for</str<strong>on</strong>g> an excellent tutorial <strong>on</strong> BP). In the sum-product variant<br />
of the BP algorithm applied to the MRF (7), the message from<br />
to is<br />
(9)<br />
In each iterati<strong>on</strong> messages are passed al<strong>on</strong>g all the graph edges<br />
in both edge directi<strong>on</strong>s. In every iterati<strong>on</strong>, an estimate of the<br />
posterior marginal distributi<strong>on</strong> (“belief”) <str<strong>on</strong>g>for</str<strong>on</strong>g> each variable can<br />
be computed by multiplying together all the incoming messages<br />
from all the other nodes<br />
(10)<br />
A variant of the sum-product algorithm is the max-product algorithm<br />
in which the summati<strong>on</strong> in (9) is replaced by a maximizati<strong>on</strong><br />
over all the symbols in . In a cycle-free MRF graph<br />
the sum-product algorithm always c<strong>on</strong>verges to the exact marginal<br />
probabilities (which in the case of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> corresp<strong>on</strong>ds<br />
to a soft decisi<strong>on</strong> of each symbol, i.e., ). In a cyclefree<br />
MRF graph the max-product variant of the BP algorithm<br />
always c<strong>on</strong>verges to the most likely c<strong>on</strong>figurati<strong>on</strong> [13] (which<br />
corresp<strong>on</strong>ds to ML decoding in our case). For cycle-free graphs,<br />
BP is essentially a distributed variant of dynamic programming.<br />
The BP message update equati<strong>on</strong>s <strong>on</strong>ly involve passing messages<br />
between neighboring nodes. Computati<strong>on</strong>ally, it is thus straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward<br />
to apply the same local message updates in graphs with<br />
cycles. In most such models, however, this loopy BP algorithm<br />
will not compute exact marginal distributi<strong>on</strong>s; hence, there is almost<br />
no theoretical justificati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> applying the BP algorithm<br />
(<strong>on</strong>e excepti<strong>on</strong> is that, <str<strong>on</strong>g>for</str<strong>on</strong>g> Gaussian graphs, if BP c<strong>on</strong>verges, then<br />
the means are correct [14]). However, the BP algorithm applied to<br />
loopy graphs has been found to have outstanding empirical success<br />
in many applicati<strong>on</strong>s, e.g., in decoding of LDPC codes [15].<br />
The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of BP in this applicati<strong>on</strong> may be attributed to<br />
the sparsity of the graphs. The cycles in the graph are l<strong>on</strong>g, hence
GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4975<br />
Fig. 2. Decoding results <str<strong>on</strong>g>for</str<strong>on</strong>g> 8 2 8 BPSK real valued system, A = f01; 1g.<br />
the graph has tree-like properties, so that messages are approximately<br />
independent and inference may be per<str<strong>on</strong>g>for</str<strong>on</strong>g>med as though<br />
the graph was cycle-free. The BP algorithm has also been used<br />
successfully in image processing and computer visi<strong>on</strong> (e.g., [16])<br />
where the image is represented by a grid-structured MRF that is<br />
based <strong>on</strong> local c<strong>on</strong>necti<strong>on</strong>s between neighboring nodes.<br />
However, when the graph is not sparse, and is not based <strong>on</strong><br />
local grid c<strong>on</strong>necti<strong>on</strong>s, loopy BP almost always fails to c<strong>on</strong>verge.<br />
Unlike the sparse graphs of LDPC codes, or grid graphs in<br />
computer visi<strong>on</strong> applicati<strong>on</strong>s, the MRF graphs of <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channels<br />
are completely c<strong>on</strong>nected graphs and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e the associated<br />
detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance is poor. This has prevented the BP<br />
from being an asset <str<strong>on</strong>g>for</str<strong>on</strong>g> the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> problem. Fig. 2 shows per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance<br />
curves <str<strong>on</strong>g>for</str<strong>on</strong>g> a BPSK <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> system based <strong>on</strong> an 8 8<br />
matrix and<br />
(see Secti<strong>on</strong> IV <str<strong>on</strong>g>for</str<strong>on</strong>g> a detailed descripti<strong>on</strong><br />
of the simulati<strong>on</strong> setup). As can be seen in Fig. 2, the BP<br />
decoder based <strong>on</strong> the MRF representati<strong>on</strong> (7) yields very poor<br />
results. Standard techniques to stabilize the BP iterati<strong>on</strong>s such<br />
as damping the message updates [17] do not help here. Even applying<br />
more advanced versi<strong>on</strong>s of BP (e.g., Generalized BP and<br />
Expectati<strong>on</strong> Propagati<strong>on</strong> [18]) to inference problems <strong>on</strong> complete<br />
MRF graphs yields poor results. The problem here is not<br />
in the optimizati<strong>on</strong> method but in the cost functi<strong>on</strong> that needs to<br />
be modified to yield a good approximate soluti<strong>on</strong> (see e.g., [19]).<br />
Shental et al. [10] analyzed the case where the matrix is<br />
relatively sparse (and has a grid structure) (see Fig. 3). They<br />
showed that even under this restricted assumpti<strong>on</strong> the BP still<br />
does not per<str<strong>on</strong>g>for</str<strong>on</strong>g>m well. As an alternative method they proposed<br />
the generalized belief propagati<strong>on</strong> (GBP) algorithm that does<br />
work well <strong>on</strong> the sparse matrix if the algorithm regi<strong>on</strong>s are carefully<br />
chosen. There are situati<strong>on</strong>s where the sparsity assumpti<strong>on</strong><br />
makes sense (e.g., 2-D intersymbol interference (ISI) channels).<br />
However, in the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channel model we assume that<br />
the channel matrix elements are i.i.d. and Gaussian; hence we<br />
cannot assume that the channel matrix is sparse.<br />
There is yet another way to write the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong><br />
problem (6) as a factor graph [20]. It can be easily verified that<br />
(11)<br />
Fig. 3. The MRF grid model corresp<strong>on</strong>ds to 5 2 5 2-D intersymbol interference<br />
(ISI) channels.<br />
Fig. 4. Factor graph representati<strong>on</strong> of the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> system. The variables<br />
of the factor graph x ; ...;x corresp<strong>on</strong>d to the transmit antennas and<br />
the factors y ; ...;y corresp<strong>on</strong>d to the receive antennas.<br />
with is the th row of . Each factor corresp<strong>on</strong>ds to a single<br />
linear equati<strong>on</strong> and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e each factor is a functi<strong>on</strong> of all the<br />
unknown variables. An illustrati<strong>on</strong> of the resulting factor graph<br />
is shown in Fig. 4. The variables of the factor graph corresp<strong>on</strong>d<br />
to the transmit antennas and the factors corresp<strong>on</strong>d to the receive<br />
antennas. Note that the factor graph representati<strong>on</strong> of (11)<br />
is a completely c<strong>on</strong>nected bipartite graph. Applying the BP algorithm<br />
<strong>on</strong> the factor graph (11) yields the following messages.<br />
to the vari-<br />
The BP message from the factor associated with<br />
able is<br />
and the BP message from variable to factor is<br />
(12)<br />
(13)<br />
A major drawback of applying BP to the factor graph (11) is<br />
the complexity of computing the factor-to-variable BP messages<br />
(12), which is exp<strong>on</strong>ential in the number of variables<br />
[21]. Several recent studies suggested an efficiently computed<br />
approximati<strong>on</strong> of the factor-to-variable BP messages based <strong>on</strong><br />
a Gaussian approximati<strong>on</strong>. The Gaussian approximati<strong>on</strong> can<br />
be justified <str<strong>on</strong>g>for</str<strong>on</strong>g> large values of by the central limit theorem
4976 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />
(see, e.g., [22]–[24]). While these algorithms were originally<br />
described <str<strong>on</strong>g>for</str<strong>on</strong>g> BPSK c<strong>on</strong>stellati<strong>on</strong>s, they can be generalized to<br />
arbitrary <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s.<br />
In recent years there were also several attempts to apply BP<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> densely c<strong>on</strong>nected Gaussian graphs. For example c<strong>on</strong>sider<br />
the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem (6) without the finite-set c<strong>on</strong>straint<br />
<strong>on</strong> the unknown variable (see, e.g., [25]). The Gaussian<br />
MRF model, however, is much easier. In that case the computati<strong>on</strong>al<br />
complexity of finding the exact soluti<strong>on</strong> is a low-degree<br />
polynomial in the number of variables. The BP algorithm can<br />
be utilized to further reduce the complexity.<br />
III. THE GAUSSIAN TREE APPROXIMATION ALGORITHM<br />
Our approach is based <strong>on</strong> an approximati<strong>on</strong> of the exact probability<br />
functi<strong>on</strong><br />
where is a Gaussian density with mean and covariance<br />
matrix . can be viewed as a posterior distributi<strong>on</strong><br />
of assuming a n<strong>on</strong>in<str<strong>on</strong>g>for</str<strong>on</strong>g>mative prior. Note that (15) is<br />
<strong>on</strong>ly defined if , otherwise the matrix is singular and<br />
there<str<strong>on</strong>g>for</str<strong>on</strong>g>e is undefined. Now, instead of marginalizing the true<br />
distributi<strong>on</strong> , which is an NP hard problem, we approximate<br />
it by the product of the marginals of the Gaussian density<br />
(16)<br />
At this stage we apply the finite-set c<strong>on</strong>straint. From the<br />
Gaussian approximati<strong>on</strong> (16) we can extract a discrete approximati<strong>on</strong><br />
(14)<br />
that enables a successful implementati<strong>on</strong> of the Belief Propagati<strong>on</strong><br />
paradigm. Since the BP algorithm is optimal <strong>on</strong> c<strong>on</strong>nected<br />
cycle-free factor graphs, i.e., <strong>on</strong> trees, a reas<strong>on</strong>able approach is<br />
finding an optimal tree approximati<strong>on</strong> of the exact distributi<strong>on</strong><br />
(14). Chow and Liu [26] proposed a method to find a tree<br />
approximati<strong>on</strong> of a given distributi<strong>on</strong> that has the minimal<br />
Kullback-Leibler (KL) divergence to the true distributi<strong>on</strong>.<br />
They showed that the optimal tree can be found efficiently<br />
by searching <str<strong>on</strong>g>for</str<strong>on</strong>g> the maximum-weight spanning tree in the<br />
weighted complete graph <strong>on</strong> vertices where the weight of<br />
an edge is given by the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> of the two random<br />
variables associated with the edge’s endpoints. The problem is<br />
that the Chow-Liu algorithm is based <strong>on</strong> the two-dimensi<strong>on</strong>al<br />
marginal distributi<strong>on</strong>s. However, finding the marginal distributi<strong>on</strong><br />
of the probability functi<strong>on</strong> (14) is, un<str<strong>on</strong>g>for</str<strong>on</strong>g>tunately, NP hard<br />
and it is (equivalent to) our final target.<br />
To overcome this obstacle, our approach is based <strong>on</strong> applying<br />
the Chow-Liu algorithm <strong>on</strong> the distributi<strong>on</strong> corresp<strong>on</strong>ding to the<br />
unc<strong>on</strong>strained linear system. This distributi<strong>on</strong> is Gaussian and<br />
there<str<strong>on</strong>g>for</str<strong>on</strong>g>e it is straight<str<strong>on</strong>g>for</str<strong>on</strong>g>ward in this case to compute the two-dimensi<strong>on</strong>al<br />
marginal distributi<strong>on</strong>s. Given the Gaussian tree approximati<strong>on</strong>,<br />
the next step of our approach is to apply the finite-set<br />
c<strong>on</strong>straint and utilize the Gaussian tree distributi<strong>on</strong> to<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g>m a discrete loop free approximati<strong>on</strong> of which can be<br />
efficiently globally maximized using the BP algorithm. To motivate<br />
this approach we first show that the zero-<str<strong>on</strong>g>for</str<strong>on</strong>g>cing decoding<br />
algorithm (4) can be derived by utilizing a Gaussian approximati<strong>on</strong><br />
which is a simplified versi<strong>on</strong> of our Gaussian tree approximati<strong>on</strong>.<br />
Let be the least-squares estimator (3)<br />
and<br />
be its covariance matrix. It can be easily<br />
verified that (14) can be written as<br />
(17)<br />
Since this joint probability functi<strong>on</strong> is obtained as a product of<br />
marginal probabilities, we can decode each variable separately<br />
(18)<br />
Taking the most likely symbol we obtain the suboptimal zero<str<strong>on</strong>g>for</str<strong>on</strong>g>cing<br />
soluti<strong>on</strong> (4).<br />
Motivated by the simple product-of-marginals approximati<strong>on</strong><br />
described above, we suggest approximating the discrete distributi<strong>on</strong><br />
via a tree-based approximati<strong>on</strong> of the Gaussian<br />
distributi<strong>on</strong> . Although the Chow-Liu algorithm was<br />
originally stated <str<strong>on</strong>g>for</str<strong>on</strong>g> discrete distributi<strong>on</strong>s, <strong>on</strong>e can verify that it<br />
also applies <str<strong>on</strong>g>for</str<strong>on</strong>g> the Gaussian case. For the sake of completeness<br />
we give a detailed derivati<strong>on</strong> below.<br />
A. Finding the Optimal Gaussian Tree Approximati<strong>on</strong><br />
We represent an -node tree graph by the cycle-free parent<br />
relati<strong>on</strong>s such that is the parent node of .To<br />
simplify notati<strong>on</strong> we do not separately describe the root node.<br />
The parent of the root is implicitly assumed to be the empty<br />
set. A distributi<strong>on</strong><br />
is described by a tree<br />
if it can be written as<br />
. We start with a<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g>mula <str<strong>on</strong>g>for</str<strong>on</strong>g> the KL divergence between a Gaussian distributi<strong>on</strong><br />
and a distributi<strong>on</strong> defined <strong>on</strong> the same space that is<br />
represented by a tree graphical model.<br />
Theorem 1: Let<br />
be a multivariate<br />
Gaussian distributi<strong>on</strong> and let<br />
be another<br />
distributi<strong>on</strong> that is represented by a cycle-free graphical<br />
model (tree). The KL divergence between and is<br />
(15)<br />
(19)
GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4977<br />
such that is the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> and is the differential<br />
entropy, based <strong>on</strong> the distributi<strong>on</strong> .<br />
Proof: The definiti<strong>on</strong> of the KL divergence implies<br />
From (19) it can be easily seen that if we fix a tree graph<br />
, the tree distributi<strong>on</strong> whose KL divergence to is<br />
minimal is<br />
. In other words, the best<br />
tree approximati<strong>on</strong> of is c<strong>on</strong>structed from the c<strong>on</strong>diti<strong>on</strong>al<br />
distributi<strong>on</strong>s of . For that tree approximati<strong>on</strong> we have<br />
(20)<br />
Moreover, since in (20) and do not depend <strong>on</strong> the<br />
tree structure, the tree topology that best approximates<br />
is the <strong>on</strong>e that maximizes the sum<br />
(21)<br />
A spanning tree of a c<strong>on</strong>nected graph is a subgraph that c<strong>on</strong>tains<br />
all the vertices and is a tree. Now suppose the edges of the graph<br />
have weights. The weight of a spanning tree is simply the sum of<br />
weights of its edges. Equati<strong>on</strong> (21) reveals that the problem of<br />
finding the best tree approximati<strong>on</strong> of the Gaussian distributi<strong>on</strong><br />
can be reduced to the well-known problem of finding the<br />
maximum spanning tree of the weighted -node graph where<br />
the weight of the - edge is the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> between<br />
and [26]. It can be easily verified that the mutual in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong><br />
between two r.v. and that are jointly Gaussian is<br />
(22)<br />
where is the correlati<strong>on</strong> coefficient between and .<br />
There are several algorithms to find a minimum spanning tree.<br />
They all utilize a greedy approach. In this work we use the Prim<br />
algorithm [27], which is efficient and very simple to implement.<br />
The Prim algorithm begins with some vertex in a given graph,<br />
defining the initial set of vertices . Then, in each iterati<strong>on</strong>,<br />
we choose the edge with minimal weight am<strong>on</strong>g all the edges<br />
, where is outside of and is in . Then vertex<br />
is brought in to . This process is repeated until a spanning<br />
tree is <str<strong>on</strong>g>for</str<strong>on</strong>g>med. We can use a heap to remember, <str<strong>on</strong>g>for</str<strong>on</strong>g> each vertex,<br />
the lowest-weight edge c<strong>on</strong>necting the current subtree with<br />
that vertex. The complexity of Prim’s algorithm, <str<strong>on</strong>g>for</str<strong>on</strong>g> finding the<br />
minimum-weight spanning tree of a -vertex graph is .<br />
We note in passing that the Prim algorithm <strong>on</strong>ly relies <strong>on</strong> the<br />
order of the weights and not <strong>on</strong> their exact values. Hence, applying<br />
a m<strong>on</strong>ot<strong>on</strong>ically increasing functi<strong>on</strong> <strong>on</strong> the graph weights<br />
does not change the topology of the optimal tree. To find the optimal<br />
Gaussian tree approximati<strong>on</strong> we can, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e, use the<br />
weights instead of . The optimal<br />
Gaussian tree is, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e, the <strong>on</strong>e that maximizes the sum<br />
of the square correlati<strong>on</strong> coefficients between adjacent nodes.<br />
To summarize, the algorithm that finds the best Gaussian tree<br />
approximati<strong>on</strong> is as follows. Define as the root. Then find<br />
the edge c<strong>on</strong>necting a vertex in the tree to a vertex outside the<br />
tree, such that the corresp<strong>on</strong>ding square correlati<strong>on</strong> coefficient<br />
is maximal and add the edge to the tree. C<strong>on</strong>tinue this procedure<br />
until a spanning tree is obtained.<br />
B. Applying BP <strong>on</strong> the Tree Approximati<strong>on</strong><br />
Let be the optimal Chow-Liu tree approximati<strong>on</strong> of<br />
(15). We can assume, without loss of generality, that<br />
is rooted at . is a cycle-free Gaussian distributi<strong>on</strong><br />
<strong>on</strong> , i.e.,<br />
(23)<br />
where is the “parent” of the th node in the optimal tree.<br />
The Chow-Liu algorithm guarantees that is the optimal<br />
Gaussian tree approximati<strong>on</strong> of in the sense that the<br />
KL divergence is minimal.<br />
Given the Gaussian tree approximati<strong>on</strong>, the next step of our<br />
approach is to apply the finite-set c<strong>on</strong>straint to <str<strong>on</strong>g>for</str<strong>on</strong>g>m a discrete<br />
loop free approximati<strong>on</strong> of which can be efficiently globally<br />
maximized using the BP algorithm. Our approximati<strong>on</strong> approach<br />
is, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e, based <strong>on</strong> replacing the true distributi<strong>on</strong><br />
with the following approximati<strong>on</strong>:<br />
(24)<br />
The probability functi<strong>on</strong> is a cycle-free factor graph.<br />
Hence the BP algorithm can be applied to find its most likely<br />
c<strong>on</strong>figurati<strong>on</strong>. We next derive the messages of the BP algorithm<br />
that is applied <strong>on</strong> . An optimal BP schedule, when applied<br />
to a tree, requires passing a message <strong>on</strong>ce in each directi<strong>on</strong> of<br />
each edge [20]. The BP messages are first sent from leaf variables<br />
“downward” to the root. The computati<strong>on</strong> begins at the<br />
leaves of the graph. Each leaf variable node sends a message to<br />
its parent. Each vertex waits <str<strong>on</strong>g>for</str<strong>on</strong>g> messages from all of its children<br />
be<str<strong>on</strong>g>for</str<strong>on</strong>g>e computing the message to be sent to its parent. The<br />
“downward” BP message from a variable to its parent variable<br />
is computed based <strong>on</strong> all the messages received<br />
from its children<br />
(25)
4978 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />
If<br />
is a leaf node in the tree then the message is simply<br />
(26)<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> decoding <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> systems presented in the experiment secti<strong>on</strong>.<br />
The max-product is more computati<strong>on</strong>ally efficient since<br />
the BP messages can be entirely computed in the log-domain.<br />
The “downward” computati<strong>on</strong> terminates at the root node.<br />
Next, BP messages are sent “upward” back to the leaves. The<br />
computati<strong>on</strong> begins at the root of the graph. Each vertex waits<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> a message from its parent be<str<strong>on</strong>g>for</str<strong>on</strong>g>e computing the messages to<br />
be sent to each of its children. The “upward” BP message from<br />
a parent variable to its child variable is computed based<br />
<strong>on</strong> the “upward” message received from its parent<br />
and from “downward” messages that received from all the<br />
siblings of<br />
C. An MMSE Versi<strong>on</strong> of A Tree Approximati<strong>on</strong><br />
The MMSE Bayesian approach (5) is known to be better<br />
than the zero-<str<strong>on</strong>g>for</str<strong>on</strong>g>cing soluti<strong>on</strong> (4). In MMSE we partially incorporate<br />
the in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> that by using the prior Gaussian<br />
distributi<strong>on</strong><br />
. In a similar way we can c<strong>on</strong>sider a<br />
Bayesian versi<strong>on</strong> of the proposed Gaussian tree approximati<strong>on</strong>.<br />
We can partially incorporate the in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> that<br />
by using the prior Gaussian distributi<strong>on</strong><br />
such<br />
that<br />
. This yields the posterior Gaussian<br />
distributi<strong>on</strong><br />
If<br />
is the root of the tree then the message is simply<br />
(27)<br />
(28)<br />
After the downward-upward message passing procedure is<br />
completed, we can compute the “belief” at each variable which<br />
is the product of all the messages sent to the variable from its<br />
parent and from its children (if there are any)<br />
In the case<br />
follows:<br />
(29)<br />
is the root node, the “belief” is computed as<br />
(30)<br />
Since the approximated distributi<strong>on</strong> (24) is cycle-free,<br />
the general Belief Propagati<strong>on</strong> theory guarantees that (the normalized)<br />
belief vector is exactly the marginal distributi<strong>on</strong><br />
of the approximated distributi<strong>on</strong> (24). To obtain<br />
a hard-decisi<strong>on</strong> decoding we choose the symbol whose posterior<br />
probability is maximal<br />
(31)<br />
Above we described the sum-product versi<strong>on</strong> of the BP algorithm<br />
that computes the marginal probabilities .A<br />
variant of the sum-product algorithm is the max-product algorithm<br />
in which the summati<strong>on</strong> in (25)–(28) is replaced by<br />
a maximizati<strong>on</strong> over all the symbols in . The max-product<br />
algorithm finds the most likely pattern of the approximati<strong>on</strong><br />
. We did not observe any significant per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance difference<br />
using either the sum-product or the max-product variants<br />
(32)<br />
such that<br />
and<br />
. The MMSE method is obtained by approximating<br />
the unc<strong>on</strong>strained posterior distributi<strong>on</strong><br />
by<br />
a product of marginals. In our approach we use, instead, the best<br />
cycle-free Gaussian approximati<strong>on</strong>. We can apply the Chow-Liu<br />
tree approximati<strong>on</strong> <strong>on</strong> the Gaussian distributi<strong>on</strong> (32) to obtain<br />
a “Bayesian” Gaussian tree approximati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> . In this<br />
way we partially use the finite-set c<strong>on</strong>straint when we search<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> the best tree approximati<strong>on</strong> of the true discrete distributi<strong>on</strong><br />
. This is likely to yield a better approximati<strong>on</strong> of the discrete<br />
distributi<strong>on</strong> than the tree distributi<strong>on</strong> which is based<br />
<strong>on</strong> the unc<strong>on</strong>strained distributi<strong>on</strong> . Note that if there<br />
are more transmit antennas than receive antennas, i.e., ,<br />
the linear system<br />
is underdetermined and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e<br />
the ZF soluti<strong>on</strong> does not exist and we cannot apply the<br />
Gaussian tree approximati<strong>on</strong>. However, the MMSE soluti<strong>on</strong> and<br />
the MMSE versi<strong>on</strong> of the Gaussian tree approximati<strong>on</strong> are still<br />
valid.<br />
To summarize, our soluti<strong>on</strong> to the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding problem<br />
is based <strong>on</strong> applying BP <strong>on</strong> a discrete versi<strong>on</strong> of the Gaussian<br />
tree approximati<strong>on</strong> of the Bayesian versi<strong>on</strong> of the c<strong>on</strong>tinuous<br />
least-square soluti<strong>on</strong>. We dub this method “The Gaussian-Tree-<br />
Approximati<strong>on</strong> (GTA) Algorithm.” The GTA algorithm is summarized<br />
in Fig. 5. We next compute the complexity of the GTA<br />
algorithm. The complexity of computing the covariance matrix<br />
is (if we can utilize<br />
the matrix inversi<strong>on</strong> lemma <str<strong>on</strong>g>for</str<strong>on</strong>g> the matrix inversi<strong>on</strong>). The<br />
complexity of the Chow-Liu algorithm (based <strong>on</strong> Prim’s algorithm<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> finding the minimum spanning tree) is and the<br />
complexity of the BP algorithm is (regardless of the<br />
number of receive antennas).<br />
IV. EXPERIMENTAL RESULTS<br />
In this secti<strong>on</strong> we provide simulati<strong>on</strong> results <str<strong>on</strong>g>for</str<strong>on</strong>g> the GTA<br />
algorithm over various <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> systems. The channel matrix<br />
comprised i.i.d. elements drawn from a zero-mean complex
GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4979<br />
Fig. 6. Comparis<strong>on</strong> of various detectors in 12 2 12 system, 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> symbols.<br />
Fig. 5. The Gaussian Tree Approximati<strong>on</strong> (GTA) Algorithm.<br />
normal distributi<strong>on</strong> of unit variance. We used 500 000 realizati<strong>on</strong>s<br />
of the channel matrix and each matrix was used <strong>on</strong>ce <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
sending a message. The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the proposed algorithm<br />
is shown as a functi<strong>on</strong> of the variance of the additive noise .<br />
The signal-to-noise ratio (SNR) is defined as<br />
where ( is the number of variables, is the<br />
variance of the Gaussian additive noise, and is the mean<br />
symbol energy). We compared the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the GTA<br />
method to the MMSE-SIC algorithm with optimal ordering<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> the successive interference cancelati<strong>on</strong> [2], and to the<br />
Schnorr-Euchner variant of sphere decoding (SD-SE) with<br />
infinite radius [3], [28]. We used sorting of the channel matrix<br />
using the SQRD algorithm [29] and regularizati<strong>on</strong> [30], which<br />
substantially reduces the computati<strong>on</strong>al complexity.<br />
We also implemented the SDR detector suggested by<br />
Sidiropoulos and Luo [8]. Recently it was shown [9] that <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
the cases of 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> and 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> this SDR based method is<br />
equivalent to the SDR based detecti<strong>on</strong> method suggested by<br />
Wiesel, Eldar, and Shamai [7]. The SDRs were solved using<br />
the CSDP package [31]. In the SDR Gaussian randomizati<strong>on</strong><br />
step, 100 independent randomizati<strong>on</strong>s were implemented. All<br />
the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> algorithms were implemented in C <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
efficiency.<br />
Fig. 6 shows <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance <str<strong>on</strong>g>for</str<strong>on</strong>g> a 12 12<br />
<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> system using 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g>. The methods that are shown are<br />
MMSE-SIC, GTA, SDR, and sphere-decoding. In this case the<br />
SDR outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms both the MMSE-SIC and the GTA methods.<br />
However, in this case it is still feasible to compute the optimal<br />
maximum-likelihood algorithm using the sphere decoding<br />
algorithm.<br />
The SD-SE is the favorite detecti<strong>on</strong> method when the problem<br />
size is small or moderate. In this case the SD-SE can always<br />
yield the exact ML soluti<strong>on</strong> at acceptable computati<strong>on</strong>al cost.<br />
In large size problems, however, there is still a need <str<strong>on</strong>g>for</str<strong>on</strong>g> good<br />
approximati<strong>on</strong> methods. Fig. 7 shows the SER versus SNR and<br />
worst case executi<strong>on</strong> time versus SNR <str<strong>on</strong>g>for</str<strong>on</strong>g> the 12 12 system<br />
using 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g>. Fig. 8 shows the same experimental results <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
the 16 16 system using 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> (time was measured <strong>on</strong> a<br />
dual quad-core Intel Xe<strong>on</strong> 2.33 GHz). To assess the computati<strong>on</strong>al<br />
complexity we used a measure of the worst case rather<br />
than the average case since in <strong>on</strong>line applicati<strong>on</strong>s we have to<br />
decode within a specified time. The choice between executi<strong>on</strong><br />
time or number of floating point operati<strong>on</strong>s is debatable. The<br />
differences in running time between the methods we implemented<br />
was in orders of magnitude and running time is easier<br />
to appreciate.<br />
As can be seen from Figs. 7 and 8, the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the<br />
GTA algorithm in high SNR is significantly better than the<br />
MMSE-SIC. The computati<strong>on</strong>al complexity of GTA is comparable<br />
to MMSE-SIC and it is much better than SDR. Note<br />
that the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the SDR method [8] in these 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g><br />
cases is worse than that of MMSE-SIC and the computati<strong>on</strong>al<br />
complexity is much higher. From the derivati<strong>on</strong> of the SDR<br />
method it can be seen that the relaxati<strong>on</strong> becomes more crude<br />
as at higher c<strong>on</strong>stellati<strong>on</strong>s.<br />
We next show the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of several variants of the GTA<br />
algorithm. The GTA algorithm differs from the ZF, MMSE, and<br />
MMSE-SIC algorithms in several ways. The first difference is<br />
that the GTA algorithm utilizes a Markovian approximati<strong>on</strong> of<br />
instead of an approximati<strong>on</strong> based <strong>on</strong> a product of<br />
independent densities. The sec<strong>on</strong>d aspect is the use of an optimal<br />
tree. To clarify the c<strong>on</strong>tributi<strong>on</strong> of each comp<strong>on</strong>ent we modified<br />
the GTA algorithm by replacing the Chow-Liu optimal tree by<br />
the tree<br />
. We call this method the<br />
“Line-Tree.” As can be seen from Fig. 9, using the optimal tree<br />
is crucial to obtain improved results. Fig. 9 also shows the results<br />
of the n<strong>on</strong>-Bayesian variant of the GTA algorithm. As can be<br />
seen, the Bayesian versi<strong>on</strong> yields better results. Fig. 9 shows<br />
the symbol error rate (SER) versus SNR <str<strong>on</strong>g>for</str<strong>on</strong>g> a 20 20, ,<br />
real <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> system. The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the GTA method and<br />
its variants was compared to the MMSE and the MMSE-SIC<br />
algorithms.
4980 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />
Fig. 7. 12 2 12 system, 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> symbols. (a) SER versus SNR. (b) Max sec<strong>on</strong>ds <str<strong>on</strong>g>for</str<strong>on</strong>g> decoded symbol vector versus SNR. Note that the graphs of GTA and<br />
MMSE-SIC overlap.<br />
Fig. 8. 16 2 16 system, 64-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> symbols. (a) SER versus SNR. (b) Max sec<strong>on</strong>ds <str<strong>on</strong>g>for</str<strong>on</strong>g> decoded symbol vector versus SNR.<br />
Fig. 9. Comparative results of MMSE, MMSE-SIC and variants of the GTA<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> a 20 2 20 real system, A = f61; 63g.<br />
V. CONCLUSION<br />
We proposed a novel <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> technique based <strong>on</strong><br />
the principle of a tree approximati<strong>on</strong> of the Gaussian distributi<strong>on</strong><br />
that corresp<strong>on</strong>ds to the c<strong>on</strong>tinuous linear problem. The<br />
proposed method outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms previously suggested <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding<br />
algorithms in high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>, as dem<strong>on</strong>strated<br />
in simulati<strong>on</strong>s. Finding the best tree <str<strong>on</strong>g>for</str<strong>on</strong>g> the integer c<strong>on</strong>strained<br />
linear problem is NP hard (unless in which<br />
the -node graph is a tree and there<str<strong>on</strong>g>for</str<strong>on</strong>g>e the Gaussian tree approximati<strong>on</strong><br />
is exact). Although the proposed method yields<br />
improved results, the tree approximati<strong>on</strong> we applied may not<br />
be the best <strong>on</strong>e. It is left <str<strong>on</strong>g>for</str<strong>on</strong>g> future research to search <str<strong>on</strong>g>for</str<strong>on</strong>g> a<br />
better discrete tree approximati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> the c<strong>on</strong>strained linear least<br />
squares problem. This paper dealt with a tree approximati<strong>on</strong> approach,<br />
more complicated approximati<strong>on</strong>s such as multiparent<br />
trees could improve per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance and can potentially provide a<br />
smooth per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance-complexity trade-off (see, e.g., [24], [32]<br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> other methods that allow an arbitrary tradeoff of speed versus<br />
accuracy). Another aspect of the GTA algorithm that is worthwhile<br />
to be further investigated is the dependency of the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance<br />
<strong>on</strong> the channel matrix probabilistic model. In the experiments,<br />
the channel matrix was sampled from a standard<br />
normal distributi<strong>on</strong>, leading to a rather dense system of equati<strong>on</strong>s.<br />
This is a standard simulati<strong>on</strong> modeling of a <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> communicati<strong>on</strong><br />
system. The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of the GTA algorithm,<br />
however, can be different <str<strong>on</strong>g>for</str<strong>on</strong>g> more sparse matrices and <str<strong>on</strong>g>for</str<strong>on</strong>g> binary<br />
matrices that occur in CDMA systems. We have shown
GOLDBERGER AND LESHEM: <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> DETECTION FOR HIGH-ORDER <str<strong>on</strong>g>QAM</str<strong>on</strong>g> BASED ON A GAUSSIAN TREE APPROXIMATION 4981<br />
that <str<strong>on</strong>g>for</str<strong>on</strong>g> high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> the GTA method outper<str<strong>on</strong>g>for</str<strong>on</strong>g>ms previously<br />
suggested <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoding methods such as MMSE-SIC<br />
and SDR. As <str<strong>on</strong>g>for</str<strong>on</strong>g> potential future work, it would be interesting to<br />
compare our Gaussian tree approximati<strong>on</strong> BP algorithm to other<br />
BP Gaussian approximati<strong>on</strong> methods that are based <strong>on</strong> the factor<br />
graph representati<strong>on</strong> (11), e.g., the decoding algorithm proposed<br />
by Kabashima [22].<br />
While the method provides excellent per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance, it is<br />
worthwhile to menti<strong>on</strong> that the method provides estimates<br />
of the a posteriori probabilities <str<strong>on</strong>g>for</str<strong>on</strong>g> each variable that can<br />
be used to improve per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance. This is d<strong>on</strong>e by applying a<br />
Schnorr-Euchner sphere decoding algorithm [3], where we<br />
order the symbols according to their a posteriori probabilities.<br />
This can lead to very close to optimal per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance with highly<br />
reduced complexity. This is due to the fact that by utilizing the<br />
a posteriori probabilities, we have a much higher probability<br />
of finding the true soluti<strong>on</strong> during the first search, there<str<strong>on</strong>g>for</str<strong>on</strong>g>e<br />
significantly reducing the search radius.<br />
There are several important applicati<strong>on</strong>s of the proposed technique.<br />
We comment here <strong>on</strong> combining it into communicati<strong>on</strong><br />
systems with coding and interleaving. It is useful both <str<strong>on</strong>g>for</str<strong>on</strong>g> single<br />
carrier and OFDM systems. It can serve as a <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> decoder <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
wireless communicati<strong>on</strong> systems. Using the a posteriori probability<br />
distributi<strong>on</strong> of the symbols we can easily estimate the a<br />
posteriori probability and the likelihood ratio <str<strong>on</strong>g>for</str<strong>on</strong>g> the bits. The<br />
technique can be combined with <str<strong>on</strong>g>MIMO</str<strong>on</strong>g>-OFDM system with bit<br />
interleaved coded modulati<strong>on</strong> or with trellis coded modulati<strong>on</strong><br />
by joint coding over all frequency t<strong>on</strong>es of the OFDM system,<br />
and running the decoder <str<strong>on</strong>g>for</str<strong>on</strong>g> each t<strong>on</strong>e independently.<br />
In this paper we focused <strong>on</strong> the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem.<br />
The proposed method, however, can be applied to solve more<br />
general c<strong>on</strong>strained linear least squares problems which is an<br />
important issue in many fields. A main c<strong>on</strong>cept in the GTA<br />
model is the interplay between discrete and Gaussian models.<br />
Such hybrid ideas can be c<strong>on</strong>sidered also <str<strong>on</strong>g>for</str<strong>on</strong>g> discrete inference<br />
problems other than least-squares. One example is the work<br />
of Opper and Winther [33] who applied an iterative algorithm<br />
using a model which is seen as discrete and Gaussian in turn to<br />
address Ising model inference problems. In their approach they<br />
approximated intractable probabilistic models by replacing an<br />
average over the original intractable distributi<strong>on</strong> with a tractable<br />
<strong>on</strong>e. Our method can be viewed as a special case of their tree expectati<strong>on</strong>-c<strong>on</strong>sistent<br />
approximati<strong>on</strong> algorithm [33], applied to<br />
the <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> problem.<br />
REFERENCES<br />
[1] U. Fincke and M. Pohst, “Improved methods <str<strong>on</strong>g>for</str<strong>on</strong>g> calculating vectors<br />
of short length in a lattice, including a complexity analysis,” Math.<br />
Comput., vol. 44, no. 170, pp. 463–471, 1985.<br />
[2] G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky,<br />
“<str<strong>on</strong>g>Detecti<strong>on</strong></str<strong>on</strong>g> algorithm and initial laboratory results using V-BLAST<br />
space-time communicati<strong>on</strong> architecture,” Electr<strong>on</strong>. Let., vol. 35, no. 1,<br />
pp. 14–16, 1999.<br />
[3] C. Schnorr and M. Euchner, “Lattice basis reducti<strong>on</strong>: Improved practical<br />
algorithms and solving subset sum problems,” J. Math. Program.,<br />
vol. 66, no. 1–3, pp. 181–199, 1994.<br />
[4] J. Boutros, N. Gresset, L. Brunel, and M. Fossorier, “Soft-input softoutput<br />
lattice sphere decoder <str<strong>on</strong>g>for</str<strong>on</strong>g> linear channels,” in Proc. <strong>IEEE</strong> Global<br />
Commun. C<strong>on</strong>f., Dec. 2003, vol. 3, pp. 1583–1587.<br />
[5] C. Studer, A. Burg, and H. Bölcskei, “Soft-output sphere decoding:<br />
Algorithms and VLSI implementati<strong>on</strong>,” <strong>IEEE</strong> J. Sel. Areas Commun.,<br />
vol. 26, no. 2, pp. 290–300, 2008.<br />
[6] J. Jalden and B. Ottersten, “On the complexity of sphere decoding in<br />
digital communicati<strong>on</strong>s,” <strong>IEEE</strong> Trans. Signal Process., vol. 53, no. 4,<br />
pp. 1474–1484, 2005.<br />
[7] A. Wiesel, Y. C. Eldar, and S. Shamai, “Semidefinite relaxati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
detecti<strong>on</strong> of 16-<str<strong>on</strong>g>QAM</str<strong>on</strong>g> signaling in <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> channels,” <strong>IEEE</strong> Signal<br />
Process. Lett., vol. 12, no. 9, pp. 653–656, 2005.<br />
[8] N. Sidiropoulos and Z. Luo, “A semidefinite relaxati<strong>on</strong> approach to<br />
<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detecti<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> high-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g> c<strong>on</strong>stellati<strong>on</strong>s,” <strong>IEEE</strong> Signal<br />
Process. Lett., vol. 13, no. 9, pp. 525–528, 2006.<br />
[9] W. Ma, C. Su, J. Jalden, T. Chang, and C. Chi, “The equivalence of<br />
semidefinite relaxati<strong>on</strong> <str<strong>on</strong>g>MIMO</str<strong>on</strong>g> detectors <str<strong>on</strong>g>for</str<strong>on</strong>g> higher-order <str<strong>on</strong>g>QAM</str<strong>on</strong>g>,” <strong>IEEE</strong><br />
J. Sel. Topics Signal Process., vol. 3, no. 6, pp. 1038–1052, 2009.<br />
[10] O. Shental, N. Shental, S. Shamai (Shitz), I. Kanter, A. Weiss, and<br />
Y. Weiss, “Discrete-input two-dimensi<strong>on</strong>al Gaussian channels with<br />
memory: Estimati<strong>on</strong> and in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> rates via graphical models and<br />
statistical mechanics,” <strong>IEEE</strong> Trans. Inf. Theory, vol. 54, no. 4, pp.<br />
1500–1513, Apr. 2008.<br />
[11] J. S. Yedidia, W. T. Freeman, and Y. Weiss, “Understanding belief<br />
propagati<strong>on</strong> and its generalizati<strong>on</strong>s,” in Proc. Int. Joint C<strong>on</strong>f. Artif. Intell.,<br />
Aug. 2001, pp. 239–236.<br />
[12] E. Sudderth and W. Freeman, “Signal and image processing with belief<br />
propagati<strong>on</strong>,” <strong>IEEE</strong> Signal Process. Mag., vol. 25, no. 2, pp. 114–141,<br />
2008.<br />
[13] J. Pearl, Probabilistic Reas<strong>on</strong>ing in Intelligent Systems. San Mateo,<br />
CA: Morgan Kaufman, 1988.<br />
[14] Y. Weiss and W. Freeman, “Correctness of belief propagati<strong>on</strong> in<br />
Gaussian graphical models of arbitrary topology,” Neural Comput.,<br />
vol. 13, no. 10, pp. 2173–2200, 2001.<br />
[15] R. G. Gallager, Low Density Parity Check Codes. Cambridge, MA:<br />
MIT Press, 1963.<br />
[16] P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient belief propagati<strong>on</strong><br />
<str<strong>on</strong>g>for</str<strong>on</strong>g> early visi<strong>on</strong>,” Int. J. Comput. Vis., vol. 70, no. 1, pp. 41–54, 2006.<br />
[17] K. Murphy, Y. Weiss, and M. Jordan, “Loopy belief propagati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
approximate inference: An empirical study,” in Proc. Uncertainty Artif.<br />
Intell., Jul. 1999, pp. 467–475.<br />
[18] T. Minka and Y. Qi, “Tree-structured approximati<strong>on</strong>s by expectati<strong>on</strong><br />
propagati<strong>on</strong>,” in Proc. Neural Inf. Process. Syst., Vancouver, BC,<br />
Canada, Dec. 2004.<br />
[19] J. Goldberger and A. Leshem, “Pseudo prior belief propagati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g><br />
densely c<strong>on</strong>nected discrete graphs,” in Proc. <strong>IEEE</strong> Inf. Theory Workshop,<br />
Jan. 2010, pp. 1–5.<br />
[20] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and<br />
the sum-product algorithm,” <strong>IEEE</strong> Trans. Inf. Theory, vol. 47, no. 2,<br />
pp. 498–519, Feb. 2001.<br />
[21] M. Kaynak, T. Duman, and E. Kurtas, “Belief propagati<strong>on</strong> over<br />
SISO/<str<strong>on</strong>g>MIMO</str<strong>on</strong>g> frequency selective fading channels,” <strong>IEEE</strong> Trans.<br />
Wireless Commun, vol. 6, no. 6, pp. 2001–2005, Jun. 2007.<br />
[22] Y. Kabashima, “A CDMA multiuser detecti<strong>on</strong> algorithm <strong>on</strong> the basis<br />
of belief propagati<strong>on</strong>,” J. Phys. A, vol. 36, pp. 11 111–11 121, 2003.<br />
[23] J. Neirotti and D. Saad, “Improved message passing <str<strong>on</strong>g>for</str<strong>on</strong>g> inference in<br />
densely c<strong>on</strong>nected systems,” Europhys. Lett., vol. 71, pp. 866–872,<br />
2005.<br />
[24] J. Hu and T. M. Duman, “Graph-based detector <str<strong>on</strong>g>for</str<strong>on</strong>g> BLAST architecture,”<br />
in Proc. <strong>IEEE</strong> Int. C<strong>on</strong>f. Commun., Jun. 2007, pp. 1018–1023.<br />
[25] D. Bicks<strong>on</strong>, O. Shental, P. H. Siegel, J. K. Wolf, and D. Dolev,<br />
“Gaussian belief propagati<strong>on</strong> based multiuser detecti<strong>on</strong>,” in Proc.<br />
<strong>IEEE</strong> Int. Symp. Inf. Theory, Jul. 2008, pp. 1878–1882.<br />
[26] C. K. Chow and C. N. Liu, “Approximating discrete probability distributi<strong>on</strong>s<br />
with dependence trees,” <strong>IEEE</strong> Trans. Inf. Theory, vol. IT-14,<br />
no. 3, pp. 462–467, May 1968.<br />
[27] R. C. Prim, “Shortest c<strong>on</strong>necti<strong>on</strong> networks and some generalizati<strong>on</strong>s,”<br />
J. Bell Syst. Tech., pp. 1389–1401, 1957.<br />
[28] E. Agrell, T. Erikss<strong>on</strong>, A. Vardy, and K. Zeger, “Closest point search in<br />
lattices,” <strong>IEEE</strong> Trans. Inf. Theory, vol. 48, no. 8, pp. 2201–2214, Aug.<br />
2002.<br />
[29] D. Wübben, R. Böhnke, J. Rinas, V. Kühn, and K. Kammeyer, “Efficient<br />
algorithm <str<strong>on</strong>g>for</str<strong>on</strong>g> decoding layered space-time codes,” IEE Electr<strong>on</strong>.<br />
Lett., vol. 37, pp. 1348–1350, 2001.<br />
[30] D. Wübben, R. Böhnke, V. Kühn, and K. Kammeyer, “MMSE extensi<strong>on</strong><br />
of V-BLAST based <strong>on</strong> sorted QR decompositi<strong>on</strong>,” in Proc. <strong>IEEE</strong><br />
Veh. Technol. C<strong>on</strong>f., Oct. 2003, vol. 1, pp. 508–512.<br />
[31] B. Borchers, “A C library <str<strong>on</strong>g>for</str<strong>on</strong>g> semidefinite programming,” Optim.<br />
Methods Softw., vol. 11, pp. 613–623, 1999.<br />
[32] P. H. Tan and L. K. Rasmussen, “Multiuser detecti<strong>on</strong> in CDMA—A<br />
comparis<strong>on</strong> of relaxati<strong>on</strong>s, exact, and heuristic search methods,” <strong>IEEE</strong><br />
Trans. Wireless Commun., vol. 3, no. 5, pp. 1802–1809, Sep. 2004.<br />
[33] M. Opper and O. Winther, “Expectati<strong>on</strong> c<strong>on</strong>sistent approximate inference,”<br />
J. Mach. Learn. Res., vol. 6, no. 12, pp. 2177–2204, 2005.
4982 <strong>IEEE</strong> TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 8, AUGUST 2011<br />
Jacob Goldberger received the B.Sc. degree (cum laude) in mathematics and<br />
computer science from the Bar-Ilan University, Ramat Gan, Israel, the M.Sc.<br />
degree (cum laude) in mathematics, and the Ph.D. degree in Engineering, from<br />
the Tel-Aviv University, Tel-Aviv, Israel. In 2004 he joined the School of Electrical<br />
and Computer Engineering, Bar-Ilan University, Ramat Gan, Israel, where<br />
he is currently a faculty member. His main research interests include machine<br />
learning, in<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> theory, computer visi<strong>on</strong>, medical imaging, and speech<br />
recogniti<strong>on</strong>.<br />
Amir Leshem (M’98-SM’06) received the B.Sc. degree (cum laude) in mathematics<br />
and physics, the M.Sc. degree (cum laude) in mathematics, and the Ph.D.<br />
degree in mathematics, all from the Hebrew University, Jerusalem, Israel. He<br />
is <strong>on</strong>e of the founders of the School of Electrical and Computer Engineering,<br />
Bar-Ilan University, Ramat Gan, Israel, where he is currently an Associate Professor<br />
and Head of the signal processing track. His main research interests include<br />
multichannel communicati<strong>on</strong>, applicati<strong>on</strong>s of game theory to communicati<strong>on</strong>,<br />
array and statistical signal processing with applicati<strong>on</strong>s to sensor arrays<br />
and networks, wireless communicati<strong>on</strong>s, radio-astr<strong>on</strong>omy, and brain research,<br />
set theory, logic, and foundati<strong>on</strong>s of mathematics.