12.02.2014 Views

Transform coding techniques for lossy hyperspectral data compression

Transform coding techniques for lossy hyperspectral data compression

Transform coding techniques for lossy hyperspectral data compression

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Trans<strong>for</strong>m</strong> <strong>coding</strong> <strong>techniques</strong> <strong>for</strong> <strong>lossy</strong><br />

<strong>hyperspectral</strong> <strong>data</strong> <strong>compression</strong><br />

Barbara Penna, Member, IEEE, Tammam Tillo, Member, IEEE,<br />

Enrico Magli, Member, IEEE, Gabriella Olmo, Member, IEEE<br />

Abstract<br />

<strong>Trans<strong>for</strong>m</strong>-based <strong>lossy</strong> <strong>compression</strong> has a huge potential <strong>for</strong> <strong>hyperspectral</strong> <strong>data</strong> reduction. Hyperspectral<br />

<strong>data</strong> are three-dimensional, and the nature of their correlation is different in each dimension.<br />

This calls <strong>for</strong> a careful design of the 3D trans<strong>for</strong>m to be used <strong>for</strong> <strong>compression</strong>.<br />

In this paper we investigate the trans<strong>for</strong>m design and rate allocation stage <strong>for</strong> <strong>lossy</strong> <strong>compression</strong><br />

of <strong>hyperspectral</strong> <strong>data</strong>. Firstly, we select a set of 3D trans<strong>for</strong>ms, obtained by combining in various ways<br />

wavelets, wavelet packets, the discrete cosine trans<strong>for</strong>m, and the Karhunen-Loève trans<strong>for</strong>m (KLT),<br />

and evaluate the <strong>coding</strong> efficiency of these combinations. Secondly, we propose a low-complexity<br />

version of the KLT, in which complexity and per<strong>for</strong>mance can be balanced in a scalable way, allowing<br />

one to design the trans<strong>for</strong>m that better matches a specific application. Thirdly, we integrate this, as<br />

well as other existing trans<strong>for</strong>ms, in the framework of Part 2 of the JPEG 2000 standard, taking<br />

advantage of the high <strong>coding</strong> efficiency of JPEG 2000, and exploiting the interoperability of an<br />

international standard.<br />

We report experimental results on AVIRIS scenes. It is shown that the scheme based on the<br />

proposed low-complexity KLT significantly outper<strong>for</strong>ms previous schemes as to rate-distortion per<strong>for</strong>mance.<br />

We also carry out some experiments in order to evaluate the effect of <strong>lossy</strong> <strong>compression</strong> on<br />

image classification using the spectral angle mapper method. It turns out that classification accuracy<br />

is reasonably correlated with the mean squared-error; there<strong>for</strong>e, the proposed scheme exhibits considerably<br />

better per<strong>for</strong>mance than the state-of-the-art also in terms of this more application-related<br />

quality assessment.<br />

Index Terms<br />

Lossy <strong>compression</strong>; <strong>hyperspectral</strong> <strong>data</strong>; wavelets; wavelet packets; KLT; DCT; JPEG 2000;<br />

SPIHT 3D.<br />

The authors are with CERCOM (Center <strong>for</strong> Multimedia Radio Communications), Dip. di Elettronica, Politecnico<br />

di Torino, Corso Duca degli Abruzzi 24 - 10129 Torino - Italy - Ph.: +39-011-5644195 - FAX: +39-011-5644099<br />

- E-mail: barbara.penna(tammam.tillo,enrico.magli,gabriella.olmo)@polito.it. Corresponding<br />

author: Enrico Magli.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 1<br />

<strong>Trans<strong>for</strong>m</strong> <strong>coding</strong> <strong>techniques</strong> <strong>for</strong> <strong>lossy</strong><br />

<strong>hyperspectral</strong> <strong>data</strong> <strong>compression</strong><br />

I. INTRODUCTION<br />

Hyperspectral imaging amounts to collecting the energy reflected or emitted by ground targets at<br />

a typically very high number of wavelengths, resulting in a <strong>data</strong> cube containing tens to hundreds of<br />

bands. These <strong>data</strong> have become increasingly popular, since they enable plenty of new applications,<br />

including detection and identification of surface and atmospheric constituents, analysis of soil type,<br />

agriculture and <strong>for</strong>est monitoring, environmental studies and military surveillance. The <strong>data</strong> are usually<br />

acquired by a remote plat<strong>for</strong>m (a satellite or an aircraft), and then downlinked to a ground station.<br />

Due to the huge size of the <strong>data</strong>sets, <strong>compression</strong> is necessary to match the available transmission<br />

bandwidth.<br />

In the past, scientific <strong>data</strong> have been almost exclusively compressed by means of lossless methods,<br />

in order to preserve their full quality. However, more recently, there has been an increasing interest<br />

in their <strong>lossy</strong> <strong>compression</strong>. In fact, two of the most recent satellites, SPOT 4 and IKONOS, employ<br />

on-board <strong>lossy</strong> <strong>compression</strong> prior to downlinking the <strong>data</strong> to ground stations. As <strong>lossy</strong> <strong>compression</strong><br />

allows <strong>for</strong> higher scene acquisition rates, several <strong>lossy</strong> algorithms have been designed <strong>for</strong> multispectral<br />

and <strong>hyperspectral</strong> images.<br />

Many of these <strong>techniques</strong> are based on decorrelating trans<strong>for</strong>ms, in order to exploit spatial and<br />

inter-band (i.e., spectral) correlation, followed by a quantization stage and an entropy coder. Examples<br />

include the JPEG 2000 standard [1], and set partitioning methods such as SPIHT and its variations<br />

(SPIHT-2D, SPIHT-3D, SPECK). Some authors have also proposed to employ the 3D Discrete Wavelet<br />

<strong>Trans<strong>for</strong>m</strong> (DWT) [2], [3], [4] and the 3D Discrete Cosine <strong>Trans<strong>for</strong>m</strong> (DCT) [5], [6].<br />

Moreover, several methods that treat differently spectral and spatial redundancy have been investigated.<br />

A popular approach involves the combination of a one-dimensional spectral decorrelator such<br />

as the Karhunen-Loève <strong>Trans<strong>for</strong>m</strong> (KLT), the DWT, or the DCT, followed by JPEG 2000 employed as<br />

spatial decorrelator, rate allocator, and entropy coder (see e.g. [7], [8]); SPIHT has also been used <strong>for</strong><br />

the same purpose [9]. In [10] and [11] a 3D version of SPIHT and a low complexity image encoder<br />

with similar features (Set Partitioned Embedded BloCK(SPECK)) are proposed <strong>for</strong> <strong>hyperspectral</strong><br />

image <strong>compression</strong>, exploiting the wavelet packet trans<strong>for</strong>m.<br />

Although there has been a large amount of work on 3D coders, the proposed <strong>techniques</strong> have been<br />

often tested under different conditions and using different <strong>data</strong>sets; this makes it difficult to evaluate


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 2<br />

the best combination of spatial and spectral trans<strong>for</strong>ms <strong>for</strong> a given application. On a related note,<br />

regardless of the fact that the KLT is the optimal trans<strong>for</strong>m in the <strong>coding</strong> gain sense, its practical<br />

application has been somewhat limited because of its complexity and of the fact that the trans<strong>for</strong>m<br />

is signal-adaptive; however, a few recent works have rediscovered the KLT and attempted to exploit<br />

its superior decorrelation capabilities [12].<br />

In [13] vector quantization and spectral KLT are employed to exploit the correlation between<br />

multispectral bands. In [14], an efficient adaptive KLT algorithm <strong>for</strong> multispectral image <strong>compression</strong><br />

is presented. The proposed technique exploits an adaptive algorithm to continuously adjust eigenvalues<br />

and eigenvectors when input image <strong>data</strong> are received sequentially.<br />

In [15], an integer implementation of the KTL followed by JPEG 2000 applied to each trans<strong>for</strong>med<br />

component is presented. In [16] a hybrid 3D wavelet trans<strong>for</strong>m is used, employing JPEG 2000<br />

as spatial decorrelator, with full 3D post-<strong>compression</strong> rate-distortion optimization; this technique<br />

significantly outper<strong>for</strong>ms state-of-the-art schemes.<br />

The highest-per<strong>for</strong>mance existing schemes take advantage of the high <strong>coding</strong> efficiency of the<br />

KLT as spectral decorrelator, and of JPEG 2000 as spatial decorrelator and entropy coder. However,<br />

a few issues are still open. Firstly, although the KLT is the optimal decorrelating trans<strong>for</strong>m, its<br />

complexity is very high, due to the need to estimate covariance matrices, solve eigenvector problems,<br />

and computing matrix-vector products. Secondly, most schemes employ JPEG 2000 separately on<br />

each band, allocating the same rate to each of them; this approach is obviously suboptimal, since the<br />

spectral trans<strong>for</strong>m unbalances the energy in different bands, and this can be exploited by differentiating<br />

the rate allocation.<br />

Although <strong>lossy</strong> <strong>compression</strong> allows <strong>for</strong> much higher <strong>compression</strong> ratios than lossless <strong>compression</strong>,<br />

it introduces degradation in the <strong>data</strong>. There<strong>for</strong>e, as <strong>lossy</strong> <strong>compression</strong> has become more popular,<br />

researchers have started to investigate the quality issues associated with such in<strong>for</strong>mation losses. Two<br />

important questions are 1) whether the metrics based on the mean-squared error (MSE) can adequately<br />

capture the effects of this degradation in typical remote sensing applications, and 2) whether other<br />

simple metrics exist, which are better then MSE at capturing these effects. Recently, a comprehensive<br />

investigation of this problem has been reported in [17], where the authors consider a set of quality<br />

metrics and a set of image degradations, including <strong>lossy</strong> <strong>compression</strong>. The sensitivity of each metric<br />

to each degradation is evaluated on <strong>hyperspectral</strong> <strong>data</strong>, and some general conclusions are drawn. It<br />

is shown that, taking e.g. spectral angle mapper (SAM) classification [18] as reference application,<br />

MSE turns out to be a reasonably good metric; however, more than one metric should be used if an<br />

accurate characterization of the degradation is required.<br />

This paper attempts to solve the trans<strong>for</strong>m <strong>coding</strong> problems outlined above, and builds on the


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 3<br />

state-of-the-art by providing the following contributions. First of all, we report the results of a<br />

comprehensive experiment aimed at comparing the <strong>coding</strong> efficiency of several combinations of<br />

spatial and spectral trans<strong>for</strong>ms. The experiment is carried out in the framework of <strong>lossy</strong> <strong>compression</strong><br />

of <strong>hyperspectral</strong> <strong>data</strong>, since these <strong>data</strong> have become very popular, and are more amenable to spectral<br />

decorrelation than multispectral <strong>data</strong>; in particular, a few AVIRIS scenes have been used to evaluate<br />

the selected trans<strong>for</strong>ms. We consider different trans<strong>for</strong>ms such as DCT, rectangular, square and hybrid<br />

wavelet and wavelet packet trans<strong>for</strong>ms [16], KLT, and various spatial/spectral combinations of these<br />

trans<strong>for</strong>ms; the evaluation procedure is designed so as to simulate global 3D rate allocation, as opposed<br />

to assigning the same rate to each trans<strong>for</strong>med band. Secondly, we propose a new low-complexity<br />

version of the KLT, which provides per<strong>for</strong>mance similar to the full-featured KLT, with significantly<br />

lower computational complexity. Thirdly, we integrate the low-complexity KLT, as well as a few of<br />

the best existing trans<strong>for</strong>ms, into a practical <strong>compression</strong> scheme based on the JPEG 2000 standard.<br />

This choice allows us to define a <strong>compression</strong> scheme combining the flexibility and interoperability<br />

of an international standard with the high <strong>coding</strong> efficiency of JPEG 2000 and the KLT. In particular,<br />

the resulting scheme is compliant with the multicomponent trans<strong>for</strong>mation extension defined in Part<br />

2 of JPEG 2000 [19], and significantly outper<strong>for</strong>ms the best existing <strong>lossy</strong> <strong>compression</strong> <strong>techniques</strong>.<br />

The comparison is carried out using MSE-based metrics, as well as investigating the effect of <strong>lossy</strong><br />

<strong>compression</strong> on SAM classification.<br />

This paper is organized as follows. In Sect. II we analyze various decorrelating trans<strong>for</strong>ms. In<br />

Sect. III we outline a trans<strong>for</strong>m evaluation procedure and provide evaluation results on <strong>hyperspectral</strong><br />

<strong>data</strong>. In Sect. IV we define the proposed KLT-based algorithm, whereas in Sect. V we report its<br />

per<strong>for</strong>mance evaluation. Finally, in Sect. VI we draw some conclusions.<br />

II. 3D TRANSFORM CODING STUDY<br />

<strong>Trans<strong>for</strong>m</strong> <strong>coding</strong> <strong>techniques</strong> are very attractive <strong>for</strong> image <strong>coding</strong> thanks to their good energy<br />

compaction characteristics. Since <strong>hyperspectral</strong> images exhibit both spatial and spectral redundancy,<br />

a 3D trans<strong>for</strong>m is a natural approach. To choose an efficient <strong>coding</strong> scheme, a comprehensive study<br />

with different trans<strong>for</strong>ms has been carried out. The trans<strong>for</strong>ms employed in the development of the<br />

proposed <strong>compression</strong> schemes are described in the following. Only separable extensions to multiple<br />

dimensions have been considered <strong>for</strong> simplicity.<br />

A. Karhunen-Loève <strong>Trans<strong>for</strong>m</strong><br />

The KLT is the optimal block-based trans<strong>for</strong>m (in a statistical sense) <strong>for</strong> <strong>data</strong> <strong>compression</strong>, because<br />

it approximates a signal in the trans<strong>for</strong>m domain using the smallest number of coefficients, minimizing


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 4<br />

the MSE between the reconstructed and original image. Defining the covariance matrix C X<br />

random row vector X with mean value µ X as<br />

of a<br />

C X = E[(X − µ X )(X − µ X ) T ] (1)<br />

the KLT trans<strong>for</strong>m matrix V is obtained by aligning columnwise the eigenvectors of C X . It can<br />

be shown that the trans<strong>for</strong>med random vector Y = V T X has uncorrelated components, i.e. C Y =<br />

V T C X V is a diagonal matrix.<br />

Although the KLT is provably optimal, it has a few drawbacks. First, the trans<strong>for</strong>m matrix V is<br />

obtained as the solution of a numerically intensive eigenvector problem. Moreover, the trans<strong>for</strong>m is<br />

signal-adaptive; hence it has to be recomputed <strong>for</strong> each input vector, and it has to be transmitted<br />

along with the compressed <strong>data</strong>, thus causing a significant overhead.<br />

B. Discrete Cosine <strong>Trans<strong>for</strong>m</strong><br />

The DCT is a popular technique <strong>for</strong> converting a signal into elementary frequency components. In<br />

particular, the DCT represents the input signal as a linear combination of weighted basis functions<br />

that are related to its frequency components. It is widely used in image <strong>compression</strong>, and is known<br />

to be close to optimal in terms of its energy compaction capabilities <strong>for</strong> Gauss-Markov processes;<br />

moreover, a number of fast algorithms have been developed to speed up the computation of this<br />

trans<strong>for</strong>m. A description of the DCT and its applications in signal <strong>coding</strong> can be found in several<br />

books, e.g. [20].<br />

C. Discrete Wavelet <strong>Trans<strong>for</strong>m</strong><br />

The DWT [20] is widely used in many signal processing fields, thanks to its ability to accurately<br />

describe both small-scale and large-scale components of a signal. The DWT is based on the principle<br />

that efficient decorrelation can be achieved by splitting the <strong>data</strong> into two half-rate subsequences,<br />

carrying in<strong>for</strong>mation respectively on the approximation and detail of the original signal, or equivalently<br />

on the low- and high-frequency half-bands of its spectrum. Since most of the signal energy of realworld<br />

signals is typically concentrated in the lowpass frequencies, this process splits the signal in a<br />

very significant and a little significant part, leading to good energy compaction. The procedure can<br />

be iterated on the lowpass subsequence by means of the filter bank configuration shown in Fig. 1.<br />

D. Discrete Wavelet Packet <strong>Trans<strong>for</strong>m</strong><br />

The discrete wavelet packet trans<strong>for</strong>m (DWPT) is a generalization of the DWT that offers a richer<br />

range of options <strong>for</strong> signal analysis. The DWPT is implemented by a filter bank in which also the


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 5<br />

Input<br />

H l (z)<br />

H h (z)<br />

2<br />

2<br />

H l (z)<br />

H h (z)<br />

2<br />

2<br />

Input<br />

H l (z)<br />

2<br />

H l (z)<br />

H h (z)<br />

2<br />

2<br />

H h (z)<br />

2<br />

H l (z)<br />

H h (z)<br />

2<br />

2<br />

Fig. 1. Implementation of wavelet trans<strong>for</strong>ms by means of a filter bank scheme with lowpass and highpass filters denoted<br />

as H l (z) and H h (z) respectively. (a) DWT and (b) DWPT. The circles denote subsampling by a factor of two.<br />

highpass outputs are allowed to be further split into approximation and detail. If both the lowpass and<br />

highpass sequences are always split, the system is said to be a complete wavelet packet trans<strong>for</strong>m.<br />

However, it is not necessary <strong>for</strong> the trans<strong>for</strong>m to be complete; <strong>for</strong> any given input signal, there exists<br />

an optimal choice of highpass and lowpass iterations that captures most of the input signal correlation,<br />

which is known as best basis wavelet packet trans<strong>for</strong>m. Given an appropriate cost function, a search<br />

algorithm adaptively selects the best basis <strong>for</strong> a given signal.<br />

Different cost functions can be employed, e.g. entropy, minimum distortion, minimum number of<br />

coefficients above a certain threshold [21], or rate-distortion optimization [22]. Our purpose is to<br />

select the best 3D trans<strong>for</strong>ms in terms of energy compaction; hence, the cost function in [22] is<br />

not suitable because it explicitly takes quantization into account, while our analysis aims at being<br />

independent of the specific quantization scheme employed. We have found that the <strong>coding</strong> gain,<br />

which is a per<strong>for</strong>mance measure of trans<strong>for</strong>m efficiency [20], is a very good and theoretically sound<br />

objective function <strong>for</strong> seeking the best decomposition tree. Assuming a DWPT with l decomposition


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 6<br />

levels, the <strong>coding</strong> gain G (l−1)→l is defined as<br />

G (l−1)→l =<br />

( ∏ N<br />

i=1 (σi l )2 ) 1 N<br />

where σ l−1 is the standard deviation of the trans<strong>for</strong>med coefficients in a subband at level l − 1, and<br />

σ i l<br />

are the standard deviations of the i =1,...,N subbands stemming from a further decomposition;<br />

note that the denominator is the geometric mean of the subband variances. For example, N is equal<br />

to two <strong>for</strong> a one-dimensional trans<strong>for</strong>m, and to four and eight <strong>for</strong> a 2D and 3D trans<strong>for</strong>m respectively.<br />

The energy distribution of the trans<strong>for</strong>med subbands is supposed to be highly unbalanced, i.e. a small<br />

percentage of the subbands concentrates a high percentage of the total energy. The more unbalanced<br />

is the distribution, the lower is the geometric mean. If G (l−1)→l > 1 the decomposition at level l is<br />

σ 2 l−1<br />

retained, otherwise the current subband at level l − 1 is kept.<br />

This procedure has a very intuitive interpretation in terms of average distortion after reconstruction.<br />

Given a total <strong>coding</strong> rate, high-resolution quantization theory can be applied to the bit-allocation<br />

problem among the eight subbands at level l. This yields [23] that the optimal average distortion<br />

<strong>for</strong> a Gaussian source is D = γ( ∏ N<br />

i=1 σ i) 1 N 2 −2R , with γ a given constant. Moreover, the distortion<br />

if the decomposition at level l − 1 is retained is proportional to σl−1 2 . There<strong>for</strong>e, the <strong>coding</strong> gain is<br />

simply the ratio between the distortions respectively obtained by keeping the current representation<br />

and per<strong>for</strong>ming an additional decomposition step. As a consequence, under the high-rate and Gaussian<br />

assumptions, maximizing the <strong>coding</strong> gain amounts to picking the branch of the decomposition tree<br />

that minimizes the average distortion of the reconstructed <strong>data</strong>. It is worth noting that, since this<br />

rate-distortion model embeds the quantizer effect in the constant γ, using the <strong>coding</strong> gain allows<br />

one to obtain the optimal decomposition tree taking quantization into account, but without making<br />

any explicit assumption on the employed quantizer. This is very important, as we are interested in<br />

comparing only the trans<strong>for</strong>ms and not the quantizers, although in a practical <strong>compression</strong> algorithm<br />

the trans<strong>for</strong>m will have to be followed by a quantizer.<br />

E. Multidimensional extensions of wavelet trans<strong>for</strong>ms<br />

When the DWT and DWPT have to be applied to 3D <strong>data</strong> set, multidimensional extensions of<br />

either trans<strong>for</strong>m are required. In the following we will consider three possible extensions, which<br />

are referred to as square, rectangular and hybrid trans<strong>for</strong>ms. Our description refers to the DWT; the<br />

generalization to DWPT is straight<strong>for</strong>ward.<br />

A square 2D trans<strong>for</strong>m is such that first one decomposition level is computed in all dimensions.<br />

Then, the (multidimensional) approximation subband is considered, and a new iteration is applied to<br />

it.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 7<br />

A rectangular 2D trans<strong>for</strong>m is such that first the complete 1D wavelet trans<strong>for</strong>m (i.e., all<br />

decomposition levels) is computed in one dimension, and then the complete trans<strong>for</strong>m is applied<br />

to the second dimension.<br />

In 3D, a square trans<strong>for</strong>m is obtained by first computing one decomposition level in all dimensions,<br />

and then iterating on the LLL cube. Conversely, the rectangular trans<strong>for</strong>m is obtained by first applying<br />

the complete trans<strong>for</strong>m along the first dimension, then along the second one, and finally along the<br />

third one.<br />

In 3D, hybrid trans<strong>for</strong>ms can also be obtained as in [16] by first applying the complete trans<strong>for</strong>m<br />

in one dimension, and then taking a 2D square trans<strong>for</strong>m in the other two dimensions. The obtained<br />

trans<strong>for</strong>m is referred to as 3D hybrid rectangular/square DWT.<br />

F. 3D trans<strong>for</strong>ms selected <strong>for</strong> evaluation<br />

The previously described one-dimensional trans<strong>for</strong>ms have been combined in various ways to obtain<br />

3D trans<strong>for</strong>ms <strong>for</strong> <strong>hyperspectral</strong> <strong>data</strong>. The most relevant combinations are reported in the following.<br />

As <strong>for</strong> filter selection in the DWT and DWPT, the (9,7) biorthogonal wavelet filter pair has been<br />

used throughout this work; this filter is known to provide excellent <strong>compression</strong> per<strong>for</strong>mance, and<br />

has been selected <strong>for</strong> inclusion in the JPEG 2000 standard.<br />

1) 3D square DWT: This method is based on the wavelet trans<strong>for</strong>m applied in all three dimensions<br />

simultaneously. In particular, one level of wavelet decomposition is applied along each of the three<br />

dimensions. This procedure is repeated on the obtained LLL cube, as opposed to the rectangular<br />

trans<strong>for</strong>m described in Sect. II-F.3. As an example, Fig. 2 shows in a pictorial way the subbands<br />

obtained by per<strong>for</strong>ming three levels of 3D-decomposition on a <strong>data</strong> cube as described above. The<br />

obtained decomposition has cubic subbands in 3D.<br />

Fig. 2.<br />

Subbands obtained by per<strong>for</strong>ming three levels of 3D square DWT on a <strong>data</strong> cube.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 8<br />

2) 3D square DWPT: In the 3D square DWPT, one level of wavelet packet decomposition is applied<br />

along the three dimensions; then, all the sub-cubes obtained by the wavelet packet decomposition may<br />

be further split so as to minimize an appropriate cost function. This procedure is repeated iteratively<br />

on each obtained cube <strong>for</strong> a given number of decomposition levels.<br />

3) Hybrid rectangular/square 3D trans<strong>for</strong>m: As <strong>hyperspectral</strong> <strong>data</strong> carry a lot of in<strong>for</strong>mation in the<br />

spectral dimension, it is interesting to think of trans<strong>for</strong>ms that operate differently in the spectral and<br />

spatial directions, in order to match the different nature of those correlations. A way to obtain such<br />

multidimensional trans<strong>for</strong>m is the 3D hybrid rectangular/square wavelet trans<strong>for</strong>m. Fig. 3 depicts<br />

a hybrid 3D wavelet trans<strong>for</strong>m obtained with three decomposition levels as described. As can be<br />

seen, the subbands generated by this trans<strong>for</strong>m are rectangles in 2D and parallelepipeds in 3D. The<br />

number of subbands is higher than that of the classical square trans<strong>for</strong>m; in particular the frequency<br />

partitioning is such that high horizontal and low vertical frequencies lie in subbands where the basis<br />

functions are short and long in the horizontal and vertical dimensions respectively. The obtained<br />

frequency tessellation is finer, and has more radial symmetry than the square trans<strong>for</strong>m.<br />

Fig. 3.<br />

Hybrid rectangular/square 3D DWT subband decomposition of a <strong>data</strong> cube.<br />

4) Hybrid spectral wavelet packet and spatial square 2D wavelet trans<strong>for</strong>m: In this method a<br />

DWPT is first applied in the spectral dimension, while a 2D square DWT is applied in the spatial<br />

dimension. In other terms, this is equivalent to the hybrid rectangular/square 3D wavelet, in which<br />

the spectral DWT is replaced by a spectral DWPT. The cost function <strong>for</strong> DWPT is minimized in<br />

the third dimension considering the whole cube obtained by the 1D packet decomposition, and not<br />

the single 1D vectors. In fact, the separated optimization <strong>for</strong> each single vector could yield different<br />

bases. In this case, the trans<strong>for</strong>med <strong>data</strong> cube would present discontinuities, which would penalize the<br />

per<strong>for</strong>mance of the next spatial 2D DWT stage. An other advantage is the reduced overhead, since a


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 9<br />

single spectral decomposition tree has to be transmitted, instead of a separate tree <strong>for</strong> each spectral<br />

vector.<br />

Best basis selection using the <strong>coding</strong> gain yields the decomposition represented in Fig. 4.<br />

Consistently with the notion that spectral vectors have a significant in<strong>for</strong>mation content, the obtained<br />

decomposition is finer than the classical dyadic wavelet tree in the high-frequency portion of the<br />

spectrum, and almost resembles a Fourier trans<strong>for</strong>m.<br />

L<br />

H<br />

L H L H<br />

L H L H L H<br />

Fig. 4.<br />

Best basis <strong>for</strong> the spectral DWPT.<br />

5) Hybrid spectral wavelet packet trans<strong>for</strong>m and spatial wavelet packet trans<strong>for</strong>m: In this method<br />

a wavelet packet decomposition is applied separately in the spectral and spatial dimensions. The cost<br />

function is minimized in the third dimension considering the cube obtained by the 1D DWPT; then,<br />

a 2D DWPT is evaluated on each single band.<br />

6) Hybrid spectral discrete wavelet trans<strong>for</strong>m and spatial wavelet packet trans<strong>for</strong>m: In this method<br />

a DWT is applied in the spectral dimension, while a 2D DWPT follows in the spatial dimension.<br />

7) Spectral discrete cosine trans<strong>for</strong>m and spatial discrete wavelet trans<strong>for</strong>m: This method applies<br />

a one-dimensional DCT trans<strong>for</strong>m to each spectral vector, and a 2D square DWT on each single band<br />

of the obtained trans<strong>for</strong>med cube.<br />

8) Spectral KLT and spatial DWT: This method applies the KLT in the spectral dimension followed<br />

by the 2D square DWT along the spatial dimensions. In order to evaluate the trans<strong>for</strong>m matrix which<br />

optimally decorrelates the spectral dimension, we estimate the covariance matrix of the <strong>hyperspectral</strong><br />

<strong>data</strong> cube assuming that each spectral vector, containing the radiance of a pixel at a given spatial<br />

location in all the bands, is a realization of the random process that has to be decorrelated.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 10<br />

In particular, given the <strong>hyperspectral</strong> <strong>data</strong> cube, i.e. B bands containing M lines and N samples,<br />

we <strong>for</strong>m the M × N column vectors X ij =[x 1 ij ,x2 ij ,...,xB ij ]T , <strong>for</strong> i =1, ...M and j =1, ...N,<br />

where x k ij<br />

is the pixel with spatial coordinates (i, j) in band k. We employ the sample mean vector<br />

M x =[m x 1,m x 2,...,m x B], with m x<br />

k = 1 ∑ M ∑ N<br />

MN i=1 j=1 xk ij , as estimate of the ensemble averages<br />

of each band.<br />

For each spectral vector, we estimate its covariance matrix using one single realization, i.e.<br />

C X,i,j = (X ij − M x ) T (X ij − M x ). Finally, we compute the average covariance matrix C X =<br />

1<br />

∑ M<br />

∑ N<br />

MN i=1 j=1 C X,i,j.<br />

We solve the eigenvector problem <strong>for</strong> the symmetric matrix C X , obtaining the eigenvalues λ i and<br />

eigenvectors u i that satisfy C X u i = λ i u i . The KLT kernel is a unitary matrix V , whose columns<br />

are the eigenvectors u i arranged in descending order of eigenvalue magnitude. This matrix is used to<br />

trans<strong>for</strong>m each spectral vector, after subtracting its mean value, as Y ij = V T (X ij − M x ).<br />

The complexity of the decorrelation trans<strong>for</strong>m is the sum of three contributions. The first one is the<br />

evaluation of the covariance matrix (O(B 2 MN)); the second one is the solution of the eigenvector<br />

problem (O(B 3 ), [24]); the third one is the computation of trans<strong>for</strong>m coefficients (O(B 2 MN)). It<br />

can be observed that M, N and B are of the same order of magnitude, hence the second term is<br />

negligible with the respect to the first and third. The overall computational complexity is very high,<br />

and has so far limited the use of the KLT in practical applications.<br />

III. RESULTS OF TRANSFORM EVALUATION<br />

A. Evaluation procedure<br />

All the trans<strong>for</strong>ms described in Sect. II-F have been compared in terms of their energy compaction<br />

capability, which is a measure of the fraction of signal energy contained in a given number of trans<strong>for</strong>m<br />

coefficients. It is a very important property in image <strong>compression</strong>, since it provides an estimation of<br />

the effect of quantization in a practical <strong>coding</strong> scheme. The energy compaction property is evaluated<br />

<strong>for</strong> each trans<strong>for</strong>m by per<strong>for</strong>ming the following steps:<br />

• computing the 3D trans<strong>for</strong>m on a few <strong>hyperspectral</strong> scenes;<br />

• zeroing out a given percentage of trans<strong>for</strong>m coefficients, taken as those with the smallest<br />

magnitude;<br />

• computing the inverse 3D trans<strong>for</strong>m;<br />

• computing the peak signal-to-noise ratio (PSNR) with respect to the original image.<br />

Of particular importance is the fact that the coefficients to be zeroed out are taken in arbitrary<br />

order within the complete 3D set of trans<strong>for</strong>m coefficients, and not on a band by band basis. This is


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 11<br />

akin to per<strong>for</strong>ming a 3D rate-distortion optimization, which is known to provide significantly better<br />

results than band-by-band optimization [16].<br />

The per<strong>for</strong>mance evaluation is carried out on 16-bit radiance AVIRIS <strong>data</strong> cubes. AVIRIS scenes<br />

have 224 bands and 614 × 512 pixels resolution, but each scene has been cropped to 256 × 256 × 224<br />

pixels. Scene 4 of Cuprite and scene 3 of Jasper Ridge have been employed; <strong>for</strong> brevity, we only<br />

report the set of results <strong>for</strong> Cuprite.<br />

B. Energy compaction results<br />

For clarity, in Tab. I we summarize the acronyms used in the figures to identify the eight 3D<br />

trans<strong>for</strong>ms that have been evaluated.<br />

TABLE I<br />

ACRONYMS OF THE EVALUATED TRANSFORMS<br />

Acronym<br />

DWT3D<br />

DWP3D<br />

DWT1D2D<br />

DWP1D-DWT2D<br />

DWP1D-DWP2D<br />

DWT1D-DWP2D<br />

DCT1D-DWT2D<br />

KLT1D-DWT2D<br />

Sect.<br />

II.E.1<br />

II.E.2<br />

II.E.3<br />

II.E.4<br />

II.E.5<br />

II.E.6<br />

II.E.7<br />

II.E.8<br />

We anticipate that, not surprisingly, we have found that the spectral correlation plays a crucial role<br />

<strong>for</strong> <strong>compression</strong>, since the trans<strong>for</strong>ms that are better able to capture this correlation are those that<br />

rank best <strong>for</strong> <strong>compression</strong>. It is already known <strong>for</strong> lossless <strong>compression</strong> (see e.g. [25]) that large<br />

bit-rate reductions can be achieved by employing an efficient model of the spectral correlation. As<br />

will be seen, exploiting this correlation in the <strong>lossy</strong> case calls <strong>for</strong> the use of separate spectral and<br />

spatial trans<strong>for</strong>ms.<br />

In Fig. 5 we compare the DWT3D and DWP3D trans<strong>for</strong>ms. Neither trans<strong>for</strong>m is computed<br />

separately in the spectral dimension. As can be seen, the rate-distortion curve of the DWP3D trans<strong>for</strong>m<br />

is significantly better than that of the DWT3D. This is due to the fact that the 3D square wavelet<br />

trans<strong>for</strong>m is isotropic in all three dimension. This may not be the most appropriate correlation model<br />

of a <strong>hyperspectral</strong> <strong>data</strong>set, since the subband decomposition in the spectral dimension is not as fine as<br />

it could be. Hence, because of the rather rough tessellation of 3D frequency space, the DWT3D turns


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 12<br />

out to have poor per<strong>for</strong>mance as to energy compaction of spectral vectors. The DWP3D trans<strong>for</strong>m<br />

per<strong>for</strong>ms much better, since its ability to adaptively select the frequency tessellation allows it to<br />

refine the signal description along the spectral dimension, and hence to exploit much better the<br />

spectral correlation.<br />

The obtained decomposition tree is depicted in Fig. 6. It is possible to observe that both low and high<br />

spatial frequency components have been finely decomposed in the low spectral frequency components.<br />

Moreover, almost all the low spatial frequencies (along both the pixels and lines direction) have been<br />

finely decomposed. On the other hand, all the high frequency components along the three dimensions<br />

have not been further decomposed, as in the case of the 3D square DWT.<br />

100<br />

DWT3D<br />

DWP3D<br />

90<br />

80<br />

PSNR (dB)<br />

70<br />

60<br />

50<br />

40<br />

30<br />

50 55 60 65 70 75 80 85 90 95 100<br />

% of trans<strong>for</strong>m coefficients set to zero<br />

Fig. 5.<br />

Per<strong>for</strong>mance comparison of different trans<strong>for</strong>ms - DWT3D vs DWP3D.<br />

Fig. 7 compares the per<strong>for</strong>mance of wavelet and wavelet packet trans<strong>for</strong>ms computed separately in<br />

the spectral direction. Namely, we first compute a full 1D DWT or DWPT in the spectral direction,<br />

followed by a square 2D DWT or DWPT. The following remarks can be made. Spatial decorrelation is<br />

per<strong>for</strong>med more effectively by the DWT than by the DWPT. This is somewhat counterintuitive, since<br />

one would expect that the optimized decomposition provides better results. As a matter of fact, in our<br />

evaluation procedure we zero out the least significant coefficients taken from the complete 3D set of<br />

trans<strong>for</strong>m coefficients, whereas the DWPT trans<strong>for</strong>m is optimized separately in the spectral and spatial<br />

directions. Thus, our procedure closely simulated a 3D rate allocation, which would work better with<br />

a three-dimensionally optimized trans<strong>for</strong>m. As an example, the 2D DWPT does not decompose the<br />

bands in the water absorption region because they contain little in<strong>for</strong>mation; however, those bands


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 13<br />

LLL<br />

lines<br />

bands<br />

pixels<br />

Fig. 6.<br />

gain.<br />

Sub-cubes obtained by the 3D wavelet packet decomposition minimizing the cost function based on the <strong>coding</strong><br />

contain many high-valued coefficients that have to be retained, at the expenses of other coefficients<br />

that are discarded. There<strong>for</strong>e, the mismatch between the 3D coefficient selection and the separatedness<br />

of the DWPT trans<strong>for</strong>m makes it useless, and even disadvantageous, to per<strong>for</strong>m best basis selection.<br />

Comparing Fig. 7 and Fig. 5, it can be seen that the per<strong>for</strong>mance of the DWT1D2D trans<strong>for</strong>m is<br />

very similar to that of the DWP3D trans<strong>for</strong>m, but with a significantly reduced computational ef<strong>for</strong>t,<br />

since it is not necessary to compute the optimal basis. In fact, this trans<strong>for</strong>m has been selected in<br />

[16] <strong>for</strong> its favorable trade-off between per<strong>for</strong>mance and complexity.<br />

Continuing the study of spectrally separable trans<strong>for</strong>ms, Fig. 8 compares the DWT1D2D, DCT1D-<br />

DWT2D, and KLT1D-DWT2D trans<strong>for</strong>ms. Following the results described above, these trans<strong>for</strong>ms<br />

have been selected in order to compare different spectral decorrelators, using the 2D DWT <strong>for</strong> spatial<br />

decorrelation because of its effectiveness. Not surprising, the KLT turns out to be the best trans<strong>for</strong>m;<br />

since we are employing the same trans<strong>for</strong>m <strong>for</strong> all spectral vectors, the overhead of describing the<br />

trans<strong>for</strong>m matrix in the compressed file is negligible. The per<strong>for</strong>mance gain of the KLT1D-DWT2D<br />

with respect to the DWT1D2D is about 2 dB at high quality levels, and significantly more at low bitrates.<br />

However, as outlined in Sect. II-F.8, the KLT1D-DWT2D requires the estimation (and averaging)<br />

of as many covariance matrices as samples per band, followed by the solution of the eigenvector


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 14<br />

100<br />

90<br />

DWT1D2D<br />

DWP1D−DWT2D<br />

DWT1D−DWP2D<br />

DWP1D2D<br />

80<br />

PSNR (dB)<br />

70<br />

60<br />

50<br />

40<br />

50 55 60 65 70 75 80 85 90 95 100<br />

% of trans<strong>for</strong>med coefficients set to zero<br />

Fig. 7.<br />

Per<strong>for</strong>mance comparison of different trans<strong>for</strong>ms, with a separable trans<strong>for</strong>m in the spectral direction.<br />

problem. As expected, the DCT per<strong>for</strong>ms almost always worse than the DWT, except <strong>for</strong> a range in<br />

the very low bit-rate region.<br />

In summary, this analysis shows that the schemes with highest per<strong>for</strong>mance are based on hybrid<br />

rectangular/square trans<strong>for</strong>ms. Among these schemes, the 2D DWT should be preferred <strong>for</strong> spatial<br />

decorrelation. As far as spectral decorrelation is concerned, the 1D DWT provides good per<strong>for</strong>mance<br />

with limited complexity. The KLT achieves significantly better per<strong>for</strong>mance at all bit-rates, and<br />

especially in the low bit-rate region. It should be noticed that this KLT employs a single trans<strong>for</strong>m<br />

matrix <strong>for</strong> all spectral vectors. This approach is effective because, along the spectral dimension, the<br />

signal depend almost exclusively on pixel land cover; since only a few land covers are typically<br />

present in an image, the KLT works better than the other trans<strong>for</strong>ms. In the spatial dimension the<br />

signal depends on the scene geometry, which is less predictable with many discontinuities near region<br />

boundaries. In this case, due to the high degree of nonstationarity, the single KLT becomes far from<br />

optimal, and the DWT works better; the use of the optimal KLT would require the solution to multiple<br />

eigenvector problems, which is not realistic in a practical scenario.<br />

IV. PROPOSED LOW-COMPLEXITY KLT<br />

As has been seen, the spectral KLT can provide per<strong>for</strong>mance gains in excess of 2 dB with respect<br />

to the wavelet trans<strong>for</strong>m using a single “average” covariance matrix. However, although this is a<br />

somewhat simplified version of the trans<strong>for</strong>m, because it assumes that the spectral vectors are samples<br />

of a stationary signal, its complexity is still high <strong>for</strong> real-time applications, <strong>for</strong> the reasons outlined


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 15<br />

100<br />

90<br />

80<br />

PSNR (dB)<br />

70<br />

60<br />

50<br />

40<br />

KLT1D−DWT2D<br />

DWT1D2D<br />

DCT1D−DWT2D<br />

30<br />

50 55 60 65 70 75 80 85 90 95 100<br />

% <strong>Trans<strong>for</strong>m</strong>ed coefficients set to zero<br />

Fig. 8. Per<strong>for</strong>mance comparison of different trans<strong>for</strong>ms, with a separable trans<strong>for</strong>m in the spectral direction: KLT, DWT<br />

and DCT.<br />

in Sect. II-F.8. In the following we propose a low-complexity version of the KLT that alleviates<br />

this problem with virtually no per<strong>for</strong>mance loss with respect to the full-complexity trans<strong>for</strong>m. In<br />

particular, in Sect. IV-A we define the low-complexity one-dimensional KLT; in Sect. IV-B we define<br />

our proposed 3D trans<strong>for</strong>m based on the low-complexity KLT, and evaluate its energy compaction<br />

capability followed the same procedure used with the other trans<strong>for</strong>ms; in Sect. IV-C we provide a<br />

breif overview of JPEG 2000, and in Sect. IV-D we describe the integration of the proposed trans<strong>for</strong>m<br />

within Part 2 of JPEG 2000 [26].<br />

A. One-dimensional trans<strong>for</strong>m<br />

The KLT applies principal components analysis to the spectral dimension evaluating the average<br />

correlation matrix over all spectral vectors. For an AVIRIS scene, this amounts to computing and<br />

averaging over 300000 such matrices. To simplify this process, we note that convergence of the<br />

estimation process may be achieved using fewer matrices.<br />

Using the notation defined in Sect. II-F.8, in the proposed low-complexity trans<strong>for</strong>m all the<br />

processing is not carried out on the complete set of spectral vectors, but rather on a subset of vectors<br />

selected at random. Hence, the sample mean vector is defined as M x ′ =[m ′ x 1,m′ x 2,...,m′ x<br />

], where<br />

B<br />

m ′ x<br />

= 1 ∑<br />

k M ′ N<br />

∑i∈I<br />

′ j∈J xk ij , and I and J are sets containing respectively M ′ and N ′ different<br />

indexes picked at random in the intervals [1,M] and [1,N], with M ′ ≤ M and N ′ ≤ N. This<br />

process is also depicted in Fig. 9, where the different sets of spectral vectors are highlighted.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 16<br />

- - - - - - -<br />

- - - - - - -<br />

- - - - - - -<br />

- - - - - - -<br />

KLT <strong>Trans<strong>for</strong>m</strong><br />

matrix<br />

-<br />

-<br />

-<br />

-<br />

Correlated<br />

components<br />

=<br />

-<br />

-<br />

-<br />

-<br />

Decorrelated<br />

components<br />

Spectral vector employed in the<br />

evaluation of Covariance matrix<br />

in the low complexity KLT<br />

method<br />

Spectral vector employed in the<br />

evaluation of Covariance matrix<br />

in the full complexity KLT<br />

method<br />

Fig. 9.<br />

Computation of the covariance matrix <strong>for</strong> the full-complexity and low-complexity KLT.<br />

The covariance matrix is obtained as C<br />

X ′ = 1 ∑<br />

M ′ N<br />

∑i∈I<br />

′ j∈J (X ij − M x) ′ T (X ij − M x). ′ Itis<br />

used to <strong>for</strong>m the eigenvector set C<br />

X ′ u′ i = λ′ i u′ i , where u′ i are the eigenvectors associated with the<br />

eigenvalues λ ′ i . Aligning the eigenvectors columnwise we obtain the low-complexity KLT matrix V ′ .<br />

The trans<strong>for</strong>med vector is computed as Y ij =(V ′ ) T (X ij − M x). ′ We also denote as ρ = M ′ N ′<br />

the<br />

percentage of spectral vectors employed to evaluate the covariance matrix. Obviously, the smaller is<br />

ρ, the lower is the complexity of this KLT.<br />

The complexity of the first stage of the new trans<strong>for</strong>m, i.e. the evaluation of the covariance matrix,<br />

becomes O(ρB 2 MN), i.e. it is reduced by a factor ρ, as the number of covariance matrices to be<br />

computed decreases linearly with ρ; the other two terms remain unchanged. There<strong>for</strong>e, the proposed<br />

scheme is able to significantly reduce the complexity of the first stage, while no advantage is achieved<br />

in the third stage, because the resulting trans<strong>for</strong>m matrix does not exhibit any specific structure that<br />

can be exploited to reduce its complexity. Some numerical results on the complexity of the complete<br />

JPEG 2000 based algorithm are given in Sect. V-A.<br />

Fig. 10 shows an example of covariance matrix of all the spectral vectors. This matrix is clearly<br />

symmetric, and is depicted as an image with 256 grey levels. The tone is proportional to the absolute<br />

value of the correlation. The elements of the main diagonal have very high values, as also a large<br />

number of the main diagonal neighbors do, since a high degree of inter-band correlation is present. It<br />

can be seen that some elements of matrix are close to zero, because some pairs of bands are poorly<br />

MN


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 17<br />

correlated. This behavior is particularly evident <strong>for</strong> the bands around 160, characterized by the water<br />

absorption region.<br />

1<br />

20<br />

40<br />

60<br />

80<br />

0.9<br />

0.8<br />

0.7<br />

band<br />

100<br />

120<br />

0.6<br />

0.5<br />

140<br />

0.4<br />

160<br />

180<br />

200<br />

220<br />

20 40 60 80 100 120 140 160 180 200 220<br />

band<br />

0.3<br />

0.2<br />

0.1<br />

Fig. 10. Example of correlation matrix of the spectral vectors, depicted as an image with 256 grey levels. The tone is<br />

proportional to the absolute value of the correlation. The elements of the main diagonal have very high values, as also a<br />

large number of the main diagonal neighbors do, since a high degree of inter-band correlation is present.<br />

B. 3D trans<strong>for</strong>m evaluation: spectral low-complexity KLT and spatial 2D DWT<br />

In order to obtain a 3D trans<strong>for</strong>m that can be applied to a <strong>hyperspectral</strong> <strong>data</strong> cube, we employ<br />

the proposed low-complexity KLT as a spectral decorrelator, followed by the 2D DWT <strong>for</strong> spatial<br />

decorrelation. In other terms, this trans<strong>for</strong>m is equivalent to the KLT1D-DWT2D, which turned out<br />

to be the highest per<strong>for</strong>mance trans<strong>for</strong>m, but employs the low-complexity version of the KLT.<br />

We have evaluated the per<strong>for</strong>mance of this trans<strong>for</strong>m, following the procedure outlined in Sect.<br />

III-A, <strong>for</strong> several values of ρ, in order to understand how many spectral vectors are actually needed<br />

to obtain convergence in the estimate of the covariance matrix, and hence optimal per<strong>for</strong>mance. The<br />

results are reported in Fig. 11. As can be seen, taking ρ =0.1 we obtain a negligible per<strong>for</strong>mance loss<br />

with respect to the full-complexity KLT; in particular, setting to zero 50% and 96% of the trans<strong>for</strong>m<br />

coefficients yields a PSNR loss of 0.21 and 0.16 dB respectively. Taking ρ =0.01 the loss is still<br />

very small, with an even larger computational saving; in this case, the PSNR decreases by 0.89 and<br />

1.12 dB when 50% and 96% of the trans<strong>for</strong>m coefficients are respectively set to zero.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 18<br />

100<br />

90<br />

80<br />

PSNR (dB)<br />

70<br />

60<br />

50<br />

KLT1D−DWT2D ρ = 1<br />

KLT1D−DWT2D ρ = 0.1<br />

KLT1D−DWT2D ρ = 0.01<br />

KLT1D−DWT2D ρ = 0.001<br />

KLT1D−DWT2D ρ = 0.0001<br />

DWT1D2D<br />

40<br />

50 55 60 65 70 75 80 85 90 95 100<br />

% trans<strong>for</strong>med coefficients set to zero<br />

Fig. 11. Per<strong>for</strong>mance of the 3D trans<strong>for</strong>m employing the low-complexity KLT, <strong>for</strong> several values of ρ.<br />

C. Overview of JPEG 2000<br />

The architecture of the JPEG 2000 core <strong>coding</strong> system (Part 1) is based on trans<strong>for</strong>m <strong>coding</strong>. An<br />

image may be divided into several sub-images (tiles), to reduce memory and computing requirements;<br />

in the following we disregard color trans<strong>for</strong>mations, as they are of no particular interest in the<br />

<strong>hyperspectral</strong> image scenario. A biorthogonal discrete wavelet trans<strong>for</strong>m is first applied to each tile,<br />

whose output is a series of versions of the tile at different resolution levels (subbands); then, the<br />

trans<strong>for</strong>m coefficients are quantized, independently <strong>for</strong> each subband, with an embedded dead-zone<br />

quantizer. Each subband of the wavelet decomposition is divided into rectangular blocks (codeblocks),<br />

which are independently encoded with the EBCOT (Embedded Block Coding with Optimized<br />

Truncation) entropy <strong>coding</strong> engine; EBCOT is based on a bit-plane approach, context modeling<br />

and arithmetic <strong>coding</strong>. The bit stream output by EBCOT is organized by the rate allocator into a<br />

sequence of layers, each layer containing contributions from each code-block; the block truncation<br />

points associated with each layer are optimized in the rate distortion sense. The final JPEG 2000<br />

codestream consists of a main header, followed by one or more sections corresponding to individual<br />

tiles. Each tile comprises a tile header and a layered representation of the included code-blocks,<br />

organized into packets. In order to <strong>for</strong>m a progressive bitstream, the layers are <strong>for</strong>med and ordered<br />

in such a way that the most important in<strong>for</strong>mation is placed at the beginning of the bitstream. The<br />

JPEG 2000 decoder per<strong>for</strong>ms exactly the same steps (except <strong>for</strong> rate allocation), in reverse order:<br />

syntax parsing, codeblock de<strong>coding</strong> by EBCOT, inverse quantization, inverse wavelet trans<strong>for</strong>m, and


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 19<br />

tile mosaicking.<br />

Part 2 of the standard provides specific tools that can be applied to <strong>hyperspectral</strong> images. In<br />

particular, the multicomponent trans<strong>for</strong>mation feature allows <strong>for</strong> spectral decorrelation by means of an<br />

external trans<strong>for</strong>m, followed by the application of JPEG 2000 to a whole block of decorrelated bands;<br />

the bands are separately decorrelated in the spatial directions by means of the 2D wavelet trans<strong>for</strong>m,<br />

whereas the rate allocation is optimized across the whole block. Since JPEG 2000 standardizes the<br />

decoder, Part 2 provides the syntax (i.e. the MCC, MCT, and MCO marker segments) to embed<br />

into the codestream the inverse spectral trans<strong>for</strong>m that must be carried out after per<strong>for</strong>ming JPEG<br />

2000 de<strong>coding</strong> of each component. Three types of spectral trans<strong>for</strong>mations are supported, namely<br />

i) array-based trans<strong>for</strong>mations (i.e., those that can be described by a set of linear equations in the<br />

input coefficients, e.g. the DCT or the KLT); ii) dependency trans<strong>for</strong>mations (i.e., those of the causal<br />

predictive type, like causal DPCM); iii) wavelet trans<strong>for</strong>ms. For each class, reversible and irreversible<br />

modes are <strong>for</strong>eseen. Irreversible trans<strong>for</strong>ms are specified <strong>for</strong> example by storing the trans<strong>for</strong>m matrix<br />

coefficients in floating-point <strong>for</strong>mat in the relevant marker segments within the codestream. Reversible<br />

trans<strong>for</strong>ms are defined as a set of single element linear trans<strong>for</strong>mations and rounding operations; this<br />

structure can accommodate lifting-based integer implementations of classical trans<strong>for</strong>ms such as DCT<br />

and wavelets (see e.g. [27], [28]).<br />

D. Integration of low-complexity KLT within JPEG 2000<br />

The proposed technique employs a hybrid 3D trans<strong>for</strong>m; it first applies the low-complexity KLT<br />

as multicomponent extension to JPEG 2000, and then the JPEG 2000 2D DWT, rate allocation and<br />

entropy <strong>coding</strong> to the spectrally trans<strong>for</strong>med bands. Three decomposition levels are per<strong>for</strong>med <strong>for</strong><br />

the 2D spatial trans<strong>for</strong>m, employing the (9,7) filter. The inverse KLT trans<strong>for</strong>m matrix is written<br />

in an MCT marker segment in the compressed file. Notably, the post-<strong>compression</strong> rate-distortion<br />

optimization is operated on the complete 3D set of trans<strong>for</strong>med coefficients, ensuring optimal<br />

per<strong>for</strong>mance.<br />

On a related note, a very desirable feature of a <strong>compression</strong> system <strong>for</strong> remote sensing images<br />

is the ability to generate quicklook images without having to fully decode the compressed file. In a<br />

typical scenario, a user would download a low spatial resolution false-color quicklook of the scene.<br />

To do so, full spectral decorrelation is necessary in order to extract the three false-color bands, and<br />

then reduced resolution de<strong>coding</strong> of each band has to be carried out. This procedure is impractical,<br />

because it requires to per<strong>for</strong>m the full spectral decorrelation to extract few channels. Moreover, it<br />

is not compliant with the JPEG 2000 standard, which requires that the spatial inverse trans<strong>for</strong>ms<br />

are per<strong>for</strong>med be<strong>for</strong>e the spectral one. On the other hand, JPEG 2000 Part 2 provides an interesting


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 20<br />

feature, in that, through suitable marker segments, it is possible to specify different trans<strong>for</strong>mations<br />

<strong>for</strong> selected groups of bands [26]. For example, the three bands to be used to generate false-color<br />

quicklooks can be skipped by the spectral decorrelator and compressed in intraband mode; this yields<br />

a slight per<strong>for</strong>mance loss, but allows increased flexibility in the access to selected portions of the <strong>data</strong>.<br />

This procedure can be extended to the proposed scheme, where the bands to be used to generate the<br />

quicklooks could be canceled from the spectral vectors in the computation of the covariance matrix,<br />

and then of the trans<strong>for</strong>m coefficients. However, this goes beyond the scope of the present paper, and<br />

is left <strong>for</strong> further work.<br />

V. EXPERIMENTAL RESULTS<br />

The proposed scheme, based on the low-complexity KLT and JPEG 2000, has been compared<br />

with other state-of-the-art <strong>lossy</strong> <strong>compression</strong> schemes. First, in Sect. V-A we evaluate the complexity<br />

of the low- and full-complexity KLT; this allows to assess the actual computational advantage in a<br />

realistic <strong>compression</strong> setting. Then, in Sect. V-B we compare the <strong>compression</strong> per<strong>for</strong>mance of various<br />

algorithms. Finally, in Sect. V-C we report the results of some experiments aimed at evaluating the<br />

effect of <strong>lossy</strong> <strong>compression</strong> on remote sensing applications, and specifically on SAM classification.<br />

In the results described in this section, we have employed a set of AVIRIS radiance <strong>data</strong> using 256<br />

lines with 512 pixels and all bands, unless otherwise noted. The AVIRIS sensor is a representative<br />

<strong>hyperspectral</strong> one, and the <strong>data</strong> are publicly available on the Internet at aviris.jpl.nasa.gov;<br />

since these <strong>data</strong> are widely used in the literature, comparisons with other <strong>techniques</strong> are facilitated. In<br />

particular, the Cuprite, Jasper Ridge and Moffett Field scenes have been used. PSNR has been used<br />

as quality metric <strong>for</strong> <strong>lossy</strong> <strong>compression</strong> 1 . JPEG 2000 has been run without error resilience options,<br />

and no quality layers have been <strong>for</strong>med.<br />

A. Complexity<br />

Fig. 12 shows the <strong>compression</strong> per<strong>for</strong>mance and the computation time of the proposed algorithm as<br />

a function of ρ, on the Cuprite scene. The computation times have been measured on a Pentium IV PC<br />

at 3 GHz, and refer only to the evaluation of the covariance matrix. As can be seen, the per<strong>for</strong>mance<br />

loss is very smooth as ρ decreases, allowing one to select the best per<strong>for</strong>mance-complexity tradeoff<br />

<strong>for</strong> a given application. As had been noted in the previous experiment, the values ρ =0.1 and<br />

ρ =0.01 yield a very small loss, and can be used as starting point <strong>for</strong> a fine optimization. These<br />

values provide a complexity reduction of about 20 and 100 times with respect to the full-complexity<br />

trans<strong>for</strong>m.<br />

1 Since the <strong>data</strong> have only 15 significant bits in modulus, 2 15 − 1 has been used as peak value <strong>for</strong> PSNR computation.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 21<br />

80<br />

PSNR(dB)<br />

75<br />

70<br />

65<br />

10 −5 10 −4 10 −3 10 −2 10 −1 10 0<br />

10 3<br />

ρ<br />

time[s]<br />

10 2<br />

10 1<br />

10 0<br />

10 −5 10 −4 10 −3 10 −2 10 −1 10 0<br />

ρ<br />

Fig. 12. Top: Per<strong>for</strong>mance of the low-complexity KLT as a function of ρ. The curve refers to an en<strong>coding</strong> rate of 1 bpp.<br />

Bottom: computation time <strong>for</strong> the evaluation of the covariance matrix, as function of ρ. The results refer to the Cuprite<br />

scene.<br />

Clearly, the covariance matrix evaluation is only one source of complexity, since the solution<br />

to the eigenvector problem, the computation of trans<strong>for</strong>m coefficients, as well as the quantization,<br />

entropy <strong>coding</strong> and rate allocation, have to be taken into account. Tab. II compares the end-to-end<br />

computation time <strong>for</strong> the full-complexity KLT, the low-complexity KLT with ρ equal to 0.1 and<br />

0.01, and the technique proposed in [16], which employs the DWT1D2D trans<strong>for</strong>m, using JPEG<br />

2000 <strong>for</strong> the spatial wavelet trans<strong>for</strong>m, quantization, entropy <strong>coding</strong> and rate allocation; the time<br />

spent in the covariance matrix evaluation has also been reported. As can be seen, using ρ =0.1<br />

and ρ =0.01 yields an end-to-end computational saving of 2.57 and 3.01 times respectively, with a<br />

minor per<strong>for</strong>mance loss. As ρ decreases the computation time tends to settle on an asymptotic value<br />

which is larger than the value of the DWT1D2D. This is due to the fact that the solution to the<br />

eigenvector problem, and especially the computation of trans<strong>for</strong>m coefficients, are more demanding<br />

than the spectral DWT. This is somewhat obvious because the trans<strong>for</strong>m coefficients are computed as<br />

a full matrix-vector product, since the KLT matrix does not exhibit any structure that can be exploited<br />

to reduce the number of operations. However, the low-complexity KLT with ρ =0.01 is only about<br />

40% more complex than the DWT1D2D, and provides a significant per<strong>for</strong>mance gain, as will be seen<br />

in the following.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 22<br />

TABLE II<br />

COMPLEXITY COMPARISON OF END-TO-END COMPRESSION ALGORITHMS.RUNNING TIMES ARE EXPRESSED IN<br />

SECONDS.<br />

Algorithm Total time C X<br />

full-complexity KLT 816.12 557.35<br />

low-compl. KLT, ρ =0.1 318.01 59.58<br />

low-compl. KLT, ρ =0.01 270.57 8.73<br />

DWT1D2D 195.86 n.a.<br />

B. Compression per<strong>for</strong>mance<br />

The <strong>compression</strong> per<strong>for</strong>mance of the proposed scheme has been compared with that of other stateof-the-art<br />

schemes. The results are shown in Fig. 13 <strong>for</strong> the Cuprite scene. The following algorithms<br />

are compared: 1) the proposed scheme with the low-complexity KLT (ρ =0.01); 2) the scheme with<br />

the full-complexity KLT; 3) the DWT1D2D scheme employing JPEG 2000 and 3D rate-distortion<br />

optimization, as proposed in [16]; 4) the 3D-SPIHT scheme proposed in [10], [29].<br />

As expected, the per<strong>for</strong>mance of the low-complexity KLT is very close to that of the full-complexity<br />

trans<strong>for</strong>m, with a maximum loss of 0.27 dB at high bit-rates. The per<strong>for</strong>mance gap between the<br />

DWT1D2D trans<strong>for</strong>m and 3D-SPIHT had already been noticed in [16], where it was pointed out<br />

that the hybrid square/rectangular 3D wavelet trans<strong>for</strong>m per<strong>for</strong>ms significantly better than the 3D<br />

square trans<strong>for</strong>m, mainly thanks to the finer frequency tessellation, which is a better match to the<br />

high in<strong>for</strong>mation content of the spectral vectors. It should be noted that, with respect to the technique<br />

in [16], the proposed KLT-based scheme achieves a significant PSNR gain, ranging between 2.5 and<br />

6.7 dB. The gain with respect to 3D-SPIHT is even larger, and reflects the trans<strong>for</strong>m evaluation results<br />

in Sect. III, where it has been observed that the 3D square DWT is not able to effectively capture all<br />

the correlations of a <strong>hyperspectral</strong> <strong>data</strong>set.<br />

Similar results have been achieved <strong>for</strong> other scene. In particular in Fig. 14 we report per<strong>for</strong>mance<br />

results <strong>for</strong> the Jasper Ridge scene. The gain with the respect to the technique in [16] is between 5<br />

and 8.1 dB, and even larger gains are achieved with respect to 3D-SPIHT.<br />

In order to compare the per<strong>for</strong>mance of the proposed KLT-based technique with the most up-to-date<br />

<strong>compression</strong> technology, a comparison with SPECK [30], [11], [31] has been carried out. The results<br />

in Tab. III are worked out in the same conditions as [30], [11], [31], i.e. using only 512 samples per<br />

line of the AVIRIS reflectance <strong>data</strong> set, 512 lines, and <strong>coding</strong> all 224 bands as a whole. The proposed<br />

scheme with full-complexity KLT, low-complexity KLT (ρ =0.01), the DWT1D2D scheme in [16],


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 23<br />

90<br />

85<br />

80<br />

PSNR (dB)<br />

75<br />

70<br />

65<br />

60<br />

Full−complexity KLT + JPEG 2000<br />

Low−complexity KLT ( ρ= 0.01) + JPEG 2000<br />

DWT1D + JPEG 2000<br />

3D−SPIHT<br />

55<br />

0 0.5 1 1.5 2 2.5<br />

rate ( bpp)<br />

Fig. 13. Per<strong>for</strong>mance evaluation of the proposed JPEG 2000 based technique: rate-distortion curve <strong>for</strong> the Cuprite<br />

scene. Dashed: Full-complexity KLT. Solid: low-complexity KLT, ρ = 0.01. Dotted+star: DWT1D2D as proposed in<br />

[16]. Solid+star: 3D-SPIHT.<br />

and SPECK are compared; note that, in the table, results are given in terms of signal-to-noise ratio<br />

(SNR) rather than PSNR.<br />

Consistently with the results reported above, also on the reflectance <strong>data</strong> the per<strong>for</strong>mance loss of<br />

the low-complexity KLT with respect to the full-complexity one does not exceed 0.5 dB. The lowcomplexity<br />

KLT has a gain of about 7 dB with respect to the scheme in [16] employing the hybrid<br />

rectangular/square DWT, even though <strong>for</strong> high bit-rate the per<strong>for</strong>mance gap decreases. The proposed<br />

scheme exhibits a significant gain also with respect to SPECK; the SNR gain is even more remarkable,<br />

ranging from 5 to more than 10 dB. This gain is mainly due to two factors. The <strong>for</strong>mer is the improved<br />

<strong>coding</strong> efficiency of the KLT with respect to the spectral DWPT employd in SPECK. The latter is<br />

the 3D post-<strong>compression</strong> rate-distortion optimization, which is more flexible in selecting the portions<br />

of the 3D set of trans<strong>for</strong>m coefficients that contribute more significantly to the reconstructed image<br />

quality.<br />

C. Impact on image exploitation<br />

As is well-known, quality metrics such as PSNR, which are based on the MSE, measure the<br />

fidelity of the reconstructed image with respect to the original image. However, higher PSNR may not<br />

necessarily yield higher quality of a remote sensing <strong>lossy</strong>-compressed image <strong>for</strong> a given application.<br />

In fact, some artifacts, e.g. tiling, which may have little effect on PSNR, could heavily bias the


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 24<br />

90<br />

85<br />

80<br />

75<br />

PSNR (dB)<br />

70<br />

65<br />

60<br />

Full−complexity KLT + JPEG 2000<br />

Low−complexity KLT ( ρ=0.01) + JPEG 2000<br />

55<br />

DWT1D2D<br />

3D−SPIHT<br />

50<br />

0 0.5 1 1.5 2 2.5<br />

rate (bpp)<br />

Fig. 14. Per<strong>for</strong>mance evaluation of the proposed JPEG 2000 based technique: rate-distortion curve <strong>for</strong> the Jasper Ridge<br />

scene. Dashed: Full-complexity KLT. Solid: low-complexity KLT, ρ =0.01. Dotted+star: DWT1D2D as proposed in [16].<br />

Solid+star: 3D-SPIHT.<br />

analysis results of the reconstructed images. There<strong>for</strong>e, it is necessary to validate the <strong>compression</strong><br />

results also from the remote sensing application standpoint. Although this is an important research<br />

topic, there is no widely accepted protocol to evaluate remote sensing image quality in a general<br />

way; this is partly caused by the conspicuous number of existing remote sensing applications, which<br />

makes it difficult to work out a reasonable set of quality metrics.<br />

In [17] a study of various quality metrics has been carried out. It is shown that MSE is reasonably<br />

good at capturing the effect of <strong>lossy</strong> <strong>compression</strong> on SAM classification, although more than one<br />

metric is needed to accurately analyze the quality degradation. A similar approach has been followed<br />

in [30], [11], [31], where SAM classification is used as benchmark application to evaluate the impact<br />

of <strong>lossy</strong> <strong>compression</strong>. In this paper, we also adopt the SAM classification method; SAM permits<br />

rapid mapping by calculating the spectral similarity between the image spectral vectors and reference<br />

vectors. These reference vectors can either be taken from laboratory or field measurements or extracted<br />

directly from the image. SAM measures the spectral similarity by calculating the angle between the<br />

two spectral vectors, treating them as vectors in a B-dimensional space; small angles between the<br />

two vectors indicate high similarity, and high angles indicate low similarity. It computes the arccosine<br />

of the dot product between the test vector t to a reference vector r with the following equation:<br />

( ∑ )<br />

B<br />

i=1<br />

arccos<br />

t ir i<br />

( ∑ B<br />

i=1 t2 i ) 1 2 ( ∑ B<br />

i=1 r2 i ) (2)<br />

1<br />

2<br />

where B is the number of bands of the <strong>hyperspectral</strong> image cube, t i are the components of the test


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 25<br />

TABLE III<br />

COMPARISON BETWEEN THE PROPOSED TECHNIQUE AND SPECK - SNR (DB).<br />

Coding scheme / Rate (bpp) 0.1 0.2 0.5 1 2 3 4<br />

Jasper Ridge - scene 1<br />

Proposed scheme (KLT) 27.97 34.02 40.77 45.50 51.24 56.86 57.77<br />

Proposed scheme (low-complexity KLT) 27.90 33.85 40.69 45.35 50.99 56.59 57.84<br />

DWT1D2D [16] 22.31 26.55 34.31 40.41 47.31 52.56 57.56<br />

3D-SPECK 19.70 23.66 31.75 38.55 46.00 48.59 52.36<br />

Moffett Field - scene 1<br />

Proposed scheme (KLT) 22.17 32.54 42.17 47.46 53.45 58.92 61.04<br />

Proposed scheme (low-complexity KLT) 22.19 32.56 42.08 47.40 53.27 58.73 61.10<br />

DWT1D2D [16] 17.24 22.34 32.22 40.85 48.76 53.98 59.25<br />

3D-SPECK 16.67 21.52 29.91 38.60 47.18 51.27 55.57<br />

Moffett Field - scene 3<br />

Proposed scheme (KLT) 16.12 25.34 36.32 42.87 49.75 55.08 56.92<br />

Proposed scheme (low-complexity KLT) 16.10 25.32 36.21 42.83 49.62 54.89 57.05<br />

DWT1D2D [16] 12.86 17.91 27.53 36.37 45.09 50.75 55.87<br />

3D-SPECK 12.60 17.98 26.99 35.37 40.10 46.71 50.79<br />

vector, and r i those of the reference vector.<br />

Following the procedure in [30], [11], [31], we have selected an area in scene 1 of Jasper Ridge,<br />

and have applied the k-means clustering method [32], [33] to evaluate the centroids of three clusters,<br />

namely asphalt, water and vegetation. Subsequently, we have employed these three selected centroids<br />

as reference vectors in the SAM method.<br />

In Tab. IV we report the classification per<strong>for</strong>mance in terms of percentage of pixels assigned to<br />

the same cluster in the reconstructed image with respect to the original one. It can be noticed that<br />

the per<strong>for</strong>mance reflects quite closely the result discussed above in terms of PSNR, in that, in the<br />

vast majority of cases, a higher PSNR results into a smaller classification error; this is consistent<br />

with the results in [17], and confirms that MSE-based metrics are reasonably good indicators of<br />

the per<strong>for</strong>mance degradation caused by <strong>compression</strong> artifacts. These results, although without any<br />

presumption of being exhaustive, indeed indicate that <strong>for</strong> this application the proposed scheme yields<br />

improved per<strong>for</strong>mance also in terms of classification results, making the proposed technique very<br />

competitive in terms of complexity, <strong>compression</strong> per<strong>for</strong>mance, and remote sensing image quality.<br />

In Fig. 16 one can see the results of the classification procedure applied to the original image, to<br />

the reconstructed image with full-complexity KLT at rate of 1 bpp, and to the reconstructed image


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 26<br />

TABLE IV<br />

COMPARISON BETWEEN THE PROPOSED TECHNIQUE AND SPECK, IN TERMS OF THE PERCENTAGE OF PIXELS THAT<br />

ARE ASSIGNED TO THE SAME CLUSTER IN THE ORIGINAL AND THE RECONSTRUCTED IMAGES.<br />

Coding scheme / Rate (bpp) 0.1 0.2 0.5 1 2 3 4<br />

Jasper Ridge - scene 1<br />

Full-complexity KLT 98.98 99.73 99.93 99.97 99.98 99.99 99.99<br />

Low-complexity KLT, ρ =0.1 98.96 99.74 99.93 99.97 99.98 99.99 99.99<br />

Low-complexity KLT, ρ =0.01 98.98 99.73 99.93 99.97 99.98 99.99 99.99<br />

DWT1D2D 98.74 99.36 99.84 99.95 99.98 99.99 99.99<br />

3D-SPIHT 97.64 98.69 99.66 99.90 99.95 99.98 99.99<br />

with low-complexity KLT (ρ =0.01) at rate of 1 bpp. The original image is displayed in Fig. 15. It<br />

can be observed that the thematic maps obtained from the compressed images are almost identical<br />

to the reference map.<br />

Fig. 15.<br />

Jasper Ridge - scene 1 (reflectance).<br />

VI. CONCLUSIONS<br />

In this paper we have carried out an extensive study of 3D trans<strong>for</strong>ms <strong>for</strong> <strong>lossy</strong> <strong>compression</strong> of <strong>hyperspectral</strong><br />

<strong>data</strong>. It has been found that, among wavelet-based trans<strong>for</strong>ms, a hybrid-rectangulare/square<br />

trans<strong>for</strong>m is highly suitable, and achieves per<strong>for</strong>mance similar to wavelet packets.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 27<br />

Fig. 16. Classification result respectively on the original image (left), on the compressed image with full-complexity KLT<br />

at 1 bpp (center), on the compressed image with low-complexity KLT (ρ =0.01) at 1 bpp (right). The number of clusters<br />

is equal to 3.<br />

The best spectral trans<strong>for</strong>m has turned out to be the KLT. In order to make this trans<strong>for</strong>m<br />

computationally feasible, we have proposed a low-complexity version with comparable per<strong>for</strong>mance.<br />

The degree of computational saving and the related per<strong>for</strong>mance loss can be tuned to the specific<br />

needs of each application.<br />

The low-complexity KLT, along with a hybrid wavelet-based scheme, have been integrated<br />

into a JPEG 2000 Part 2 compliant scheme. Tests have been carried out on AVIRIS <strong>data</strong>, and<br />

comparisons have been per<strong>for</strong>med with respect to 3D-SPIHT and SPECK. The proposed KLT-based<br />

scheme achieves significant per<strong>for</strong>mance gains with respect to the hybrid schemes and 3D-SPIHT;<br />

it outper<strong>for</strong>ms SPECK by 5 to 10 dB in PSNR. An end-to-end complexity reduction of about three<br />

times can be achieved using the low-complexity KLT, with a minor per<strong>for</strong>mance loss (about 0.5<br />

dB). This trans<strong>for</strong>m is only about 40% more complex than 3D wavelets, but has significantly better<br />

per<strong>for</strong>mance.<br />

A quality assessment of compressed images has also been carried out by evaluating the effects of<br />

several <strong>lossy</strong> <strong>compression</strong> schemes on the results of SAM classification. It turns out that, <strong>for</strong> this<br />

application, PSNR is a good indicator of classification per<strong>for</strong>mance, so that the proposed scheme is<br />

still the highest-per<strong>for</strong>mance one by a large margin.<br />

REFERENCES<br />

[1] D.S. Taubman and M.W. Marcellin, JPEG2000: Image Compression Fundamentals, Standards, and Practice, Kluwer,<br />

2001.<br />

[2] S. Lim, K. Sohn, and C. Lee, “Compression <strong>for</strong> <strong>hyperspectral</strong> images using three dimensional wavelet trans<strong>for</strong>m,” in<br />

Proc. of IGARSS - IEEE International Geoscience and Remote Sensing Symposium, Sydney, Australia, 2001.<br />

[3] Y. Tseng, H. Shih, and P. Hsu, “Hyperspectral image <strong>compression</strong> using three-dimensional wavelet trans<strong>for</strong>mation,”<br />

in Proceedings of the the 21st Asian Conference on Remote Sensing (ACRS), Taipei, Taiwan, 2000.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 28<br />

[4] A. Kaarna and J. Parkkinen, “Comparison of <strong>compression</strong> methods <strong>for</strong> multispectral images,” in Proc. of NORSIG -<br />

Nordic Signal Processing Symposium, Kolmarden, Sweden, 2000, vol. 2, pp. 251–254.<br />

[5] G.P. Abousleman, M.W. Marcellin, and B.R. Hunt, “Compression of <strong>hyperspectral</strong> imagery using the 3-D DCT and<br />

hybrid DPCM-DCT,” IEEE Transactions on Geoscience and Remote Sensing, vol. 33, no. 1, pp. 26–34, Jan. 1995.<br />

[6] D. Markman and D. Malah, “Hyperspectral image <strong>coding</strong> using 3D trans<strong>for</strong>ms,” in Proc. of ICIP - IEEE International<br />

Conference on Image Processing, Thessaloniki, Greece, 2001.<br />

[7] M.D. Pal, C.M. Brislawn, and S.R. Brumby, “Feature extraction from <strong>hyperspectral</strong> images compressed using the<br />

JPEG-2000 standard,” in Proc. of SSIAI - IEEE Southwest Symposium on Image Analysis and Interpretation, Santa<br />

Fe, New Mexico, 2002.<br />

[8] M.D. Pal and C.M. Brislawn S.P. Brumby, “Feature extraction <strong>for</strong>m <strong>hyperspectral</strong> images compressed using the<br />

JPEG-2000 standard,” in Proc. of SSIAI Southwest Symposium on Image Analysis and Interpretation, Santa Fe, New<br />

Mexico, 2002.<br />

[9] S. Lim, K.H. Sohn, and C. Lee, “Principal component analysis <strong>for</strong> <strong>compression</strong> of <strong>hyperspectral</strong> images,” in Proc. of<br />

IGARSS - IEEE International Geoscience and Remote Sensing Symposium, Sydney, Australia, 2001.<br />

[10] X. Tang, C. Sungdae, and W.A. Pearlman, “3D set partitioning <strong>coding</strong> methods in <strong>hyperspectral</strong> image <strong>compression</strong>,”<br />

in Proc. of ICIP - IEEE International Conference on Image Processing, Barcelona, Spain, 2003.<br />

[11] X. Tang and W.A. Pearlman, “Three-dimensional wavelet-based <strong>compression</strong> of <strong>hyperspectral</strong> images,” in<br />

Hyperspectral Data Compression. Kluwer Academic Publishers, 2005.<br />

[12] J.A. Sagri, A.G. Tescher, and J.T. Reagan, “Practical trans<strong>for</strong>m <strong>coding</strong> of multispectral imagery,” IEEE Signal<br />

Processing Magazine, pp. 32–43, Jan. 1995.<br />

[13] P.L. Dragotti, G. Poggi, and A.R.P. Ragozini, “Compression of multispectral images by three-dimensional SPIHT<br />

algorithm,” IEEE Transactions on Geoscience and Remote Sensing, vol. 38, no. 1, pp. 416–428, Jan. 2000.<br />

[14] L. Chang, C. Cheng, and T. Chen, “An efficient adaptive KLT <strong>for</strong> multispectral image <strong>compression</strong>,” in Proceedings<br />

of 4th IEEE Southwest Symposium on Image Analysis and Interpretation, Austin, TX, 2000.<br />

[15] P. Hao and Q. Shi, “Reversible integer KLT <strong>for</strong> progressive-to-lossless <strong>compression</strong> of multiple component images,”<br />

in Proc of IEEE International Conference on Image Processing, 2003, Barcelona, Spain, 2003.<br />

[16] B. Penna, T.Tillo, E. Magli, and G. Olmo, “Progressive 3D <strong>coding</strong> of <strong>hyperspectral</strong> images based on JPEG 2000,”<br />

IEEE Geoscience and Remote Sensing Letters, to appear Jan. 2006.<br />

[17] E. Christophe, D. Léger, and C. Mailhes, “Quality criteria benchmark <strong>for</strong> <strong>hyperspectral</strong> imagery,” IEEE Transactions<br />

on Geoscience and Remote Sensing, vol. 43, no. 9, pp. 2103–2114, Sept. 2005.<br />

[18] V. Guralnik and G. Karypis, “A scalable algorithm <strong>for</strong> clustering protein sequences,” in Workshop on Data Mining<br />

in Bioin<strong>for</strong>matics, 2001.<br />

[19] JPEG 2000 Part 2 - Extensions, Document ISO/IEC 15444-2.<br />

[20] M. Vetterli and J. Kovacevic, Wavelet and Subband Coding, Pretince Hall, 1995.<br />

[21] R.R. Coifman and M.V. Wickerhauser, “Entropy-based algorithms <strong>for</strong> best basis selection,” IEEE Transactions on<br />

In<strong>for</strong>mation Theory, vol. 38, no. 2, pp. 713–718, Mar. 1992.<br />

[22] K. Ramchandran and M. Vetterli, “Best wavelet packet bases in a rate-distortion sense,” IEEE Transactions on Image<br />

Processing, vol. 2, no. 2, pp. 160–175, Apr. 1993.<br />

[23] V.K. Goyal, “Theoretical foundations of trans<strong>for</strong>m <strong>coding</strong>,” IEEE Signal Processing Magazine, vol. 18, no. 5, pp.<br />

9–21, 2001.<br />

[24] I.S. Dhillon, ANewO(N 2 ) Algorithm <strong>for</strong> the Symmetric Tridiagonal Eigenvalue/Eigenvector Problem, Ph.D. Thesis,<br />

University of Cali<strong>for</strong>nia, Berkeley, 1997.


IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (SUBMITTED DEC. 2005) 29<br />

[25] X. Wu and N. Memon, “Context-based lossless interband <strong>compression</strong> - extending CALIC,” IEEE Transactions on<br />

Image Processing, vol. 9, no. 6, pp. 994–1001, June 2000.<br />

[26] JPEG 2000 Part 2 - Extensions, Document ISO/IEC 15444-2.<br />

[27] T.D. Tran, “The binDCT: fast multiplierless approximation of the DCT,” IEEE Signal Processing Letters, vol. 7, no.<br />

6, pp. 141–144, June 2000.<br />

[28] P. Hao and Q. Shi, “Matrix factorizations <strong>for</strong> reversible integer mapping,” IEEE Transactions on Signal Processing,<br />

vol. 49, no. 10, pp. 2314–2324, Oct. 2001.<br />

[29] X. Tang, C. Sungdae, and W.A. Pearlman, “Comparison of 3D set partitioning methods in <strong>hyperspectral</strong> image<br />

<strong>compression</strong> featuring an improved 3D-SPIHT,” in Proceedings of the IEEE Data Compression Conference (DCC),<br />

2003.<br />

[30] X. Tang, W.A. Pearlman, and J.W. Modestino, “Hyperspectral image <strong>compression</strong> using three-dimensional wavelet<br />

<strong>coding</strong>: A <strong>lossy</strong>-to-lossless solution,” submitted to IEEE Transactions on Geoscience and Remote Sensing, available<br />

at http://www.cipr.rpi.edu/ pearlman/ , 2004.<br />

[31] X. Tang and W.A. Pearlman, “Lossy-to-lossless block-based <strong>compression</strong> of <strong>hyperspectral</strong> volumetric <strong>data</strong>,” in Proc<br />

of IEEE International Conference on Image Processing, 2004.<br />

[32] F.A. Kruse, A.B. Lefkoff, J.B. Boardman, K.B. Heidebrecht, A.T. Shapiro, P.J. Barloon, and A.F.H. Goetz, “The<br />

spectral image processing system (sips) interactive visualization and analysis of imaging spectrometer <strong>data</strong>,” Remote<br />

Sensing of Environment, vol. 44, pp. 145–163, 1993.<br />

[33] J.W. Boardman, F.A. Kruse, and R.O. Green, “Mapping target signatures via partial unmixing of AVIRIS <strong>data</strong>,” in<br />

Fifth JPL Airborne Earth Science Workshop, JPL Publication, 1995, pp. 23–26.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!