01.12.2014 Views

Modified Fisher's Linear Discriminant Analysis for ... - IEEE Xplore

Modified Fisher's Linear Discriminant Analysis for ... - IEEE Xplore

Modified Fisher's Linear Discriminant Analysis for ... - IEEE Xplore

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

504 <strong>IEEE</strong> GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007<br />

II. MFLDA<br />

Let the total scatter matrix S T be defined as<br />

S T =<br />

n∑<br />

(r i − µ)(r i − µ) T (4)<br />

i=1<br />

and it can be related with S W and S B by [1]<br />

S T = S W + S B . (5)<br />

So the maximization of (3) is equivalent to maximizing<br />

q ′ = wT S B w<br />

w T S T w . (6)<br />

Following the same idea of FLDA, the solution will be the<br />

eigenvectors of the generalized eigenproblem: S B w = λS T w.<br />

When the only available in<strong>for</strong>mation is the class signatures<br />

{s 1 , s 2 ,...,s p }, they can be treated as class means, i.e., M =<br />

[µ 1 µ 2 ···µ p ] ≈ [s 1 s 2 ···s p ].TheS B in (2) becomes<br />

Ŝ B =<br />

p∑<br />

(s j − ˆµ)(s j − ˆµ) T (7)<br />

j=1<br />

where ˆµ is the mean of class signatures, i.e., (1/p) ∑ p<br />

i=1 s i =<br />

ˆµ. S T in (4) can be replaced by the data covariance<br />

matrix Σ, i.e.,<br />

Ŝ T =Σ=<br />

N∑<br />

(r i − ˜µ)(r i − ˜µ) T (8)<br />

i=1<br />

where ˜µ is the sample mean of the entire data set with N pixels,<br />

i.e., (1/N ) ∑ N<br />

i=1 r i = ˜µ. Then, the solution is the eigenvectors<br />

of the generalized eigenproblem: ŜBw = λΣw or Σ −1 Ŝ B .<br />

Regardless of the actual classes present in the data, replacing<br />

S T with Σ represents an extreme case, which means all the<br />

pixels are separated into the classes they belong to and selected<br />

as samples. Using ŜB as S B represents another extreme case,<br />

which means there is only one sample in each class. So the<br />

discrepancy incurred comes from two factors: only one sample<br />

(i.e., class signature) <strong>for</strong> each of the p classes is used to estimate<br />

S B , and all the pixels are used to estimate S T with the implicit<br />

assumption that pixels are put into all the existing classes<br />

including unknown background classes (i.e., the actual number<br />

of classes p T may be greater than p). In the experiments, it will<br />

be shown that the term Σ −1 is very effective in background<br />

suppression.<br />

Since the rank of ŜB is the same as S B , which is (p − 1),<br />

the dimensionality of the MFLDA-trans<strong>for</strong>med data is (p − 1)<br />

as that of FLDA. After the data are projected onto this (p − 1)-<br />

dimensional space, an algorithm is needed <strong>for</strong> some tasks, such<br />

as classification or detection. A less powerful distance-based<br />

classifier such as the Spectral Angle Mapper (SAM) can be<br />

applied. Or, a more powerful filter, such as target constrained<br />

interference minimized filter (TCIMF), may be used [6].<br />

III. RELATIONSHIP BETWEEN LDA-BASED APPROACHES<br />

A. Relationship Between FLDA and CFLDA<br />

The CFLDA in [5] imposed a constraint to align the class<br />

centers along with different directions [4], i.e.,<br />

w T l µ j = δ lj , <strong>for</strong> 1 ≤ l; j ≤ p. (9)<br />

This also means that the jth trans<strong>for</strong>m vector w j is <strong>for</strong> the<br />

jth class. So the CFLDA-trans<strong>for</strong>med data are actually classification<br />

maps. It can be derived that when the constraint<br />

was satisfied, w T S B w was a constant. Thus, the constrained<br />

problem would be to minimize w T S W w in (3) while satisfying<br />

the constraint in (9). Using the Lagrange multiplier approach,<br />

it was shown that the desired trans<strong>for</strong>m matrix W including all<br />

the p trans<strong>for</strong>m vectors is<br />

W CFLDA = S −1<br />

W M ( M T S −1<br />

W M) −1<br />

. (10)<br />

Obviously, the implementation of CFLDA requires the<br />

knowledge of the training samples of each class to compute<br />

S W .<br />

B. Relationship Between CFLDA, CLDA, and MFLDA<br />

Following the same idea of FLDA in maximizing the class<br />

separability, the CLDA in [2] and [3] imposed the same<br />

constraint that different classes were aligned along different<br />

directions as in (9). To make the constrained problem easier to<br />

solve, it employed the ratio of within-class and between-class<br />

distances instead of the Raleigh quotient [4]. It was proved that<br />

the trans<strong>for</strong>med within-class distance is a constant when the<br />

constraint in (9) was satisfied. It also used the data covariance<br />

matrix Σ to substitute S T as in MFLDA. It was proved that the<br />

trans<strong>for</strong>m matrix W is equivalent to [3]<br />

W CLDA =Σ −1 M(M T Σ −1 M) −1 . (11)<br />

Equation (11) is similar to (10) except that S W is replaced<br />

with Σ. There<strong>for</strong>e, CLDA does not require the training samples<br />

in each class and it needs the class signatures only. Similar to<br />

CFLDA, CLDA was designed <strong>for</strong> classification, so the classification<br />

maps were obtained right after the trans<strong>for</strong>m.<br />

C. Use of Σ and S W<br />

Both CFLDA and CLDA apply the constraint in (9), resulting<br />

in the similar operators in (10) and (11) with the difference<br />

that CLDA uses Σ while CFLDA uses S W . So CLDA does<br />

not require the training samples, which is the same as in<br />

MFLDA. There is another benefit of using Σ. As mentioned<br />

earlier, the true number of classes present in an image scene<br />

p T is greater than p due to the difficulty of exhausting all the<br />

present classes, in particular, those background classes. In the<br />

ideal case when all the pixels in an image scene are put into<br />

the p T classes, S T =Σ. There<strong>for</strong>e, using Σ in LDA-based<br />

approaches represents the best situation <strong>for</strong> S T , which means

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!