04.02.2013 Views

on angles between subspaces of inner product spaces - FMIPA ...

on angles between subspaces of inner product spaces - FMIPA ...

on angles between subspaces of inner product spaces - FMIPA ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ON ANGLES BETWEEN SUBSPACES<br />

OF INNER PRODUCT SPACES<br />

HENDRA GUNAWAN AND OKI NESWAN<br />

Abstract. We discuss the noti<strong>on</strong> <strong>of</strong> <strong>angles</strong> <strong>between</strong> two <strong>sub<strong>spaces</strong></strong> <strong>of</strong> an <strong>inner</strong><br />

<strong>product</strong> space, as introduced by Risteski and Trenčevski [11] and Gunawan, Neswan<br />

and Setya-Budhi [7], and study its c<strong>on</strong>necti<strong>on</strong> with the so-called can<strong>on</strong>ical <strong>angles</strong>.<br />

Our approach uses <strong>on</strong>ly elementary calculus and linear algebra.<br />

1. Introducti<strong>on</strong><br />

The noti<strong>on</strong> <strong>of</strong> <strong>angles</strong> <strong>between</strong> two <strong>sub<strong>spaces</strong></strong> <strong>of</strong> a Euclidean space R n has attracted<br />

many researchers since the 1950’s (see, for instance, [1, 2, 3]). These <strong>angles</strong> are known<br />

to statisticians and numerical analysts as can<strong>on</strong>ical or principal <strong>angles</strong>. A few results<br />

<strong>on</strong> can<strong>on</strong>ical <strong>angles</strong> can be found in, for example, [4, 8, 10, 12]. Recently, Risteski and<br />

Trenčevski [11] introduced an alternative definiti<strong>on</strong> <strong>of</strong> <strong>angles</strong> <strong>between</strong> two <strong>sub<strong>spaces</strong></strong><br />

<strong>of</strong> R n and explained their c<strong>on</strong>necti<strong>on</strong> with can<strong>on</strong>ical <strong>angles</strong>.<br />

Let U = span{u1, . . . , up} and V = span{v1, . . . , vq} be p- and q-dimensi<strong>on</strong>al sub-<br />

<strong>spaces</strong> <strong>of</strong> R n , where {u1, . . . , up} and {v1, . . . , vq} are orth<strong>on</strong>ormal, with 1 ≤ p ≤ q.<br />

Then, as is suggested by Risteski and Trenčevski [11], <strong>on</strong>e can define the angle θ<br />

<strong>between</strong> the two <strong>sub<strong>spaces</strong></strong> U and V by<br />

cos 2 θ := det(M T M)<br />

where M := [〈ui, vk〉] T is a q × p matrix, M T is its transpose, and 〈·, ·〉 denotes the<br />

usual <strong>inner</strong> <strong>product</strong> <strong>on</strong> R n . (Actually, Risteski and Trenčevski gave a more general<br />

formula for cos θ, but their formula is err<strong>on</strong>eous and is corrected by Gunawan, Neswan<br />

and Setya-Budhi in [7].)<br />

2000 Mathematics Subject Classificati<strong>on</strong>. 15A03, 51N20, 15A21, 65F30.<br />

Key words and phrases. can<strong>on</strong>ical <strong>angles</strong>, angle <strong>between</strong> <strong>sub<strong>spaces</strong></strong>.<br />

The research is supported by QUE-Project V (2003) Math-ITB.<br />

1


Now let θ1 ≤ θ2 ≤ · · · ≤ θp be the can<strong>on</strong>ical <strong>angles</strong> <strong>between</strong> U and V , which are<br />

defined recursively by<br />

cos θ1 := max<br />

u∈U,�u�=1<br />

cos θi+1 := max<br />

u∈Ui,�u�=1<br />

max 〈u, v〉 = 〈u1, v1〉<br />

v∈V,�v�=1<br />

max<br />

v∈Vi,�v�=1 〈u, v〉 = 〈ui+1, vi+1〉<br />

where Ui is the orthog<strong>on</strong>al complement <strong>of</strong> ui relative to Ui−1 and Vi is the orthog<strong>on</strong>al<br />

complement <strong>of</strong> vi relative to Vi−1 (with U0 = U and V0 = V ). Then we have<br />

Theorem 1 (Risteski & Trenčevski). cos2 θ = p�<br />

cos2 θi.<br />

The pro<strong>of</strong> <strong>of</strong> Theorem 1 uses matrix theoretic arguments (involving the noti<strong>on</strong>s <strong>of</strong><br />

orthog<strong>on</strong>al, positive definite, symmetric, and diag<strong>on</strong>alizable matrices). In this note,<br />

we present an alternative approach towards Theorem 1. For the case p = 2, we<br />

reprove Theorem 1 by using <strong>on</strong>ly elementary calculus. For the general case, we prove<br />

a statement similar to Theorem 1 by using elementary calculus and linear algebra.<br />

As in [7], we replace R n by an <strong>inner</strong> <strong>product</strong> space <strong>of</strong> arbitrary dimensi<strong>on</strong>. Related<br />

results may be found in [9].<br />

i=1<br />

2. Main results<br />

In this secti<strong>on</strong>, let (X, 〈·, ·〉) be an <strong>inner</strong> <strong>product</strong> space <strong>of</strong> dimensi<strong>on</strong> 2 or higher (may<br />

be infinite), and U = span{u1, . . . , up} and V = span{v1, . . . , vq} be two <strong>sub<strong>spaces</strong></strong> <strong>of</strong><br />

X with 1 ≤ p ≤ q < ∞. Assuming that {u1, . . . , up} and {v1, . . . , vq} are orth<strong>on</strong>ormal,<br />

let θ be the angle <strong>between</strong> U and V , given by<br />

cos 2 θ := det(M T M)<br />

with M := [〈ui, vk〉] T being a q × p matrix.<br />

2.1. The case p = 2. For p = 2, we have the following reformulati<strong>on</strong> <strong>of</strong> Theorem 1.<br />

Fact 2. cos θ = max<br />

u∈U,�u�=1<br />

max 〈u, v〉 · min<br />

v∈V,�v�=1 u∈U,�u�=1<br />

max 〈u, v〉.<br />

v∈V,�v�=1<br />

Pro<strong>of</strong>. First note that, for each u ∈ X, 〈u, v〉 is maximized <strong>on</strong> {v ∈ V : �v� = 1} at<br />

v = projV u<br />

�<br />

and max 〈u, v〉 = u, �projV u�<br />

v∈V,�v�=1 projV u<br />

�<br />

= �proj �projV u�<br />

V u�. Here <strong>of</strong> course projV u =<br />

q�<br />

〈u, vk〉vk, denoting the orthog<strong>on</strong>al projecti<strong>on</strong> <strong>of</strong> u <strong>on</strong> V . Now c<strong>on</strong>sider the functi<strong>on</strong><br />

k=1<br />

2


f(u) := �proj V u�, u ∈ U, �u� = 1. To find its extreme values, we examine the<br />

functi<strong>on</strong> g := f 2 , given by<br />

g(u) = �proj V u� 2 =<br />

q�<br />

〈u, vk〉 2 , u ∈ U, �u� = 1.<br />

k=1<br />

Writing u = (cos t)u1 + (sin t)u2, we can view g as a functi<strong>on</strong> <strong>of</strong> t ∈ [0, π], given by<br />

q� �<br />

g(t) = cos t〈u1, vk〉 + sin t〈u2, vk〉 �2 .<br />

k=1<br />

Expanding the series and using trig<strong>on</strong>ometric identities, we can rewrite g as<br />

g(t) = 1�<br />

√ �<br />

C + A2 + B2 cos(2t − α)<br />

2<br />

where<br />

q� �<br />

A = 〈u1, vk〉 2 − 〈u2, vk〉 2� q�<br />

, B = 2 〈u1, vk〉〈u2, vk〉,<br />

k=1<br />

C =<br />

k=1<br />

q� �<br />

〈u1, vk〉 2 + 〈u2, vk〉 2� ,<br />

k=1<br />

and tan α = B/A. From this expressi<strong>on</strong>, we see that g has the maximum value<br />

m := C+√ A 2 +B 2<br />

2<br />

and the minimum value n := C−√A2 +B2 . (Note particularly that n is<br />

2<br />

n<strong>on</strong>negative.) Accordingly the extreme values <strong>of</strong> f are<br />

max<br />

u∈U,�u�=1<br />

Now we observe that<br />

max 〈u, v〉 =<br />

v∈V,�v�=1<br />

√ m and min<br />

u∈U,�u�=1<br />

mn = C2 − (A 2 + B 2 )<br />

=<br />

4<br />

q�<br />

〈u1, vk〉 2<br />

k=1<br />

= det(M T M)<br />

= cos 2 θ.<br />

max 〈u, v〉 =<br />

v∈V,�v�=1<br />

√ n.<br />

q�<br />

〈u2, vk〉 2 �<br />

−<br />

q �<br />

〈u1, vk〉〈u2, vk〉<br />

k=1<br />

This completes the pro<strong>of</strong>. �<br />

Geometrically, the value <strong>of</strong> cos θ, where θ is the angle <strong>between</strong> U = span{u1, u2}<br />

and V = span{v1, . . . , vq} with 2 ≤ q < ∞, represents the ratio <strong>between</strong> the area <strong>of</strong><br />

the parallelogram spanned by proj V u1 and proj V u2 and the area <strong>of</strong> the parallelogram<br />

spanned by u1 and u2. The area <strong>of</strong> the parallelogram spanned by two vectors x and<br />

y in X is given by � �x� 2 �y� 2 − 〈x, y〉 2� 1/2 (see [6]). In particular, when {u1, u2} and<br />

3<br />

k=1<br />

� 2


{v1, . . . , vq} are orth<strong>on</strong>ormal, the area <strong>of</strong> the parallelogram spanned by projV u1 and<br />

� q�<br />

q� 2 projV u2 is 〈u1, vk〉 〈u2, vk〉 2 � q�<br />

− 〈u1, vk〉〈u2, vk〉 �2 �1/2 , while the area <strong>of</strong> the<br />

k=1<br />

k=1<br />

k=1<br />

parallelogram spanned by u1 and u2 is 1.<br />

Our result here says that, in statistical terminology, the value <strong>of</strong> cos θ is the <strong>product</strong><br />

<strong>of</strong> the maximum and the minimum correlati<strong>on</strong>s <strong>between</strong> u ∈ U and its best predictor<br />

proj V u ∈ V . The maximum correlati<strong>on</strong> is <strong>of</strong> course the cosine <strong>of</strong> the first can<strong>on</strong>ical<br />

angle, while the minimum correlati<strong>on</strong> is the cosine <strong>of</strong> the sec<strong>on</strong>d can<strong>on</strong>ical angle. The<br />

fact that the cosine <strong>of</strong> the last can<strong>on</strong>ical angle is in general the minimum correlati<strong>on</strong><br />

<strong>between</strong> u ∈ U and proj V u ∈ V is justified in [10].<br />

2.2. The general case. The main ideas <strong>of</strong> the pro<strong>of</strong> <strong>of</strong> Theorem 1 are that the<br />

matrix M T M is diag<strong>on</strong>alizable so that its determinant is equal to the <strong>product</strong> <strong>of</strong> its<br />

eigenvalues, and that the eigenvalues are the square <strong>of</strong> the cosines <strong>of</strong> the can<strong>on</strong>ical<br />

<strong>angles</strong>. The following statement is slightly less sharp than Theorem 1, but much easier<br />

to prove (in the sense that there are no sophisticated arguments needed).<br />

Theorem 3. cos θ is equal to the <strong>product</strong> <strong>of</strong> all critical values <strong>of</strong> f(u) := max 〈u, v〉,<br />

v∈V,�v�=1<br />

u ∈ U, �u� = 1.<br />

Pro<strong>of</strong>. For each u ∈ X, 〈u, v〉 is maximized <strong>on</strong> {v ∈ V : �v� = 1} at v = projV u<br />

�projV u� and<br />

max 〈u, v〉 = �projV u�. Now c<strong>on</strong>sider the functi<strong>on</strong> g := f<br />

v∈V,�v�=1<br />

2 , given by<br />

g(u) = �projV u� 2 , u ∈ U, �u� = 1.<br />

Writing u = p�<br />

tiui, we have<br />

i=1<br />

proj V u =<br />

q�<br />

〈u, vk〉vk =<br />

k=1<br />

q�<br />

k=1<br />

and so we can view g as g(t1, . . . , tp), given by<br />

� q� p� �<br />

�<br />

�<br />

g(t1, . . . , tp) := � ti〈ui, vk〉vk<br />

where p�<br />

k=1<br />

i=1<br />

� 2<br />

p�<br />

ti〈ui, vk〉vk,<br />

i=1<br />

=<br />

q� � p � �2 ti〈ui, vk〉 ,<br />

t<br />

i=1<br />

2 i = 1. In terms <strong>of</strong> the matrix M = [〈ui, vk〉] T , <strong>on</strong>e may write<br />

k=1<br />

i=1<br />

g(t) = t T M T Mt = |Mt| 2 q, |t|p = 1,<br />

where t = [t1 · · · tp] T ∈ R p and | · |p denotes the usual norm in R p .<br />

4


To find its critical values, we use the Lagrange method, working with the functi<strong>on</strong><br />

hλ(t) := |Mt| 2 q + λ � 1 − |t| 2� p<br />

p , t ∈ R .<br />

For the critical values, we must have<br />

whence<br />

∇hλ(t) = 2M T Mt − 2λt = 0,<br />

(2.1) M T Mt = λt.<br />

Multilpying both sides by t T <strong>on</strong> the left, we obtain<br />

λ = t T M T Mt = g(t),<br />

because t T t = |t| 2 = 1. Hence any Lagrange multiplier λ is a critical value <strong>of</strong> g.<br />

But (2.1) also tells us that λ must be an eigenvalue <strong>of</strong> M T M. From elementary<br />

linear algebra, we know that λ must satisfy<br />

det(M T M − λI) = 0,<br />

which has in general p roots, λ1 ≥ · · · ≥ λp ≥ 0, say. Since M T M is diag<strong>on</strong>alizable,<br />

we c<strong>on</strong>clude that<br />

det(M T M) =<br />

p�<br />

λi.<br />

This proves the asserti<strong>on</strong>. �<br />

A geometric interpretati<strong>on</strong> <strong>of</strong> the value <strong>of</strong> cos θ is provided below.<br />

Fact 4. cos θ is equal to the volume <strong>of</strong> the p-dimensi<strong>on</strong>al parallelepiped spanned by<br />

proj V ui, i = 1, . . . , p.<br />

Pro<strong>of</strong>. Since we are assuming that {v1, . . . , vq} is orth<strong>on</strong>ormal, we have projV ui =<br />

q�<br />

〈ui, vk〉vk, for each i = 1, . . . , p. Thus, for i, j = 1, . . . , p, we obtain<br />

k=1<br />

whence<br />

〈proj V ui, proj V uj〉 =<br />

i=1<br />

q�<br />

〈ui, vk〉〈uj, vk〉,<br />

k=1<br />

det[〈proj V ui, proj V uj〉] = det(M T M) = cos 2 θ.<br />

But det[〈proj V ui, proj V uj〉] is the square <strong>of</strong> the volume <strong>of</strong> the p-dimensi<strong>on</strong>al paral-<br />

lelepiped spanned by proj V ui, i = 1, . . . , p (see [6]). �<br />

5


3. C<strong>on</strong>cluding remarks<br />

Statisticians use can<strong>on</strong>ical <strong>angles</strong> as measures <strong>of</strong> dependency <strong>of</strong> <strong>on</strong>e set <strong>of</strong> random<br />

variables <strong>on</strong> another. The drawback is that finding these <strong>angles</strong> <strong>between</strong> two given<br />

<strong>sub<strong>spaces</strong></strong> is rather involved (see, for example, [2, 4]). Many researchers <strong>of</strong>ten use <strong>on</strong>ly<br />

the first can<strong>on</strong>ical angle for estimati<strong>on</strong> purpose (see, for instance, [5]). Geometrically,<br />

however, the first can<strong>on</strong>ical angle is not a good measurement for approximati<strong>on</strong> (in<br />

R 3 , for instance, the first can<strong>on</strong>ical angle <strong>between</strong> two arbitrary <strong>sub<strong>spaces</strong></strong> is 0 even<br />

though the two <strong>sub<strong>spaces</strong></strong> do not coincide).<br />

The work <strong>of</strong> Risteski and Trenčevski [11] suggests that if we multiply the cosines<br />

<strong>of</strong> the can<strong>on</strong>ical <strong>angles</strong> (instead <strong>of</strong> computing each <strong>of</strong> them separately), we get the<br />

cosine <strong>of</strong> some value θ that can be c<strong>on</strong>sidered as the ‘geometrical’ angle <strong>between</strong> the<br />

two given <strong>sub<strong>spaces</strong></strong>. In general, as is shown in [7], the value <strong>of</strong> cos θ represents the<br />

ratio <strong>between</strong> the volume <strong>of</strong> the parallelepiped spanned by the projecti<strong>on</strong> <strong>of</strong> the basis<br />

vectors <strong>of</strong> the lower dimensi<strong>on</strong> subspace <strong>on</strong> the higher dimensi<strong>on</strong> subspace and the<br />

volume <strong>of</strong> the parallelepiped spanned by the basis vectors <strong>of</strong> the lower dimensi<strong>on</strong><br />

subspace. (Thus, the noti<strong>on</strong> <strong>of</strong> <strong>angles</strong> <strong>between</strong> two <strong>sub<strong>spaces</strong></strong> developed here can<br />

be thought <strong>of</strong> as a generalizati<strong>on</strong> <strong>of</strong> the noti<strong>on</strong> <strong>of</strong> <strong>angles</strong> in trig<strong>on</strong>ometry taught at<br />

schools.) What is particularly nice here is that we have an explicit formula for cos θ<br />

in terms <strong>of</strong> the basis vectors <strong>of</strong> the two <strong>sub<strong>spaces</strong></strong>.<br />

References<br />

[1] T.W. Anders<strong>on</strong>, An Introducti<strong>on</strong> to Multivariate Statistical Analysis, John Wiley & S<strong>on</strong>s, Inc.,<br />

New York, 1958.<br />

[2] ˙ A. Björck and G.H. Golub, “Numerical methods for computing <strong>angles</strong> <strong>between</strong> linear <strong>sub<strong>spaces</strong></strong>”,<br />

Math. Comp. 27 (1973), 579–594.<br />

[3] C. Davies and W. Kahan, “The rotati<strong>on</strong> <strong>of</strong> eigenvectors by a perturbati<strong>on</strong>. III”, SIAM J. Numer.<br />

Anal. 7 (1970), 1–46.<br />

[4] Z. Drmač, “On principal <strong>angles</strong> <strong>between</strong> <strong>sub<strong>spaces</strong></strong> <strong>of</strong> Euclidean space”, SIAM J. Matrix Anal.<br />

Appl. 22 (2000), 173–194 (electr<strong>on</strong>ic).<br />

[5] S. Fedorov, “Angle <strong>between</strong> <strong>sub<strong>spaces</strong></strong> <strong>of</strong> analytic and antianalytic functi<strong>on</strong>s in weighted L2<br />

space <strong>on</strong> a boundary <strong>of</strong> a multiply c<strong>on</strong>nected domain” in Operator Theory, System Theory and<br />

Related Topics, Beer-Sheva/Rehovot, 1997, 229–256.<br />

[6] F.R. Gantmacher, The Theory <strong>of</strong> Matrices, Vol. I, Chelsea Publ. Co., New York (1960), 247–256.<br />

[7] H. Gunawan, O. Neswan and W. Setya-Budhi, “A formula for <strong>angles</strong> <strong>between</strong> two <strong>sub<strong>spaces</strong></strong> <strong>of</strong><br />

<strong>inner</strong> <strong>product</strong> <strong>spaces</strong>”, submitted.<br />

[8] A.V. Knyazev and M.E. Argentati, “Principal <strong>angles</strong> <strong>between</strong> <strong>sub<strong>spaces</strong></strong> in an A-based scalar<br />

<strong>product</strong>: algorithms and perturbati<strong>on</strong> estimates”, SIAM J. Sci. Comput. 23 (2002), 2008–2040.<br />

6


[9] J. Miao and A. Ben-Israel, “Product cosines <strong>of</strong> <strong>angles</strong> <strong>between</strong> <strong>sub<strong>spaces</strong></strong>”, Linear Algebra Appl.<br />

237/238 (1996), 71–81.<br />

[10] V. Rakočević and H.K. Wimmer, “A variati<strong>on</strong>al characterizati<strong>on</strong> <strong>of</strong> can<strong>on</strong>ical <strong>angles</strong> <strong>between</strong><br />

<strong>sub<strong>spaces</strong></strong>”, J. Geom. 78 (2003), 122–124.<br />

[11] I.B. Risteski and K.G. Trenčevski, “Principal values and principal <strong>sub<strong>spaces</strong></strong> <strong>of</strong> two <strong>sub<strong>spaces</strong></strong><br />

<strong>of</strong> vector <strong>spaces</strong> with <strong>inner</strong> <strong>product</strong>”, Beiträge Algebra Geom. 42 (2001), 289–300.<br />

[12] H.K. Wimmer, “Can<strong>on</strong>ical <strong>angles</strong> <strong>of</strong> unitary <strong>spaces</strong> and perturbati<strong>on</strong>s <strong>of</strong> direct complements”,<br />

Linear Algebra Appl. 287 (1999), 373–379.<br />

Department <strong>of</strong> Mathematics, Bandung Institute <strong>of</strong> Technology, Bandung 40132,<br />

Ind<strong>on</strong>esia<br />

E-mail address: hgunawan, <strong>on</strong>eswan@dns.math.itb.ac.id<br />

7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!