08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

and has coefficient (−1) n−1 (a 11 + a 22 + · · · + a nn ). Now<br />

(−1) n<br />

n<br />

∏<br />

i=1<br />

(λ − λ i ) = (−1) n (λ − λ 1 )(λ − λ 2 ) · · · (λ − λ n )<br />

( )<br />

= (−1) n λ n − (λ 1 + λ 2 + · · · + λ n )λ n−1 + · · ·<br />

Therefore equating coefficients λ 1 + λ 2 + · · · + λ n = a 11 + a 22 + · · · + a nn = tr(A)<br />

( )<br />

( )<br />

1 0<br />

1 0<br />

Note that (tr(A)) 2 ≠ tr(A 2 ). For example A = has trace 3, A<br />

0 2<br />

2 =<br />

0 4<br />

has trace 5 ≠9. However tr(A 2 ) = λ 2 1 + λ 2 2 + · · · + λ 2 n. To see this, observe that A 2 =<br />

(V T DV ) 2 = V T D 2 V . Thus, the eigenvalues <strong>of</strong> A 2 are the squares <strong>of</strong> the eigenvalues for<br />

A.<br />

Alternative pro<strong>of</strong> that tr(A) = λ 1 + λ 2 + · · ·+ λ n . Suppose the spectral decomposition<br />

<strong>of</strong> A is A = P DP T . We have<br />

tr (A) = tr ( P DP T ) = tr ( DP T P ) = tr (D) = λ 1 + λ 2 + · · · + λ n .<br />

Lemma 12.27 If A is n × m and B is a m × n matrix, then tr(AB)=tr(BA).<br />

Pseudo inverse<br />

tr(AB) =<br />

n∑ n∑<br />

a ij b ji =<br />

i=1 j=1<br />

n∑ n∑<br />

b ji a ij = tr (BA)<br />

j=1 i=1<br />

Let A be an n × m rank ( r matrix and let A ) = UΣV T be the singular value decomposition<br />

<strong>of</strong> A. Let Σ ′ 1<br />

= diag<br />

σ 1<br />

, . . . , 1 σ r<br />

, 0, . . . , 0 where σ 1 , . . . , σ r are the nonzero singular<br />

values <strong>of</strong> A. Then A ′ = V Σ ′ U T is the pseudo inverse <strong>of</strong> A. It is the unique X that<br />

minimizes ‖AX − I‖ F<br />

.<br />

Second eigenvector<br />

Suppose the eigenvalues <strong>of</strong> a matrix are λ 1 ≥ λ 2 ≥ · · · . The second eigenvalue,<br />

λ 2 , plays an important role for matrices representing graphs. It may be the case that<br />

|λ n | > |λ 2 |.<br />

Why is the second eigenvalue so important? Consider partitioning the vertices <strong>of</strong> a<br />

regular degree d graph G = (V, E) into two blocks <strong>of</strong> equal size so as to minimize the<br />

number <strong>of</strong> edges between the two blocks. Assign value +1 to the vertices in one block and<br />

-1 to the vertices in the other block. Let x be the vector whose components are the ±1<br />

values assigned to the vertices. If two vertices, i and j, are in the same block, then x i and<br />

x j are both +1 or both –1 and (x i −x j ) 2 = 0. If vertices i and j are in different blocks then<br />

(x i − x j ) 2 = 4. Thus, partitioning the vertices into two blocks so as to minimize the edges<br />

416

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!