08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A<br />

n × d<br />

=<br />

U<br />

n × r<br />

D<br />

r × r<br />

V T<br />

r × d<br />

Figure 3.2: The SVD decomposition <strong>of</strong> an n × d matrix.<br />

the singular values <strong>of</strong> A. For any matrix A, the sequence <strong>of</strong> singular values is unique and<br />

if the singular values are all distinct, then the sequence <strong>of</strong> singular vectors is unique up<br />

to signs. However, when some set <strong>of</strong> singular values are equal, the corresponding singular<br />

vectors span some subspace. Any set <strong>of</strong> orthonormal vectors spanning this subspace can<br />

be used as the singular vectors.<br />

3.5 Best Rank-k Approximations<br />

Let A be an n × d matrix and think <strong>of</strong> the rows <strong>of</strong> A as n points in d-dimensional<br />

space. Let<br />

r∑<br />

A = σ i u i vi<br />

T<br />

be the SVD <strong>of</strong> A. For k ∈ {1, 2, . . . , r}, let<br />

i=1<br />

A k =<br />

k∑<br />

σ i u i vi<br />

T<br />

i=1<br />

be the sum truncated after k terms. It is clear that A k has rank k. We show that A k<br />

is the best rank k approximation to A, where, error is measured in the Frobenius norm.<br />

Geometrically, this says that v 1 , . . . , v k define the k-dimensional space minimizing the<br />

sum <strong>of</strong> squared distances <strong>of</strong> the points to the space. To see why, we need the following<br />

lemma.<br />

Lemma 3.5 The rows <strong>of</strong> A k are the projections <strong>of</strong> the rows <strong>of</strong> A onto the subspace V k<br />

spanned by the first k singular vectors <strong>of</strong> A.<br />

Pro<strong>of</strong>: Let a be an arbitrary row vector. Since the v i are orthonormal, the projection<br />

<strong>of</strong> the vector a onto V k is given by ∑ k<br />

i=1 (a · v i)v i T . Thus, the matrix whose rows are<br />

45

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!