02.08.2013 Views

CHAPTER II DIMENSION In the present chapter we investigate ...

CHAPTER II DIMENSION In the present chapter we investigate ...

CHAPTER II DIMENSION In the present chapter we investigate ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>CHAPTER</strong> <strong>II</strong><br />

<strong>DIMENSION</strong><br />

<strong>In</strong> <strong>the</strong> <strong>present</strong> <strong>chapter</strong> <strong>we</strong> <strong>investigate</strong> interrelations bet<strong>we</strong>en <strong>the</strong> notions of linear independence,<br />

spanning sets, bases and dimensions, which are central to <strong>the</strong> basic <strong>the</strong>ory of<br />

linear algebra. Among o<strong>the</strong>r things, <strong>we</strong> establish <strong>the</strong> crucial identity relating <strong>the</strong> dimensions<br />

of V , T (V ) and ker T for a linear transformation T defined on V .<br />

§1. Linear <strong>In</strong>dependence<br />

1.1. Linear independence (or its opposite, linear dependence) is by far <strong>the</strong> most<br />

important concept in describing relations among vectors. A thorough understanding is a<br />

“must” for working with linear algebra. The reader is advised to study this concept very<br />

carefully, and to contemplate its meaning as many times as possible.<br />

Consider a finite set of vectors v1, v2, . . . , vr in a linear space V . By a linear<br />

relation among <strong>the</strong>se vectors <strong>we</strong> mean an identity of <strong>the</strong> form<br />

a1v1 + a2v2 + · · · + arvr = 0. (1.1.1)<br />

where a1, a2, . . . , ar are some scalars. We give a quick example: <strong>the</strong> linear relation<br />

2v1 + v2 + (−1)v3 + 0 v4 = 0 holds for v1 = (1, 1), v2 = (1, 0) v3 = (3, 2) and<br />

v4 = (5, 7), as <strong>we</strong> can check. (2.1.1) is certainly true when all ak (1 ≤ k ≤ r) are zeros:<br />

0 v1 + 0 v2 + · · · + 0 vr = 0 (1.1.2)<br />

holds trivially. Let us call (1.1.2) <strong>the</strong> trivial linear relation among v1, v2, . . . , vr. If <strong>the</strong><br />

linear relation given as (2.1.1) is nontrivial, that is, it is not of <strong>the</strong> form (1.1.2), <strong>the</strong>n at<br />

least one of <strong>the</strong> scalars a1, a2, . . . , ar is nonzero, that is, ak = 0 fo some k.<br />

Definition. W e say that vectors v1, v2, . . . , vr are linearly dependent if <strong>the</strong>re is a<br />

nontrivial linear relation among <strong>the</strong>m, that is a1v1 + a2v2 + · · · vr = 0 holds for some<br />

a1, a2, . . . , ar, not all equal to zero.<br />

To make our language flexible, <strong>we</strong> also say a set of vectors v1, v2, . . . , vr is linearly<br />

dependent if vectors v1, v2, . . . , vr are linearly dependent.<br />

1


<strong>In</strong> order to prove that a given set of vectors v1, v2, . . . , vr is linearly dependent,<br />

<strong>we</strong> may write down a1v1 + a2v2 + · · · · · · + arvr = 0 and treat it as an equation with<br />

a1, a2, . . . , ar as unknowns and try to find a nontrivial solution of it. To give a quick<br />

example, let us check if <strong>the</strong> vectors v1 = (i, 1), v2 = (1, −i) in C 2 are linearly dependent.<br />

We begin by writing down a1v1 + a2v2 = 0, or a1(i, 1) + a2(1, −i) = (0, 0). which gives<br />

ia1 + a2 = 0 and a1 − ia2 = 0. This system of equations in a1, a2 has a nontrivial solution,<br />

such as a1 = i and a2 = 1. So <strong>the</strong> given vectors are linearly dependent.<br />

The opposite of linear dependence is linear independence. A set of vectors is linearly<br />

independent if this set is not linearly dependent. Recall that v1, v2, . . . , vr are linearly<br />

dependent if <strong>the</strong>re is a nontrivial linear relation among <strong>the</strong>m. So “vectors v1, v2, . . . , vr<br />

are linearly independent” means that <strong>the</strong>re is no nontrivial linear relation among <strong>the</strong>m, in<br />

o<strong>the</strong>r words, <strong>the</strong> only possible linear relation among <strong>the</strong>m is <strong>the</strong> trivial one. Thus, if <strong>we</strong><br />

have a linear relation a1v1 + a2v2 + · · · + arvr = 0 among linearly independent vectors<br />

v1, v2, . . . , vr, <strong>the</strong>n all ak (0 ≤ k ≤ r) must be zeroes. Thus <strong>we</strong> have arrived at<br />

Definition. We say that vectors v1, v2, . . . , vr are linearly independent if any<br />

linear relation among <strong>the</strong>m, say<br />

a1v1 + a2v2 + · · · · · · + arvr = 0,<br />

is necessarily trival, that is, a1 = a2 = · · · = ar = 0.<br />

To prove that given vectors v1, v2, . . . , vr are linearly independent, <strong>we</strong> almost always<br />

start with something like<br />

“Suppose <strong>we</strong> have a1v1 + a2v2 + · · · + arvr = 0.”<br />

After that, <strong>we</strong> try to figure out why a1, a2, . . . , ar all must be equal to zero. <strong>In</strong> <strong>the</strong> next<br />

subsection, <strong>we</strong> give a few examples to see how it works.<br />

1.2. First <strong>we</strong> give a quick example to show <strong>the</strong> steps. We are asked to prove that<br />

vectors v1 = (1, 0), v2 = (1, 2) in R 2 are linearly independent. We begin with <strong>the</strong><br />

sentence “Suppose that a1v1 +a2v2 = 0”. Then <strong>we</strong> proceed by rewriting a1v1 +a2v2 = 0<br />

as a1(1, 0) + a2(1, 2) = (0, 0), or (a1 + a2, 2a2) = (0, 0), which gives us a1 + a2 = 0 and<br />

2a2 = 0, from which <strong>we</strong> deduce a1 = a2 = 0. Here <strong>we</strong> emphasize that <strong>the</strong> first sentence in<br />

our proof must be correct, no matter how hard or how easy <strong>the</strong> given problem is.<br />

Example 1.2.1. Prove that vectors u and v in a vector space V are linearly independent<br />

if u + v and u − v are linearly independent.<br />

2


♠ Aside. What is supposed to be proved? Well, <strong>the</strong> linear independence of u and v.<br />

So <strong>we</strong> should start with something like “Assume au + bv = 0”. Then <strong>we</strong> try to derive<br />

a = 0 and b = 0. A common error is to start with “Assume a(u + v) + b(u − v) = 0”<br />

which results in receiving no credit for <strong>the</strong> rest of <strong>the</strong> work. ♠<br />

Solution. Assume au + bv = 0. Let p = u + v and q = u − v. Then, by assumption,<br />

p and q are linearly independent. Notice that 2u = p + q and 2v = p − q. Hence<br />

0 = 2(au + bv) = a(2u) + b(2v) = a(p+q) + b(p−q) = (a+b)p + (a−b)q.<br />

Since p, q are linearly independent, a + b = 0 and a − b = 0. So a = b = 0. Done.<br />

Example 1.2.2. Let T be a linear operator on V and let v be in V . Assume that<br />

T 3 v = 0 but T 2 v = 0. Prove that v, T v, T 2 v are linearly independent.<br />

Solution. Assume<br />

a1v + a2T v + a3T 2 v = 0. (1.2.1)<br />

Applying T to this identity, <strong>we</strong> have T (a1v + a2T v + a3T 2 v) = T 0 = 0. Rewrite <strong>the</strong> last<br />

identity as a1T v + a2T 2 v + a3T 3 v = 0. Since T 3 v = 0, <strong>we</strong> have<br />

a1T v + a2T 2 v = 0. (1.2.2)<br />

Applying T to <strong>the</strong> last identity again, <strong>we</strong> have a1T 2 v + a2T 3 v = 0, or a1T 2 v = 0. Since<br />

T 2 v = 0, <strong>we</strong> must have a1 = 0. Thus (1.2.2) becomes a2T 2 v = 0. Since T 2 v = 0, <strong>we</strong><br />

must have a2 = 0. Now (1.2.1) becomes a3T 2 v = 0. Since T 2 v = 0, <strong>we</strong> must have a3 = 0.<br />

We have shown that a1 = a2 = a3 = 0. Hence v, T v, T 2 v are linearly dependent.<br />

Remark. More generally, if T m v = 0 and T m+ 1 v = 0, <strong>the</strong>n <strong>the</strong> vectors<br />

vm = v, vm−1 = T v, vm−2 = T 2 v, . . . , v1 = T m−1 v, v0 = T m v<br />

are linearly independent. This remark will be refered to in <strong>the</strong> future and <strong>the</strong> labeling of<br />

vectors here is competible to <strong>the</strong> discussion <strong>the</strong>re.<br />

Example 1.2.3. Let T be a linear operator on a vector space V and let u and v<br />

be nonzero vectors in V such that T u = 2u and T v = 3v. Prove that u, v are<br />

linearly independent.<br />

Solution. Assume<br />

au + bv = 0. (1.2.3)<br />

3


Applying T to this identity, <strong>we</strong> have T (au + bv) = 0, or aT u + bT v = 0. From<br />

T u = 2u and T v = 3v <strong>we</strong> get<br />

2a u + 3b v = 0. (1.2.4)<br />

Multiply (1.2.3) by 2 and <strong>the</strong>n substract (1.2.4), <strong>we</strong> get bv = 0. Since v is nonzero, <strong>we</strong><br />

must have b = 0. Thus (1.2.3) becomes au = 0. Since u is nonzero, <strong>we</strong> must have a = 0.<br />

We have shown a = b = 0. Hence u, v are linearly independent.<br />

Example 1.2.4. Let c1, c2, . . . , cn, cn+ 1 be n + 1 distinct complex numbers and con-<br />

) for k = 0, 1, 2, . . . , n:<br />

sider vectors in C n+ 1 : vk = (c k 1 , ck 2 , . . . , ck n , ck n+ 1<br />

v0 = (1, 1, 1, . . . , 1, 1)<br />

v1 = (c1, c2, . . . , cn, cn+ 1)<br />

v2 = (c 2 1 , c2 2 , . . . , c2 n , c2 n+ 1 )<br />

.<br />

vn = (c n 1 , cn 2 , . . . , cn n , cn n+ 1 )<br />

Prove that v0, v1, v2, . . . , vn are linearly independent.<br />

Proof: Assume a0v0 + a1v1 + · · · + anvn = 0. For each j with 1 ≤ j ≤ n + 1, write<br />

out <strong>the</strong> jth component of this vector identity:<br />

a0 + a1cj + a2c 2 j + · · · + anc n j<br />

This shows that <strong>the</strong> polynomial a0 +a1x+a2x 2 +· · ·+anx n has n+1 roots c1, c2, . . . , cn+ 1.<br />

But a polynomial of degree at most n cannot have n + 1 roots, unless that polynomial is<br />

<strong>the</strong> zero polynomial. Therefore a0, a1, . . . , an must vanish. This shows v0, v1, v2, . . . , vn<br />

are linearly independent.<br />

1.3. Given a finite set of vectors v1, v2, . . . , vr in V , <strong>we</strong> can construct a subspace<br />

containing <strong>the</strong>se vectors by collecting all linear combinations of <strong>the</strong>se vectors. Here, by a<br />

linear combination of v1, v2, . . . , vr <strong>we</strong> mean a vector of <strong>the</strong> form<br />

α1v1 + α2v2 + · · · · · · + αrvr<br />

= 0.<br />

(1.3.1)<br />

for certain scalars α1, α2, . . . , αr. A quick example: 4 + 2x − x 2 is a linear combination of<br />

2 + 3x and 4x + x 2 because 4 + 2x − x 2 = 2(2 + 3x) + (−1)(4x + x 2 ), which can be easily<br />

checked. (For <strong>the</strong> moment let us not worry about how to figure out this identity.) One<br />

can think of a linear combination of <strong>the</strong> form (1.3.1) as a “combo soup” with vk as its<br />

kth ingredient and ak as <strong>the</strong> amount of this ingredient (1 ≤ k ≤ r).<br />

4


The set L of all linear combinations of given vectors v1, v2, . . . , vr is indeed a subspace<br />

of V . To prove this, let us take two such linear combinations, say<br />

v = α1v1 + · · · + αrvr and w = β1v1 + · · · + βrvr.<br />

We have to show that αv + βw is also in L, where α and β are arbitrary scalars. Now<br />

αv + βw = (αα1 + ββ1)v1 + · · · + (ααr + ββr)vr,<br />

which is again a linear combination of <strong>the</strong> given vectors. Hence αv + βw ∈ L.<br />

This furnishes <strong>the</strong> proof of L being a subspace. This subspace is called <strong>the</strong> (linear)<br />

span of <strong>the</strong> set {v1, v2, . . . , vr}, or <strong>the</strong> subspace spanned by v1, v2, . . . , vr. You may write<br />

L = span{v1, v2, . . . , vr}.<br />

Thus, <strong>we</strong> say that vectors v1, v2, . . . , vr span <strong>the</strong> whole space V , or <strong>the</strong>y form a spanning<br />

set if V = span{v1, v2, . . . , vr}, that is, every vector in V is a linear combination of<br />

v1, v2, . . . , vr. A vector space V is said to be finite dimensional and <strong>we</strong> write dim V < ∞<br />

if it is spanned by a finite set of vectors, o<strong>the</strong>rwise it is said to be infinite dimensional<br />

and <strong>we</strong> write dim V = ∞.<br />

A subspace of V can be described as a nonempty set W of V closed under taking<br />

linear combinations, that is, linear combinations of vectors in W are also vectors in W .<br />

The span of vectors v1, v2, . . . , vr in a vector space V can be described as <strong>the</strong> smallest<br />

subspace containing v1, v2, · · · , vr.<br />

Recall from subsection 3.5 of Chapter I that, given a linear operator T on a vector<br />

space V , a subspace M of V is said to be invariant for T if T (M) ⊆ M, that<br />

is, every vector v in M is sent by T to a vector (namely T v) in M. <strong>In</strong> case M is<br />

spanned by v1, v2, . . . , vr, that is, M = span{v1, v2, . . . , vr}, to show that T (M) ⊆ M, it<br />

is enough to check that T vk is in M for k = 1, 2, . . . , r. <strong>In</strong>deed, if v is in M, <strong>the</strong>n <strong>we</strong> can<br />

write v = a1v1 + a2v2 + · · · + arvr for some scalars a1, a2, . . . , ar and hence<br />

T v = a1T v1 + a2T v2 + · · · + arT vr<br />

is in M because it is a linear combination of vectors T v1, T v2, . . . , T vr which are in M.<br />

Example 1.3.1. Let F(R) be <strong>the</strong> space of differentiable functions of one variable x.<br />

Define operators D and Ta (a here is any real number) by putting D(f(x)) = f ′ (x) and<br />

Ta(f(x)) = f(x + a). Here D is a differential operator and Ta is a translation operator.<br />

5


They are among <strong>the</strong> most important operators used in differential equations and difference<br />

equations. We can check that each of <strong>the</strong> following (finite dimensional) subspaces is<br />

invariant for both D and Ta:<br />

V1 = span {1, x, x 2 }, V2 = span {e kx , xe kx , x 2 e ax }, V3 = span {e kx cos bx, e kx sin bx},<br />

V4 = span {cos bx, sin bx, x cos bx, x sin bx, x 2 cos bx, x 2 sin bx}.<br />

Here k and b are any real constants. Here <strong>we</strong> only check that V3 is invariant for both<br />

D and Ta and <strong>we</strong> leave <strong>the</strong> rest of verification to <strong>the</strong> reader as an exericse; (see Drill 7).<br />

<strong>In</strong>deed, by <strong>the</strong> product rule in differentiation, <strong>we</strong> have<br />

D(e kx cos bx) = D(e kx ) cos bx + e kx D(cos bx) = ke kx cos bx − be kx sin bx<br />

D(e kx sin bx) = D(e kx ) sin bx + e kx D(sin bx) = ke kx sin bx + be kx cos bx<br />

which are linear combinations of e kx cos bx and e kx sin bx and hence are in V3. This proves<br />

D(V3) ⊆ V3. Next,<br />

T (e kx cos bx) = e k(x+ a) cos(b(x + a)) = e kx+ ka cos(ab + bx)<br />

= e ak e kx (cos ab cos bx − sin ab sin kx)<br />

= e ak cos ab(e kx cos bx) − e ak sin ab(e kx sin kx)<br />

which is a linear combination of e kx cos bx and e kx sin bx and hence are in V3. Similarly<br />

<strong>we</strong> can check that T (e kx sin bx) is in V3. Hence Ta(V3) ⊆ V3. (We recommend <strong>the</strong> reader<br />

to review Examples 3.5.1 to 3.5.4 in <strong>the</strong> last <strong>chapter</strong>. <strong>In</strong> each of <strong>the</strong>se examples <strong>the</strong> choice<br />

of <strong>the</strong> invariant subspace is suggested by our <strong>present</strong> example.)<br />

If v is a linear combination of a set of linear independnt vectors v1, v2, . . . , vr,<br />

<strong>the</strong>n this linear combination must be unique. <strong>In</strong>deed, suppose <strong>we</strong> have both<br />

v = α1v1 + α2v2 + · · · + αnvr and v = β1v1 + β2v2 + · · · + βnvr.<br />

Subtract <strong>the</strong>se identities:<br />

0 = v − v = (α1 − β1)v1 + (α2 − β2)v2 + · · · + (αr − βr)vr.<br />

Since, by assumption, v1, v2, . . . , vr are linearly independent, <strong>we</strong> must have α1 − β1 = 0,<br />

α2 − β2 = 0, etc. and hence α1 = β1, α2 = β2, etc. Using <strong>the</strong> argument <strong>present</strong>ed here,<br />

<strong>we</strong> can prove: a set of vectors form a basis of V if and only if: 1. this set is linearly<br />

independent, and 2. it spans V . Recall from Definition 2.6.1 that vectors v1, v2, . . . , vn<br />

in V form a basis of V if and only if each vector v can be uniquely written as a linear<br />

6


combination of <strong>the</strong>m, that is, v can be written as a1v1 + a2v2 + · · · + anvn, where <strong>the</strong><br />

scalars a1, a2, . . . , an are uniquely determined by v.<br />

1.4. Let us take an arbitrary finite set of vectors v1, v2, . . . , vr in a linear space V .<br />

For each positive integer k with k ≤ r, let Vk be <strong>the</strong> span of v1, v2, . . . , vk, that is,<br />

Vk = span {v1, v2, . . . , vk}.<br />

For convenience, let V0 = {0}, <strong>the</strong> trivial subspace. So<br />

V0 = {0}, V1 = span {v1}, V2 = span {v1, v2}, . . . Vr = span {v1, v2, . . . , vr}.<br />

Clearly <strong>the</strong> family of subspaces Vk is increasing:<br />

{0} = V0 ⊆ V1 ⊆ V2 ⊆ · · · ⊆ Vr−1 ⊆ Vr.<br />

(That Vk ⊆ Vk+ 1 means that Vk is contained in Vk+ 1, that is, every vector in Vk is also<br />

a vector in Vk+ 1.) For each positive integer k with 1 ≤ k ≤ r, <strong>the</strong>re are two possibilites<br />

Case 1: Vk−1 = Vk. This happens exactly when vk is in Vk−1.<br />

Case 2: Vk−1 = Vk. This happens exactly when vk is not in Vk−1.<br />

When <strong>the</strong> second case occurs, <strong>we</strong> call vk a leading vector. Thus vk is a leading vector<br />

if and only if vk is not in <strong>the</strong> span of <strong>the</strong> preceding vectors, namely v1, v2, . . . , vk−1.<br />

Now <strong>we</strong> claim<br />

Theorem 1.4.1. W ith <strong>the</strong> notation and terminology as above, <strong>the</strong> <strong>we</strong> have<br />

(a) leading vectors are linearly independent, and<br />

(b) any non-leading vector is a unique linear combination of preceding leading vectors.<br />

We postpone <strong>the</strong> proof of this <strong>the</strong>orem to <strong>the</strong> end of <strong>the</strong> <strong>present</strong> section. Here let us<br />

draw some important conclusions from it. First, let us take a spanning set of vectors<br />

v1, v2, . . . , vr in V . From part (a) of <strong>the</strong> above <strong>the</strong>orem, <strong>we</strong> see that <strong>the</strong> leading vectors<br />

in this set are linearly independent. From part (b) <strong>we</strong> see that all vectors vk (1 ≤ k ≤ r)<br />

in this spanning set are linear combinations of leading vectors. This shows that leading<br />

vectors also span V . Now <strong>the</strong> leading vectors are both linearly independent and spanning.<br />

Hence <strong>the</strong> leading vectors form a basis of V . We have proved<br />

Corollary 1.4.2. Any finite set of vectors spanning a linear space V contains a subset<br />

of vectors that form a basis of V .<br />

7


As a result of this corollary, <strong>we</strong> know:<br />

Corollary 1.4.3. Every finite dimensional vector space has a basis.<br />

Next, assume that a set of linearly independent vectors w1, w2, . . . , wp in a finite<br />

dimensional space V is given. Let v1, v2, . . . , vr be any spanning set of V . Now<br />

put <strong>the</strong>se two sets of vectors toge<strong>the</strong>r to form<br />

w1, w2, . . . , wp, v1, v2, . . . , vr<br />

The above list of vectors certainly spans V . By our previous argument, <strong>we</strong> see that <strong>the</strong><br />

leading vectors in this list form a basis of V , say B. Since <strong>the</strong> linearly independent vectors<br />

w1, w2, . . . , wp appear at <strong>the</strong> beginning of this list, all of <strong>the</strong>m are leading vectors. So<br />

<strong>the</strong>y belong to <strong>the</strong> basis B. <strong>we</strong> have shown<br />

Corollary 1.4.4. A finite set of linearly independent vectors in a finite dimensional<br />

linear space V can be enlarged to a basis of V .<br />

1.5. Given a finite set of vectors v1, v2, . . . , vn in a finite dimensonal linear space<br />

V , how do <strong>we</strong> find <strong>the</strong> leading vectors and how do <strong>we</strong> express o<strong>the</strong>r vectors as linear<br />

combinations of <strong>the</strong> leading vectors, as suggested by Theorem 1.4.1 above?<br />

First, by using a coordinate system, <strong>we</strong> can convert <strong>the</strong>se vectors into column vectors<br />

in space F m (where F is <strong>the</strong> field <strong>we</strong> work with and m is <strong>the</strong> number of coordinates),<br />

say C1, C2, . . . , Cn. <strong>In</strong> this way <strong>we</strong> turn <strong>the</strong> original problem about vectors into <strong>the</strong> one<br />

about column vectors. We form an m × n matrix A by using <strong>the</strong>se vectors as columns:<br />

A = [C1 C2 · · · Cn]<br />

Now <strong>we</strong> apply elementary row operations to this matrix to its reduced row echelon form.<br />

Recall that elementary row operations ei<strong>the</strong>r exchange two rows, or adding a scalar multiple<br />

of one row to ano<strong>the</strong>r, or multiply one row by a nonzero scalar. An elementary row<br />

operation has <strong>the</strong> same effect as multiplying an invertible matrix on <strong>the</strong> left. For example,<br />

given a 2 × 2 matrix with A, exchanging two rows (adding 2 × <strong>the</strong> second row to <strong>the</strong> first,<br />

resp.) has <strong>the</strong> same effect as multiplying A on <strong>the</strong> left by<br />

as <strong>we</strong> can check that<br />

<br />

0<br />

<br />

1 a<br />

<br />

b<br />

1 0 c d<br />

=<br />

<br />

0 1<br />

1 0<br />

<br />

c d<br />

,<br />

a b<br />

(by<br />

<br />

1 2<br />

0 1<br />

resp.)<br />

<br />

1 2 a b<br />

=<br />

0 1 c d<br />

8<br />

<br />

a + 2c b + 2d<br />

.<br />

c d


Applying elementary row operations several times to a m×n matrix A, <strong>we</strong> end up with an<br />

echelon matrix, say B. The effect of <strong>the</strong>se operations is <strong>the</strong> same as multiplying A on <strong>the</strong><br />

left consecutively by some invertible matrices, say E1, E2, . . . , Eℓ, that is, B = EA where<br />

E = Eℓ . . . E2E1. As a product of invertible matrices, E is invertible. Now<br />

B = EA = E[C1 C2 · · · Cn] = [EC1 EC2 · · · ECn]<br />

This identity tells us that EC1, EC2, . . . , ECn are columns of B. We claim:<br />

Theorem 1.5.1. If matrices A and B are row equivalent and if a linear relation holds<br />

among columns of A, <strong>the</strong>n <strong>the</strong> same relation holds for corresponding columns of B.<br />

More precisely, this <strong>the</strong>orem says that if <strong>the</strong> linear relation<br />

holds, <strong>the</strong>n so does <strong>the</strong> linear relation<br />

α1C1 + α2C2 + · · · + αnCn = O<br />

α1EC1 + α2EC2 + · · · + αnECn = O.<br />

But this is completely obvious! Just multiply <strong>the</strong> first identity on <strong>the</strong> left by E! With <strong>the</strong><br />

help of this <strong>the</strong>orem, <strong>the</strong> question raised at <strong>the</strong> beginning of §1.5 is ans<strong>we</strong>red, as illustrated<br />

in <strong>the</strong> next example.<br />

Example 1.5.2. Consider <strong>the</strong> following “vectors” in P3:<br />

p1(x) = 1 + 3x + 4x 2 + 2x 3 , p2(x) = 2 + 6x + 8x 2 + 4x 3 ,<br />

p3(x) = 2 + x + x 2 + x 3 , p4(x) = 1 + x 2 + x 3 , p5(x) = 9 + 6x + 9x 2 + 7x 3 .<br />

We are asked to find <strong>the</strong> leading vectors and express each vector as a linear combination<br />

of preceding leading vectors.<br />

Solution. Use <strong>the</strong> natural basis {1, x, x 2 , x 3 } <strong>we</strong> convert <strong>the</strong> given vectors into<br />

columns A1, . . . , A5 and form <strong>the</strong> matrix A = [A1 A2 A3 A4 A5] Then reduce A to<br />

its reduced row echelon form B = [B1 B2 B3 B4 B5]. The actual computation shows<br />

⎡<br />

1 2 2 1<br />

⎤<br />

9<br />

⎡<br />

1 2 0 0<br />

⎤<br />

1<br />

⎢ 3<br />

A = ⎣<br />

4<br />

6<br />

8<br />

1<br />

1<br />

0<br />

1<br />

6 ⎥<br />

⎦ .<br />

9<br />

⎢ 0<br />

B = ⎣<br />

0<br />

0<br />

0<br />

1<br />

0<br />

0<br />

1<br />

3 ⎥<br />

⎦ .<br />

2<br />

2 4 1 1 7<br />

0 0 0 0 0<br />

The leading column vectors in B are B1, B3, B4, and B2 = 2B1, B5 = B1 + 3B3 + 2B4.<br />

By Theorem 1.5.1, <strong>we</strong> know that in A, <strong>the</strong> leading column vectors are A1, A3, A4, and<br />

9


A2 = 2A1, A5 = A1 + 3A3 + 2A4. Hence, among <strong>the</strong> given “vectors” in P3, <strong>the</strong> leading<br />

vectors are p1(x), p3(x), p4(x) and p2(x) = 2p1(x), p5(x) = p1(x) + 3p3(x) + 2p4(x).<br />

Remark. As <strong>we</strong> know, any matrix A can always be row reduced to a reduced row<br />

echelon matrix, say B. Ho<strong>we</strong>ver, row reduction can be performed in many different ways.<br />

Naturally a question is raised: is B uniquely determined by A? The ans<strong>we</strong>r is Yes. The<br />

reason is, first, those columns in B containing leading ones correspond to leading column<br />

vectors of A, second, <strong>the</strong> entries of any o<strong>the</strong>r column in B is determined by <strong>the</strong> way it is<br />

written as a linear combination of columns containing leading ones, <strong>the</strong> same way as <strong>the</strong><br />

corresponding column in A expressed as a linear combination of leading vectors, and such<br />

a linear combination is unique because <strong>the</strong> leading vectors are linearly independent. The<br />

uniqueness of reduced row echelon form is considered by some people as a “hard <strong>the</strong>orem”.<br />

<strong>In</strong> fact, from our view it is easy to understand.<br />

1.6. Now <strong>we</strong> return to <strong>the</strong> proof of Theorem 1.4.1. Let <strong>the</strong> leading vectors be<br />

vk1 , vk2 , . . . , vkp , where 1 ≤ k1 < k2 < · · · < kp ≤ r. Suppose <strong>we</strong> have<br />

a1vk1 + a2vk2 + · · · + apvkp = 0. (1.6.1)<br />

If all coefficients a1, a2, . . . , ap in (1.6.1)are zeroes, <strong>the</strong>n <strong>the</strong>re is nothing to prove. So<br />

<strong>we</strong> assume that one of <strong>the</strong>m is nonzero. Let q be <strong>the</strong> largest positive number for which aq<br />

is nonzero. Then (1.6.1) becomes<br />

with aq = 0. We can rewrite (1.6.2) as<br />

vkq =<br />

a1vk1 + a2vk2 + · · · + aqvkq = 0 (1.6.2)<br />

<br />

− a1<br />

<br />

vk1<br />

aq<br />

+<br />

<br />

− a2<br />

<br />

<br />

vk2 + · · · + −<br />

aq<br />

aq−1<br />

<br />

vkq− 1<br />

aq<br />

.<br />

Since <strong>the</strong> vectors vk1 , vk2 , . . . , vkq− 1 are in Vkq−1 (since kq−1 < kq, <strong>we</strong> have kq−1 ≤ kq −1<br />

and hence all vectors in Vkq− 1 are also in Vkq−1), this identity shows that vkq is in Vkq−1,<br />

contradicting <strong>the</strong> fact that vkq is a leading vector. This shows that all coefficients<br />

a1, a2, . . . , ap must be zeroes.<br />

To prove part (b) of <strong>the</strong> <strong>the</strong>orem, take any non-leading vector vj from <strong>the</strong> list and<br />

let vk1, vk2, . . . , vks be all leading vectors preceding vj. Then <strong>we</strong> have ks < j < ks+ 1.<br />

Now, for all i with ks < i < ks+ 1, <strong>we</strong> have Vi = Vks . This is because vi is not a leading<br />

vector and vks+1 is <strong>the</strong> first leading vector after vks . <strong>In</strong> particular, Vj = Vks and<br />

hence vj is in Vks . We can repeat <strong>the</strong> argument for proving Corollary 1.4.2 to prove that<br />

vk1 , vk2 , . . . , vks form a basis of Vks . Now part (b) of Theorem 1.4.1 is clear.<br />

10


EXERCISE SET <strong>II</strong>. 1.<br />

Review Questions. What is <strong>the</strong> big deal about linear independence? Can I explain this<br />

concept to my parents and convince <strong>the</strong>m I get my money’s worth for paying tuition fee<br />

to learn important things like this? (Admittedly, this is hard!) Do I understand perfectly<br />

<strong>the</strong> meaning of each of <strong>the</strong> following concept?<br />

linear dependence (independence), linear combination, span, spanning set<br />

Given a list of vectors, what are <strong>the</strong> leading vectors? Why are leading vectors linearly<br />

independent and any vector in this list can be uniquely written as a linear of <strong>the</strong>m? How<br />

do I use a matrix to find <strong>the</strong> leading vectors and <strong>the</strong> expression of any vector in <strong>the</strong> list as<br />

a linear combination of <strong>the</strong> leading vectors?<br />

Drills<br />

1. <strong>In</strong> each of <strong>the</strong> following cases, show that <strong>the</strong> given set S of vectors in V is linearly<br />

independent:<br />

(a) V = R 3 , S = {(1, 1, 0), (1, 2, 0)}.<br />

(b) V = C 3 , S = {(1, 1, i), (1, i, 1), (i, 1, 1)}.<br />

(c) V = P2, S = {p(x), p ′ (x), p ′′ (x)}, where p(x) = 1 + x + x 2 .<br />

(d) V = P2, S = {p(x), p(x + 1), p(x + 2)}, where p(x) = x2 .<br />

(e) V = M2,2, S = {A, A2 <br />

1 −1<br />

}, where A = .<br />

0 −2<br />

2. <strong>In</strong> each of <strong>the</strong> following cases, show that <strong>the</strong> given set S of vectors are linearly dependent:<br />

(a) V = R 3 , S = {(1, 1, 0), (1, 2, 0), (88, 89, 0)}.<br />

(b) V = C 3 , S = {(1, i, 1), (i, 1, i), (1, 0, 1)}.<br />

(c) V = P2, S = {p(x), p(x + 1), p(x + 2), p(x + 3)}, where p(x) = x 2 .<br />

(d) V = M2,2, S = {A, A 2 , A 3 }, where<br />

A =<br />

<br />

1 −1<br />

.<br />

0 −2<br />

3. Suppose that u, v, w are linearly independent vectors in a complex vector space V .<br />

Show that<br />

(a) u + v, u − v are linearly independent.<br />

(b) u + v, u + w, v + w are linearly independent.<br />

11


(c) u + iv, iu + v are linearly independent.<br />

4. True or False:<br />

(a) A set S consisting of a single vector, say S = {v}, is always linearly independent.<br />

(b) If two vectors are linearly dependent, <strong>the</strong>n one of <strong>the</strong>m is a scalar multiple of <strong>the</strong><br />

o<strong>the</strong>r.<br />

(c) If two vectors are linearly dependent, <strong>the</strong>n each of <strong>the</strong>m is a scalar multiple of<br />

<strong>the</strong> o<strong>the</strong>r.<br />

(d) If v1, v2, v3 are linearly dependent, <strong>the</strong>n so are v1, v2.<br />

(e) A set of vectors is linearly independent if and only if none of <strong>the</strong>m is in <strong>the</strong> span<br />

of <strong>the</strong> rest of <strong>the</strong>m.<br />

(f) A set of three vectors is linearly independent if each pair of <strong>the</strong>m is a linearly<br />

independent set.<br />

5. Prove <strong>the</strong> following two statements:<br />

(a) 2×2 matrices A1, A2, A3 (considered as vectors in M2,2) are linearly independent<br />

if BA1, BA2, BA3 are linearly independent, where B is some 2 × 2 matrix.<br />

(b) Polynomials p1(x), p2(x), p3(x) (considered as vectors in P) are linearly independent,<br />

if <strong>the</strong>ir derivatives p ′ 1 (x), p′ 2 (x), p′ 3 (x) are linearly independent.<br />

6. <strong>In</strong> each of <strong>the</strong> following cases, find <strong>the</strong> leading vectors of <strong>the</strong> given list L of vectors<br />

in a linear space V and express <strong>the</strong> o<strong>the</strong>r vectors in <strong>the</strong> list as linear combinations of<br />

<strong>the</strong> leading vectors.<br />

(a) V = P2, L = x + 1, 2x + 2, x − 1, 2x, x 2 − 2x, x 2 + 5x + 1 <br />

(b) V = C 2 , L = ( (1 + i, 1 − i), (i, 1), (1, i), (1, 2) )<br />

(c) V = C 3 , L = ( C1, C2, C3, C4, C5, C6 ), assuming that <strong>the</strong> matrix<br />

C = [C1 C2 C3 C4 C5 C6]<br />

has <strong>the</strong> following reduced row echelon form:<br />

⎡<br />

1<br />

R = ⎣ 0<br />

2<br />

0<br />

0<br />

1<br />

0<br />

0<br />

5<br />

2<br />

⎤<br />

3<br />

4 ⎦<br />

0 0 0 1 6 7<br />

7. verify that subspaces V1, V2, V3 in Example 1.3.1 are invariant for both operators<br />

D and Ta.<br />

12


Exercises<br />

1. Let T be a linear map from V to W and let v1, v2, . . . , vr be vectors in V . Show<br />

that, if T v1, T v2, . . . , T vr are linearly independent, <strong>the</strong>n so are v1, v2, . . . , vr. Give<br />

an example to show that <strong>the</strong> converse of this statement is false.<br />

2. Show that vectors v1, v2, . . . , vr are linearly independent if and only if v1 = 0 and,<br />

for each k with 2 ≤ k ≤ r, vk is not in <strong>the</strong> linear span of v1, v2, . . . , vk−1.<br />

3. (a) Let T be a linear operator on a vector space V and let v be a vector in V . Suppose<br />

that n is a positive integer such that T n v = 0 and T n+ 1 v = 0. Show that <strong>the</strong> vectors<br />

v, T v, T 2 v, . . . , T n v are linearly independent. (b) Use this result to show that<br />

if p(x) is a polynomial of degree n, <strong>the</strong>n <strong>the</strong> polynomials obtained by consecutive<br />

differentiation p(x), p ′ (x), p ′′ (x), . . . , p (n) (x) are linearly independent.<br />

4. Let P be <strong>the</strong> 4 × 4 matrix given by<br />

⎡<br />

0 1 0<br />

⎤<br />

0<br />

⎢ 0<br />

P = ⎣<br />

0<br />

0<br />

0<br />

1<br />

0<br />

0 ⎥<br />

⎦ .<br />

1<br />

1 0 0 0<br />

Show that I, P, P 2 , P 3 are linearly independent.<br />

5. (a) Give an example of three linearly dependent vectors v1, v2, v3 in an appropriate<br />

vector space such that every pair of <strong>the</strong>m form a linearly independent set. (b) Give<br />

an example of two linear operators S and T on an appropriate vector space V such<br />

that, as vectors in L (V ), S and T are linearly independent, but, for every v in V , <strong>the</strong><br />

vectors Sv and T v are linearly dependent! Hint: Let S = MA and T = MB with<br />

A =<br />

<br />

1 0<br />

0 0<br />

and B =<br />

<br />

1 1<br />

.<br />

0 0<br />

6. Suppose that H is a subspace of V spanned by <strong>the</strong> set SH ≡ {h1, h2, . . . , hr} and<br />

K be a subspace of V spanned by SK ≡ {k1, k2, . . . , ks}. Show that <strong>the</strong> set S ≡<br />

{h1, h2, . . . , hr, k1, k2, . . . , ks} is linearly independent if and only if both sets SH and<br />

SK are linearly independent and H ∩ K = {0}, that is, <strong>the</strong> zero vector is <strong>the</strong> only<br />

vector in both H and K.<br />

7. Prove that if H is a subspace of a finite dimensional vector space V , <strong>the</strong>n <strong>the</strong>re is a<br />

subspace K of V such that H + K = V and H ∩ K = {0}. (Hint: Take a basis of H<br />

and extend it to a basis of V .)<br />

13


§2. Dimension<br />

2.1. The main purpose of <strong>the</strong> <strong>present</strong> subsection is to establish <strong>the</strong> <strong>the</strong>orem stated<br />

below. The key role played by this <strong>the</strong>orem in linear algebra is to justify <strong>the</strong> definition of<br />

<strong>the</strong> dimension of a vector space.<br />

Theorem 2.1.1. Any two bases of a vector space have <strong>the</strong> same number of vectors.<br />

Let V be a finite dimensional vector space. Take any basis in V ; (<strong>we</strong> can do this because<br />

V has a basis according to Corollary 1.4.3 in <strong>the</strong> previous section. This <strong>the</strong>orem tells us<br />

that <strong>the</strong> number of vectors in it does not depend on which basis <strong>we</strong> pick. So <strong>we</strong> can define<br />

<strong>the</strong> dimension of V to be <strong>the</strong> number of vectors in any basis of V , denoted by dim V .<br />

We provide two proofs of this <strong>the</strong>orem. The first proof given here is based on <strong>the</strong><br />

material about leading vectors described in <strong>the</strong> last section. Let us take two bases of V ,<br />

say E consisting of vectors e1, e2, . . . , em and F consisting of f1, f2, . . . , fn. Here of<br />

course m and n are number of vectors in <strong>the</strong> bases E and F respectively. We need to prove<br />

m = n. Without loss of generality, let us assume n ≤ m. We use F as a coordinate<br />

system to convert vectors in E into column vectors in F n . Let<br />

Ck = [ek]F (1 ≤ k ≤ m) and C = [C1 C2 · · · Cm].<br />

Notice that C is an n × m matrix, with n ≤ m. Since <strong>the</strong> vectors e1, e2, . . . , em are<br />

linearly independent, all of <strong>the</strong>m are leading vectors. This fact carries over to <strong>the</strong> column<br />

vectors of C: all of C1, C2, . . . , Cm are leading vectors. Hence <strong>the</strong> reduced row echelon<br />

form of C must be ⎡<br />

1 0<br />

⎤<br />

0<br />

⎢ 0<br />

⎢<br />

⎣<br />

1<br />

. ..<br />

0 ⎥<br />

⎦<br />

0 0 1<br />

which is necessarily a square matrix. So C is a square matrix as <strong>we</strong>ll. Thus m = n.<br />

Example 2.1.2. Now <strong>we</strong> determine <strong>the</strong> dimensions of familiar spaces. First, <strong>the</strong><br />

dimension of F n is n, since <strong>the</strong> standard basis given as follows has exactly n vectors:<br />

e1 = (1, 0, . . . , 0, 0), e2 = (0, 1, . . . , 0, 0), . . . , en = (0, 0, . . . , 0, 1).<br />

The linear space Mmn(F) of all m × n matrices over F can be identified with F mn (as<br />

long as <strong>the</strong> linear structure is concerned and hence its dimension is mn. Next, <strong>the</strong> space<br />

Pn of all polynomials of degree not exceeding n is of dimension n + 1, since its standard<br />

basis consisting of<br />

1, x, x 2 , . . . , x n−1 , x n<br />

14


has n + 1 elements. Now <strong>we</strong> determine <strong>the</strong> dimension of <strong>the</strong> space L (V, W ) of all linear<br />

mappings from V into W , where dim V = n and dim W = m. Take a basis E in V and F<br />

in W , Then each linear map T ∈ L (V, W ) corresponds to <strong>the</strong> m × n matrix [T ] E F and<br />

vice versa. Thus <strong>the</strong> dimension of L (V, W ) is <strong>the</strong> same as <strong>the</strong> dimension of <strong>the</strong> space of<br />

all m × n matrices, which is clear mn. Finally, th dual V ′ of a vector space is <strong>the</strong> special<br />

case of L (V, W ) with W = F and hence <strong>the</strong> dimension of V ′ is 1 × n = n. <strong>In</strong> summary,<br />

dim F n = n, dim Mmn = mn, dim Pn = n + 1,<br />

dim L (V, W ) = (dim V )(dim W ), dim V ′ = dim V.<br />

Example 2.1.3. Determine dimension of <strong>the</strong> subspace M of C 3 consisting of vectors<br />

of <strong>the</strong> form (x, x + iy, 2y), where x, y are arbitrary complex numbers.<br />

Solution. We can rewrite (x, x+iy, 2y) as x(1, 1, 0)+y(0, i, 2). Let b1 = (1, 1, 0),<br />

b2 = y(0, i, 2). Then b1, b2 span M. Also, <strong>we</strong> can easily check that b1, b2 are linearly<br />

independent. Hence b1, b2 form a basis of M. Therefore dim M = 2.<br />

Example 2.1.4. <strong>In</strong> Example 1.2.4 from <strong>the</strong> last section, <strong>we</strong> took n + 1 distinct<br />

complex numbers c1, c2, . . . , cn+ 1 and consider n+1 vectors vk = (c k 1 , ck 2 , . . . , ck n , ck n+ 1 )<br />

(k = 0, 1, 2, . . . , n) in C n+ 1 . We sho<strong>we</strong>d that <strong>the</strong>se vectors are linearly independent. Now<br />

<strong>the</strong> number of vectors here is <strong>the</strong> same as <strong>the</strong> dimension of <strong>the</strong> space C n+ 1 , namely n + 1.<br />

From <strong>the</strong> following corollary <strong>we</strong> see that <strong>the</strong>se vectors form a basis of C n+ 1 .<br />

Corollary 2.1.1. If E is a linearly independent set of m vectors in a linear space<br />

V with dim V = n, <strong>the</strong>n m ≤ n. If, fur<strong>the</strong>rmore, m = n, <strong>the</strong>n E forms a basis of V .<br />

<strong>In</strong>deed, by Corollary 1.4.4 from <strong>the</strong> last section, <strong>we</strong> know that E can be extended to a<br />

basis F of V . Since dim V = n, F is a set of n vectors. Now <strong>the</strong> corollary is clear.<br />

Corollary 2.1.2. If E is a spanning set of m vectors in a linear space V and with<br />

dim V = n, <strong>the</strong>n m ≤ n. If, fur<strong>the</strong>rmore, m = n, <strong>the</strong>n E forms a basis of V .<br />

To see this, <strong>we</strong> use Corollary 1.4.2 from <strong>the</strong> last section to infer that E contains a basis<br />

of V , say B. Since dim V = n, <strong>the</strong>re are n vectors in B. The rest of argument is clear.<br />

2.2. Now <strong>we</strong> give a second proof of Theorem 2.1.1 in a more direct way, not relying on<br />

<strong>the</strong> material about leading vectors in <strong>the</strong> previous section and <strong>the</strong> fact that every matrix<br />

can be row reduced to a reduced row echelon form. Ho<strong>we</strong>ver, basic skill in handling <strong>the</strong><br />

summation symbol is required and this proof can be skipped or postponed. §2 of<br />

15


Chapter <strong>II</strong>I will be devoted to <strong>the</strong> basic skill of handling <strong>the</strong> summation symbol. You may<br />

return to <strong>the</strong> second proof after studying that section.<br />

Again let E and F be two bases in V , where E consists of vectors e1, e2, . . . , em and<br />

F consists of f1, f2, . . . , fn. Now each vector in E can be written as a linear combination<br />

of vectors in F and vice versa, say<br />

ej = n<br />

For each ℓ with 1 ≤ ℓ ≤ n, <strong>we</strong> have<br />

k= 1 pjkfk and fk = m<br />

j= 1 qkjej.<br />

fℓ = <br />

j qℓjej = <br />

j qℓj<br />

<br />

k pjkfk = <br />

j<br />

= <br />

k j qℓjpjkfk = <br />

k j qℓjpjk<br />

<br />

fk.<br />

k qℓjpjkfk<br />

Since f1, f2, . . . , fn form a basis, <strong>the</strong> coefficient of fk must be <strong>the</strong> same for each k, equal<br />

to 1 for k = ℓ and zero o<strong>the</strong>rwise. Thus <strong>we</strong> have<br />

<br />

j qℓjpjk<br />

<br />

1,<br />

= δℓk ≡<br />

0,<br />

if k = ℓ;<br />

o<strong>the</strong>rwise.<br />

(2.2.1)<br />

(Here, δℓk given above is <strong>the</strong> so-called Kronecker’s delta.) We can rewrite (2.2.1) in<br />

matrix form as QP = <strong>In</strong>, where <strong>In</strong> is <strong>the</strong> n × n identity matrix and<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

p11 · · · · · · p1m<br />

q11 · · · · · · q1n<br />

⎢<br />

P =<br />

.<br />

⎣ .<br />

. ⎥ ⎢ .<br />

. ⎥<br />

.<br />

. ⎦ , Q = ⎣ .<br />

. ⎦ .<br />

pn1 · · · · · · pnm<br />

qm1 · · · · · · qmn<br />

By reversing <strong>the</strong> roles of p’s and q’s, <strong>we</strong> know that (2.2.1) holds if p and q are switched,<br />

giving us P Q = Im.<br />

♠ Aside: Merely P Q = <strong>In</strong> alone is not enough to guarantee that Q is <strong>the</strong> inverse of P<br />

(unless you also know that both P and Q are square matrices) One could have AB = <strong>In</strong><br />

without <strong>the</strong> invertibility of A and B. For example, if<br />

A =<br />

⎡ ⎤<br />

1 0<br />

1 0 0<br />

, B = ⎣ 0 1 ⎦ ,<br />

0 1 0<br />

9 9<br />

<strong>the</strong>n AB = I2, but A, B are not invertible because <strong>the</strong>y are not square matrices. ♠<br />

Consider <strong>the</strong> sum S of all pkjqjk (= qjkpkj), where k runs from 1 to n and j runs<br />

from 1 to m. We can perform <strong>the</strong> addition in two different ways: one way is letting k run<br />

16


first, <strong>the</strong> o<strong>the</strong>r letting j run first. According to what <strong>we</strong> have above, (in particular, <strong>the</strong><br />

identities (2.1.1)):<br />

S = m<br />

S = n<br />

n<br />

j= 1 k= 1 pkjqjk = m<br />

j= 1<br />

m<br />

k= 1 j= 1 qjkpkj = n<br />

k= 1<br />

1 = m,<br />

1 = n.<br />

Therefore m = n. The second proof of Theorem 2.1.1 is complete.<br />

2.3. Next is a basic <strong>the</strong>orem about “dimension counting” for a linear map.<br />

Theorem 2.3.1. If T : V → W is a linear transformation from a finite dimensional<br />

vector space V into ano<strong>the</strong>r W over <strong>the</strong> same field, <strong>the</strong>n <strong>we</strong> have<br />

dim ker T + dim T (V ) = dim V.<br />

We call dim ker T (<strong>the</strong> dimension of <strong>the</strong> kernel of T ) <strong>the</strong> nullity of T and <strong>we</strong> call dim T (V )<br />

(<strong>the</strong> dimension of <strong>the</strong> range of T ) <strong>the</strong> rank of T . So <strong>the</strong> above <strong>the</strong>orem says:<br />

nullity + rank = dimension of <strong>the</strong> domain V<br />

♠ Aside: The above identity has nothing to do with <strong>the</strong> dimension of W . To understand<br />

Theorem 2.3.1, <strong>we</strong> think of V as <strong>the</strong> space of vectors to start with and its size is measured<br />

by dim V . The subspace ker T consists of those v in V nullified (or killed) by T : T v = 0.<br />

The range of T is imagined to be <strong>the</strong> collection of those vectors who survive and land on<br />

a new place W . So <strong>the</strong> identity of this <strong>the</strong>orem says: <strong>the</strong> size of those who get killed plus<br />

<strong>the</strong> size of survivors is equal to <strong>the</strong> total size at <strong>the</strong> beginning in V . ♠<br />

The proof of this <strong>the</strong>orem involves several straightforward steps of “unwinding” definitions.<br />

For convenience, let us put n = dim V , k = dim ker T and r = dim T (V ), keeping<br />

in mind that <strong>we</strong> have to prove k + r = n. That k = dim ker T means that <strong>the</strong>re are k<br />

vectors in ker T , say v1, v2, . . . , vk, which form a basis of ker T . Note that vj being in<br />

ker T (1 ≤ j ≤ k) means T vj = 0. Let us look at ano<strong>the</strong>r identity: r = dim T (V ). This<br />

means <strong>the</strong>re are r vectors in T (V ), say w1, w2, . . . , wr, which form a basis of T (V ). As<br />

<strong>the</strong>se vectors are in <strong>the</strong> range T (V ), <strong>the</strong>re are vectors u1, u2, . . . , ur such that<br />

T u1 = w1, T u2 = w2, . . . . . . , T ur = wr.<br />

Let B = {v1, . . . , vk, u1, , . . . , ur}. First <strong>we</strong> show <strong>the</strong> linear independence of B. Assume<br />

a1v1 + · · · + akvk + b1u1 + · · · + brur = 0. (2.3.1)<br />

17


(Don’t forget that <strong>the</strong> <strong>the</strong>orem <strong>we</strong> are proving now mainly concerns a linear transformation,<br />

namely T . Naturally <strong>we</strong> apply T to both sides of <strong>the</strong> identity and see what happens.) Apply<br />

T to (2.3.1) above and write down T (a1v1 + · · · + akvk + b1u1 + · · · + brur) = T 0. Because<br />

T is linear, <strong>we</strong> can rewrite this as<br />

a1T v1 + · · · + akT vk + b1T u1 + · · · + brT ur = 0.<br />

Recall what happens when <strong>we</strong> apply T to <strong>the</strong>se v’s and u’s: T vj = 0, T uk = wk. So<br />

b1w1 + b2w2 + · · · + brwr = 0. (2.3.2)<br />

But w1, w2, . . . , wr form a basis of T (V ). <strong>In</strong> particular, <strong>the</strong>y are linearly independent. So<br />

(2.3.2) entails b1 = b2 = · · · = br = 0. Now go back to (2.3.1). It becomes<br />

a1v1 + a2v2 + · · · + akvk = 0.<br />

Since v1, v2, . . . , vk are linearly independent, (because <strong>the</strong>y form a basis of ker T ), <strong>we</strong> have<br />

a1 = a2 = · · · = ak = 0. Hence B ≡ {v1, . . . , vk, u1, . . . , ur} are linearly independent.<br />

Next <strong>we</strong> prove that B spans V . To this end, take an arbitrary vector v in V . Our goal<br />

is to write v as a linear combination of vectors from B. Applying T to v, <strong>we</strong> get T v, a vector<br />

in <strong>the</strong> range T (V ). <strong>In</strong> T (V ) <strong>we</strong> have already picked a basis, namely {w1, w2, . . . , wr},<br />

which is waiting to be used. Thus T v as a linear combination of <strong>the</strong>m, say T (v) =<br />

β1w1 + β2w2 + · · · + βrwr. Recall that w1 = T u1, w2 = T u2 etc. So<br />

This tells us T z = 0, where<br />

T v = β1T u1 + · · · + βrT ur = T (β1u1 + · · · + βrur).<br />

z = v − (β1u1 + · · · + βrur). (2.3.3)<br />

<strong>In</strong> o<strong>the</strong>r words, z is in ker T . Now, in ker T <strong>the</strong>re is a basis of vectors waiting for us, namely<br />

v1, v2, . . . , vk. Hence z is a linear combination of <strong>the</strong>se basis vectors, say<br />

z = α1v1 + α2v2 + · · · + αkvk. (2.3.4)<br />

Combining (2.3.3) and (2.3.4), <strong>we</strong> end up with v = α1v1 + · · · + αkvk + β1u1 + · · · + βrur,<br />

which is exactly what <strong>we</strong> want.<br />

Theorem 2.3.2. If M and N are subspaces of a (finite dimensional) vector space V ,<br />

<strong>the</strong>n<br />

dim(M + N) + dim(M ∩ N) = dim M + dim N.<br />

18


Here, M + N stands for <strong>the</strong> subspace consisting of vectors which can be written in <strong>the</strong><br />

form u + v, where u is in M and v is in N, and M ∩ N, called <strong>the</strong> intersection of M and<br />

N, stands for <strong>the</strong> subspace consisting of vectors in both M and N. Thus<br />

M + N = {u + v: u ∈ M and v ∈ N} M ∩ N = {w ∈ V : w ∈ M and w ∈ N}.<br />

The proof of this Theorem can be done by using similar argument as <strong>the</strong> previous <strong>the</strong>orem.<br />

It is left as an exercise for <strong>the</strong> reader. (See Exercise 13.)<br />

2.4. Now <strong>we</strong> draw some consequences of Theorem 2.3.1. Given vector spaces V<br />

and W , assume <strong>the</strong>re is an invertible linear mapping T from V to W (Recall that<br />

“T is invertible” means that <strong>the</strong>re is a linear map S from W to V such that ST = IV<br />

and T S = IW .) The invertibility of T tells us that ker T = {0} and T (V ) = W .<br />

Thus <strong>the</strong> identity dim ker T + dim T (V ) = dim V becomes dim W = dim V and hence<br />

dim V = dim W . We have shown<br />

Corollary 2.4.1. If <strong>the</strong>re is an invertible linear mapping from V to W , <strong>the</strong>n V<br />

and W has <strong>the</strong> same dimension.<br />

♠ Remark: There is a subject in ma<strong>the</strong>matics called category <strong>the</strong>ory (which is considered<br />

as “abstract general nonsense” by some people) that puts every branch of ma<strong>the</strong>matics<br />

under its umbrella. According to that <strong>the</strong>ory (perhaps “philosophy” is a more appropriate<br />

word here) every branch of ma<strong>the</strong>matics deals with a family of “objects” and “mophisms”<br />

bet<strong>we</strong>en objects. <strong>In</strong>vertible morphisms are called isomorphisms. If <strong>the</strong>re is an isomorphisms<br />

bet<strong>we</strong>en two objects, say X and Y , <strong>the</strong>n <strong>we</strong> say that X and Y are isomorphic.<br />

Two isomorphic objects are indistinguishable in <strong>the</strong>ir behavior and often considered as <strong>the</strong><br />

same. <strong>In</strong> linear algebra, <strong>the</strong> objects are finite dimensional vector spaces and morphisms are<br />

linear mappings bet<strong>we</strong>en <strong>the</strong>m. The above corollary tells us that isomorphic vector spaces<br />

have <strong>the</strong> same dimension. The converse of this is also true: if V and W are vector spaces<br />

over F having <strong>the</strong> same dimension, say dim V = dim W = n, <strong>the</strong>n <strong>the</strong>y are isomorphic.<br />

<strong>In</strong>deed, according to <strong>the</strong> last paragraph of subsection 2.6 in Chapter I, both V and W are<br />

isomorphic to F n and hence <strong>the</strong>y are isomorphic to each o<strong>the</strong>r. Thus, dimension is <strong>the</strong><br />

only thing <strong>we</strong> need to tell if two vector spaces are “<strong>the</strong> same” or “different”. Comparing<br />

to o<strong>the</strong>r branches of ma<strong>the</strong>matics, objects in linear algebra are considered to be “easy”. ♠<br />

Next, let T be a linear operator on a finite dimensional space, (that is, T is a linear<br />

map from V into V itself). Assume that ker T is <strong>the</strong> zero space, that is, ker T = {0}.<br />

The identity dim ker T + dim T (V ) = dim V becomes dim T (V ) = dim V . Now T (V ) is<br />

a subspace of V , and it has <strong>the</strong> same dimension as V . So T (V ) must coincide with V .<br />

This shows that all vectors in V are in <strong>the</strong> range of T , in o<strong>the</strong>r words, T is surjective.<br />

19


So, for each v, <strong>the</strong>re is a vector u in V such that T u = v. Here u is uniquely<br />

determined by v. To see this, assume <strong>the</strong>re is ano<strong>the</strong>r vector w such that T w = v.<br />

Then T (u − w) = T u − T w = v − v = 0. This shows that u − w is in ker T . The<br />

assumption, ker T = {0} tells us that u − w = 0 and hence u = w. Now <strong>we</strong> can define<br />

<strong>the</strong> inverse T −1 of T by putting T −1 v = u if T u = v holds. We have proved<br />

Corollary 2.4.2. If T is a linear operator on a finite dimensional space V with<br />

ker T = {0}, <strong>the</strong>n T is invertible.<br />

The argument for proving <strong>the</strong> above corollary also shows<br />

Corollary 2.4.3. If T is a linear mapping on a finite dimensional space V into<br />

itself and if <strong>the</strong> homogeneous equation T x = 0 has no nontrivial solution, <strong>the</strong>n, for any<br />

vector b in V , <strong>the</strong> equation T x = b has a unique solution.<br />

Corollary 2.4.4. If M is an m × n matrix with m < n, <strong>the</strong>n <strong>the</strong> homogeneous<br />

equation Ax = 0 has nontrivial solutions. (Notice that m is <strong>the</strong> number of equations and<br />

n is <strong>the</strong> number of unknowns in <strong>the</strong> system of equations in vector form Ax = 0.)<br />

Proof. Define <strong>the</strong> linear map T : C n → C m by putting T x = Ax (in o<strong>the</strong>r<br />

words, T = MA). Then dim ker T + dim T (C n ) = dim C n = n. On <strong>the</strong> o<strong>the</strong>r hand,<br />

since T (C n ) is a subspace of C m and dim C m = m, <strong>we</strong> have dim T (C n ) ≤ m. Thus<br />

dim ker T = dim C n − dim T (C n ) ≥ n − m > 0. Hence ker T = {0}. Any nonzero vector<br />

in ker T gives a nontrivial solution to Ax = 0.<br />

Let A = [A1 A2 · · · An] be an m × n matrix. As usual, <strong>we</strong> study <strong>the</strong> homogeneous<br />

equation Ax = 0 by introducing <strong>the</strong> linear map T : C n → C m . A solution of this equation<br />

is just a vector in <strong>the</strong> kernel of T . So <strong>the</strong> dimension of <strong>the</strong> solution space is <strong>the</strong> nullity of<br />

T , which is also <strong>the</strong> number of parameters needed in writing down <strong>the</strong> general solution.<br />

The rank of A is defined to be <strong>the</strong> rank of T , which is <strong>the</strong> dimension of <strong>the</strong> range T (V )<br />

of V . From <strong>the</strong>n identity<br />

⎢<br />

T x = Ax = [A1 A2 · · · An] ⎢<br />

⎣<br />

⎡<br />

x1<br />

x2<br />

.<br />

xn<br />

⎤<br />

⎥<br />

⎦ = A1x1 + A2x2 + · · · + Anxn<br />

<strong>we</strong> see that <strong>the</strong> range of T is just <strong>the</strong> column space of A, that is, <strong>the</strong> linear span of<br />

<strong>the</strong> column vectors A1, A2, . . . , An of <strong>the</strong> matrix A. The leading vectors of this list of<br />

column vectors is a basis of <strong>the</strong> column space of A. The number of <strong>the</strong> leading vectors is<br />

<strong>the</strong> rank of A. Recall that leading vectors can be identified by <strong>the</strong> echelon form of A.<br />

20


2.5. Consider <strong>the</strong> following general problem: given a polynomial p(x) of degree d,<br />

how can <strong>we</strong> find a formula for<br />

Sn = p(0) + p(1) + p(2) + · · · + p(n) ≡ n<br />

k= 1<br />

p(k) (2.5.1)<br />

which is valid for all n? Well, suppose that <strong>we</strong> know ahead of time that <strong>the</strong>re is a polynomial<br />

q(x) of degree d + 1 such that q(x + 1) − q(x) = p(x), <strong>the</strong>n from (2.5.1) <strong>we</strong> have<br />

Sn = (q(1) − q(0)) + (q(2) − q(1)) + (q(3) − q(2)) + · · · + (q(n + 1) − q(n))<br />

= (−q(0) + q(1)) + (−q(1) + q(2)) + (−q(2) + q(3)) + · · · + (−q(n) + q(n + 1))<br />

= q(n + 1) − q(0)<br />

by canceling <strong>the</strong> neighboring terms. So <strong>the</strong> required formula is Sn = q(n + 1) − q(n), or<br />

Sn = Q(n), where Q(x) = q(x + 1) − q(0) is a polynomial also of degree d + 1. Once <strong>we</strong><br />

know <strong>the</strong> existence of such q(x), <strong>we</strong> can find it by solving a system of linear equations,<br />

as shown in <strong>the</strong> following examples.<br />

Example 2.5.1. Find a formula for <strong>the</strong> sum Sn = 1 + 2 + 3 + · · · + n.<br />

Solution. Certainly <strong>we</strong> know this formula from an elementary math course. We apply<br />

<strong>the</strong> method described here for practice. Here <strong>we</strong> take p(x) = x, which is a polynomial<br />

of degree 1. So q(x) is a polynomial of degree 2. Let us write q(x) = ax 2 + bx + c.<br />

Then q(x + 1) − q(x) = a(x + 1) 2 + b(x + 1) + c − (ax 2 + bx + c) = 2ax + a + b. So<br />

q(x + 1) − q(x) = p(x) gives 2ax + a + b = x. Comparing both sides leads to <strong>the</strong> system<br />

2a = 1, a + b = 0. The solution to this system is a = 1/2, b = −1/2. The constant term c<br />

is arbitrary and <strong>we</strong> set it to zero: c = 0. Hence q(x) = (x 2 − x)/2. Therefore <strong>the</strong> required<br />

formula is<br />

as <strong>we</strong> expect.<br />

Sn = q(n + 1) − q(0) = (n + 1)2 − (n + 1)<br />

2<br />

− 0 =<br />

n(n + 1)<br />

2<br />

Example 2.5.2. Find a formula for <strong>the</strong> sum Sn = 1 2 + 2 2 + 3 2 + · · · + n 2 .<br />

Solution. We use a slightly different method. Take p(x) = x 2 . Motivated from our<br />

discussion above, <strong>we</strong> try to find a polynomial Q(x) of degree 3 such that Sn = Q(n).<br />

Write Q(x) = ax 3 + bx 2 + cx + d. Then Q(0) = 0, Q(1) = 1, Q(2) = 1 2 + 2 2 = 5 and<br />

Q(3) = 1 2 + 2 2 + 3 2 = 14 gives <strong>the</strong> following system of equations<br />

d = 0, a + b + c + d = 1, 8a + 4b + 2c + d = 5 27a + 9b + 3c + d = 14.<br />

You only need little patience to find <strong>the</strong> solution: a = 1/3, b = 1/2, c = 1/6, d = 0. So<br />

Q(x) = x 3 /3 + x 2 /2 + x/6. Thus<br />

Sn = Q(n) = n3<br />

3<br />

+ n2<br />

2<br />

+ n<br />

6<br />

21<br />

= n(n + 1)(2n + 1)<br />

6


which is <strong>the</strong> require formula.<br />

Now <strong>we</strong> have <strong>the</strong> following situation: given p(x), once <strong>we</strong> known that <strong>the</strong>re is q(x)<br />

such that q(x + 1) − q(x) = p(x), <strong>the</strong>n <strong>we</strong> can find q(x). The question is, how do <strong>we</strong><br />

know such q(x) exists? Now Theorem 2.3.1 comes to help. Denote by Pd <strong>the</strong> space of all<br />

polynomials of degree not exceeding d. Notice that if q(x) is in Pd+ 1, <strong>the</strong>n <strong>the</strong> degree<br />

of q(x + 1) − q(x) will be 1 less than that of q(x) (because <strong>the</strong> highest po<strong>we</strong>r terms in<br />

q(x + 1) and q(x) are canceled out) and hence q(x + 1) − q(x) belongs to Pd. Thus <strong>we</strong> can<br />

define a mapping from Pd+ 1 to Pd by putting T (q(x)) = q(x + 1) − q(x). Now <strong>we</strong> apply<br />

dim ker T + dim T (V ) = dim V with V = Pd+ 1. As <strong>we</strong> know, dim V = dim Pd+ 1 = d + 2.<br />

We have to find out dim ker T . Suppose q(x) belongs to ker T . Then q(x + 1) − q(x) = 0,<br />

or q(x) = q(x + 1) (for all x). <strong>In</strong> particular, q(0) = q(1) = q(2) = q(3) = · · · . This<br />

tells us that all positive integers are roots of q(x) − q(0). But a polynomial cannot<br />

have infinitely many roots unless it is <strong>the</strong> zero polynomial. Hence q(x) − q(0) = 0, or<br />

q(x) = q(0), that is, q(x) is a constant polynomial. We have shown that ker T is <strong>the</strong><br />

space of constant polynomials. Hence dim ker T = 1. Now dim ker T + dim T (V ) = dim V<br />

becomes 1 + dim T (V ) = d + 2, or dim T (V ) = d + 1. Here T (V ) is a subspace of Pd with<br />

dim T (V ) = dim Pd = d + 1 and hence T (V ) = Pd. <strong>In</strong> particular p(x) ∈ Pd is in <strong>the</strong> range<br />

of T , showing that <strong>the</strong>re exists q(x) in Pd+ 1 with q(x + 1) − q(x) = p(x).<br />

2.6. Consider a variation of <strong>the</strong> general problem studied in <strong>the</strong> last subsection: given<br />

a polynomial p(x) of degree d, and a constant a, how can <strong>we</strong> find a formula for<br />

Sn = p(0) + p(1)a + p(2)a 2 + · · · + p(n)a n ≡ n<br />

k= 1 p(k)ak<br />

(2.6.1)<br />

which is valid for all n? Certainly, when a = 1, <strong>we</strong> return to <strong>the</strong> problem which is dealt<br />

with in <strong>the</strong> last subsection. So <strong>we</strong> assume a = 1.<br />

We look for an expression f(x) such that f(k + 1) − f(k) = p(k)a k so that <strong>we</strong> can<br />

use a “telescoping sum” as in <strong>the</strong> previous subsection to obtain Sn = f(n + 1) − f(0). Let<br />

us try f(k) = q(k)a k , where q(x) is some polynomial. Now<br />

f(k + 1) − f(k) = q(k + 1)a k+ 1 − q(k)a k = (aq(k + 1) − q(k))a k .<br />

So f(k + 1) − f(k) = p(k)a k becomes (aq(k + 1) − q(k))a k = p(k)a k . Naturally <strong>we</strong> ask<br />

if <strong>the</strong>re is a polynomial q(x) such that aq(x + 1) − q(x) = p(x). Again, <strong>the</strong> question is<br />

whe<strong>the</strong>r such q(x) exists. Once <strong>we</strong> know it does exist, <strong>we</strong> get <strong>the</strong> license to look for it!<br />

Define a linear operator T on Pd by putting T (q(x)) = aq(x + 1) − q(x). By<br />

Corollary 2.4.3, <strong>we</strong> know that, in order to show that T (q(x)) = p(x) has a solution (here<br />

p(x) is given and q(x) is unknown), it suffices to show that ker T is <strong>the</strong> zero space. Let<br />

22


q(x) be in ker T . Then aq(x + 1) = q(x). We claim that q(x) is <strong>the</strong> zero polynomial.<br />

Suppose <strong>the</strong> contrary that q(x) = 0. Notice that q(x) cannot be a (nonzero) constant,<br />

due <strong>the</strong> assumption a = 1. Let m be <strong>the</strong> degree of q(x). Then q(x + 1) − q(x) is of<br />

degree m − 1 or less. Now <strong>we</strong> write aq(x + 1) − q(x) = a(q(x + 1) − q(x)) + (1 − a)q(x).<br />

The degree of a(q(x + 1) − q(x)) is m − 1 or less. Since 1 − a = 0, (1 − a)q(x) and q(x)<br />

have <strong>the</strong> same degree, namely m. Thus <strong>the</strong> sum a(q(x + 1) − q(x)) + (1 − a)q(x) is a<br />

polynomial of degree m. This is impossible because it is equal to aq(x + 1) − q(x), which<br />

is assumed to be <strong>the</strong> zero polynomial. So ker T is necessarily <strong>the</strong> zero space. We conclude:<br />

for a = 1, given any polynomial p(x), <strong>the</strong>re is a polynomial q(x) of <strong>the</strong> same degree such<br />

that aq(x + 1) − q(x) = p(x).<br />

Example 2.6.1. Again <strong>we</strong> start with a simple one with <strong>the</strong> ans<strong>we</strong>r <strong>we</strong>ll known: find<br />

a formula for Sn = 1 + a + a 2 + · · · + a n . <strong>In</strong> this case p(x) is <strong>the</strong> constant polynomial<br />

1 (a polynomial of degree zero) and hence q(x) should also be a constant polynomial,<br />

say q(x) = c. Now aq(x + 1) − q(x) = p(x) becomes ac − c = 1 which gives c = 1/(a − 1).<br />

Thus f(k) = q(k)a k = a k /(a − 1) and <strong>the</strong> required formula is<br />

which is <strong>we</strong>ll known.<br />

Sn = f(n + 1) − f(0) =<br />

1<br />

an+ 1<br />

−<br />

a − 1 a − 1 = an+ 1 − 1<br />

a − 1<br />

Example 2.6.2. Find a formula for Sn = a + 2a 2 + 3a 3 + · · · + na n .<br />

Solution. <strong>In</strong> <strong>the</strong> <strong>present</strong> case <strong>we</strong> have p(x) = x, which is of degree 1. We look for a<br />

polynomial q(x) of degree 1 such that aq(x + 1) − q(x) = p(x), say q(x) = cx + d. Now<br />

aq(x+1)−q(x) = p(x) becomes a(c(x+1)+d)−(cx+d) = x, or (ac−c)x+ac+ad−d = x,<br />

which leads to <strong>the</strong> system of equations ac − c = 1, ac + ad − d = 0, with a is given and c, d<br />

are unknowns. Solving this system, <strong>we</strong> obtain c = 1/(a − 1) and d = −a/(a − 1) 2 . Thus<br />

q(x) =<br />

So <strong>we</strong> have<br />

(a − 1)x − a<br />

(a − 1) 2<br />

which is <strong>the</strong> required formula.<br />

(x − 1)a − x<br />

=<br />

(a − 1) 2 and hence f(k) = q(k)a k = (k − 1)ak+ 1 − kak (a − 1) 2<br />

Sn = f(n + 1) − f(0) = nan+ 2 − (n + 1)a n+ 1 + a<br />

(a − 1) 2<br />

23<br />

.


EXERCISE SET <strong>II</strong>. 2.<br />

Review Questions. Can I give all technical aspects in defining <strong>the</strong> dimension of a vector<br />

space? Can I make each of <strong>the</strong> following vague statements precise and prove it?<br />

“Spanning sets have more (at worse equal number of) vectors than independent set.”<br />

“A spanning set can be trimmed down to a basis.”<br />

“An independent set can be extended to a basis.”<br />

“An independent set is a basis of its span.”<br />

Can I explain <strong>the</strong> identity “rank + nullity = dimension” in detail? Do I know how to<br />

prove this identity?<br />

Drills<br />

1. Give <strong>the</strong> dimension of each of <strong>the</strong> following vector spaces over R.<br />

(a) R 4 , (b) P5, (c) M3,4, (d) FX with X = {1, 2, 3, 4}.<br />

2. What is <strong>the</strong> dimension of C when it is considered as a vector space over R? What<br />

about C n , also considered as a vector space over R?<br />

3. True or False:<br />

(a) A linear homogeneous system of 100 equations with 99 unknowns (or variables)<br />

must have a nontrivial solution.<br />

(b) A linear homogeneous system of 100 equations with 200 unknowns must have a<br />

nontrivial solution.<br />

(c) 100 vectors in a subspace spanned by 50 vectors must be linearly dependent.<br />

(d) If a subspace M is spanned by 50 linearly independent vectors, <strong>the</strong>n <strong>the</strong> dimension<br />

of M is at most 50.<br />

(e) If vectors v0, v1, v2, . . . , v50 are linearly independent, <strong>the</strong>n <strong>the</strong> dimension of <strong>the</strong><br />

subspace M spanned by <strong>the</strong>m is 50.<br />

(f) If v1, v2 and v3 span a 2-dimensional vector space V , <strong>the</strong>n one of <strong>the</strong> following<br />

three sets is a basis of V : S1 = {v2, v3}, S2 = {v1, v3}, S3 = {v1, v2}.<br />

(g) If v1, v2, v3, . . . , v99 are linearly independent vectors in a vector space V with<br />

dim V = 100, <strong>the</strong>n <strong>we</strong> can pick a vector v100 in V such that v1, v2, . . . , v99, v100<br />

form a basis of V .<br />

24


(h) If S is a set of 123 vectors spanning a vector space V with dim V = 100, <strong>the</strong>n <strong>we</strong><br />

can delete 23 vectors from S such that <strong>the</strong> remaining vectors form a basis of V .<br />

(i) If dim V = 100 and v1, . . . , v123 span V , <strong>the</strong>n v1, . . . , v100 form a basis of V .<br />

4. Ans<strong>we</strong>r <strong>the</strong> following questions:<br />

(a) For two subspaces H and K of a vector space V , if <strong>the</strong> dimensions of H ∩ K, H<br />

and H + K are 2, 4 and 5 respectively, what is <strong>the</strong> dimension of K?<br />

(b) If M and N are subspaces of P10 with dim M = 7, dim N = 8 and dim(M ∩N) =<br />

5, is M + N = P10?<br />

(c) Describe linear operators of rank zero.<br />

(d) If <strong>the</strong> kernel of a linear transformation T defined on P4 consists of polynomials<br />

of <strong>the</strong> form a + bx + (a + b + c)x 2 + cx 3 + ax 4 , where a, b, c are arbitrary scalars,<br />

what is <strong>the</strong> rank of T ?<br />

5. Find <strong>the</strong> rank and <strong>the</strong> nullity of each of <strong>the</strong> following linear transformations. <strong>In</strong> each<br />

case, verify <strong>the</strong> identity “nullity + rank = dimension of <strong>the</strong> domain”.<br />

(a) T1 : F 2 → F 3 sending (x1, x2) to (0, x1, x2).<br />

(b) T2 : F 3 → F 3 sending (x1, x2, x3) to (0, x1, x2).<br />

(c) Sω : R 3 → R 3 given by Sω(x) = ω × x, where ω is a fixed nonzero vector in R 3 .<br />

(d) D : Pn → Pn sending p(x) ∈ Pn to its derivative p ′ (x).<br />

(e) M : P3 → P5, sending p(x) to x 2 p(x).<br />

(f) E : P2 → P2, sending p(x) to 1<br />

2 (p(x) + p(−x)).<br />

(g) Q : P2 → P2, sending p(x) to 1<br />

2 (p(x) − p(−x)).<br />

<br />

1<br />

(h) δ : M2,2 → M2,2, given by δ(X) = AX − XA, where A =<br />

0<br />

0<br />

2<br />

<br />

1<br />

(i) Φ : M2,2 → M2,2 given by Φ(X) = BXC, where B =<br />

1<br />

<br />

0 1<br />

, C =<br />

0 0<br />

<br />

1<br />

.<br />

0<br />

6. True or false (S and T are linear operators on a finite dimensional vector space V of<br />

dimension at least 2):<br />

(a) If ST = T S and if v ∈ ker(T ), <strong>the</strong>n S(v) ∈ ker(T ).<br />

(b) T and T 2 always have <strong>the</strong> same kernel.<br />

(c) If T and T 2 have <strong>the</strong> same rank, <strong>the</strong>n <strong>the</strong>y have <strong>the</strong> same range.<br />

(d) If T and T 2 have <strong>the</strong> same rank, <strong>the</strong>n <strong>the</strong>y have <strong>the</strong> same kernel.<br />

(e) The identity rank(S + T )=rank(S)+rank(T ) − rank(ST ) holds.<br />

25<br />

<br />

.


(f) If S 2 = O, <strong>the</strong>n rank(S) ≤ nullity of S.<br />

7. We are given matrix A and its reduced row echelon form B as follows:<br />

⎡<br />

2 4i 7i 3 2<br />

⎤<br />

0<br />

⎡<br />

1 2i 3i 1 1<br />

⎤<br />

0<br />

⎢ 1<br />

A = ⎣<br />

0<br />

2i<br />

i<br />

3i<br />

5i<br />

1<br />

2i<br />

2<br />

−2<br />

0 ⎥ ⎢ 0<br />

⎦ =⇒ B = ⎣<br />

0<br />

0<br />

1<br />

0<br />

5<br />

1<br />

2<br />

−i<br />

2i<br />

2i<br />

0 ⎥<br />

⎦ .<br />

0<br />

1 2i 4i 2 0 0<br />

0 0 0 0 0 0<br />

(a) Write down a basis for <strong>the</strong> range of MA.<br />

(b) Write down a basis for ker(MA).<br />

(c) Write u = (1, 3i, 6i, −1 + 2i, 3, 0) as a linear combination of <strong>the</strong> row vectors of B.<br />

(d) What is <strong>the</strong> rank and <strong>the</strong> nullity of MA?<br />

8. For each linear map, find a basis of its kernel and a basis of its range.<br />

(a) T : C3 → C2 <br />

−1 i 1 + i<br />

given by T = MA, where A =<br />

.<br />

i 1 1 − i<br />

(b) S : P3 → P2 given by T (p(x)) = xp ′′ (x) − 2p ′ (x) + p(1).<br />

(c) Φ : M2,2 → M2,2 given by Φ(X) = AX − XA, where A =<br />

<br />

1 3<br />

.<br />

0 5<br />

9. Find a polynomial p(x) of degree 4 such that p(x + 1) − p(x) = x(x + 1)(x + 2). Then<br />

use your result to give a formula for <strong>the</strong> sum n k= 0k(k + 1)(k + 2).<br />

Exercises<br />

1. Give a careful proof of <strong>the</strong> following statement: a set of vectors in V is a basis of V<br />

if and only if it is linearly independent and it spans V .<br />

2. Prove that if vectors v1, v2, . . . , v50 are linearly independent and are linear combinations<br />

of vectors w1, w2, . . . , w50, <strong>the</strong>n w1, w2, . . . , w50 are linearly independent.<br />

3. Let T be a linear operator defined on a 2-dimensional real vector space such that<br />

T 2 = −I. Show that <strong>the</strong>re is a basis B of V such that [T ]B =<br />

0 −1<br />

1 0<br />

4. Let T be a linear transformation from one vector space V into ano<strong>the</strong>r W . Define a<br />

new linear tranformation T2 : V × V → W by putting T2(x, y) = T x + T y. Suppose<br />

that <strong>the</strong> rank of T is r and <strong>the</strong> dimension of V is n. What is <strong>the</strong> nullity of T2? Hint:<br />

dim(V × V ) = 2n; T and T2 have <strong>the</strong> same range.<br />

5. Let T be a linear transformation from a finite dimensional vector space V to ano<strong>the</strong>r<br />

W . Define a linear transformation T (2) : V → V × V by putting T (2) x = (T x, T x).<br />

Show that T and T (2) have <strong>the</strong> same rank.<br />

26<br />

<br />

.


6. Suppose that T is a linear operator on a 3-dimensional space V satisfying T 2 = O.<br />

Show that <strong>the</strong> rank of T is at most 1.<br />

7*. Let S and T be linear operators on a finite dimensional vector space V . Show that<br />

(a) rank(S + T ) ≤ rank(S)+ rank(T ).<br />

(b) rank(ST ) ≤ <strong>the</strong> minimum of rank(S) and rank(T ).<br />

8*. Let S and T be linear operators on a finite dimensional vector space satisfying ST S =<br />

S (i.e. T is a generalized inverse of S.) Show that ST , T S and S have <strong>the</strong> same rank.<br />

Hint: 7(b) above.<br />

9. Show that a nonzero m × n matrix A is rank one if and only if A can be written as<br />

⎡<br />

a1b1 a1b2 · · ·<br />

⎤<br />

a1bn<br />

⎢ a2b1<br />

⎢<br />

A = ⎢<br />

⎣<br />

a2b2 · · · a2bn ⎥<br />

⎦ ,<br />

amb1 amb2 · · · ambn<br />

for some scalars a1, a2, . . . , am, b1, b2, . . . , bn. Hint : If <strong>the</strong> rank of A is 1, <strong>the</strong>n <strong>the</strong><br />

induced linear transformation MA : F n → F m is a rank one operator and hence its<br />

range is spanned by a single vector in R m , say [a1 a2 · · · am]. It is known that <strong>the</strong><br />

columns of A span <strong>the</strong> range of MA.)<br />

10*. Prove that, for each (a1, a2, a3, a4, a5), <strong>the</strong>re is a unique polynomial p(x) of degree 4<br />

such that p(0) = a1, p ′ (0) = a2, p ′ (0) = a3, p(1) = a4 and p ′ (1) = a5. (Hint: Notice<br />

that, if p(0) = 0, p ′ (0) = 0 and p ′ (0) = 0, <strong>the</strong>n x 3 is a factor of p(x).)<br />

11. Let a be a nonzero constant. (a) Check that if q(x) is in Pn, <strong>the</strong>n (x+a)q(x+1)−xq(x)<br />

is also in Pn. (b) Prove that, given p(x) in Pn, <strong>the</strong>re exists q(x) in Pn such that<br />

(x + a)q(x + 1) − xq(x) = p(x). (Hint: consider <strong>the</strong> linear map Tn : Pn → Pn<br />

defined by T (q(x)) = (x + a)q(x + 1) − xq(x).)<br />

12. (a) Check that if q(x) is in Pn, <strong>the</strong>n xq(x+1)−(x+1)q(x) is also in Pn. (b) Prove that<br />

<strong>the</strong>re exists a polynomial that cannot be written in <strong>the</strong> form xq(x + 1) − (x + 1)q(x),<br />

where q(x) is in Pn.<br />

13*. Prove Theorem 2.3.2 as follows. Take a basis w1, w2, . . . , wr in M ∩ N. Extend it<br />

to a basis w1, . . . , wr, u1, u2, . . . , us of M, and a basis w1, . . . , wr v1, v2, . . . , vt<br />

in N. Verify that <strong>the</strong> vectors w1, w2, . . . , wr u1, u2, . . . , us, v1, v2, . . . , vt form<br />

a basis of M + N.<br />

14*. (Hard problem.) Let p(x) be a polynomial of degree n. Prove that, considered as<br />

vectors in Pn, <strong>the</strong> polynomials p(x), p(x + 1), . . . , p(x + n) are linearly independent.<br />

27


Appendix A*: quotient spaces<br />

Appendices for Chapter <strong>II</strong><br />

Let V be a vector space, not necessarily finite dimensional. For subsets S and T in V ,<br />

and a scalar a, <strong>we</strong> define S + T (aS respectively) to be <strong>the</strong> set of vectors which can be<br />

expressed in <strong>the</strong> form u + v (au respectively) with u in S and v in T , that is,<br />

S + T = {u + v | u ∈ S and v ∈ T }, aS = {au | u ∈ S}.<br />

When S consists of a single point u, <strong>we</strong> write u + T for S + T . Let M be a linear subspace<br />

of V . We call a subset of V of <strong>the</strong> form u + M (where u is some vector in V ) a coset<br />

(of M). Notice that u in generally is not uniquely determined. <strong>In</strong> fact, it is easy to check<br />

that two cosets u + M and v + M are equal if and only if u − v is in M. <strong>In</strong> case V is<br />

a 2–dimensional space re<strong>present</strong>ed by a plane with a point re<strong>present</strong>ing <strong>the</strong> zero vector,<br />

called <strong>the</strong> origen, <strong>the</strong>n a 1–dimensional subspace M is a line through <strong>the</strong> origin of this<br />

plane and <strong>the</strong> cosets of M are lines parallel <strong>the</strong> to line re<strong>present</strong>ing M.<br />

It is easy to show that if A and B are cosets of M, <strong>the</strong>n so are A + B and aA, where<br />

a is any scalar. <strong>In</strong> fact, when A = u + M and B = v + M, <strong>we</strong> have A + B = u + v + M<br />

and aA = au + M. Fur<strong>the</strong>rmore, with addition and scalar multiplication of cosets defined<br />

in this manner, <strong>the</strong> collection of all cosets form a vector space. This vector space of all<br />

cosets of M in V is called <strong>the</strong> quotient space of V over M, denoted by V/M:<br />

V/M = {u + M| u ∈ V }.<br />

There is a natural linear map Q from V to <strong>the</strong> quotient space V/M given by Qu = u+M.<br />

The map Q : V → V/M is called <strong>the</strong> quotient map. It is easy to check that <strong>the</strong> range<br />

of Q is V/M and <strong>the</strong> kernel of Q is M. Hence it follows from Theorem 2.3.1 of <strong>the</strong> <strong>present</strong><br />

<strong>chapter</strong> that, when V is finite dimensional,<br />

dim V/M = dim V − dim M.<br />

Now suppose that N is ano<strong>the</strong>r subspace of V (assumed to be finite dimensional). Let R<br />

be <strong>the</strong> restriction of Q to N. Thus R: N → V/M is <strong>the</strong> linear map given by Ru = u+M<br />

for all u in N. Then <strong>we</strong> can check that <strong>the</strong> kernel of R is M ∩ N and <strong>the</strong> range of R is<br />

(N + M)/M. So, by Theorem 2.3.1 again, <strong>we</strong> get<br />

dim N − dim(M ∩ N) = dim (M + N)/M = dim(M + N) − dim N.<br />

28


Hence dim(M + N) + dim M ∩ N = dim M + dim N. Thus <strong>we</strong> have given a conceptual<br />

proof of Theorem 2.3.2.<br />

Let T : V → W be a linear map bet<strong>we</strong>en vector spaces. Let M = ker T , <strong>the</strong> kernel<br />

of T and let Q : V → V/M be <strong>the</strong> quotient map. Then <strong>the</strong>re is a unique isomorphism<br />

˜T : V/M → T (V ) such that T = j ˜ T Q, where j : T (V ) → W is <strong>the</strong> inclusion map for<br />

<strong>the</strong> subspace T (V ) of W . This fact roughly says that a linear map is essentially a quotient<br />

map, follo<strong>we</strong>d by an injection. (It is analogous to <strong>the</strong> first isomorphism <strong>the</strong>orem in group<br />

<strong>the</strong>ory or ring <strong>the</strong>ory.)<br />

Appendix B*: Duality<br />

Recall that <strong>the</strong> dual V ′ of a vector space V over F is <strong>the</strong> space of all linear functionals<br />

on V : V ′ = L (V, F). The dual of V ′ , naturally denoted by V ′′ , is called <strong>the</strong> bidual of V .<br />

There is a natural map J : V → V ′′ defined by (Ju)(φ) = φ(u) for all u ∈ V and φ ∈ V ′′ .<br />

Now <strong>we</strong> prove: when V is finite dimensional, J is an isomorphism bet<strong>we</strong>en V and V ′′ .<br />

As <strong>we</strong> know, dim V ′ = dim V . Applying this identity to V ′ , <strong>we</strong> have dim V ′′ = dim V ′ .<br />

Consequently dim V ′′ = dim V . So it is enough to show that ker J = {0}. To this end,<br />

<strong>we</strong> show that, given a nonzero vector u in V , Ju = 0, <strong>the</strong>re is, <strong>the</strong>re exists φ ∈ V ′ such<br />

that (Ju)(φ) = φ(u) = 0. Since u = 0, <strong>the</strong>re is a basis of vectors b1, . . . , bn in V such<br />

that b1 = u. Now define a linear functional φ by putting φ(x1b1 + · · · + xnbn) = x1 for<br />

any vector x = x1b1 + · · · + xnbn in V . Then φ(u) = 1 = 0. Now <strong>we</strong> can conclude that<br />

J : V → V ′′ is an isomorphism. Using this natural isomorphism J, <strong>we</strong> can treat V ′′ and V<br />

as <strong>the</strong> same and hence V as <strong>the</strong> dual space of V ′ .<br />

Let M be a subspace of V . The annihilator M o of M is defined as follows:<br />

M o = {φ ∈ V ′ | φ(x) = 0 for all x ∈ M}.<br />

Let φ be an element in M o . Then φ(v + M) = φ(v) for all v ∈ V and hence φ can<br />

be treated as a linear functional on V/M. Conversely, if ψ is a linear functional on V/M,<br />

<strong>the</strong>n ψ ◦ Q is a linear functional on V belonging to M o . This shows that <strong>the</strong> dual of V/M<br />

is naturally isomorphic to M o : (V/M) ′ ∼ = M o . By a more sophisticated argument, <strong>we</strong> can<br />

show M ′ ∼ = V ′ /M o .<br />

Appendix C*: Projective Space<br />

Let V be a vector space over F of dimension n + 1. For any positive integer k with<br />

k ≤ n, denote by Gk(V ) <strong>the</strong> set of all k–dimensional subspaces of V ; (G here stands<br />

29


for Grassmann). The n-dimensional projective space modeled on V , denoted by P [V ]<br />

here, is G1(V ), <strong>the</strong> set consisting of all 1-dimensional subspaces of V . Thus a point in<br />

P [V ] is just a 1–dimensional subspace. For a nonzero vector v in V , denote by [v] <strong>the</strong><br />

1–dimensional subspace spanned by v, which re<strong>present</strong>s a point in P [V ]. Thus <strong>we</strong> may<br />

write<br />

P [V ] = G1(V ) = {[v] | v ∈ V, v = 0}.<br />

Notice that two points [u] and [v] in P [V ] coincide if and only if u = av for some scalar<br />

a = 0. Given M ∈ G2(V ) (that is, M is a 2–dimensional subspace of V ), denote by [M]<br />

<strong>the</strong> set of all points [v] in P [V ] with v ∈ M:<br />

[M] = {[v] ∈ P [V ] | v ∈ M}.<br />

<strong>In</strong> o<strong>the</strong>r words, [M] is <strong>the</strong> set of all 1–dimensional subspaces contained in M. A set of<br />

<strong>the</strong> form [M] with M ∈ G2(V ) is called a (projective) line. Now <strong>the</strong> statement “given<br />

two distinct points in a projective space, <strong>the</strong>re is a unique line passing <strong>the</strong>m” in projective<br />

geometry can be rephrased as “given two distinct 1–dimensional subspaces, <strong>the</strong>re is a<br />

unique 2–dimensional subspace containing both of <strong>the</strong>m” in linear algebra, which is more<br />

or less transparent. <strong>In</strong> <strong>the</strong> same way <strong>we</strong> define a (projective) plane in P [V ] to be [M] =<br />

{[v] ∈ P [V ] | v ∈ M}, where M ∈ G3(V ).<br />

When V = F n+ 1 , <strong>we</strong> write P n (F) or simply P n for P [V ]. For a nonzero vector<br />

x = (x0, x1, . . . , xn) in F n+ 1 , <strong>we</strong> write [x0 : x1 : x2 : · · · : xn] for [x]; (<strong>the</strong> numbers<br />

x0, x1, . . . , xn are called <strong>the</strong> homogeneous coordinates of <strong>the</strong> point). Notice that, for any<br />

scalar a = 0, [ax0 : ax1 : · · · : axn] = [x0 : x1 : · · · : xn]. We identify a point<br />

(x1, x2, . . . , xn) in F n with <strong>the</strong> point [1 : x1 : · · · : xn] in <strong>the</strong> projective space P n , called<br />

an ordinary point. Notice that, when x0 = 0, [x0 : x1 : · · · : xn] is an ordinary point<br />

because it can be rewritten as [1 : x1/x0 : · · · : xn/x0]. Thus a non–ordinary point is of<br />

<strong>the</strong> form [0 : x1 : · · · : xn]; it is called a point at infinity.<br />

Now <strong>we</strong> give a closer look at <strong>the</strong> case n = 2 and F = R. <strong>In</strong> this case P 2 is called <strong>the</strong><br />

real projective plane. Let us consider a line<br />

ax + by = c (C1)<br />

in F 2 . Write x = x1/x0 and y = x2/x0. Then <strong>we</strong> have a(x1/x0) + b(x2/x0) = c, or<br />

a0x0 + a1x1 + a2x2 = 0, (C2)<br />

where a0 = −c, a1 = a, a2 = b are not simultaneously zero. When [x0 : x1 : x2] is an<br />

ordinary point on <strong>the</strong> line given by (C2) in <strong>the</strong> projective plane, it also re<strong>present</strong>s a point<br />

30


(x, y) with x = x1/x0, y = x2/x0 on <strong>the</strong> line given by (C2) in <strong>the</strong> Euclidean plane R 2 .<br />

Take any point (p, q) on <strong>the</strong> line (C1), that is, ap + bq = c. We can put (C1) in parametric<br />

form as x = p + bt, y = q − at. The point (x, y) = (p + bt, q − at) can be identified<br />

with <strong>the</strong> point [1 : p + bt : p − at] in <strong>the</strong> projective plane, which can be rewritten as<br />

[t −1 : t −1 p + b : t −1 p − a] (assuming t = 0). Letting t → ∞, <strong>we</strong> get a point at infinity,<br />

namely [0 : b : −a]. Clearly x0 = 0, x1 = b, x2 = −a satisfy (C2). Thus [0 : b : −a] is <strong>the</strong><br />

point on <strong>the</strong> line (C2) at infinity. Notice that <strong>the</strong> vector (b, −a) is parallel to <strong>the</strong> line<br />

(C1). So it points in <strong>the</strong> direction where <strong>the</strong> point at infinity on <strong>the</strong> line can be found.<br />

When a0 = 1, a1 = a2 = 0, (C2) becomes x0 = 0. We call x0 = 0 <strong>the</strong> equation for <strong>the</strong><br />

line at infinity. The point [0 : b : −a] is <strong>the</strong> intersection of <strong>the</strong> line (C2) and <strong>the</strong> line at<br />

infinity. Using <strong>the</strong> fact that two distinct 2–dimensional subspaces in R 3 intersections in a<br />

1–dimensional subspace, <strong>we</strong> deduce that two distinct lines in <strong>the</strong> projective plane intersect<br />

at a unique point. From <strong>the</strong> projective geometric point of view, two parallel Euclidean line<br />

intersect at infinity.<br />

The process of adding points at infinity to extend <strong>the</strong> line (C1) to (C2) is called <strong>the</strong><br />

completion of (C1). <strong>In</strong> fact, this process of completion by adding points at infinity works<br />

more generally for algebraic curves. The tangent lines to completed curves at points of<br />

infinity give us asymptotic lines of curves. We give an example to show this<br />

Example C1. Consider <strong>the</strong> quadratic curve x 2 −2xy −3y 2 +4x+3y +1 = 0. Letting<br />

x = x1/x0 and y = x2/x0, <strong>we</strong> get x 2 1 −2x1x2 −3x 2 2 +4x1x0 +3x2x0 +x 2 0 = 0. If [0 : x1 : x2]<br />

is a point at infinity of <strong>the</strong> curve, <strong>we</strong> have x 2 1 −2x1x2 −3x 2 2 = 0, or (x1 −3x2)(x1 +x2) = 0.<br />

Hence <strong>the</strong>re are two points at infinity, namely [0 : 3 : 1] and [0 : −1 : 1]. <strong>In</strong> general, <strong>the</strong><br />

tangent line at a point p = [p0 : p1 : p2] on an algebraic curve f(x0, x1, x2) = 0 in <strong>the</strong><br />

projective plane is given by <strong>the</strong> recipe<br />

f0(p)x0 + f1(p)x1 + f2(p)x2 = 0, where fk = ∂f/∂xk; k = 0, 1, 2.<br />

At <strong>present</strong>, f0 = 4x1 + 3x2 + 2x0, f1 = 2x1 − 2x2 + 4x0, f2 = −2x1 − 6x2 + 3x0. At point<br />

[0 : 3 : 1], putting x0 = 0, x1 = 3 and x2 = 1, <strong>we</strong> get <strong>we</strong> have f0 = 15, f1 = 4, f2 = −12.<br />

So <strong>the</strong> tangent line at this point becomes 15x0 + 4x1 − 12x2 = 0. So <strong>the</strong> corresponding<br />

asymptote is 4x − 12y + 15 = 0. We can find <strong>the</strong> tangent line to <strong>the</strong> o<strong>the</strong>r point [0 : −1 : 1]<br />

at infinity in <strong>the</strong> same way to obtain <strong>the</strong> second asymptote.<br />

Projective spaces provide important examples in geometry and topology. When F is<br />

a finite field, Pn(F) is crucial for combinatorial design.<br />

31

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!