11.02.2014 Views

Lecture 4: Multivariate Regression Model in Matrix Form

Lecture 4: Multivariate Regression Model in Matrix Form

Lecture 4: Multivariate Regression Model in Matrix Form

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Takashi Yamano<br />

Fall Semester 2009<br />

<strong>Lecture</strong> Notes on Advanced Econometrics<br />

<strong>Lecture</strong> 4: <strong>Multivariate</strong> <strong>Regression</strong> <strong>Model</strong> <strong>in</strong> <strong>Matrix</strong> <strong>Form</strong><br />

In this lecture, we rewrite the multiple regression model <strong>in</strong> the matrix form. A general<br />

multiple-regression model can be written as<br />

y<br />

i<br />

= β<br />

0<br />

+ β1xi1<br />

+ β<br />

2xi2<br />

+ ... + β<br />

k<br />

xik<br />

+ ui<br />

for i = 1, … ,n.<br />

In matrix form, we can rewrite this model as<br />

⎡β<br />

0 ⎤<br />

⎡y1<br />

⎤ ⎡1<br />

x11<br />

x12<br />

... x1<br />

k ⎤ ⎢ ⎥ ⎡u1<br />

⎤<br />

⎢ ⎥ ⎢<br />

⎥ ⎢<br />

β1<br />

⎥ ⎢ ⎥<br />

⎢<br />

y2<br />

⎥ ⎢<br />

1 x21<br />

x22<br />

... x2k<br />

=<br />

⎥ ⎢ ⎥ + ⎢<br />

u2<br />

β ⎥<br />

2<br />

⎢ ⎥ ⎢<br />

⎥ ⎢ ⎥ ⎢ ⎥<br />

⎢ ⎥ ⎢<br />

⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣yn<br />

⎦ ⎣1<br />

xn<br />

1<br />

xn2<br />

... xnk<br />

⎦ ⎢ ⎥ ⎣u<br />

n ⎦<br />

⎣β<br />

k ⎦<br />

n x 1 n x (k+1) (k+1) x 1 n x 1<br />

Y = Xβ<br />

+ u<br />

We want to estimate β .<br />

Least Squared Residual Approach <strong>in</strong> <strong>Matrix</strong> <strong>Form</strong><br />

(Please see <strong>Lecture</strong> Note A1 for details)<br />

The strategy <strong>in</strong> the least squared residual approach is the same as <strong>in</strong> the bivariate l<strong>in</strong>ear<br />

regression model. First, we calculate the sum of squared residuals and, second, f<strong>in</strong>d a set<br />

of estimators that m<strong>in</strong>imize the sum. Thus, the m<strong>in</strong>imiz<strong>in</strong>g problem of the sum of the<br />

squared residuals <strong>in</strong> matrix form is<br />

m<strong>in</strong> u′<br />

u = ( Y − Xβ<br />

)′(<br />

Y − Xβ<br />

)<br />

1 x n n x 1<br />

Notice here that u′ u is a scalar or number (such as 10,000) because u′ is a 1 x n matrix<br />

and u is a n x 1 matrix and the product of these two matrices is a 1 x 1 matrix (thus a<br />

scalar). Then, we can take the first derivative of this object function <strong>in</strong> matrix form. First,<br />

we simplify the matrices:<br />

u′<br />

u = ( Y ′ − β ′ X ′)(<br />

Y − Xβ<br />

)<br />

= Y ′ Y − β ′ X ′ Y − Y ′ Xβ<br />

+ β ′ X ′ Xβ<br />

1


= Y ′ Y − 2 β ′ X ′ Y + β ′ X ′ Xβ<br />

Then, by tak<strong>in</strong>g the first derivative with respect to β , we have:<br />

∂(<br />

u′<br />

u)<br />

= −2X ′ Y + 2X ′ Xβ<br />

∂β<br />

From the first order condition (F.O.C.), we have<br />

− 2 X ′ Y + 2X ′ X ˆ β = 0<br />

X ′ Xβˆ<br />

= X ′ Y<br />

Notice that I have replaced β with βˆ because βˆ satisfy the F.O.C, by def<strong>in</strong>ition.<br />

Multiply the <strong>in</strong>verse matrix of ( X ′X )<br />

−1<br />

on the both sides, and we have:<br />

ˆ<br />

−1<br />

β = ( X ′ X ) X ′ Y<br />

(1)<br />

This is the least squared estimator for the multivariate regression l<strong>in</strong>ear model <strong>in</strong> matrix<br />

form. We call it as the Ord<strong>in</strong>ary Least Squared (OLS) estimator.<br />

Note that the first order conditions (4-2) can be written <strong>in</strong> matrix form as<br />

X ′( Y − X ˆ) β = 0<br />

⎡1<br />

1 ... 1 ⎤<br />

⎛ 1 ... ˆ ⎞<br />

⎜ ⎡y1<br />

⎤ ⎡ x11<br />

x1<br />

⎤ ⎡β<br />

⎤<br />

k 0 ⎟<br />

⎢<br />

...<br />

⎥ ⎜ ⎢ ⎥ ⎢<br />

1 ...<br />

⎥ ⎢ ⎥<br />

ˆ ⎟<br />

⎢<br />

x11<br />

x21<br />

xn<br />

1<br />

⎥ 2<br />

21 2 ⎢ 1 ⎥<br />

⎜ ⎢<br />

y<br />

⎥ ⎢<br />

x x<br />

k ⎥ β<br />

−<br />

⎢ ⎥<br />

⎟ = 0<br />

⎢<br />

⎥ ⎜ ⎢ ⎥ ⎢ ⎥<br />

⎢ ⎥<br />

⎟<br />

⎢<br />

⎥<br />

1 2<br />

...<br />

⎜ ⎢ ⎥ ⎢ ⎥<br />

1<br />

1<br />

... ⎢ ˆ ⎥<br />

⎟<br />

⎣x<br />

k<br />

x<br />

k<br />

xnk<br />

⎦<br />

⎝<br />

⎣yn<br />

⎦ ⎣ xn<br />

xnk<br />

⎦ ⎣β<br />

k ⎦⎠<br />

⎡1<br />

1 ... 1 ⎤<br />

⎛ ⎡ ˆ ˆ ... ˆ ⎞<br />

⎜<br />

y1<br />

− β<br />

0<br />

− β1<br />

x11<br />

− β ⎤<br />

k<br />

x1<br />

k ⎟<br />

⎢<br />

...<br />

⎥ ⎢<br />

⎥<br />

⎜ ˆ ˆ ... ˆ ⎟<br />

⎢<br />

x11<br />

x21<br />

xn<br />

1<br />

⎥ ⎢y2<br />

− β<br />

0<br />

− β1<br />

x21<br />

− β<br />

k<br />

x2k<br />

⎥<br />

⎜<br />

⎢<br />

⎥<br />

⎟ = 0<br />

⎢<br />

⎥ ⎜<br />

⎢<br />

⎥<br />

⎟<br />

⎢<br />

⎥<br />

1 2<br />

...<br />

⎜<br />

⎢ ˆ ˆ ... ˆ ⎥<br />

⎟<br />

⎣x<br />

k<br />

x<br />

k<br />

xnk<br />

⎦<br />

⎝ ⎣yn<br />

− β<br />

0<br />

− β1<br />

xn<br />

1<br />

− β<br />

k<br />

xnk<br />

⎦ ⎠<br />

(k+1) x n n x 1<br />

This is the same as the first order conditions, k+1 conditions, we derived <strong>in</strong> the previous<br />

lecture note (on the simple regression model):<br />

n ∧ ∧<br />

∧<br />

∑ ( yi<br />

− β<br />

0<br />

− β<br />

1<br />

xi1<br />

− β<br />

2<br />

xi2<br />

i=<br />

1<br />

−...<br />

−b<br />

x<br />

k<br />

ik<br />

)<br />

= 0<br />

2


n<br />

∑<br />

i=<br />

1<br />

∧<br />

xi<br />

1<br />

( yi<br />

− β − β xi<br />

1<br />

− β xi2<br />

−...<br />

−bk<br />

xik<br />

) = 0<br />

0<br />

∧<br />

1<br />

∧<br />

2<br />

n<br />

∑<br />

i=<br />

1<br />

x<br />

∧ ∧<br />

∧<br />

ik<br />

( yi<br />

− β<br />

0<br />

− β<br />

1<br />

xi<br />

1<br />

− β<br />

2<br />

xi2<br />

−...<br />

−b<br />

x<br />

k<br />

ik<br />

)<br />

= 0<br />

Example 4-1 : A bivariate l<strong>in</strong>ear regression (k=1) <strong>in</strong> matrix form<br />

As an example, let’s consider a bivariate model <strong>in</strong> matrix form. A bivariate model is<br />

y β + u for i = 1, …, n.<br />

i<br />

=<br />

0<br />

+ β 1<br />

xi<br />

1<br />

i<br />

In matrix form, this is<br />

Y = Xβ<br />

+ u<br />

⎡y<br />

⎢<br />

⎢<br />

y<br />

⎢<br />

⎢<br />

⎣y<br />

1<br />

2<br />

n<br />

⎤<br />

⎥<br />

⎥ =<br />

⎥<br />

⎥<br />

⎦<br />

⎡1<br />

⎢<br />

⎢<br />

1<br />

⎢<br />

⎢<br />

⎣1<br />

x<br />

x<br />

x<br />

1<br />

2<br />

n<br />

⎤<br />

⎥<br />

⎥<br />

⎡β<br />

0 ⎤<br />

⎥<br />

⎢ ⎥<br />

⎣β1<br />

⎦<br />

⎥<br />

⎦<br />

+<br />

⎡u<br />

⎢<br />

⎢<br />

u<br />

⎢<br />

⎢<br />

⎣u<br />

1<br />

2<br />

n<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

From (1), we have<br />

ˆ<br />

−1<br />

β = ( X ′ X ) X ′ Y<br />

(2)<br />

Let’s consider each component <strong>in</strong> (2).<br />

X ′ X<br />

⎡1<br />

= ⎢<br />

⎣x<br />

1<br />

1 ...<br />

x ...<br />

2<br />

1 ⎤<br />

x<br />

⎥<br />

n ⎦<br />

⎡1<br />

⎢<br />

⎢<br />

1<br />

⎢<br />

⎢<br />

⎣1<br />

x<br />

x<br />

x<br />

1<br />

2<br />

n<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎡<br />

⎢<br />

= ⎢<br />

⎢<br />

⎢<br />

⎣<br />

n<br />

∑<br />

i=<br />

1<br />

n<br />

∑<br />

1<br />

x<br />

n<br />

∑<br />

i=<br />

1<br />

n<br />

∑<br />

i<br />

i= 1 i=<br />

1<br />

x<br />

x<br />

i<br />

2<br />

i<br />

⎤<br />

⎥<br />

⎥ =<br />

⎥<br />

⎥<br />

⎦<br />

⎡ n<br />

⎢<br />

⎢nx<br />

⎢⎣<br />

nx ⎤<br />

⎥<br />

2<br />

x ⎥<br />

i<br />

⎥⎦<br />

n<br />

∑<br />

i=<br />

1<br />

This is a 2 x 2 square matrix. Thus, the <strong>in</strong>verse matrix of<br />

X ′ X is,<br />

n<br />

⎡<br />

′ ⎢ ∑ x<br />

−1 1<br />

( X X ) =<br />

n<br />

i=<br />

1<br />

⎢<br />

2 2<br />

n ∑ xi<br />

− n x ⎢⎣<br />

− nx<br />

i=<br />

1<br />

2<br />

i<br />

⎤<br />

− nx ⎥<br />

⎥<br />

n ⎥⎦<br />

3


4<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

−<br />

−<br />

−<br />

= ∑<br />

∑<br />

=<br />

=<br />

n<br />

nx<br />

nx<br />

x<br />

x<br />

x<br />

n<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

1<br />

2<br />

2<br />

1<br />

)<br />

(<br />

1<br />

The second term is<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

=<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

=<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎣<br />

⎡<br />

=<br />

′<br />

∑<br />

∑<br />

∑<br />

=<br />

=<br />

=<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

n<br />

n<br />

y<br />

x<br />

ny<br />

y<br />

x<br />

y<br />

y<br />

y<br />

y<br />

x<br />

x<br />

x<br />

X Y<br />

1<br />

1<br />

1<br />

2<br />

1<br />

2<br />

1 ...<br />

1<br />

...<br />

1<br />

1<br />

Thus the OLS estimators are:<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

−<br />

−<br />

−<br />

=<br />

′<br />

′<br />

=<br />

∑<br />

∑<br />

∑ =<br />

=<br />

=<br />

−<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

y<br />

x<br />

ny<br />

n<br />

nx<br />

nx<br />

x<br />

x<br />

x<br />

n<br />

X Y<br />

X X<br />

1<br />

1<br />

2<br />

2<br />

1<br />

1<br />

)<br />

(<br />

1<br />

)<br />

(<br />

ˆ<br />

β<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

+<br />

−<br />

−<br />

−<br />

=<br />

∑<br />

∑<br />

∑<br />

∑<br />

=<br />

=<br />

=<br />

=<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

y<br />

x<br />

n<br />

nx ny<br />

y<br />

x<br />

nx<br />

x<br />

ny<br />

x<br />

x<br />

n<br />

1<br />

1<br />

1<br />

2<br />

2<br />

1<br />

)<br />

(<br />

1<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

−<br />

−<br />

−<br />

=<br />

∑<br />

∑<br />

∑<br />

∑<br />

=<br />

=<br />

=<br />

=<br />

nxy<br />

y<br />

x<br />

y<br />

x<br />

x<br />

x<br />

y<br />

x<br />

x<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

1<br />

1<br />

1<br />

2<br />

2<br />

1<br />

)<br />

(<br />

1<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

−<br />

−<br />

+<br />

−<br />

−<br />

=<br />

∑<br />

∑<br />

∑<br />

∑<br />

=<br />

=<br />

=<br />

=<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

y<br />

x<br />

y<br />

x<br />

y<br />

x<br />

x<br />

x<br />

y<br />

x<br />

y<br />

x<br />

y<br />

x<br />

x<br />

1<br />

1<br />

2<br />

2<br />

1<br />

2<br />

2<br />

1<br />

)<br />

(<br />

)<br />

(<br />

1<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

−<br />

−<br />

−<br />

−<br />

−<br />

−<br />

=<br />

∑<br />

∑<br />

∑<br />

∑<br />

=<br />

=<br />

=<br />

=<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

i<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

y<br />

y<br />

x<br />

x<br />

x<br />

y<br />

y<br />

x<br />

x<br />

x<br />

x<br />

y<br />

x<br />

x<br />

1<br />

1<br />

2<br />

1<br />

2<br />

2<br />

1<br />

)<br />

)(<br />

(<br />

)<br />

(<br />

)<br />

(<br />

)<br />

(<br />

1<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎤<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

⎡<br />

−<br />

−<br />

−<br />

−<br />

=<br />

∑<br />

∑<br />

=<br />

=<br />

2<br />

1<br />

1<br />

1<br />

)<br />

(<br />

)<br />

)(<br />

(<br />

ˆ<br />

x<br />

x<br />

y<br />

y<br />

x<br />

x<br />

x<br />

y<br />

n<br />

i<br />

i<br />

n<br />

i<br />

i<br />

i<br />

β


⎡ ˆ β ⎤<br />

0<br />

= ⎢ ⎥<br />

⎢ ˆ<br />

⎣ β1<br />

⎥⎦<br />

This is what you studied <strong>in</strong> the previous lecture note.<br />

End of Example 4-1<br />

Unbiasedness of OLS<br />

In this sub-section, we show the unbiasedness of OLS under the follow<strong>in</strong>g assumptions.<br />

Assumptions:<br />

E 1 (L<strong>in</strong>ear <strong>in</strong> parameters): Y = Xβ<br />

+ u<br />

E 2 (Zero conditional mean): E ( u | X ) = 0<br />

E 3 (No perfect coll<strong>in</strong>earity): X has rank k.<br />

From (2), we know the OLS estimators are<br />

ˆ<br />

−1<br />

β = ( X ′ X ) X ′ Y<br />

We can replace y with the population model (E 1),<br />

ˆ 1<br />

−<br />

β = ( X ′ X ) X ′(<br />

Xβ<br />

+ u)<br />

−1<br />

−1<br />

= ( X ′ X ) X ′ Xβ<br />

+ ( X ′ X ) X ′ u<br />

−1<br />

= β + ( X ′ X ) X ′ u<br />

By tak<strong>in</strong>g the expectation on the both sides of the equation, we have:<br />

−1<br />

E ( ˆ) β = β + ( X ′ X ) E(<br />

X ′ u)<br />

,<br />

where E ( X ′ u)<br />

could be written as<br />

⎛ ⎡1<br />

⎜<br />

⎢<br />

⎜<br />

⎢<br />

x<br />

E(<br />

X ′ u)<br />

= E⎜<br />

⎢<br />

⎜<br />

⎢<br />

⎝ ⎣x<br />

⎛ ⎡<br />

⎜<br />

⎢<br />

⎜ i<br />

⎢<br />

⎜<br />

⎢<br />

⎜<br />

= E<br />

⎢<br />

⎜<br />

⎢<br />

i<br />

⎜<br />

⎢<br />

⎜<br />

⎢<br />

⎜<br />

⎢<br />

⎜<br />

⎝<br />

⎢⎣<br />

i<br />

11<br />

1k<br />

n<br />

∑<br />

= 1<br />

n<br />

∑<br />

= 1<br />

n<br />

∑<br />

= 1<br />

1 .... 1 ⎤⎡u1<br />

⎤⎞<br />

⎥<br />

⎟<br />

x<br />

⎥⎢<br />

21<br />

.... x1<br />

n<br />

⎥⎢<br />

u2<br />

⎥<br />

⎟<br />

⎥⎢<br />

⎥<br />

⎟<br />

⎥⎢<br />

⎥<br />

⎟<br />

x<br />

2k<br />

... xnk<br />

⎦⎣un<br />

⎦⎠<br />

⎤⎞<br />

u ⎟<br />

i ⎥<br />

⎥<br />

⎟<br />

⎟<br />

⎛ ⎡E(<br />

ui<br />

) × n ⎤⎞<br />

⎥ ⎜<br />

⎢ ⎥<br />

⎟<br />

x ⎟<br />

i1ui<br />

⎥ ⎜<br />

⎢<br />

E(<br />

xi1ui<br />

) × n<br />

⎥<br />

⎟<br />

⎥<br />

⎟ = ⎜<br />

⎢ ⎥<br />

⎟ .<br />

⎥<br />

⎟ ⎜<br />

⎟<br />

⎟<br />

⎢ ⎥<br />

⎥<br />

⎝ ⎣E(<br />

xikui<br />

) × n<br />

⎟<br />

⎦⎠<br />

x u ⎥<br />

ik i<br />

⎥<br />

⎟<br />

⎦⎠<br />

5


E( x u ik i<br />

) could be also written as E ( x ) ( ) cov( ,<br />

ikui<br />

= E xikui<br />

+ xik<br />

ui<br />

). The first term is zero<br />

because E ( u | X ) = E(<br />

u)<br />

= 0 . The second term is zero because E( u | X ) = 0 implies that u<br />

does not have any correlations with each <strong>in</strong>dependent variable <strong>in</strong> X. Therefore, we can<br />

conclude that E ( X ′u ) = 0 .<br />

F<strong>in</strong>ally, we can now f<strong>in</strong>d that the OLS estimators are unbiased under the assumptions E1-<br />

E3:<br />

E ( ˆ) β = β .<br />

The Variance of OLS Estimators<br />

Next, we consider the variance of the estimators.<br />

Assumption:<br />

E 4 (Homoskedasticity):<br />

2<br />

Var( ui | X ) = σ and ( , ) = 0<br />

2<br />

Cov u i<br />

u j<br />

, thus Var( u | X ) = σ I .<br />

Because of this assumption, we have<br />

2<br />

⎡u1<br />

⎤<br />

⎡E(<br />

u1u1)<br />

E(<br />

u1u2<br />

) E(<br />

u1un<br />

) ⎤ ⎡σ<br />

0 0 ⎤<br />

⎢<br />

u<br />

⎥<br />

⎢<br />

2<br />

2<br />

E(<br />

u2u1)<br />

E(<br />

u2u2<br />

) E(<br />

u2un<br />

)<br />

⎥ ⎢<br />

⎥<br />

0 σ 0 2<br />

E( uu′<br />

) = E⎢<br />

⎥ [ u1<br />

u2<br />

un<br />

] = ⎢<br />

⎥ =<br />

⎢<br />

⎥<br />

= σ I<br />

⎢ ⎥<br />

⎢<br />

⎥ ⎢<br />

⎥<br />

⎢ ⎥<br />

⎢<br />

⎥ ⎢<br />

⎥<br />

un<br />

E(<br />

unu<br />

E unu<br />

E unu<br />

2<br />

⎣ ⎦<br />

⎣ 1)<br />

(<br />

2<br />

) (<br />

n<br />

) ⎦ ⎢⎣<br />

0 0 σ ⎥⎦<br />

n x 1 1 x n n x n n x n n x n<br />

Therefore,<br />

−1<br />

Var ( ˆ) β = Var[<br />

β + ( X ′ X ) X ′ u]<br />

−1<br />

= Var [( X ′ X ) X ′ u]<br />

= E [( X ′ X )<br />

− X ′ uu′<br />

X ( X ′ X )<br />

1 −1<br />

− 1<br />

= ( X ′ X ) X ′ E(<br />

uu′<br />

) X ( X ′ X )<br />

]<br />

−1<br />

− 1 2<br />

−1<br />

X ′ X ) X ′ σ I X ( X ′ ) (E4: Homoskedasticity)<br />

= (<br />

X<br />

Var(<br />

ˆ)<br />

′<br />

2 −1<br />

β = σ ( X X )<br />

(3)<br />

GAUSS-MARKOV Theorem:<br />

Under assumptions 1 – 4, βˆ is the Best L<strong>in</strong>ear Unbiased Estimator (BLUE).<br />

6


Example 4-2: Step by Step <strong>Regression</strong> Estimation by STATA<br />

In this sub-section, I would like to show you how the matrix calculations we have studied<br />

are used <strong>in</strong> econometrics packages. Of course, <strong>in</strong> practices you do not create matrix<br />

programs: econometrics packages already have built-<strong>in</strong> programs.<br />

The follow<strong>in</strong>g are matrix calculations with STATA us<strong>in</strong>g data called,<br />

NFIncomeUganda.dta. Here we want to estimate the follow<strong>in</strong>g model:<br />

ln( <strong>in</strong>come ) y β β β β edusq + u<br />

i<br />

=<br />

0<br />

+<br />

1<br />

femalei<br />

+<br />

2edui<br />

+<br />

3<br />

i<br />

i<br />

All the variables are def<strong>in</strong>ed <strong>in</strong> Example 3-1. Descriptive <strong>in</strong>formation about the variables<br />

are here:<br />

. su;<br />

Variable | Obs Mean Std. Dev. M<strong>in</strong> Max<br />

-------------+--------------------------------------------------------<br />

female | 648 .2222222 .4160609 0 1<br />

edu | 648 6.476852 4.198633 -8 19<br />

edusq | 648 59.55093 63.28897 0 361<br />

-------------+--------------------------------------------------------<br />

ln_<strong>in</strong>come | 648 12.81736 1.505715 7.600903 16.88356<br />

First, we need to def<strong>in</strong>e matrices. In STATA, you can load specific variables (data) <strong>in</strong>to<br />

matrices. The command is called mkmat. Here we create a matrix, called y, conta<strong>in</strong><strong>in</strong>g<br />

the dependent variable, ln_nf<strong>in</strong>come, and a set of <strong>in</strong>dependent variables, called x,<br />

conta<strong>in</strong><strong>in</strong>g female, educ, educsq.<br />

. mkmat ln_nf<strong>in</strong>come, matrix(y)<br />

. mkmat female educ educsq, matrix(x)<br />

Then, we create some components:<br />

X ′ X ,<br />

(<br />

′X )<br />

. matrix xx=x'*x;<br />

. mat list xx;<br />

symmetric xx[4,4]<br />

female edu edusq const<br />

female 144<br />

edu 878 38589<br />

edusq 8408 407073 4889565<br />

const 144 4197 38589 648<br />

−1<br />

X , and Y<br />

X ′ :<br />

. matrix ixx=sym<strong>in</strong>v(xx);<br />

. mat list ixx;<br />

symmetric ixx[4,4]<br />

female edu edusq const<br />

female .0090144<br />

edu .00021374 .00053764<br />

edusq -.00001238 -.00003259 2.361e-06<br />

const -.00265043 -.0015892 .00007321 .00806547<br />

7


Here is<br />

X ′ Y :<br />

. matrix xy=x'*y;<br />

. mat list xy;<br />

xy[4,1]<br />

ln_nf<strong>in</strong>come<br />

female 1775.6364<br />

edu 55413.766<br />

edusq 519507.74<br />

const 8305.6492<br />

Therefore the OLS estimators are<br />

−1<br />

( X ′ X ) X ′ Y :<br />

. ** Estimat<strong>in</strong>g b hat;<br />

. matrix bhat=ixx*xy;<br />

. mat list bhat;<br />

bhat[4,1]<br />

ln_nf<strong>in</strong>come<br />

female -.59366458<br />

edu .04428822<br />

edusq .00688388<br />

const 12.252496<br />

. ** Estimat<strong>in</strong>g standard error for b hat;<br />

. matrix e=y-x*bhat;<br />

. matrix ss=(e'*e)/(648-1-3);<br />

. matrix kk=vecdiag(ixx);<br />

. mat list ss;<br />

symmetric ss[1,1]<br />

ln_nf<strong>in</strong>come<br />

ln_nf<strong>in</strong>come 1.8356443<br />

. mat list kk;<br />

kk[1,4]<br />

female edu edusq const<br />

r1 .0090144 .00053764 2.361e-06 .00806547<br />

Let’s verify what we have found.<br />

. reg ln_nf<strong>in</strong>come female edu edusq;<br />

Source | SS df MS Number of obs = 648<br />

-------------+------------------------------ F( 3, 644) = 51.70<br />

<strong>Model</strong> | 284.709551 3 94.9031835 Prob > F = 0.0000<br />

Residual | 1182.15494 644 1.83564431 R-squared = 0.1941<br />

-------------+------------------------------ Adj R-squared = 0.1903<br />

Total | 1466.86449 647 2.2671785 Root MSE = 1.3549<br />

------------------------------------------------------------------------------<br />

ln_nf<strong>in</strong>come | Coef. Std. Err. t P>|t| [95% Conf. Interval]<br />

-------------+----------------------------------------------------------------<br />

female | -.5936646 .1286361 -4.62 0.000 -.8462613 -.3410678<br />

edu | .0442882 .0314153 1.41 0.159 -.0174005 .105977<br />

edusq | .0068839 .0020818 3.31 0.001 .002796 .0109718<br />

_cons | 12.2525 .1216772 100.70 0.000 12.01356 12.49143<br />

------------------------------------------------------------------------------<br />

end of do-file<br />

8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!