08.06.2015 Views

(Pure) heteroskedasticity is caused by the error term of a

(Pure) heteroskedasticity is caused by the error term of a

(Pure) heteroskedasticity is caused by the error term of a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OSU Economics 444: Elementary Econometrics<br />

Ch.10 Heteroskedasticity<br />

• (<strong>Pure</strong>) <strong>heteroskedasticity</strong> <strong>is</strong> <strong>caused</strong> <strong>by</strong> <strong>the</strong> <strong>error</strong> <strong>term</strong> <strong>of</strong> a correctly speciÞed equation:<br />

Var(² i )=σ 2 i ,<br />

i =1, 2, ···,n,<br />

i.e., <strong>the</strong> variance <strong>of</strong> <strong>the</strong> <strong>error</strong> <strong>term</strong> depends on exactly which oberservation <strong>is</strong>.<br />

1) Heteroskedasticity occurs in data sets in which <strong>the</strong>re <strong>is</strong> a wide d<strong>is</strong>parity between <strong>the</strong> largest and smallest<br />

observed values. We may expect that <strong>the</strong> <strong>error</strong> <strong>term</strong> for very large observations might have a large<br />

variance, but <strong>the</strong> <strong>error</strong> <strong>term</strong> for small observations might have a small variance.<br />

2) Heteroskedasticity <strong>is</strong> more likely to take place on cross-sectional models. Cross-sectional models <strong>of</strong>ten<br />

have observtions <strong>of</strong> widely different sizes in a sample.<br />

3) Heteroskedasticity may take on many complex forms.<br />

4) A simple but special model <strong>of</strong> <strong>heteroskedasticity</strong> assumes that <strong>the</strong> variance <strong>of</strong> <strong>the</strong> <strong>error</strong> <strong>term</strong> <strong>is</strong> related<br />

to an exogenous variable z:<br />

y i = β 0 + β 1 x 1i + ···+ β k x ki + ² i<br />

with<br />

var(² i )=σ 2 z 2 i .<br />

(a) The variance <strong>of</strong> ² i <strong>is</strong> proportional to <strong>the</strong> square <strong>of</strong> z i . The higher <strong>the</strong> value <strong>of</strong> z i , <strong>the</strong> higher <strong>the</strong><br />

variance <strong>of</strong> ² i .<br />

(b) An example: <strong>the</strong> consumption <strong>of</strong> a household to its income. The expenditures <strong>of</strong> a low income<br />

household are not likely to be as variable in absolute varlue as <strong>the</strong> expenditures <strong>of</strong> a high income<br />

one.<br />

Figure 10.3 here<br />

• Impure <strong>heteroskedasticity</strong><br />

— <strong>heteroskedasticity</strong> that <strong>is</strong> <strong>caused</strong> <strong>by</strong> an <strong>error</strong> in speciÞcation,suchasanomittedvariable.<br />

1) An omitted variable may cause a heteroskedastic <strong>error</strong> because <strong>the</strong> portion <strong>of</strong> <strong>the</strong> omitted effect not<br />

represented <strong>by</strong> included explanatory variables may be absorbed <strong>by</strong> <strong>the</strong> <strong>error</strong> <strong>term</strong>.<br />

2) The correct remedy <strong>is</strong> to Þnd <strong>the</strong> omitted variable and include it in <strong>the</strong> regression<br />

•• Consequences <strong>of</strong> (pure) Heteroskedasticity<br />

1) <strong>Pure</strong> <strong>heteroskedasticity</strong> does not cause bias in <strong>the</strong> OLSEs <strong>of</strong> <strong>the</strong> regression coefficients.<br />

(a) Consider <strong>the</strong> simple regression model y i = βx i + ² i with var(² i )=σi 2.<br />

The OLSE <strong>is</strong><br />

P n<br />

i=1 ˆβ = x P n<br />

P<br />

iy i<br />

n = β + Pi=1 x i² i<br />

n .<br />

i=1 x2 i<br />

i=1 x2 i<br />

Therefore,<br />

E( ˆβ) =β +<br />

P n<br />

i=1 x iE(² i |x i )<br />

P n<br />

i=1 x2 i<br />

= β.<br />

1


2) The Gauss-Markov <strong>the</strong>orem does not hold. The OLSE may not be <strong>the</strong> estimator with <strong>the</strong> smallest<br />

variance within <strong>the</strong> class <strong>of</strong> linear unbiased estimators.<br />

3) The variance formula for <strong>the</strong> OLSE <strong>is</strong> not correct. The variance formula tends to underestimate <strong>the</strong><br />

true variance <strong>of</strong> <strong>the</strong> OLSE.<br />

(a) For <strong>the</strong> simple regression model y i = βx i + ² i with var(² i )=σi 2 , <strong>the</strong> true variance <strong>of</strong> <strong>the</strong> OLSE ˆβ<br />

<strong>is</strong><br />

P n<br />

var(OLS ˆβ) i=1<br />

=<br />

x2 i σ2 i<br />

( P n .<br />

i=1 x2 i )2<br />

(b) The variance formula from <strong>the</strong> computer (ignoring heteroskedastic variances) <strong>is</strong><br />

P n<br />

i=1 e2 i 1<br />

P<br />

n − 1 n<br />

i=1 x2 i<br />

.Itcanbeshownthat<br />

E(<br />

nX<br />

e 2 i )=<br />

i=1<br />

n X<br />

i=1<br />

P n<br />

σi 2 − i=1 x2 i σ2 i<br />

P n<br />

i=1 x2 i<br />

If σi<br />

2 and x2 i are positively correlated, one has<br />

P n P n<br />

i=1 x2 i i=1 σ2 i − P n P n<br />

i=1 x2 i σ2 i<br />

(n − 1)( P i=1<br />

n<br />

≤<br />

x2 i σ2 i<br />

i=1 x2 i )2 ( P n .<br />

i=1 x2 i )2<br />

That <strong>is</strong>, <strong>the</strong> expected value <strong>of</strong> <strong>the</strong> estimated variance <strong>is</strong> smaller than <strong>the</strong> true variance.<br />

•• Testing for Heteroskedasticity<br />

There are many test stat<strong>is</strong>tics depending on models. The following are two familiar tests.<br />

• The Park Test<br />

It <strong>is</strong> designed to test possible <strong>heteroskedasticity</strong> <strong>of</strong> <strong>the</strong> form<br />

var(² i )=σ 2 z δ i .<br />

It has three steps:<br />

1. Obtain <strong>the</strong> OLS residuals: Estimate <strong>the</strong> regression model <strong>by</strong> OLS (ignoring possible <strong>heteroskedasticity</strong>)<br />

ˆβ and compute<br />

.<br />

e i = y i − ˆβ 0 − ˆβ 1 x 1i − ···− ˆβ k x ki ,<br />

i =1, ···,n.<br />

2. Run thhe regression<br />

ln(e 2 i )=α 0 + α 1 ln(z i )+u i ,<br />

where z i = <strong>is</strong> a possible (best choice) proportionality factor.<br />

3. Test <strong>the</strong> signiÞcance <strong>of</strong> ˆα with a t-test. If it <strong>is</strong> signiÞcant, th<strong>is</strong> <strong>is</strong> evidence <strong>of</strong> <strong>heteroskedasticity</strong>; o<strong>the</strong>rw<strong>is</strong>e,<br />

not.<br />

4. An empirical example: Woody’s restaurants<br />

OLSE:<br />

ŷ i =102, 192−9075N i +0.355P i +1.288I i<br />

(2053) (0.073) (0.543)<br />

t = − 4.42 4.88 2.37<br />

n =33 ¯R2 =0.579 F =15.65,<br />

2


where<br />

y = <strong>the</strong> check volume at a Woody’s restaurant<br />

N = <strong>the</strong> number <strong>of</strong> near<strong>by</strong> competitors<br />

P = <strong>the</strong> near<strong>by</strong> population<br />

I = <strong>the</strong> average household income <strong>of</strong> <strong>the</strong> local area.<br />

Park’ test: try to see if <strong>the</strong> residuals give any indication <strong>of</strong> <strong>heteroskedasticity</strong> <strong>by</strong> using <strong>the</strong> population<br />

P — because large <strong>error</strong> <strong>term</strong> variances might ex<strong>is</strong>t in more heavily populated areas.<br />

ˆ ln(e 2 i )=21.05−0.2865 ln P i<br />

(0.6263)<br />

t = − 0.457<br />

n =33 R 2 =0.0067 F =0.209.<br />

The calculated t-score <strong>of</strong> -0.457 <strong>is</strong> too small and <strong>the</strong>re <strong>is</strong> no strong evidence for <strong>heteroskedasticity</strong>.<br />

• The White Test<br />

It <strong>is</strong> more general than <strong>the</strong> Park test and does not need to decide on possible z factor (as in <strong>the</strong> Park<br />

test).<br />

1) It runs a regression with <strong>the</strong> squared residuals on all <strong>the</strong> original independent variables, <strong>the</strong>ir squares<br />

and cross products.<br />

2) For example, for y = β 0 + β 1 x 1 + β 2 x 2 + ², <strong>the</strong> White’s test regression equation <strong>is</strong><br />

e 2 i = α 0 + α 1 x 1i + α 2 x 2i + α 3 x 2 1i + α 4x 2 2i + α 5x 1i x 2i + u i .<br />

3) Test <strong>the</strong> overall signiÞcance <strong>of</strong> regression coefficients <strong>of</strong> <strong>the</strong> test regression <strong>of</strong> e 2 i (excluding constant <strong>term</strong>)<br />

<strong>by</strong> a F -stat<strong>is</strong>tic. Alternatively, use nR 2 ,whereR 2 from <strong>the</strong> test regression equation, as a chi-square<br />

test with degrees <strong>of</strong> freedom equal to <strong>the</strong> number <strong>of</strong> slope coefficients.<br />

•• Remedies for Heteroskedasticity<br />

• Weighted Least Squares — a version <strong>of</strong> GLS, specially for <strong>the</strong> heteroskedastic problem.<br />

The method <strong>is</strong> to transform <strong>the</strong> ² i into a new d<strong>is</strong>turbance with constant variance σ 2 .TheOLSapproach<br />

<strong>is</strong> <strong>the</strong>n applied to <strong>the</strong> transformed equation. The resulted OLS estimator for <strong>the</strong> transformed equation <strong>is</strong><br />

called <strong>the</strong> weighted least squares estimator.<br />

1) Th<strong>is</strong> approach requires knowledge on <strong>the</strong> speciÞcation <strong>of</strong> <strong>the</strong> variance function.<br />

2) For <strong>the</strong> model y i = β 0 + β 1 x 1i + ² i where <strong>the</strong> variance <strong>of</strong> ² i <strong>is</strong> speciÞed as<br />

var(² i )=σ 2 x 2 2 .<br />

The transformed equation <strong>is</strong><br />

y i<br />

z i<br />

= β 0<br />

1<br />

z i<br />

+ β 1<br />

x 1i<br />

z i<br />

+ u i ,<br />

3


ecause u i = ²i<br />

z i<br />

which <strong>is</strong> homoskedastic.<br />

a) Estimate <strong>the</strong> transformed equation <strong>by</strong> OLS with dependent variable y z<br />

1<br />

z i<br />

and x 1i<br />

z i<br />

.<br />

and explanatory variables<br />

b) Note <strong>the</strong> transformed equation may not have an intercept <strong>term</strong>. That <strong>is</strong> ok.<br />

c) An intercept <strong>term</strong> may appear if z <strong>is</strong> one <strong>of</strong> <strong>the</strong> explanatory variable x. For example, if z = x 1 ,<br />

<strong>the</strong>n <strong>the</strong> transformed equation <strong>is</strong><br />

y i 1<br />

= β 0 + β 1 + u i ,<br />

x 1i x 1i<br />

where β 1 becomes <strong>the</strong> intercept <strong>term</strong> in <strong>the</strong> transformed equation.<br />

3) The interpretation <strong>of</strong> <strong>the</strong> weighted least squares estimates should be <strong>the</strong> coefficients <strong>of</strong> <strong>the</strong> original (not<br />

transformed) regression equation.<br />

4) The weighted least squares <strong>is</strong> <strong>the</strong> BLUE (assuming that <strong>the</strong> variance function) <strong>is</strong> correctly speciÞed.<br />

• Robust variance estimates for OLSE with an unknown form <strong>of</strong> <strong>heteroskedasticity</strong><br />

1) The OLSE (<strong>by</strong> ignoring heteroskedastic variances) <strong>is</strong> unbiased, but <strong>the</strong> standard variance formula for<br />

<strong>the</strong> OLSE <strong>is</strong> valid.<br />

2) Th<strong>is</strong> approach <strong>is</strong> not attempting to get a possible better coefficient estimate. But, it attempts to get a<br />

valid (for large sample) estimate <strong>of</strong> <strong>the</strong> proper variance <strong>of</strong> an OLSE.<br />

3) For example, for <strong>the</strong> model y i = βx i + ² i (with only a single regressor and no intercept <strong>term</strong>, for<br />

illustration purpose), <strong>the</strong> <strong>heteroskedasticity</strong>-corrected standard <strong>error</strong>s <strong>of</strong> OLSE ˆβ <strong>is</strong><br />

P n<br />

i=1 x2 i e2 i<br />

( P n ,<br />

i=1 x2 i )2<br />

where e i s are <strong>the</strong> OLS residuals.<br />

4) The robust variance formula does not require any speciÞcation <strong>of</strong> <strong>the</strong> variance function. The technique<br />

works better in large samples.<br />

5) The robust variance can be used in t-tests in hypo<strong>the</strong>s<strong>is</strong> testing. – use <strong>the</strong> value <strong>of</strong> <strong>the</strong> robust variance<br />

in <strong>the</strong> denominator <strong>of</strong> <strong>the</strong> t ratio formulae.<br />

• RedeÞning <strong>the</strong> variables<br />

Select variables within <strong>the</strong>oretical reasoning in <strong>the</strong> formulation <strong>of</strong> a regression model which might be<br />

less likely subject to <strong>heteroskedasticity</strong>.<br />

1) For an example, consider a model <strong>of</strong> total expenditures (EXP) <strong>by</strong> governments <strong>of</strong> different cities that<br />

might be explained <strong>by</strong> aggregate income (INC), <strong>the</strong> population (POP), and <strong>the</strong> average wage (WAGE)<br />

in each city.<br />

A regression model speciÞed as<br />

EXP i = β 0 + β 1 POP i + β 2 INC i + β 3 WAGE i + ² i<br />

might likely have heteroskedastic d<strong>is</strong>turbances because larger cities have larger incomes and large expenditures<br />

thatn <strong>the</strong> smaller ones.<br />

4


Ano<strong>the</strong>r <strong>the</strong>oretical model may be<br />

EXP i<br />

POP i<br />

= α 0 + α 1<br />

INC i<br />

POP i<br />

+ α 2 W AGE i + ² i ,<br />

where <strong>the</strong> variables are formulated in per capita <strong>term</strong>s. The large and small size observations d<strong>is</strong>appear<br />

with <strong>the</strong> per capita variables and th<strong>is</strong> speciÞed equation might be less likely subject to <strong>the</strong> <strong>heteroskedasticity</strong><br />

<strong>is</strong>sue.<br />

• An empirical example:<br />

Try to explain petroleum consumption <strong>by</strong> state (PCON), using explanatory variables including <strong>the</strong> size<br />

<strong>of</strong> <strong>the</strong> state and gasoline tax rate (TAX).<br />

A possible speciÞcation <strong>is</strong><br />

PCON i = β 0 + β 1 REG i + β 2 TAX i + ² i ,<br />

where<br />

PCON i = petroleum consumption in <strong>the</strong> ith state<br />

REG i = motor vehicle reg<strong>is</strong>trations in <strong>the</strong> ith state<br />

TAX i = <strong>the</strong> gasoline tax rate in <strong>the</strong> ith state<br />

1) OLS approach: <strong>the</strong> estimated equation <strong>is</strong><br />

ˆ PCON i =551.7+0.1861REG i − 53.59TAX i<br />

(0.0117) (16.86)<br />

t =15.88 − 3.18<br />

¯R 2 =0.861 n =50.<br />

The estimated coefficients are signiÞcant and have <strong>the</strong> expected sign.<br />

2) The equation might be subject to <strong>heteroskedasticity</strong> <strong>caused</strong> <strong>by</strong> variation in <strong>the</strong> size <strong>of</strong> <strong>the</strong> states. A plot<br />

<strong>of</strong> <strong>the</strong> OLS residuals with respect to REG appear to follow a wider d<strong>is</strong>tribution for large values <strong>of</strong> REG<br />

than for small value <strong>of</strong> REG.<br />

(Figure 10.8 here)<br />

3) Run a Park test: with ln(REG) asfactor<br />

ˆ ln(e 2 i )=1.650+0.952 ln(REG i)<br />

(0.308)<br />

t =3.09<br />

¯R 2 =0.148 n =50 F =9.533.<br />

The critical t-value for a 1% two-tailed t-test <strong>is</strong> about 2.7. The computed t =3.09 <strong>is</strong> larger than 2.7<br />

and, hence, we reject <strong>the</strong> null hypo<strong>the</strong>s<strong>is</strong> <strong>of</strong> homoskedasticity.<br />

5


4) Use robust estimated variances for OLSEs<br />

ˆ PCON i =551.7+0.1861REG i − 53.59TAX i<br />

(0.022) (23.90)<br />

t =8.64 − 2.24<br />

¯R 2 =0.861 n =50.<br />

The robust variances <strong>of</strong> <strong>the</strong> OLSEs are larger than those without correction. So <strong>the</strong> uncorrected variance<br />

formulas underestimate <strong>the</strong> proper variances <strong>of</strong> <strong>the</strong> OLSE.<br />

5) Estimation with <strong>the</strong> weighted least squares method<br />

PCON ˆ i<br />

1<br />

=218.54 +0.168 − 17.389 TAX i<br />

REG i REG i REG i<br />

(0.014) (4.682)<br />

t =12.27 − 3.71<br />

¯R 2 =0.333 n =50.<br />

The weighted least squares estimates <strong>of</strong> β 1 and β 2 have smaller (estimated) standard <strong>error</strong>s than those<br />

<strong>of</strong> <strong>the</strong> OLSEs (compared with <strong>the</strong> robust variances) in 4). The overall Þt <strong>is</strong> worse but th<strong>is</strong> has no<br />

importance as <strong>the</strong> dependent variables are different in <strong>the</strong> two equations.<br />

6) An alternative formulation using per captit petroleum consumption (PCON<br />

POP)<br />

,wherePOP <strong>is</strong> <strong>the</strong> population<br />

<strong>of</strong> a state:<br />

PCON ˆ i<br />

=0.168+0.1082 REG i<br />

− 0.0103TAX i<br />

POP i<br />

POP i<br />

(0.0716) (0.0035)<br />

t =1.51 − 2.95<br />

¯R 2 =0.165 n =50.<br />

Th<strong>is</strong> approach <strong>is</strong> quite different. It <strong>is</strong> not necessarily better and <strong>is</strong> not directly comparable to <strong>the</strong> o<strong>the</strong>r<br />

equations. Which specÞcation <strong>is</strong> better will depend on <strong>the</strong> purposes <strong>of</strong> research.<br />

6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!