MLE estimation in STATA
MLE estimation in STATA
MLE estimation in STATA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Notes on <strong>MLE</strong> <strong>in</strong> <strong>STATA</strong><br />
July 31, 2013<br />
NOTE: I will not test you on this material, but I provide our notes here <strong>in</strong> case you need them <strong>in</strong> the future<br />
(e.g., for a homework problem).<br />
A good guide for us<strong>in</strong>g <strong>MLE</strong> <strong>in</strong> <strong>STATA</strong> has been written by Marco R. Steenbergen. I uploaded it to my website<br />
just <strong>in</strong> case; you can f<strong>in</strong>d it at http://www.econ.umn.edu/~evdok003/<strong>MLE</strong>_<strong>in</strong>_stata.pdf. I will illustrate the<br />
basic techniques with some examples, progressively more difficult.<br />
where<br />
We start with the simplest possible model:<br />
The likelihood function is given by<br />
f(y i |x i , σ 2 ) =<br />
y i = x i β + ɛ i , e i ∼ i.i.d. N(0, σ 2 )<br />
L = Π N i=1f(y i |x i , θ),<br />
1<br />
√ (y 2πσ<br />
2 e− i −x i β)<br />
2σ 2 == 1 σ<br />
1<br />
√ e − ( y i −µ i )<br />
σ<br />
2<br />
2 = 1 2π σ φ(y i − µ i<br />
).<br />
σ<br />
φ here is the standard normal distribution and µ i the conditional mean x i β. We can therefore write the likelihood<br />
function <strong>in</strong> terms of φ as<br />
L = Π N 1<br />
i=1<br />
σ φ(y i − µ i<br />
).<br />
σ<br />
Therefore,<br />
N∑<br />
ln(L) = ln(φ( y i − µ i β<br />
N∑<br />
) − ln(σ)) = ln(f i ).<br />
σ<br />
The term <strong>in</strong>side the sum, ln(f i ), is programmed <strong>in</strong> <strong>STATA</strong> as follows<br />
program def<strong>in</strong>e normal<br />
version 1.0<br />
args lnf mu sigma<br />
quietly replace ‘lnf’=ln(normd(($ML_y1-‘mu’)/‘sigma’))-ln(‘sigma’)<br />
end<br />
i=1<br />
The layout of the program is standard, so it is useful to go over it. The first l<strong>in</strong>e tells us that we are def<strong>in</strong><strong>in</strong>g a<br />
program called “normal.” The second l<strong>in</strong>e is optional and specifies the version of the program. The third l<strong>in</strong>e def<strong>in</strong>es<br />
the arguments of the program, which are “ln(f i )”; the parameters “ln(f i )” depends on, which are the mean (µ i ) and<br />
the variance (σ) of the distribution of our ɛ i ; and y i . Notice that we are tell<strong>in</strong>g the program how y i enters <strong>in</strong>to ln(f i )!<br />
We then use the program def<strong>in</strong>ed above to create the full log-likelihood. This is done us<strong>in</strong>g the follow<strong>in</strong>g l<strong>in</strong>e<br />
ml model lf normal (y=x) (y)<br />
Above, “lf” stands for “l<strong>in</strong>ear form.” This command tells <strong>STATA</strong> that ln(L) = ln(f 1 ) + ln(f 2 ) + ... + ln(f N ). This<br />
will be true for every model we consider below. “normal” tells stata what the shape of each <strong>in</strong>dividual ln(f i ) is.<br />
The two sets of opened and closed parentheses tell <strong>STATA</strong> to estimate two parameters, us<strong>in</strong>g dependent variable<br />
y. The first parameter is µ i , which depends on some regressor x i (s<strong>in</strong>ce µ i = x i β), and the second parameter is the<br />
variance, which does not depend on any regressor.<br />
i=1<br />
F<strong>in</strong>ally, we type<br />
1
ml max<br />
to maximize the log-likelihood. This will provide you the <strong>MLE</strong> estimates ˆβ <strong>MLE</strong> and ˆσ <strong>MLE</strong> that you derived analytically<br />
<strong>in</strong> your first homework, as well as their standard errors.<br />
We next follow Bresnahan and Reiss (1991), look<strong>in</strong>g at the market for tire manufacturers as an application of<br />
the <strong>MLE</strong> method. Our dataset (available at http:///www.econ.umn.edu/~evdok003/BR.csv) <strong>in</strong>cludes markets<br />
with 0, 1, 2, 3, 4, 5 and > 5 entrants. First, we will focus on markets with 0 or 1 producers, model<strong>in</strong>g market entry<br />
us<strong>in</strong>g a probit specification. Thus, {<br />
y i = 1 if Π 1 > 0<br />
,<br />
0 otherwise<br />
where Π 1 are the firm’s profits. We further assume that Π 1 = ¯Π 1 + ɛ, ¯Π 1 be<strong>in</strong>g the firm’s expected profits, a<br />
nonl<strong>in</strong>ear function of a number of regressors, and ɛ ∼ N(0, σ 2 ). NOTE THAT THE NONLINEAR NATURE OF<br />
THE EXPECTED PROFIT FUNCTION MAKES <strong>STATA</strong>’S PROBIT COMMAND INAPPLICABLE. We must<br />
therefore build the <strong>MLE</strong> estimator from scratch.<br />
The regressors are:<br />
• tpop - town population<br />
• opop - nearby population<br />
• ngrw - negative tpop growth<br />
• pgrw - positive tpop growth<br />
• octy - commuters out of county<br />
• landv - value per acre of farm-land and build<strong>in</strong>gs<br />
• eld - percentage of the country population be<strong>in</strong>g 65 or older<br />
• ffrac - fraction of land <strong>in</strong> farms<br />
• p<strong>in</strong>c - per capita <strong>in</strong>come<br />
• lnhdd - log of heat<strong>in</strong>g degree days<br />
The specific form of the expected profit function is assumed to be<br />
¯Π 1 = S(Y ; λ) · V 1 (Z; α 1 , β) − F 1 (W ; γ).<br />
Here, S(Y ; λ) is a measure of market size, which is a function of population parameters Y . V 1 (Z; α 1 , β) is a<br />
measure of per-capita demand, which depends on demand shifters Z. F 1 (W ; γ) is a measure of costs, which depends<br />
on cost shifters W . We assume that S, V 1 and F 1 depend on the regressors l<strong>in</strong>early <strong>in</strong> the follow<strong>in</strong>g way:<br />
S(Y ; λ) = tpop + λ 1 opop + λ 2 ngrw + λ 3 pgrw + λ 4 octy<br />
V 1 (W, Z; α 1 , β) = α 1 + β 1 eld + β 2 p<strong>in</strong>c + β 3 lnhdd + β 4 ffrac<br />
F 1 (W ; γ) = γ 1 + γ L landv<br />
Notice that our assumptions imply that P (N = 1|Y, Z, W ) = Φ(¯Π 1 ).<br />
The model can be estimated with <strong>STATA</strong> us<strong>in</strong>g the follow<strong>in</strong>g code:<br />
<strong>in</strong>sheet us<strong>in</strong>g http://www.econ.umn.edu/~evdok003/BR.csv, clear<br />
drop if tire>1<br />
program monentry<br />
version 1.0<br />
2
args lnf s v f<br />
quietly replace ‘lnf’=ln(normal(‘s’*‘v’-‘f’)) if $ML_y1==1<br />
quietly replace ‘lnf’=ln(1-normal(‘s’*‘v’-‘f’)) if $ML_y1==0<br />
end<br />
ml model lf monentry (lambda:tire=opop ngrw pgrw octy,nocons offset(tpop)) \\\ (beta:tire=eld p<strong>in</strong>c lnhdd ff<br />
ml search lambda -1 1 beta 0 1 gammaL -1 1<br />
ml max<br />
The results are<br />
Table 1: Estimation results (probit model)<br />
Variable Coefficient (Std. Err.)<br />
opop (λ 1 ) 7.404 (21.593)<br />
ngrw (λ 2 ) -12.980 (44.674)<br />
pgrw (λ 3 ) 21.061 (65.999)<br />
octy (λ 4 ) -3.349 (15.151)<br />
α 1 -0.163 (1.173)<br />
eld (β 1 ) 0.431 (0.977)<br />
p<strong>in</strong>c (β 2 ) 0.031 (0.083)<br />
lnhdd (β 3 ) -0.007 (0.139)<br />
ffrac (β 4 ) 0.182 (0.397)<br />
landv (γ L ) -0.725 (0.727)<br />
γ 1 0.802 ∗ (0.366)<br />
We will now look at the whole dataset and model the number of entrants us<strong>in</strong>g ordered probit. Follow<strong>in</strong>g<br />
standard economic theory, we assume that profits are highest <strong>in</strong> a monopoly, and that entry of additional firms<br />
drives them down. Thus ¯Π 1 > ¯Π 2 > ¯Π 3 > ¯Π 4 > ¯Π 5 . Follow<strong>in</strong>g the standard ordered logit argument,<br />
P (N = 0|Y, Z, W ) = 1 − Φ( ¯Π 1 )<br />
P (N = J|Y, Z, W ) = Φ(¯Π J ) − Φ( ¯Π J+1 ) ∀J = 1, 2, 3, 4<br />
P (N ≥ 5|Y, Z, W ) = Φ(¯Π 5 )<br />
where ¯Π N = S(Y ; λ) · V N (W, Z; α, β) − F N (W ; γ).<br />
The specific forms of S, V N and F N are similar to that of F 1 , V 1 , and F 1 that we worked with above. Bresnahan<br />
and Reiss assume, however, that α 4 = 0, so we will impose the same restriction.<br />
Aga<strong>in</strong>, the method of Maximum Likelihood can be used to estimate the parameters of the model. The <strong>STATA</strong> code<br />
is provided below:<br />
<strong>in</strong>sheet us<strong>in</strong>g http://www.econ.umn.edu/~evdok003/BR.csv, clear<br />
program firmentry<br />
version 1.0<br />
args lnf s v f alpha2 alpha3 alpha4 alpha5 gamma2 gamma3 gamma4 gamma5<br />
tempvar p2 p3 p4 p5<br />
qui gen double ‘p2’=normal(‘s’*(‘v’-‘alpha2’)-‘f’-‘gamma2’)<br />
qui gen double ‘p3’=normal(‘s’*(‘v’-‘alpha2’-‘alpha3’)-‘f’-‘gamma2’-‘gamma3’)<br />
qui gen double ‘p4’=normal(‘s’*(‘v’-‘alpha2’-‘alpha3’-‘alpha4’)-‘f’-‘gamma2’-‘gamma3’-‘gamma4’)<br />
qui gen double ‘p5’=normal(‘s’*(‘v’-‘alpha2’-‘alpha3’-‘alpha4’-‘alpha5’)-‘f’-‘gamma2’-‘gamma3’\\\<br />
-‘gamma4’-‘gamma5’)<br />
quietly replace ‘lnf’=ln(1-normal(‘s’*(‘v’)-‘f’)) if $ML_y1==0<br />
quietly replace ‘lnf’=ln(normal(‘s’*(‘v’)-‘f’)-‘p2’) if $ML_y1==1<br />
quietly replace ‘lnf’=ln(‘p2’-‘p3’) if $ML_y1==2<br />
quietly replace ‘lnf’=ln(‘p3’-‘p4’) if $ML_y1==3<br />
3
quietly replace ‘lnf’=ln(‘p4’-‘p5’) if $ML_y1==4<br />
quietly replace ‘lnf’=ln(‘p5’) if $ML_y1>=5<br />
end<br />
constra<strong>in</strong>t 1 [alpha4]_cons=0<br />
ml model lf firmentry (lambda:tire=opop ngrw pgrw octy,nocons offset(tpop)) \\\<br />
(beta:tire=eld p<strong>in</strong>c lnhdd ffrac) (gammaL:tire=landv) (alpha2:tire=) \\\<br />
(alpha3:tire=) (alpha4:tire=) (alpha5:tire=) (gamma2:tire=) (gamma3:tire=) \\\<br />
(gamma4:tire=) (gamma5:tire=), constra<strong>in</strong>t(1)<br />
ml search lambda 0 50 beta 0 1 alpha2 0 1 alpha3 0 1 alpha4 0 1 alpha5 0 1 \\\<br />
gamma2 0 1 gamma3 0 1 gamma4 0 1 gamma5 0 1 gammaL -1 1<br />
ml max<br />
Table 2: Estimation results (ordered probit)<br />
Variable Coefficient (Std. Err.)<br />
opop (λ 1 ) -0.532 (0.404)<br />
ngrw (λ 2 ) 2.253 ∗ (0.976)<br />
pgrw (λ 3 ) 0.343 (0.612)<br />
octy (λ 4 ) 0.227 (0.408)<br />
eld (β 1 ) -0.488 (0.626)<br />
p<strong>in</strong>c (β 2 ) -0.031 (0.029)<br />
lnhdd (β 3 ) 0.004 (0.056)<br />
ffrac (β 4 ) -0.021 (0.077)<br />
α 1 0.863 † (0.464)<br />
α 2 0.035 (0.116)<br />
α 3 0.150 (0.093)<br />
α 4 0.000 (0.000)<br />
α 5 0.081 (0.050)<br />
landv (γ L ) -0.737 † (0.403)<br />
γ 1 0.529 ∗ (0.220)<br />
γ 2 0.756 ∗∗ (0.186)<br />
γ 3 0.465 ∗ (0.196)<br />
γ 4 0.598 ∗∗ (0.113)<br />
γ 5 0.120 (0.174)<br />
Notice that this replicates the last column of Table 4 <strong>in</strong> the paper (p. 994).<br />
4