Classes of Discrete Variable
Classes of Discrete Variable
Classes of Discrete Variable
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Multiple <strong>Discrete</strong> Choice Models<br />
<strong>Classes</strong> <strong>of</strong> <strong>Discrete</strong> <strong>Variable</strong><br />
Contents:<br />
Ordered Probit<br />
Ordered Logit<br />
Methods <strong>of</strong> Estimation<br />
Sequential <strong>Discrete</strong> Choice models<br />
The Bivariate Probit model<br />
The Multinomial Logit model<br />
Econometrics 2 (SS 2008) 1 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Ordered Probit/Logit Model<br />
Sometimes a simple binary choice model is inappropriate:<br />
eg. model <strong>of</strong> labour market status<br />
degree <strong>of</strong> satisfaction<br />
number <strong>of</strong> cars owned<br />
Each <strong>of</strong> these examples involves more than two possible outcomes.<br />
One possible model specification: the Ordered Probit or Logit<br />
model:<br />
appropriate when discrete outcomes have a natural (ordinal) ranking<br />
major advantage: the resulting model is relatively easy to estimate<br />
down-side: the behavioural model may be considered too restrictive<br />
Econometrics 2 (SS 2008) 2 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Ordered Probit/Logit Model<br />
Consider an independent sample <strong>of</strong> data {y i , x i } <strong>of</strong> size n.<br />
Let y i have M possible outcomes y i = m for m = 1, ..., M and<br />
natural ordering (e.g. m + 1 is in some sense better than m).<br />
Consider a latent variable y ∗<br />
i<br />
where<br />
y ∗<br />
i = x ′ i β + u i for i = 1, ..., n<br />
Define the following observability criterion:<br />
y i = m if α m−1 ≤ y ∗<br />
i ≤ α m for m = 1, ..., M,<br />
α 0 < α 1 < α 2 < ... < α M ,<br />
α 0 = −∞ and α M = ∞<br />
Econometrics 2 (SS 2008) 3 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Ordered Probit/Logit Model<br />
The conditional probability <strong>of</strong> observing y i = m is<br />
P(y i = m|x i ) = P(α m−1 ≤ y ∗<br />
i ≤ α m )<br />
= P(α m−1 ≤ x ′ i β + u i ≤ α m )<br />
Rearranging gives for m = 1, ..., M<br />
P(y i = m|x i ) = P(α m−1 − x ′ i β ≤ u i ≤ α m − x ′ i β)<br />
= P(u i ≤ α m − x ′ i β) − P(u i ≤ α m−1 − x ′ i β)<br />
Need a distribution for u i<br />
u i std normal gives the Ordered Probit<br />
u i logistic gives the Ordered Logit<br />
Econometrics 2 (SS 2008) 4 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Ordered Probit: graphical representation<br />
eg. let u i ∼ N(0, 1). Then<br />
P(y i = m|x i ) = Φ(α m − x ′ i β) − Φ(α m−1 − x ′ i β)<br />
Econometrics 2 (SS 2008) 5 / 25
Estimation<br />
Multiple <strong>Discrete</strong> Choice Models<br />
Estimate this non-linear model by maximum likelihood:<br />
let z im = 1I(y i = m), for m = 1, ..., M,<br />
then the ith likelihood contribution is<br />
L i =<br />
=<br />
M∏<br />
P(y i = m|x i ) z im<br />
m=1<br />
M∏<br />
[Φ(α m − x i ′ β) − Φ(α m−1 − x i ′ β)] z im<br />
.<br />
m=1<br />
The full likelihood function becomes<br />
L(α, β) =<br />
n∏<br />
i=1 m=1<br />
M∏<br />
[Φ(α m − x i ′ β) − Φ(α m−1 − x i ′ β)] z im<br />
.<br />
Econometrics 2 (SS 2008) 6 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Estimation<br />
Taking logs,<br />
l =<br />
n∑ M∑<br />
z im ln[Φ(α m − x i ′ β) − Φ(α m−1 − x i ′ β)].<br />
i=1 m=1<br />
For ML estimates, solve<br />
discuss conditions<br />
discuss consequences<br />
∂l<br />
∂α = 0 and ∂l<br />
∂β = 0.<br />
Econometrics 2 (SS 2008) 7 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Sequential Probit/Logit model<br />
What if decisions / alternatives are not independent?<br />
Take as an example a sequential decision rule:<br />
Can be used when dependent variable can be separated into a<br />
sequence <strong>of</strong> binary choices.<br />
For the simplest sequential model, we also assume u i independent.<br />
Some examples:<br />
Econometrics 2 (SS 2008) 8 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Sequential Probit/Logit model: Example 1<br />
labour force status<br />
Econometrics 2 (SS 2008) 9 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Sequential Probit/Logit model: Example 2<br />
transport mode<br />
Econometrics 2 (SS 2008) 10 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Sequential Probit/Logit model<br />
Consider a sample <strong>of</strong> data {y 0i , y 1i , x i , z i }.<br />
Let y 0i represent a binary indicator variable for some discrete choice.<br />
Let y 1i represent a second discrete choice, observed only when<br />
y 0i = 1.<br />
Let the k 0 explanatory variables x i influence the first choice.<br />
Let the k 1 explanatory variables z i influence the conditional choice.<br />
For the first stage, assume with u 0i ∼ N(0, 1) iid<br />
y ∗ 0i = x ′ i β 0 + u 0i<br />
Econometrics 2 (SS 2008) 11 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Sequential Probit/Logit Model<br />
Observe y 0i = 1I(y0i ∗ > 0).<br />
Hence P(y 0i = 1|x i ) = Φ(x<br />
i ′ β 0).<br />
Estimation by standard Probit MLE on the full sample.<br />
For the second stage, note first that<br />
P(y 0i = 1, y 1i = 1) = P(y 0i = 1) ∗ P(y 1i = 1|y 0i = 1).<br />
Hence, select a sample <strong>of</strong> the n 1 observations for which y 0i = 1.<br />
Define for u 1i ∼ N(0, 1) iid<br />
y ∗ 1i = z ′ i β 1 + u 1i<br />
Econometrics 2 (SS 2008) 12 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Sequential Probit/Logit Model<br />
For the second stage y 1i = 1I(y1i ∗ > 0). So,<br />
P(y 1i = 1|z i ) = Φ(z i ′ β 1 ).<br />
Estimation by standard Probit MLE on the selected sample.<br />
The overall probabilities <strong>of</strong> the three possible outcomes are<br />
P(y 0i = 0|x i ) = 1 − Φ(x i ′ β 0 )<br />
P(y 0i = 1, y 1i = 0|x i , z i ) = Φ(x i ′ β 0 ) ∗ [1 − Φ(z i ′ β 1 )]<br />
P(y 0i = 1, y 1i = 1|x i , z i ) = Φ(x i ′ β 0 ) ∗ Φ(z i ′ β 1 )<br />
Upside: easy to estimate<br />
Downside: ignores a potential correlation between u 0i and u 1i .<br />
Econometrics 2 (SS 2008) 13 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Bivariate Probit Model<br />
Binary decisions may form part <strong>of</strong> a system <strong>of</strong> choices rather than a<br />
sequence, eg. simultaneous decisions <strong>of</strong> work and take-up <strong>of</strong> paid<br />
childcare.<br />
Can apply the Bivariate Probit in these circumstances:<br />
Consider {y 0i , y 1i , x 0i , x 1i } for i = 1, ..., N.<br />
Here, y 0i and y 1i represent two binary indicator variables.<br />
Assume an underlying system <strong>of</strong> propensities:<br />
y ∗ 0i = x ′ 0iβ 0 + u 0i ,<br />
y ∗ 1i = x ′ 1iβ 1 + u 1i .<br />
Econometrics 2 (SS 2008) 14 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Bivariate Probit Model<br />
The observability criteria:<br />
y 0i = 1I(y ∗ 0i > 0),<br />
y 1i = 1I(y ∗ 1i > 0).<br />
For a Bivariate Probit model, u 0i and u 1i are bivariate normal:<br />
1<br />
φ 2 (u 0 , u 1 ; ρ) =<br />
2π(1 − ρ 2 ) 1 2<br />
Φ 2 (u 0 , u 1 ; ρ) =<br />
∫ u1<br />
−∞<br />
∫ u0<br />
−∞<br />
∗ exp(− u2 0 + u2 1 − 2ρu 0u 1<br />
1 − ρ 2 )<br />
φ 2 (u, v; ρ)∂u∂v<br />
Note that when ρ = 0, Φ 2 (u 0 , u 1 ; 0) = Φ(u 0 ) ∗ Φ(u 1 ).<br />
Econometrics 2 (SS 2008) 15 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Estimating a Bivariate Probit<br />
Derive probabilities P jk for j, k = 0, 1.<br />
For example,<br />
P 00i = P(y 0i = 0, y 1i = 0|x 0i , x 1i )<br />
= P(y0i ∗ ≤ 0, y1i ∗ ≤ 0|x 0i , x 1i )<br />
= P(u 0i ≤ −x 0iβ ′ 0 , u 1i ≤ −x 1iβ ′ 1 )<br />
= Φ 2 (−x 0iβ ′ 0 , −x 1iβ ′ 1 ; ρ).<br />
Similarly,<br />
P 11i = P(y 0i = 1, y 1i = 1|x 0i , x 1i ) = Φ 2 (x 0iβ ′ 0 , x 1iβ ′ 1 ),<br />
P 01i = P(y 0i = 0, y 1i = 1|x 0i , x 1i ) = Φ(x 1iβ ′ 1 ) − P 11i ,<br />
P 10i = P(y 0i = 1, y 1i = 0|x 0i , x 1i ) = Φ(x 0iβ ′ 0 ) − P 11i .<br />
Econometrics 2 (SS 2008) 16 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Estimating a Bivariate Probit<br />
Contours <strong>of</strong> the bivariate normal distribution<br />
Econometrics 2 (SS 2008) 17 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Estimating a Bivariate Probit<br />
Bivariate Probit probabilities<br />
Econometrics 2 (SS 2008) 18 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
Estimating a Bivariate Probit<br />
Estimation then follows by ML:<br />
L(β, ρ) =<br />
Taking logs,<br />
N∏<br />
i=1<br />
P (1−y 0i )(1−y 1i )<br />
00i<br />
ln L(β 0 , β 1 , ρ) =<br />
∗ P (1−y 0i )y 1i<br />
01i<br />
∗ P y 0i (1−y 1i )<br />
10i<br />
∗ P y 0i y 1i<br />
11i<br />
N∑<br />
{(1 − y 0i )(1 − y 1i ) ln P 00i<br />
i=i<br />
+ (1 − y 0i ) ∗ y 1i ln P 01i<br />
+ y 0i ∗ (1 − y 1i ) ∗ ln P 10i<br />
+ y 0i ∗ y 1i ∗ ln P 11i }.<br />
Econometrics 2 (SS 2008) 19 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Multinomial Logit Model<br />
Simplest model for unordered discrete choices<br />
where covariates do not vary with m.<br />
Example: public transport choice.<br />
Consider M discrete alternatives<br />
P mi = P(y i = m) for m = 1, ..., M.<br />
Thinking again <strong>of</strong> latent variables, here utilities;<br />
U ∗ im = x ′ i β m + u im<br />
we get<br />
P mi = P ( U ∗ im > U ∗ ij, ∀j ≠ m )<br />
Econometrics 2 (SS 2008) 20 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Multinomial Logit Model<br />
Let us derive the probability model generally:<br />
For the Multinomial Model , m = 1, ..., M − 1<br />
for a benchmark probability P M .<br />
This implies that<br />
P m<br />
P m + P M<br />
= F (x ′ β m )<br />
P m<br />
P M<br />
= F (x ′ β m )<br />
1 − F (x ′ β m ) = λ(x ′ β m )<br />
would reminds us <strong>of</strong> the logit distribution.<br />
Will see that F (·) is cdf<br />
Econometrics 2 (SS 2008) 21 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Multinomial Logit Model<br />
Since P m ∈ (0, 1), we therefore have that<br />
P m<br />
→ 0<br />
P m + P M<br />
as P m → 0,<br />
P m<br />
→ 1<br />
P m + P M<br />
as P m → 1.<br />
So, F (.) is a monotone increasing function,<br />
F (u) → 0 as u → −∞,<br />
F (u) → 1 as u → ∞.<br />
Since ∑ M<br />
m=1 P m = 1, we have that<br />
M−1<br />
∑<br />
j=1<br />
P j<br />
P M<br />
= 1 − P M<br />
P M<br />
= 1<br />
P M<br />
− 1.<br />
Econometrics 2 (SS 2008) 22 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Multinomial Logit Model<br />
Hence, for all m = 1, ..., M − 1<br />
M−1<br />
∑<br />
M−1<br />
P j<br />
∑<br />
P M = [1 + ] −1 = [1 + λ(x ′ β j )] −1<br />
P M<br />
P m =<br />
j=1<br />
λ(x ′ β m )<br />
1 + ∑ M−1<br />
j=1 λ(x ′ β j )<br />
General derivation (one possibility),<br />
for the MLM we set λ(u) = exp(u).<br />
Alternatives are possible but rarely used.<br />
j=1<br />
Suffers from certain restrictions, maybe most crucial:<br />
Econometrics 2 (SS 2008) 23 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Independence <strong>of</strong> Irrelevant Alternatives<br />
Recall the formulae for the probabilities,<br />
for all m = 1, ..., M − 1.<br />
However, looking at<br />
P m =<br />
exp(x ′ β m )<br />
1 + ∑ M−1<br />
j=1 exp(x ′ β j )<br />
P j<br />
= exp(x ′ β j )<br />
P k exp(x ′ β k ) .<br />
we notice that this ratio is independent <strong>of</strong> the probability <strong>of</strong> any other<br />
outcome.<br />
This is called the assumption <strong>of</strong> independence <strong>of</strong> irrelevant<br />
alternatives. Now, compare this with sequential decisions.<br />
Econometrics 2 (SS 2008) 24 / 25
Multiple <strong>Discrete</strong> Choice Models<br />
The Conditional Logit Model<br />
still unordered discrete choices<br />
now covariates may vary over m<br />
Example: distance to store.<br />
Consider M discrete alternatives<br />
P mi = P(y i = m) for m = 1, ..., M.<br />
Thinking again <strong>of</strong> latent variables, here utilities;<br />
U ∗ im = x ′ imβ + u im<br />
β fixed for identification. Again we have<br />
P mi = P ( U ∗ im > U ∗ ij, ∀j ≠ m )<br />
Have no benchmark, similar derivation leads to<br />
P mi =<br />
exp(x ′ im β)<br />
∑ M<br />
j=1 exp(x ′ ij β)<br />
Econometrics 2 (SS 2008) 25 / 25