01.08.2014 Views

Classes of Discrete Variable

Classes of Discrete Variable

Classes of Discrete Variable

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Multiple <strong>Discrete</strong> Choice Models<br />

<strong>Classes</strong> <strong>of</strong> <strong>Discrete</strong> <strong>Variable</strong><br />

Contents:<br />

Ordered Probit<br />

Ordered Logit<br />

Methods <strong>of</strong> Estimation<br />

Sequential <strong>Discrete</strong> Choice models<br />

The Bivariate Probit model<br />

The Multinomial Logit model<br />

Econometrics 2 (SS 2008) 1 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Ordered Probit/Logit Model<br />

Sometimes a simple binary choice model is inappropriate:<br />

eg. model <strong>of</strong> labour market status<br />

degree <strong>of</strong> satisfaction<br />

number <strong>of</strong> cars owned<br />

Each <strong>of</strong> these examples involves more than two possible outcomes.<br />

One possible model specification: the Ordered Probit or Logit<br />

model:<br />

appropriate when discrete outcomes have a natural (ordinal) ranking<br />

major advantage: the resulting model is relatively easy to estimate<br />

down-side: the behavioural model may be considered too restrictive<br />

Econometrics 2 (SS 2008) 2 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Ordered Probit/Logit Model<br />

Consider an independent sample <strong>of</strong> data {y i , x i } <strong>of</strong> size n.<br />

Let y i have M possible outcomes y i = m for m = 1, ..., M and<br />

natural ordering (e.g. m + 1 is in some sense better than m).<br />

Consider a latent variable y ∗<br />

i<br />

where<br />

y ∗<br />

i = x ′ i β + u i for i = 1, ..., n<br />

Define the following observability criterion:<br />

y i = m if α m−1 ≤ y ∗<br />

i ≤ α m for m = 1, ..., M,<br />

α 0 < α 1 < α 2 < ... < α M ,<br />

α 0 = −∞ and α M = ∞<br />

Econometrics 2 (SS 2008) 3 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Ordered Probit/Logit Model<br />

The conditional probability <strong>of</strong> observing y i = m is<br />

P(y i = m|x i ) = P(α m−1 ≤ y ∗<br />

i ≤ α m )<br />

= P(α m−1 ≤ x ′ i β + u i ≤ α m )<br />

Rearranging gives for m = 1, ..., M<br />

P(y i = m|x i ) = P(α m−1 − x ′ i β ≤ u i ≤ α m − x ′ i β)<br />

= P(u i ≤ α m − x ′ i β) − P(u i ≤ α m−1 − x ′ i β)<br />

Need a distribution for u i<br />

u i std normal gives the Ordered Probit<br />

u i logistic gives the Ordered Logit<br />

Econometrics 2 (SS 2008) 4 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Ordered Probit: graphical representation<br />

eg. let u i ∼ N(0, 1). Then<br />

P(y i = m|x i ) = Φ(α m − x ′ i β) − Φ(α m−1 − x ′ i β)<br />

Econometrics 2 (SS 2008) 5 / 25


Estimation<br />

Multiple <strong>Discrete</strong> Choice Models<br />

Estimate this non-linear model by maximum likelihood:<br />

let z im = 1I(y i = m), for m = 1, ..., M,<br />

then the ith likelihood contribution is<br />

L i =<br />

=<br />

M∏<br />

P(y i = m|x i ) z im<br />

m=1<br />

M∏<br />

[Φ(α m − x i ′ β) − Φ(α m−1 − x i ′ β)] z im<br />

.<br />

m=1<br />

The full likelihood function becomes<br />

L(α, β) =<br />

n∏<br />

i=1 m=1<br />

M∏<br />

[Φ(α m − x i ′ β) − Φ(α m−1 − x i ′ β)] z im<br />

.<br />

Econometrics 2 (SS 2008) 6 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Estimation<br />

Taking logs,<br />

l =<br />

n∑ M∑<br />

z im ln[Φ(α m − x i ′ β) − Φ(α m−1 − x i ′ β)].<br />

i=1 m=1<br />

For ML estimates, solve<br />

discuss conditions<br />

discuss consequences<br />

∂l<br />

∂α = 0 and ∂l<br />

∂β = 0.<br />

Econometrics 2 (SS 2008) 7 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Sequential Probit/Logit model<br />

What if decisions / alternatives are not independent?<br />

Take as an example a sequential decision rule:<br />

Can be used when dependent variable can be separated into a<br />

sequence <strong>of</strong> binary choices.<br />

For the simplest sequential model, we also assume u i independent.<br />

Some examples:<br />

Econometrics 2 (SS 2008) 8 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Sequential Probit/Logit model: Example 1<br />

labour force status<br />

Econometrics 2 (SS 2008) 9 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Sequential Probit/Logit model: Example 2<br />

transport mode<br />

Econometrics 2 (SS 2008) 10 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Sequential Probit/Logit model<br />

Consider a sample <strong>of</strong> data {y 0i , y 1i , x i , z i }.<br />

Let y 0i represent a binary indicator variable for some discrete choice.<br />

Let y 1i represent a second discrete choice, observed only when<br />

y 0i = 1.<br />

Let the k 0 explanatory variables x i influence the first choice.<br />

Let the k 1 explanatory variables z i influence the conditional choice.<br />

For the first stage, assume with u 0i ∼ N(0, 1) iid<br />

y ∗ 0i = x ′ i β 0 + u 0i<br />

Econometrics 2 (SS 2008) 11 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Sequential Probit/Logit Model<br />

Observe y 0i = 1I(y0i ∗ > 0).<br />

Hence P(y 0i = 1|x i ) = Φ(x<br />

i ′ β 0).<br />

Estimation by standard Probit MLE on the full sample.<br />

For the second stage, note first that<br />

P(y 0i = 1, y 1i = 1) = P(y 0i = 1) ∗ P(y 1i = 1|y 0i = 1).<br />

Hence, select a sample <strong>of</strong> the n 1 observations for which y 0i = 1.<br />

Define for u 1i ∼ N(0, 1) iid<br />

y ∗ 1i = z ′ i β 1 + u 1i<br />

Econometrics 2 (SS 2008) 12 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Sequential Probit/Logit Model<br />

For the second stage y 1i = 1I(y1i ∗ > 0). So,<br />

P(y 1i = 1|z i ) = Φ(z i ′ β 1 ).<br />

Estimation by standard Probit MLE on the selected sample.<br />

The overall probabilities <strong>of</strong> the three possible outcomes are<br />

P(y 0i = 0|x i ) = 1 − Φ(x i ′ β 0 )<br />

P(y 0i = 1, y 1i = 0|x i , z i ) = Φ(x i ′ β 0 ) ∗ [1 − Φ(z i ′ β 1 )]<br />

P(y 0i = 1, y 1i = 1|x i , z i ) = Φ(x i ′ β 0 ) ∗ Φ(z i ′ β 1 )<br />

Upside: easy to estimate<br />

Downside: ignores a potential correlation between u 0i and u 1i .<br />

Econometrics 2 (SS 2008) 13 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Bivariate Probit Model<br />

Binary decisions may form part <strong>of</strong> a system <strong>of</strong> choices rather than a<br />

sequence, eg. simultaneous decisions <strong>of</strong> work and take-up <strong>of</strong> paid<br />

childcare.<br />

Can apply the Bivariate Probit in these circumstances:<br />

Consider {y 0i , y 1i , x 0i , x 1i } for i = 1, ..., N.<br />

Here, y 0i and y 1i represent two binary indicator variables.<br />

Assume an underlying system <strong>of</strong> propensities:<br />

y ∗ 0i = x ′ 0iβ 0 + u 0i ,<br />

y ∗ 1i = x ′ 1iβ 1 + u 1i .<br />

Econometrics 2 (SS 2008) 14 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Bivariate Probit Model<br />

The observability criteria:<br />

y 0i = 1I(y ∗ 0i > 0),<br />

y 1i = 1I(y ∗ 1i > 0).<br />

For a Bivariate Probit model, u 0i and u 1i are bivariate normal:<br />

1<br />

φ 2 (u 0 , u 1 ; ρ) =<br />

2π(1 − ρ 2 ) 1 2<br />

Φ 2 (u 0 , u 1 ; ρ) =<br />

∫ u1<br />

−∞<br />

∫ u0<br />

−∞<br />

∗ exp(− u2 0 + u2 1 − 2ρu 0u 1<br />

1 − ρ 2 )<br />

φ 2 (u, v; ρ)∂u∂v<br />

Note that when ρ = 0, Φ 2 (u 0 , u 1 ; 0) = Φ(u 0 ) ∗ Φ(u 1 ).<br />

Econometrics 2 (SS 2008) 15 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Estimating a Bivariate Probit<br />

Derive probabilities P jk for j, k = 0, 1.<br />

For example,<br />

P 00i = P(y 0i = 0, y 1i = 0|x 0i , x 1i )<br />

= P(y0i ∗ ≤ 0, y1i ∗ ≤ 0|x 0i , x 1i )<br />

= P(u 0i ≤ −x 0iβ ′ 0 , u 1i ≤ −x 1iβ ′ 1 )<br />

= Φ 2 (−x 0iβ ′ 0 , −x 1iβ ′ 1 ; ρ).<br />

Similarly,<br />

P 11i = P(y 0i = 1, y 1i = 1|x 0i , x 1i ) = Φ 2 (x 0iβ ′ 0 , x 1iβ ′ 1 ),<br />

P 01i = P(y 0i = 0, y 1i = 1|x 0i , x 1i ) = Φ(x 1iβ ′ 1 ) − P 11i ,<br />

P 10i = P(y 0i = 1, y 1i = 0|x 0i , x 1i ) = Φ(x 0iβ ′ 0 ) − P 11i .<br />

Econometrics 2 (SS 2008) 16 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Estimating a Bivariate Probit<br />

Contours <strong>of</strong> the bivariate normal distribution<br />

Econometrics 2 (SS 2008) 17 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Estimating a Bivariate Probit<br />

Bivariate Probit probabilities<br />

Econometrics 2 (SS 2008) 18 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

Estimating a Bivariate Probit<br />

Estimation then follows by ML:<br />

L(β, ρ) =<br />

Taking logs,<br />

N∏<br />

i=1<br />

P (1−y 0i )(1−y 1i )<br />

00i<br />

ln L(β 0 , β 1 , ρ) =<br />

∗ P (1−y 0i )y 1i<br />

01i<br />

∗ P y 0i (1−y 1i )<br />

10i<br />

∗ P y 0i y 1i<br />

11i<br />

N∑<br />

{(1 − y 0i )(1 − y 1i ) ln P 00i<br />

i=i<br />

+ (1 − y 0i ) ∗ y 1i ln P 01i<br />

+ y 0i ∗ (1 − y 1i ) ∗ ln P 10i<br />

+ y 0i ∗ y 1i ∗ ln P 11i }.<br />

Econometrics 2 (SS 2008) 19 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Multinomial Logit Model<br />

Simplest model for unordered discrete choices<br />

where covariates do not vary with m.<br />

Example: public transport choice.<br />

Consider M discrete alternatives<br />

P mi = P(y i = m) for m = 1, ..., M.<br />

Thinking again <strong>of</strong> latent variables, here utilities;<br />

U ∗ im = x ′ i β m + u im<br />

we get<br />

P mi = P ( U ∗ im > U ∗ ij, ∀j ≠ m )<br />

Econometrics 2 (SS 2008) 20 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Multinomial Logit Model<br />

Let us derive the probability model generally:<br />

For the Multinomial Model , m = 1, ..., M − 1<br />

for a benchmark probability P M .<br />

This implies that<br />

P m<br />

P m + P M<br />

= F (x ′ β m )<br />

P m<br />

P M<br />

= F (x ′ β m )<br />

1 − F (x ′ β m ) = λ(x ′ β m )<br />

would reminds us <strong>of</strong> the logit distribution.<br />

Will see that F (·) is cdf<br />

Econometrics 2 (SS 2008) 21 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Multinomial Logit Model<br />

Since P m ∈ (0, 1), we therefore have that<br />

P m<br />

→ 0<br />

P m + P M<br />

as P m → 0,<br />

P m<br />

→ 1<br />

P m + P M<br />

as P m → 1.<br />

So, F (.) is a monotone increasing function,<br />

F (u) → 0 as u → −∞,<br />

F (u) → 1 as u → ∞.<br />

Since ∑ M<br />

m=1 P m = 1, we have that<br />

M−1<br />

∑<br />

j=1<br />

P j<br />

P M<br />

= 1 − P M<br />

P M<br />

= 1<br />

P M<br />

− 1.<br />

Econometrics 2 (SS 2008) 22 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Multinomial Logit Model<br />

Hence, for all m = 1, ..., M − 1<br />

M−1<br />

∑<br />

M−1<br />

P j<br />

∑<br />

P M = [1 + ] −1 = [1 + λ(x ′ β j )] −1<br />

P M<br />

P m =<br />

j=1<br />

λ(x ′ β m )<br />

1 + ∑ M−1<br />

j=1 λ(x ′ β j )<br />

General derivation (one possibility),<br />

for the MLM we set λ(u) = exp(u).<br />

Alternatives are possible but rarely used.<br />

j=1<br />

Suffers from certain restrictions, maybe most crucial:<br />

Econometrics 2 (SS 2008) 23 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Independence <strong>of</strong> Irrelevant Alternatives<br />

Recall the formulae for the probabilities,<br />

for all m = 1, ..., M − 1.<br />

However, looking at<br />

P m =<br />

exp(x ′ β m )<br />

1 + ∑ M−1<br />

j=1 exp(x ′ β j )<br />

P j<br />

= exp(x ′ β j )<br />

P k exp(x ′ β k ) .<br />

we notice that this ratio is independent <strong>of</strong> the probability <strong>of</strong> any other<br />

outcome.<br />

This is called the assumption <strong>of</strong> independence <strong>of</strong> irrelevant<br />

alternatives. Now, compare this with sequential decisions.<br />

Econometrics 2 (SS 2008) 24 / 25


Multiple <strong>Discrete</strong> Choice Models<br />

The Conditional Logit Model<br />

still unordered discrete choices<br />

now covariates may vary over m<br />

Example: distance to store.<br />

Consider M discrete alternatives<br />

P mi = P(y i = m) for m = 1, ..., M.<br />

Thinking again <strong>of</strong> latent variables, here utilities;<br />

U ∗ im = x ′ imβ + u im<br />

β fixed for identification. Again we have<br />

P mi = P ( U ∗ im > U ∗ ij, ∀j ≠ m )<br />

Have no benchmark, similar derivation leads to<br />

P mi =<br />

exp(x ′ im β)<br />

∑ M<br />

j=1 exp(x ′ ij β)<br />

Econometrics 2 (SS 2008) 25 / 25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!