Endogeneity in multinomial response models

The endogeneity issue in multinomial response model refers to the correlation between the explanatory variable and the unobservable variable. Take a choice-making model as an example for multinomial response model with unobservable variable:

y_i= j if and only if y_j, i = max{y_1, i, y_2, i.....y_j, i};

y_j, i = x_j, iβ + v_i + u_j, i

In the choice-making context, i indexes the N individuals and j indexes the J choices, for instance different brands of chocolates. v_i represents an unobservable feature of individual i, for instance, the taste of individual i and denotes the i.i.d response error which might come from cognitive limitation, information analysis difficulty and many other reasons. y_j, i^* is a latent variable representing the utility of individual when she chooses choice . The response error u_j, t is assumed to be i.i.d. with zero mean and has a density f_u(⬝). When f_u(⬝) is normal, then the given model will be a multinomial probit model; when it is a Gumbel density, then the model will be a multinomial logit model. In usual cases, v_i is assumed to be uncorrelated with x_j, i which usually is a group of variables describing the features of the choice , for instance the weight of the chocolate ; the features of the individual , for instance the age of individual ; and the interaction between choice and individual , for instance the quantity of chocolate consumed by individual last year. Under this assumption, v_i|x_j, t ∼ f_v(⬝). Then the log-likelihood function of a typical multi-nomial response model with unobservable variable can be written as:

$\sum_{i=1}^N \log\{\int P[y^*_{y,i}> y^*_{j,i} \forall j \neq v_{i}, x_{1,i}....x_{J,i}] f_{v} (v_{i})dv_{i}\}$

However, in many practice, the personal feature v_i is correlated with x_j, i. In this situation, the estimates from the model estimation without considering this correlation will be inconsistent. To fix this problem, the log-likelihood function should be revised as:

$\sum_{i=1}^N \log\{\int P[y^*_{y,i}> y^*_{j,i} \forall j \neq v_{i}, x_{1,i}....x_{J,i}] f_{v} (v_{i})dv_{i}\}$

Then, the model can be estimated consistently by MLE. Because the construction of this correlation can be very non-standard, there is not a unified solution for this type of problem. One common practice is to impose some parametric assumption to model the distribution of the unobservable variable conditional on the observable explanatory variables and then implement MLE based on the new likelihood function.

🪦 Wikipedia History

1 yearage

1editors

1edits

Archive Provenance

Created: January 1, 2018

Deleted: May 29, 2019

Article size: 4.0 KB

Technical Metadata

Wikipedia page ID: 54255466

Metadata captured: May 11, 2026 1:25 PM

Metadata updated: May 11, 2026 1:25 PM

Subject Tags

Econometric models

Why Deleted

AfD

by Jo-Jo Eumerus

Articles for deletion/Endogeneity in multinomial response models (XFDcloser)

View AfD discussion ↗

Archive Inventory

View stored source record counts

Revision rows stored: 0

Outgoing links stored: 6

External links stored: 0

Templates stored: 4

Talk exports stored: 0

AfD exports stored: 0

Raw API payloads stored: 0

Image records stored: 0

View full source metadata

Outgoing Wikipedia links (6)

Endogeneity (econometrics)Gumbel distributionIndependent and identically distributed random variablesMaximum likelihood estimationMultinomial logistic regressionMultinomial probit

Templates (4)

ContextMultiple issuesOrphanReflist