Endogeneity in multinomial response models

The endogeneity issue in multinomial response model refers to the correlation between the explanatory variable and the unobservable variable. Take a choice-making model as an example for multinomial response model with unobservable variable:

yi= j if and only if yj, i = max{y1, i, y2, i.....yj, i};

yj, i = xj, iβ + vi + uj, i

In the choice-making context, i indexes the N individuals and j indexes the J choices, for instance different brands of chocolates. vi represents an unobservable feature of individual i, for instance, the taste of individual i  and  denotes the i.i.d response error which might come from cognitive limitation, information analysis difficulty and many other reasons. yj, i*  is a latent variable representing the utility of individual  when she chooses choice .  The response error uj, t is assumed to be i.i.d. with zero mean and has a density fu(⬝). When fu(⬝) is normal, then the given model will be a multinomial probit model; when it is a Gumbel density, then the model will be a multinomial logit model. In usual cases, vi  is assumed to be uncorrelated with xj, i which usually is a group of variables describing the features of the choice , for instance the weight of the chocolate ; the  features of  the individual , for instance the age of individual ; and the interaction between choice  and individual , for instance the quantity of chocolate  consumed by individual  last year. Under this assumption,  vi|xj, t ∼ fv(⬝). Then the log-likelihood function of a typical multi-nomial response model with unobservable variable can be written as:

$\sum_{i=1}^N \log\{\int P[y^*_{y,i}> y^*_{j,i} \forall j \neq v_{i}, x_{1,i}....x_{J,i}] f_{v} (v_{i})dv_{i}\}$

However, in many practice, the personal feature vi is correlated with xj, i. In this situation, the estimates from the model estimation without considering this correlation will be inconsistent. To fix this problem, the log-likelihood function should be revised as:

$\sum_{i=1}^N \log\{\int P[y^*_{y,i}> y^*_{j,i} \forall j \neq v_{i}, x_{1,i}....x_{J,i}] f_{v} (v_{i})dv_{i}\}$

Then, the model can be estimated consistently by MLE. Because the construction of this correlation can be very non-standard, there is not a unified solution for this type of problem. One common practice is to impose some parametric assumption to model the distribution of the unobservable variable conditional on the observable explanatory variables and then implement MLE based on the new likelihood function.