The specification of the propensity score in multilevel observational studies

Arpino, Bruno; Mealli, F.

doi:10.1016/j.csda.2010.11.008

The use of multilevel models for the estimation of the propensity score for data with a hierarchical structure and unobserved cluster-level variables is proposed. This approach is compared with models that ignore the hierarchy, and models in which the hierarchy is represented by a fixed parameter for each cluster. It is shown, by simulation, that simple models with dummy variables outperform both random effect models and models ignoring the hierarchy in terms of balance of cluster-level unobserved covariates and omitted variable bias. The representation of the clusters by fixed or random effects defines a model more general than would be ideal if the relevant cluster-level variables were available. The general conclusion confirms that when conducting propensity score analysis it is safer to specify a more general model than pursuing model parsimony.