The specification of the propensity score in multilevel studies

Arpino, Bruno; Mealli, Fabrizia

Propensity Score Matching (PSM) has become a popular approach to estimation of causal effects. It relies on the assumption that selection into a treatment can be explained purely in terms of observable characteristics (the “unconfoundedness assumption”) and on the property that balancing on the propensity score is equivalent to balancing on the observed covariates. Several applications in social sciences are characterized by a hierarchical structure of data: units at the first level (e.g., individuals) clustered into groups (e.g., provinces). In this paper we explore the use of multilevel models for the estimation of the propensity score for such hierarchical data when one or more relevant cluster-level variables is unobserved. We compare this approach with alternative ones, like a single level model with cluster dummies. By using Monte Carlo evidence we show that multilevel specifications usually achieve reasonably good balancing in cluster level unobserved covariates and consequently reduce the omitted variable bias. This is also the case for the dummy model.