Conjugate Bayes for probit regression via unified skew-normal distributions

Durante, Daniele

doi:10.1093/biomet/asz034

Regression models for dichotomous data are ubiquitous in statistics. Besides being useful for inference on binary responses, these methods serve as building blocks in more complex formulations, such as density regression, nonparametric classification and graphical models. Within the Bayesian framework, inference proceeds by updating the priors for the coefficients, typically taken to be Gaussians, with the likelihood induced by probit or logit regressions for the responses. In this updating, the apparent absence of a tractable posterior has motivated a variety of computational methods, including Markov chain Monte Carlo routines and algorithms that approximate the posterior. Despite being implemented routinely, Markov chain Monte Carlo strategies have mixing or time-inefficiency issues in large-p and small-n studies, whereas approximate routines fail to capture the skewness typically observed in the posterior. In this article it is proved that the posterior distribution for the probit coefficients has a unified skew-normal kernel under Gaussian priors. This result allows efficient Bayesian inference for a wide class of applications, especially in large-p and small-to-moderate-n settings where state-of-the-art computational methods face notable challenges. These advances are illustrated in a genetic study, and further motivate the development of a wider class of conjugate priors for probit models, along with methods for obtaining independent and identically distributed samples from the unified skew-normal posterior.