We consider the problem of approximating the product of n expectations with respect to a common probability distribution μ⁠. Such products routinely arise in statistics as values of the likelihood in latent variable models. Motivated by pseudo-marginal Markov chain Monte Carlo schemes, we focus on unbiased estimators of such products. The standard approach is to sample N particles from μ and assign each particle to one of the expectations; this is wasteful and typically requires the number of particles to grow quadratically with the number of expectations. We propose an alternative estimator that approximates each expectation using most of the particles while preserving unbiasedness, which is computationally more efficient when the cost of simulations greatly exceeds the cost of likelihood evaluations. We carefully study the properties of our proposed estimator, showing that in latent variable contexts it needs only O(n) particles to match the performance of the standard approach with O(n^2) particles. We demonstrate the procedure on two latent variable examples from approximate Bayesian computation and single-cell gene expression analysis, observing computational gains by factors of about 25 and 450, respectively.

Unbiased approximations of products of expectations

Zanella, Giacomo
2019

Abstract

We consider the problem of approximating the product of n expectations with respect to a common probability distribution μ⁠. Such products routinely arise in statistics as values of the likelihood in latent variable models. Motivated by pseudo-marginal Markov chain Monte Carlo schemes, we focus on unbiased estimators of such products. The standard approach is to sample N particles from μ and assign each particle to one of the expectations; this is wasteful and typically requires the number of particles to grow quadratically with the number of expectations. We propose an alternative estimator that approximates each expectation using most of the particles while preserving unbiasedness, which is computationally more efficient when the cost of simulations greatly exceeds the cost of likelihood evaluations. We carefully study the properties of our proposed estimator, showing that in latent variable contexts it needs only O(n) particles to match the performance of the standard approach with O(n^2) particles. We demonstrate the procedure on two latent variable examples from approximate Bayesian computation and single-cell gene expression analysis, observing computational gains by factors of about 25 and 450, respectively.
2019
2019
Lee, Anthony; Tiberi, Simone; Zanella, Giacomo
File in questo prodotto:
File Dimensione Formato  
2019_Biometrika.pdf

non disponibili

Descrizione: Paper
Tipologia: Pdf editoriale (Publisher's layout)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 161.28 kB
Formato Adobe PDF
161.28 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/4021094
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact