The simultaneous testing of multiple hypotheses is common to the analysis of high-dimensional data sets. The two-group model, first proposed in Efron (2004), identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two group model framework. Here, we investigate employing instead mixtures of two-parameter Poisson Dirichlet Processes (2PPD), and show how they provide a more flexible and effective tool for large-scale hypothesis testing. Our model further employs non-local prior densities to allow separation between the two mixture components. We obtain a closed form expression for the exchangeable partition probability function of the two-group model, which leads to a straightforward MCMC implementation. We compare the performances of our method for large-scale inference in a simulation study and illustrate its use on both a prostate cancer dataset and a case-control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate to severe diarrhea

Two‐group Poisson‐Dirichlet mixtures for multiple testing

Denti, Francesco;Guindani, Michele;Lijoi, Antonio;
2021

Abstract

The simultaneous testing of multiple hypotheses is common to the analysis of high-dimensional data sets. The two-group model, first proposed in Efron (2004), identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two group model framework. Here, we investigate employing instead mixtures of two-parameter Poisson Dirichlet Processes (2PPD), and show how they provide a more flexible and effective tool for large-scale hypothesis testing. Our model further employs non-local prior densities to allow separation between the two mixture components. We obtain a closed form expression for the exchangeable partition probability function of the two-group model, which leads to a straightforward MCMC implementation. We compare the performances of our method for large-scale inference in a simulation study and illustrate its use on both a prostate cancer dataset and a case-control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate to severe diarrhea
2021
2020
Denti, Francesco; Guindani, Michele; Leisen, Fabrizio; Lijoi, Antonio; Wadsworth, Duncan; Vannucci, Marina
File in questo prodotto:
File Dimensione Formato  
maintext.pdf

non disponibili

Descrizione: articolo
Tipologia: Documento in Post-print (Post-print document)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 783.78 kB
Formato Adobe PDF
783.78 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/4033009
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact