Multivariate categorical data are common in many fields. An illustrative example is provided by election polls studies assessing evidence of changes in voters’ opinions with their candidates preferences in the 2016 United States Presidential primaries or caucuses. Similar goals arise in routine applications, but current literature lacks a general methodology which combines flexibility, efficiency, and tractability in testing for group differences in multivariate categorical data at different – potentially complex – scales. This contribution addresses such goal by leveraging a Bayesian representation, which factorizes the joint probability mass function for the group variable and the multivariate categorical data as the product of the marginal probabilities for the groups and the conditional probability mass function of the multivariate categorical data, given the group membership. To enhance flexibility, the conditional probability mass function of the multivariate categorical data is defined via a group-dependent mixture of tensor factorizations which facilitates dimensionality reduction and borrowing of information, while providing tractable procedures for computation, and accurate tests assessing global and local group differences. The proposed methods are compared with popular competitors, and the improved performance is outlined in simulations and in American election polls studies.

Bayesian inference on group differences in multivariate categorical data

DURANTE, DANIELE;
2018

Abstract

Multivariate categorical data are common in many fields. An illustrative example is provided by election polls studies assessing evidence of changes in voters’ opinions with their candidates preferences in the 2016 United States Presidential primaries or caucuses. Similar goals arise in routine applications, but current literature lacks a general methodology which combines flexibility, efficiency, and tractability in testing for group differences in multivariate categorical data at different – potentially complex – scales. This contribution addresses such goal by leveraging a Bayesian representation, which factorizes the joint probability mass function for the group variable and the multivariate categorical data as the product of the marginal probabilities for the groups and the conditional probability mass function of the multivariate categorical data, given the group membership. To enhance flexibility, the conditional probability mass function of the multivariate categorical data is defined via a group-dependent mixture of tensor factorizations which facilitates dimensionality reduction and borrowing of information, while providing tractable procedures for computation, and accurate tests assessing global and local group differences. The proposed methods are compared with popular competitors, and the improved performance is outlined in simulations and in American election polls studies.
2018
2018
Russo, Massimiliano; Durante, Daniele; Scarpa, Bruno
File in questo prodotto:
File Dimensione Formato  
CSDA_Durante2018.pdf

non disponibili

Descrizione: Article
Tipologia: Documento in Post-print (Post-print document)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.13 MB
Formato Adobe PDF
1.13 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/4014597
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact