Over recent years, there has been a remarkable increase in the complexity of models and data structures. On the one side, as models become increasingly complex, there is a growing need to understand and explain the decision-making process of black boxes: this is important to enhance the trust of the users as well as to comply with legal requirements. On the other side, complex data structures, such as complex networks, require flexible models to effectively extract relevant information. The leitmotif of this thesis is the search for hidden structures in models and data. The first part is devoted to the topic of explainability. Namely, we introduce the Xi method, a comprehensive statistical framework to define post hoc explanations that possess theoretical guarantees. The rationale is to propose and evaluate a class of probabilistic sensitivity measures that quantifies the degree of association between covariates and generic model predictions. These explanations are designed to be applicable across different models and data types, regardless of their specific characteristics. The second part of this thesis focuses on Bayesian nonparametric models for community detection in complex networks. First, we define a stochastic block model for multiplex networks. Such a model identifies clusters specific to each layer, as well as a latent partition common to all the layers. A non-trivial computational scheme to perform posterior inference is also introduced. This framework has wide ranging applicability to a plethora of problems, including the analysis of latent structures in brain networks of different subjects. Secondly, we propose a stochastic block model specifically tailored for weighted networks with continuous and multidimensional node attributes. This model has the potential to effectively capture and utilize the information contained in these node features, while also being able to learn the optimal amount of information to incorporate from them. A real, motivating application is showcased, addressing the need to identify a meaningful latent partition within a transportation network.

From post hoc explanations to Bayesian nonparametric models: unveiling hidden structures

GHIDINI, VALENTINA
2024

Abstract

Over recent years, there has been a remarkable increase in the complexity of models and data structures. On the one side, as models become increasingly complex, there is a growing need to understand and explain the decision-making process of black boxes: this is important to enhance the trust of the users as well as to comply with legal requirements. On the other side, complex data structures, such as complex networks, require flexible models to effectively extract relevant information. The leitmotif of this thesis is the search for hidden structures in models and data. The first part is devoted to the topic of explainability. Namely, we introduce the Xi method, a comprehensive statistical framework to define post hoc explanations that possess theoretical guarantees. The rationale is to propose and evaluate a class of probabilistic sensitivity measures that quantifies the degree of association between covariates and generic model predictions. These explanations are designed to be applicable across different models and data types, regardless of their specific characteristics. The second part of this thesis focuses on Bayesian nonparametric models for community detection in complex networks. First, we define a stochastic block model for multiplex networks. Such a model identifies clusters specific to each layer, as well as a latent partition common to all the layers. A non-trivial computational scheme to perform posterior inference is also introduced. This framework has wide ranging applicability to a plethora of problems, including the analysis of latent structures in brain networks of different subjects. Secondly, we propose a stochastic block model specifically tailored for weighted networks with continuous and multidimensional node attributes. This model has the potential to effectively capture and utilize the information contained in these node features, while also being able to learn the optimal amount of information to incorporate from them. A real, motivating application is showcased, addressing the need to identify a meaningful latent partition within a transportation network.
23-gen-2024
Inglese
35
2022/2023
STATISTICS
Settore SECS-S/01 - Statistica
PAPASPILIOPOULOS, OMIROS
DURANTE, DANIELE
File in questo prodotto:
File Dimensione Formato  
Thesis_final_ghidini.pdf

accesso aperto

Descrizione: Ghidini_revised_thesis
Tipologia: Tesi di dottorato
Dimensione 34.51 MB
Formato Adobe PDF
34.51 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/4062461
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact