Over recent years, there has been a remarkable increase in the complexity of models and data structures. On the one side, as models become increasingly complex, there is a growing need to understand and explain the decision-making process of black boxes: this is important to enhance the trust of the users as well as to comply with legal requirements. On the other side, complex data structures, such as complex networks, require flexible models to effectively extract relevant information. The leitmotif of this thesis is the search for hidden structures in models and data. The first part is devoted to the topic of explainability. Namely, we introduce the Xi method, a comprehensive statistical framework to define post hoc explanations that possess theoretical guarantees. The rationale is to propose and evaluate a class of probabilistic sensitivity measures that quantifies the degree of association between covariates and generic model predictions. These explanations are designed to be applicable across different models and data types, regardless of their specific characteristics. The second part of this thesis focuses on Bayesian nonparametric models for community detection in complex networks. First, we define a stochastic block model for multiplex networks. Such a model identifies clusters specific to each layer, as well as a latent partition common to all the layers. A non-trivial computational scheme to perform posterior inference is also introduced. This framework has wide ranging applicability to a plethora of problems, including the analysis of latent structures in brain networks of different subjects. Secondly, we propose a stochastic block model specifically tailored for weighted networks with continuous and multidimensional node attributes. This model has the potential to effectively capture and utilize the information contained in these node features, while also being able to learn the optimal amount of information to incorporate from them. A real, motivating application is showcased, addressing the need to identify a meaningful latent partition within a transportation network.
From post hoc explanations to Bayesian nonparametric models: unveiling hidden structures
GHIDINI, VALENTINA
2024
Abstract
Over recent years, there has been a remarkable increase in the complexity of models and data structures. On the one side, as models become increasingly complex, there is a growing need to understand and explain the decision-making process of black boxes: this is important to enhance the trust of the users as well as to comply with legal requirements. On the other side, complex data structures, such as complex networks, require flexible models to effectively extract relevant information. The leitmotif of this thesis is the search for hidden structures in models and data. The first part is devoted to the topic of explainability. Namely, we introduce the Xi method, a comprehensive statistical framework to define post hoc explanations that possess theoretical guarantees. The rationale is to propose and evaluate a class of probabilistic sensitivity measures that quantifies the degree of association between covariates and generic model predictions. These explanations are designed to be applicable across different models and data types, regardless of their specific characteristics. The second part of this thesis focuses on Bayesian nonparametric models for community detection in complex networks. First, we define a stochastic block model for multiplex networks. Such a model identifies clusters specific to each layer, as well as a latent partition common to all the layers. A non-trivial computational scheme to perform posterior inference is also introduced. This framework has wide ranging applicability to a plethora of problems, including the analysis of latent structures in brain networks of different subjects. Secondly, we propose a stochastic block model specifically tailored for weighted networks with continuous and multidimensional node attributes. This model has the potential to effectively capture and utilize the information contained in these node features, while also being able to learn the optimal amount of information to incorporate from them. A real, motivating application is showcased, addressing the need to identify a meaningful latent partition within a transportation network.File | Dimensione | Formato | |
---|---|---|---|
Thesis_final_ghidini.pdf
accesso aperto
Descrizione: Ghidini_revised_thesis
Tipologia:
Tesi di dottorato
Dimensione
34.51 MB
Formato
Adobe PDF
|
34.51 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.