In this doctoral thesis I employ the lens of Statistical Physics to study a number of non-convex models of Neural Networks. I start by using the replica method to investigate the loss landscape of a prototypical neural network model, the Negative Perceptron, and use the tool of Linear Mode Connectivity to describe the connection of different types of solutions. I show that the geometry of such solutions can be described as star-shaped, and numerically verify that such connectivity properties hold for solutions found by algorithms. In the same model, and for the Tree-Committee Machine, I study the critical capacity under the full-RSB ansatz, and settle a long standing open problem about the numerical value of such threshold. Comparing it to simulations with Gradient Descent, I observe an algorithmic gap: for some values of the constraint density solutions exists but are not found by the algorithm. I also introduce a transition line that separates a phase where typical states exhibit an Overlap Gap from a phase where no such gap exists, and discuss potential algorithmic implications. Going back to the connectivity properties of the Negative Perceptron, I use the fRSB framework to characterize the disconnection transition. Finally I go beyond the storage setting and study a Spiked Random Feature Model, where a low rank correction to the random feature matrix can be learned, in the teacher-student scenario. I observe a detection phenomenon where a minimum amount of data is needed for the student to align its spike with that of the teacher, and compare it to numerical simulations with real datasets.

Statistical Physics Methods for Non-Convex Neural Network Models

ANNESI, BRANDON LIVIO
2025

Abstract

In this doctoral thesis I employ the lens of Statistical Physics to study a number of non-convex models of Neural Networks. I start by using the replica method to investigate the loss landscape of a prototypical neural network model, the Negative Perceptron, and use the tool of Linear Mode Connectivity to describe the connection of different types of solutions. I show that the geometry of such solutions can be described as star-shaped, and numerically verify that such connectivity properties hold for solutions found by algorithms. In the same model, and for the Tree-Committee Machine, I study the critical capacity under the full-RSB ansatz, and settle a long standing open problem about the numerical value of such threshold. Comparing it to simulations with Gradient Descent, I observe an algorithmic gap: for some values of the constraint density solutions exists but are not found by the algorithm. I also introduce a transition line that separates a phase where typical states exhibit an Overlap Gap from a phase where no such gap exists, and discuss potential algorithmic implications. Going back to the connectivity properties of the Negative Perceptron, I use the fRSB framework to characterize the disconnection transition. Finally I go beyond the storage setting and study a Spiked Random Feature Model, where a low rank correction to the random feature matrix can be learned, in the teacher-student scenario. I observe a detection phenomenon where a minimum amount of data is needed for the student to align its spike with that of the teacher, and compare it to numerical simulations with real datasets.
23-giu-2025
Inglese
36
2023/2024
STATISTICS AND COMPUTER SCIENCE
Settore FIS/02 - Fisica Teorica, Modelli e Metodi Matematici
ZECCHINA, RICCARDO
LUCIBELLO, CARLO
File in questo prodotto:
File Dimensione Formato  
ANNESI Thesis.pdf

accesso aperto

Descrizione: ANNESI Thesis
Tipologia: Tesi di dottorato
Dimensione 4.84 MB
Formato Adobe PDF
4.84 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/4074062
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact