IRIS Università Commerciale Luigi Bocconihttps://iris.unibocconi.itIl sistema di repository digitale IRIS acquisisce, archivia, indicizza, conserva e rende accessibili prodotti digitali della ricerca.Tue, 25 Jan 2022 02:29:29 GMT2022-01-25T02:29:29Z1081Generalized approximate survey propagation for high-dimensional estimationhttp://hdl.handle.net/11565/4024163Titolo: Generalized approximate survey propagation for high-dimensional estimation
Tue, 01 Jan 2019 00:00:00 GMThttp://hdl.handle.net/11565/40241632019-01-01T00:00:00ZSubdominant dense clusters allow for simple learning and high computational performance in neural networks with discrete synapseshttp://hdl.handle.net/11565/3996613Titolo: Subdominant dense clusters allow for simple learning and high computational performance in neural networks with discrete synapses
Abstract: We show that discrete synaptic weights can be efficiently used for learning in large scale neural systems, and lead to unanticipated computational performance. We focus on the representative case of learning random patterns with binary synapses in single layer networks. The standard statistical analysis shows that this problem is exponentially dominated by isolated solutions that are extremely hard to find algorithmically. Here, we introduce a novel method that allows us to find analytical evidence for the existence of subdominant and extremely dense regions of solutions. Numerical experiments confirm these findings. We also show that the dense regions are surprisingly accessible by simple learning protocols, and that these synaptic configurations are robust to perturbations and generalize better than typical solutions. These outcomes extend to synapses with multiple states and to deeper neural architectures. The large deviation measure also suggests how to design novel algorithmic schemes for optimization based on local entropy maximization.
Thu, 01 Jan 2015 00:00:00 GMThttp://hdl.handle.net/11565/39966132015-01-01T00:00:00ZLearning may need only a few bits of synaptic precisionhttp://hdl.handle.net/11565/3996608Titolo: Learning may need only a few bits of synaptic precision
Abstract: Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states. The choice of discrete synapses is motivated by biological reasoning and experiments, and possibly by hardware implementation considerations as well. In this paper we extend a previous large deviations analysis which unveiled the existence of peculiar dense regions in the space of synaptic states which accounts for the possibility of learning efficiently in networks with binary synapses. We extend the analysis to synapses with multiple states and generally more plausible biological features. The results clearly indicate that the overall qualitative picture is unchanged with respect to the binary case, and very robust to variation of the details of the model. We also provide quantitative results which suggest that the advantages of increasing the synaptic precision (i.e., the number of internal synaptic states) rapidly vanish after the first few bits, and therefore that, for practical applications, only few bits may be needed for near-optimal performance, consistent with recent biological findings. Finally, we demonstrate how the theoretical analysis can be exploited to design efficient algorithmic search strategies.
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/11565/39966082016-01-01T00:00:00ZLocal entropy as a measure for sampling solutions in constraint satisfaction problemshttp://hdl.handle.net/11565/4007072Titolo: Local entropy as a measure for sampling solutions in constraint satisfaction problems
Abstract: We introduce a novel entropy-driven Monte Carlo (EdMC) strategy to efficiently sample solutions of random constraint satisfaction problems (CSPs). First, we extend a recent result that, using a large-deviation analysis, shows that the geometry of the space of solutions of the binary perceptron learning problem (a prototypical CSP), contains regions of very high-density of solutions. Despite being sub-dominant, these regions can be found by optimizing a local entropy measure. Building on these results, we construct a fast solver
that relies exclusively on a local entropy estimate, and can be applied to general CSPs. We describe its performance not only for the perceptron learning problem but also for the random K-satisfiabilty problem (another prototypical CSP with a radically different structure), and show numerically that a simple zero-temperature Metropolis search in the smooth local entropy landscape can reach sub-dominant clusters of optimal solutions in a small number of steps, while standard Simulated Annealing either requires extremely long cooling procedures or just fails. We also discuss how the EdMC can heuristically be made even more effcient for the cases we studied.
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/11565/40070722016-01-01T00:00:00ZFrom inverse problems to learning: a statistical mechanics approachhttp://hdl.handle.net/11565/4006643Titolo: From inverse problems to learning: a statistical mechanics approach
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/11565/40066432018-01-01T00:00:00ZUnreasonable effectiveness of learning neural networks: from accessible states and robust ensembles to basic algorithmic schemeshttp://hdl.handle.net/11565/3996619Titolo: Unreasonable effectiveness of learning neural networks: from accessible states and robust ensembles to basic algorithmic schemes
Abstract: In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is very rough even for simple architectures, and provide theoretical and numerical evidence of the existence of rare-but extremely dense and accessible-regions of configurations in the network weight space. We define a measure, the robust ensemble (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions. We analytically compute the RE in some exactly solvable models and also provide a general algorithmic scheme that is straightforward to implement: define a cost function given by a sum of a finite number of replicas of the original cost function, with a constraint centering the replicas around a driving assignment. To illustrate this, we derive several powerful algorithms, ranging from Markov Chains to message passing to gradient descent processes, where the algorithms target the robust dense states, resulting in substantial improvements in performance. The weak dependence on the number of precision bits of the weights leads us to conjecture that very similar reasoning applies to more conventional neural networks. Analogous algorithmic schemes can also be applied to other optimization problems.
Fri, 01 Jan 2016 00:00:00 GMThttp://hdl.handle.net/11565/39966192016-01-01T00:00:00ZRole of synaptic stochasticity in training low-precision neural networkshttp://hdl.handle.net/11565/4011427Titolo: Role of synaptic stochasticity in training low-precision neural networks
Abstract: Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension that allows to train discrete deep neural networks is also investigated.
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/11565/40114272018-01-01T00:00:00ZFrom statistical inference to a differential learning rule for stochastic neural networkshttp://hdl.handle.net/11565/4012754Titolo: From statistical inference to a differential learning rule for stochastic neural networks
Abstract: -
Mon, 01 Jan 2018 00:00:00 GMThttp://hdl.handle.net/11565/40127542018-01-01T00:00:00Z