A Max-Sum algorithm for training discrete neural networks

IRIS

We present an efficient learning algorithm for the problem of training neural networks with discrete synapses, a well-known hard (NP-complete) discrete optimization problem. The algorithm is a variant of the so-called Max-Sum (MS) algorithm. In particular, we show how, for bounded integer weights with q distinct states and independent concave a priori distribution (e.g. l1 regularization), the algorithm's time complexity can be made to scale as $O\left(N\,\text{log}\,N\right)$ per node update, thus putting it on par with alternative schemes, such as Belief Propagation (BP), without resorting to approximations. Two special cases are of particular interest: binary synapses $W\in \left\{-1,1\right\}$ and ternary synapses $W\in \left\{-1,0,1\right\}$ with l0 regularization. The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.

A Max-Sum algorithm for training discrete neural networks

BALDASSI, CARLO;Braunstein, Alfredo

2015

Abstract

We present an efficient learning algorithm for the problem of training neural networks with discrete synapses, a well-known hard (NP-complete) discrete optimization problem. The algorithm is a variant of the so-called Max-Sum (MS) algorithm. In particular, we show how, for bounded integer weights with q distinct states and independent concave a priori distribution (e.g. l1 regularization), the algorithm's time complexity can be made to scale as $O\left(N\,\text{log}\,N\right)$ per node update, thus putting it on par with alternative schemes, such as Belief Propagation (BP), without resorting to approximations. Two special cases are of particular interest: binary synapses $W\in \left\{-1,1\right\}$ and ternary synapses $W\in \left\{-1,0,1\right\}$ with l0 regularization. The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.

Scheda breve

Scheda completa

Scheda completa (DC)

	Year / Anno
	
				2015
			
	Date first on line publication / Data di prima pubblicazione on line
	
				2015
			
	DOI
	
				https://dx.doi.org/10.1088/1742-5468/2015/08/P08008
			
	Journal / Rivista
	
				JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT
			
	URL / Indirizzo web
	
				http://doi.org/10.1088/1742-5468/2015/08/P08008
			
	Tutti gli autori
	
						Baldassi, Carlo; Braunstein, Alfredo
					
	Appare nelle tipologie:
	
				01 - Article in academic journal / Articolo su rivista scientifica

File in questo prodotto:

File	Dimensione	Formato
1505.05401.pdf non disponibili Descrizione: Articolo principale Tipologia: Documento in Pre-print (Pre-print document) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 462.34 kB Formato Adobe PDF Visualizza/Apri	462.34 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/3996617

Citazioni

ND

16

14

social impact