We present an efficient learning algorithm for the problem of training neural networks with discrete synapses, a well-known hard (NP-complete) discrete optimization problem. The algorithm is a variant of the so-called Max-Sum (MS) algorithm. In particular, we show how, for bounded integer weights with q distinct states and independent concave a priori distribution (e.g. l1 regularization), the algorithm's time complexity can be made to scale as $O\left(N\,\text{log}\,N\right)$ per node update, thus putting it on par with alternative schemes, such as Belief Propagation (BP), without resorting to approximations. Two special cases are of particular interest: binary synapses $W\in \left\{-1,1\right\}$ and ternary synapses $W\in \left\{-1,0,1\right\}$ with l0 regularization. The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.
A Max-Sum algorithm for training discrete neural networks
BALDASSI, CARLO;
2015
Abstract
We present an efficient learning algorithm for the problem of training neural networks with discrete synapses, a well-known hard (NP-complete) discrete optimization problem. The algorithm is a variant of the so-called Max-Sum (MS) algorithm. In particular, we show how, for bounded integer weights with q distinct states and independent concave a priori distribution (e.g. l1 regularization), the algorithm's time complexity can be made to scale as $O\left(N\,\text{log}\,N\right)$ per node update, thus putting it on par with alternative schemes, such as Belief Propagation (BP), without resorting to approximations. Two special cases are of particular interest: binary synapses $W\in \left\{-1,1\right\}$ and ternary synapses $W\in \left\{-1,0,1\right\}$ with l0 regularization. The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.File | Dimensione | Formato | |
---|---|---|---|
1505.05401.pdf
non disponibili
Descrizione: Articolo principale
Tipologia:
Documento in Pre-print (Pre-print document)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
462.34 kB
Formato
Adobe PDF
|
462.34 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.