Publications
2014
Issue 72

MiS Preprint Repository

We have decided to discontinue the publication of preprints on our preprint server end of 2024. The publication culture within mathematics has changed so much due to the rise of repositories such as ArXiV (www.arxiv.org) that we are encouraging all institute members to make their preprints available there. An institute's repository in its previous form is, therefore, unnecessary. The preprints published to date will remain available here, but we will not add any new preprints here.

MiS Preprint

72/2014

On the number of response regions of deep feedforward networks with piecewise linear activations

Razvan Pascanu, Guido Montúfar and Yoshua Bengio

Abstract

This paper explores the complexity of deep feedforward networks with linear pre-synaptic couplings and rectified linear activations. This is a contribution to the growing body of work contrasting the representational power of deep and shallow network architectures. In particular, we offer a framework for comparing deep and shallow models that belong to the family of piecewise linear functions based on computational geometry. We look at a deep rectifier multi-layer perceptron (MLP) with linear outputs units and compare it with a single layer version of the model. In the asymptotic regime, when the number of inputs stays constant, if the shallow model has $k n$ hidden units and $n_{0}$ inputs, then the number of linear regions is $O (k^{n_{0}} n^{n_{0}})$ . For a $k$ layer model with $n$ hidden units on each layer it is $Ω ({⌊ n / n_{0} ⌋}^{k - 1} n^{n_{0}})$ . The number ${⌊ n / n_{0} ⌋}^{k - 1}$ grows faster than $k^{n_{0}}$ when $n$ tends to infinity or when $k$ tends to infinity and $n \geq 2 n_{0}$ .

Additionally, even when $k$ is small, if we restrict $n$ to be $2 n_{0}$ , we can show that a deep model has considerably more linear regions that a shallow one. We consider this as a first step towards understanding the complexity of these models and specifically towards providing suitable mathematical tools for future analysis.

Contact the author per mail Download full preprint 2 MB

Received:: 29.07.14

Published:: 30.07.14

MSC Codes:: 68R99, 05A99, 82C32

Keywords:: Deep learning, artificial neural network, rectifier unit, hyperplane arrangement, Representational Power

Related publications

inBook

2014 Repository Open Access

Razvan Pascanu, Guido Montúfar and Yoshua Bengio

On the number of inference regions of deep feed forward networks with piece-wise linear activations

In: Second international conference on learning representations - ICLR 2014 : 14-16 April 2014, Banff, Canada
Banff : ICLR, 2014.

BibTex ArXiv: 1312.6098 Link: openreview.net