Tackling neural network expressivity via polytopes
- Christoph Hertrich (TU Berlin)
Abstract
This talk deals with polytope theory in the context of an open problem concerning the expressivity of feedforward neural networks (NNs) with rectified linear unit (ReLU) activations. While it has been shown that NNs with logarithmic depth (in the input dimension) can represent any continuous, piecewise linear (CPWL) function, it is still unknown whether there exists any CPWL function that cannot be represented with three layers. In the first part of the talk, we present first progress towards depth lower bounds: based on mixed-integer linear programming we show that a 3-layer NN cannot compute the maximum of five scalar inputs under a natural, yet unproven assumption. In the second part, we explain how polytope theory might help to tackle the open problem via Newton polytopes of convex CPWL functions (aka tropical signomials). This is joint work in progress with Amitabh Basu, Marco Di Summa, and Martin Skutella.