Search

Talk

ReLU transformers and piecewise polynomials

  • Zehua Lai (UT Austin)
Live Stream

Abstract

We highlight a perhaps important but hitherto unobserved insight: The attention module in a ReLU-transformer is a cubic spline. Viewed in this manner, this mysterious but critical component of a transformer becomes a natural development of an old notion deeply entrenched in classical approximation theory. Conversely, if we assume the Pierce--Birkhoff conjecture, then every spline is also an encoder. This gives a satisfying answer to the mathematical structure of ReLU-transformers.

seminar
07.11.24 19.12.24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of this Seminar