Search
Talk

ReLU transformers and piecewise polynomials

  • Zehua Lai (UT Austin)
Live Stream

Abstract

We highlight a perhaps important but hitherto unobserved insight: The attention module in a ReLU-transformer is a cubic spline. Viewed in this manner, this mysterious but critical component of a transformer becomes a natural development of an old notion deeply entrenched in classical approximation theory. Conversely, if we assume the Pierce--Birkhoff conjecture, then every spline is also an encoder. This gives a satisfying answer to the mathematical structure of ReLU-transformers.

Links

seminar
22.05.25 05.06.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Upcoming Events of this Seminar