Max Planck Institute for Mathematics in the Sciences, Germany
On the Natural Gradient for Deep Learning
The natural gradient method is one of the most prominent information-geometric methods within the field of machine learning.
It was proposed by Amari in 1998 and uses the Fisher-Rao metric as Riemannian metric for the definition of a gradient within optimisation tasks. Since then it proved to be extremely efficient in the context of neural networks, reinforcement learning, and robotics. In recent years, attempts have been made to apply the natural gradient method for training deep neural networks. However, due to the huge number of parameters of such networks, the method is currently not directly applicable in this context. In my presentation, I outline ways to simplify the natural gradient for deep learning. Corresponding simplifications are related to the locality of learning associated with the underlying network structure.
Souriau-Fisher Metric and Higher Order Extension
We introduce the symplectic structure of information geometry based on Souriau’s Lie group thermodynamics model, with a covariant definition of Gibbs equilibrium via invariances through co-adjoint action of a group on its moment space, defining physical observables like energy, heat, and moment as pure geometrical objects. Using geometric Planck temperature of Souriau model and symplectic cocycle notion, the Fisher metric is identified as a Souriau geometric heat capacity. The Souriau model is based on affine representation of Lie group and Lie algebra that we compare with Koszul works on G/K homogeneous space and bijective correspondence between the set of G-invariant flat connections on G/K and the set of affine representations of the Lie algebra of G. The Souriau-Fisher metric is linked to KKS (Kostant–Kirillov–Souriau) 2-form that associates a canonical homogeneous symplectic manifold to the co-adjoint orbits. We conclude with Higher order extension of Souriau model based on works of R..S Ingarden and W. Jaworski. The Souriau model of statistical physics is validated as compatible with the Balian gauge model of thermodynamics.
Araya Inc., Tokyo, Japan
Free energy, empowerment, and predictive information compared
Imperial College London, United Kingdom
Information loss in phase transitions
There are universal features associated with a range of critical phenomena, such as thermal phase transitions or transitions associated with 'exceptional points' where eigenvectors of a matrix coalesce. These include, for example, the breakdown of adiabatic approximations, the curvature divergence of the parameter-space at critical points, or the loss of information concerning the initial state of the system as one passes through critical points. This talk will sketch some of these ideas that suggest that perhaps exceptional point physics is not all that different from critical phenomena in thermal physics.
University of Albany, USA
Entropic Dynamics: from Information Geometry to Quantum Geometry
Entropic Dynamics (ED) is a framework in which dynamical laws are derived as an application of entropic methods of inference. The dynamics of the probability distribution is driven by entropy subject to constraints that are eventually codified into the phase of a wave function. The central challenge is to identify the relevant physical constraints and, in particular, to specify how those constraints are themselves updated.
In this talk I describe how the information geometry of the space of probabilities is extended to the ensemble-phase space of probabilities and phases. The result is a highly symmetric Riemannian geometry that incorporates the symplectic and complex structures that characterize the geometry of quantum mechanics. The ED that preserves these structures is a Hamiltonian flow and the simplest Hamiltonian suggested by the extended metric leads to quantum mechanics. Thus, in the entropic dynamics framework, Hamiltonians and complex wave functions arise as the natural consequence of information geometry.
Goethe University Frankfurt, Germany
Self-organized limit cycles and chaotic attractors in the sensorimotor loop of simulated and real-world robots
Harsha Kallumadatil Velluva
Indian Institute of Technology Bombay, Mumbai, India, India
Some Information Inequalities for Statistical Inference
Romanian Institute of Science and Technology, Romania
Second-order Information Geometry on a Parametric Exponential Family
Max Planck Institute for Intelligent Systems, Germany
(joint work with Cristina Pinneri)
Towards controlled self-organized behavioral exploration
Nagoya Institute of Technology, Japan
Hessian structures on deformed exponential families II
Universiteit Antwerpen, Belgium
Deformed exponential families in statistical physics and beyond
University of Sydney, Australia
Thermodynamic Interpretation of Information Geometric Curvature
University of Fukui, Japan
Behavioral analysis of certain nonlinear diffsion equations via information geometry
Collegio Carlo Alberto, Italy
Information Geometry: finite vs infinite sample space
The scope of IG has been clearly defined in S. Amari and H. Nagaoka monograph by emphasizing the crucial importance of the dual affine structure of regular parametric statistical models. Amari’s research program is today still in progress e.g., the papers presented in this conference. One direction of research is the extension of the dual affine structure to non-parametric models. While the parametric assumption can fit most of the needs of Statististics and Machine Learning, this assumption is too restrictive in such fields as Statistical Phisics, Stochastics, or Evolution Equations. It is an easy but useful exercise to retell the geometry of finite dimensional models—-such as the probability simplex or the Gaussian distribution—-in a non-parametric language. In the first part of the talk, we give a non-parametric presentation of the IG of the finite probability simplex based on the idea of statistical bundle. After that, we discuss some of the available proposals for generalizing the same set-up to technically more complex situation.
The University of Sydney, Australia
On thermodynamic efficiency of collective computation
Universidad Nacional Autonoma de Mexico (UNAM), Mexico
Self-organization conducted by the dynamics towards the attractor at the onset of chaos
We construct an all-inclusive statistical-mechanical model for self-organization based on the hierarchical properties of the nonlinear dynamics towards the attractors that define the period-doubling route to chaos [1-3]. The aforementioned dynamics imprints a sequential assemblage of the model that privileges progressively lower entropies, while a new set of configurations emerges due to the collective partitioning of the original system into secluded portions. The initial canonical balance between numbers of configurations and Boltzmann-Gibbs (BG) statistical weights is drastically altered and ultimately eliminated by the sequential actions of the attractor. However the emerging set of configurations implies a different and novel entropy growth process that eventually upsets the original loss and has the capability of locking the system into a self-organized state with characteristics of criticality, therefore reminiscent in spirit to the so-called self-organized criticality [4,5].
Some specifics of the approach we develop are: We systematically eliminate access to configurations of an otherwise elementary thermal system model by progressively partitioning it into isolated portions until only remains a subset of configurations of vanishing measure. Each isolated portion becomes essentially a micro-canonical ensemble. The thermal system consists of a large number of (effective) degrees of freedom, each occupying entropy levels with the form of inverse powers of two. The sequential process replaces the original configurations by an emerging discrete scale invariant set of ensemble configurations with allowed entropies that are necessarily inverse powers of two. In doing this we achieve the following results:
1) The constrained thermal system becomes a close analogue of the dynamics towards the multifractal attractor at the period-doubling onset of chaos.
2) The statistical-mechanical properties of the thermal system depart from those of the ordinary Boltzmann-Gibbs form and acquire features from q-statistics.
3) Redefinition of entropy levels as logarithms of the original ones recovers the BG scheme and the free energy Legendre transform property.
Furthermore, the sequences of actions on the entropies associated with the degrees of freedom have the following consequences:
i) Confine degrees of freedom on very few configurations of ever decreasing entropies.
ii) The reduction in numbers of configurations goes down from the initial exponential of the number of degrees of freedom to only one per micro-canonical ensemble, there being an equivalent exponential number of such ensembles for the entire system. As the thermodynamic limit is approached a set of initial configurations with nonzero measure reduces to a set of vanishing measure. But in the process a new set of numbers of configurations develops. These are given by the degeneracies of the micro-canonical ensembles.
iii) The new emerging numbers of configurations grow more slowly than exponentially with the size of the system as they are binomial coefficients.
As with any partition function, the sum of the numbers of configurations times their probabilities cannot vanish nor diverge but be unity. Initially (the BG case) the numbers grow exponentially and the weights also decrease exponentially with system size. After the actions of the attractor the numbers of new ensemble degeneracies grow slower than exponentially and the new weights must now decrease accordingly. The new “canonical” partition function acquires q-exponential weights typical of q-statistics. The precise value of the tuning parameter q and its relationship with the inverse temperature is determined.
 Robledo A., Moyano, L.G., “q-deformed statistical-mechanical property in the dynamics of trajectories en route to the Feigenbaum attractor”, Physical Review E 77, 032613 (2008).
 Robledo, A., “A dynamical model for hierarchy and modular organization: The trajectories en route to the attractor at the transition to chaos”, Journal of Physics: Conf. Ser. 394, 012007 (2012).
 Diaz-Ruelas, A., Robledo, A., “Emergent statistical-mechanical structure in the dynamics along the period-doubling route to chaos”, Europhysics Letters 105, 40004 (2014).
 Bak, P., “How Nature Works” (Copernicus, New York, 1996).
 Watkins, N.W., Pruessner, G., Chapman, S.C., Crosby, N.B., Jensen, J.K., “25 Years of Self-organized Criticality: Concepts and Controversies”, Space Science Reviews 198, 3 (2016).
Antonio Maria Scarfone
Istituto dei Sistemi Complessi (ISC - CNR), Italy
What about statistical physics of asymptotical scale free distributions?
University of Sydney, Australia
Relating information dynamics and stochastic thermodynamics
Chiba University, Japan
α-divergence appeared as rate function in large deviation estimate
Ibaraki University, Japan
On the canonical distributions of the thermal particles in a weakly confining potential
Santa Fe Institute, USA
(joint work with Artemy Kolchinsky)
Fundamental limits on the thermodynamics of circuits
The thermodynamics of computation specifies the minimum amount that entropy must increase in the environment of any physical system that implements a given computation, when there are no constraints on how the system operates (the so-called “Landauer limit”). However common engineered computers use digital circuits that physically connect separate gates in a specific topology. Each gate in the circuit performs its own “local” computation, with no a priori constraints on how it operates. In contrast, the circuit’s topology introduces constraints on how the aggregate physical system implementing the overall “global” computation can operate. These constraints cause additional minimal entropy
increase in the environment of the overall circuit, beyond that caused by the individual gates.
Here we analyze the relationship between a circuit’s topology and this additional entropy increase, which we call the “circuit Landauer cost”. We also compute a second kind of circuit cost, the “circuit mismatch cost”. This is the extra entropy that is generated if a physical system is designed to achieve minimal entropy production for a particular distribution q over its inputs, but is instead used with an input distribution p that differs from q.
We show that whereas the circuit Landauer cost cannot be negative, circuits can have positive or negative mismatch cost. In fact the total circuit cost (given by summing the two types of cost) can be positive or negative. Thus, the total amount of entropy increase in the environment can be either larger or smaller when a particular computation is implemented with a circuit.
Furthermore, in general different circuits computing the same Boolean function have both different Landauer costs and different mismatch costs. This provides a new set of challenges, never before considered, for how to design a circuit to implement a given computation with minimal thermodynamic cost. As a first step in addressing these challenges, we use tools from the computer science field of circuit complexity to analyze the scaling of thermodynamic costs for different computational tasks.
G. Çiğdem Yalçın
Istanbul University, Turkey
Entropies associated with attractors at transitions to chaos in low-dimensional nonlinear dynamics
MPI for Mathematics in the Sciences
University of Sydney (Australia)
- Nihat Ay, MPI for Mathematics in the Sciences, Leipzig (Germany)
- Domenico Felice, MPI for Mathematics in the Sciences, Leipzig (Germany)
- Carlos Gershenson, Universidad Nacional Autónoma de México, Computer Sciences Department, Mexico City (Mexico)
- Paolo Gibilisco, Università degli Studi di Roma "Tor Vergata", Facoltà di Economia, Roma (Italy)
- Daniel Polani, University of Hertfordshire, Department of Computer Science, Hatfield (United Kingdom)
- Mikhail Prokopenko, University of Sydney, Sydney (Australia)
- Richard Spinney, University of Sydney, Sydney (Australia)
- Justin Werfel, Harvard University, Cambridge (USA)
- Larry Yaeger, Google Inc., San Francisco (USA)
- G. Çiğdem Yalçın, İstanbul Üniversitesi, İstanbul (Turkey)
Administrative ContactAntje Vandenberg
MPI for Mathematics in the Sciences
Contact by Email