Delve into the future of research at MiS with our preprint repository. Our scientists are making groundbreaking discoveries and sharing their latest findings before they are published. Explore repository to stay up-to-date on the newest developments and breakthroughs.
Reinforcement Learning in Complementarity Game and Population Dynamics
Jürgen Jost and Wei Li
We let different reinforcement learning schemes compete in a complementarity game played between members of two populations.
This leads us to a new version of Roth-Erev (NRE) reinforcement learning. In this NRE scheme, the probability of choosing a certain action $k$ for any player $n$ at time $t$ is proportional to the accumulated rescaled reward by playing $k$ during the time steps prior to $t$. The formula of the rescaled reward is a power law of payoffs, with the optimal value of the power exponent being 1.5. NRE reinforcement learning outperforms the original Roth-Erev-, Bush-Mosteller-, and SoftMax reinforcement learning when all of them choose optimal parameters. NRE reinforcement learning also performs better than most evolutionary strategies, except for the simplest ones which have the advantage of most quickly converging to some favorable fixed point.