Reinforcement Learning in Complementarity Game and Population Dynamics
Jürgen Jost and Wei Li
Contact the author: Please use for correspondence this email.
Submission date: 29. Jan. 2013
published in: Physical review / E, 89 (2014) 2, art-no. 022113
DOI number (of the published article): 10.1103/PhysRevE.89.022113
PACS-Numbers: 02.50.Le, 89.75.Fb, 89.90+n
Keywords and phrases: Reinforcement Learning, population dynamics
Download full preprint: PDF (6959 kB)
We let different reinforcement learning schemes compete in a complementarity game played between members of two populations. This leads us to a new version of Roth-Erev (NRE) reinforcement learning. In this NRE scheme, the probability of choosing a certain action k for any player n at time t is proportional to the accumulated rescaled reward by playing k during the time steps prior to t. The formula of the rescaled reward is a power law of payoffs, with the optimal value of the power exponent being 1.5. NRE reinforcement learning outperforms the original Roth-Erev-, Bush-Mosteller-, and SoftMax reinforcement learning when all of them choose optimal parameters. NRE reinforcement learning also performs better than most evolutionary strategies, except for the simplest ones which have the advantage of most quickly converging to some favorable fixed point.