The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Nonconvergence to saddle boundary points under perturbed reinforcement learning

Author

Summary, in English

For several reinforcement learning models in strategic-form games, convergence to action profiles that are not Nash equilibria may occur with positive probability under certain conditions on the payoff function. In this paper, we explore how an alternative reinforcement learning model, where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function, may exclude convergence to non-Nash pure strategy profiles. This approach extends prior analysis on reinforcement learning in games that addresses the issue of convergence to saddle boundary points. It further provides a framework under which the effect of mutations can be analyzed in the context of reinforcement learning.

Publishing year

2015

Language

English

Pages

667-699

Publication/Series

International Journal of Game Theory

Volume

44

Issue

3

Document type

Journal article

Publisher

Springer

Topic

  • Control Engineering

Keywords

  • Learning in games
  • Reinforcement learning
  • Replicator dynamics

Status

Published

ISBN/ISSN/Other

  • ISSN: 1432-1270