Cooperative Multi-Agent Reinforcement Learning with Approximate Model Learning

Young Joon Park, Young Jae Lee, Seoung Bum Kim

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)


In multi-agent reinforcement learning, it is essential for agents to learn communication protocol to optimize collaboration policies and to solve unstable learning problems. Existing methods based on actor-critic networks solve the communication problem among agents. However, these methods have difficulty in improving sample efficiency and learning robust policies because it is not easy to understand the dynamics and nonstationary of the environment as the policies of other agents change. We propose a method for learning cooperative policies in multi-agent environments by considering the communications among agents. The proposed method consists of recurrent neural network-based actor-critic networks and deterministic policy gradients to centrally train decentralized policies. The actor networks cause the agents to communicate using forward and backward paths and to determine subsequent actions. The critic network helps to train the actor networks by sending gradient signals to the actors according to their contribution to the global reward. To address issues with partial observability and unstable learning, we propose using auxiliary prediction networks to approximate state transitions and the reward function. We used multi-agent environments to demonstrate the usefulness and superiority of the proposed method by comparing it with existing multi-agent reinforcement learning methods, in terms of both learning efficiency and goal achievements in the test phase. The results demonstrate that the proposed method outperformed other alternatives.

Original languageEnglish
Article number9133381
Pages (from-to)125389-125400
Number of pages12
JournalIEEE Access
Publication statusPublished - 2020

Bibliographical note

Funding Information:
This work was supported in part by the National Research Foundation of Korea funded by the Korea Government through the Ministry of Science and ICT (MSIT) under Grant NRF-2019R1A4A1024732, in part by the Ministry of Culture, Sports, and Tourism, and in part by the Korea Creative Content Agency, through the Culture Technology Research and Development Program, in 2019.

Publisher Copyright:
© 2013 IEEE.


  • Reinforcement learning
  • actor-critic method
  • deterministic policy gradient
  • model-free method
  • multi-agent cooperation
  • multi-agent system

ASJC Scopus subject areas

  • General Engineering
  • General Materials Science
  • Electrical and Electronic Engineering
  • General Computer Science


Dive into the research topics of 'Cooperative Multi-Agent Reinforcement Learning with Approximate Model Learning'. Together they form a unique fingerprint.

Cite this