Skip to main navigation Skip to search Skip to main content

Multi-Level Credit Assignment for Cooperative Multi-Agent Reinforcement Learning

  • School of Electronics and Information Engineering, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-agent reinforcement learning (MARL) has become more and more popular over recent decades, and the need for high-level cooperation is increasing every day because of the complexity of the real-world environment. However, the multi-agent credit assignment problem that serves as the main obstacle to high-level coordination is still not addressed properly. Though lots of methods have been proposed, none of them have thought to perform credit assignments across multi-levels. In this paper, we aim to propose an approach to learning a better credit assignment scheme by credit assignment across multi-levels. First, we propose a hierarchical model that consists of the manager level and the worker level. The manager level incorporates the dilated Gated Recurrent Unit (GRU) to focus on high-level plans and the worker level uses GRU to execute primitive actions conditioned on high-level plans. Then, one centralized critic is designed for each level to learn each level’s credit assignment scheme. To this end, we construct a novel hierarchical MARL algorithm, named MLCA, which can achieve multi-level credit assignment. We also conduct experiments on three classical and challenging tasks to demonstrate the performance of the proposed algorithm against three baseline methods. The results show that our method gains great performance improvement across all maps that require high-level cooperation.

Original languageEnglish
Article number6938
JournalApplied Sciences (Switzerland)
Volume12
Issue number14
DOIs
StatePublished - Jul 2022
Externally publishedYes

Keywords

  • credit assignment
  • hierarchical MARL
  • multi-agent reinforcement learning

Fingerprint

Dive into the research topics of 'Multi-Level Credit Assignment for Cooperative Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this