Skip to main navigation Skip to search Skip to main content

A Ranking Scheme for Trust Region Multi-agent Reinforcement Learning

  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In multi-agent reinforcement learning (MARL), trust region (TR) methods are widely used because they effectively mitigate the nonstationarity of multi-agent systems and facilitate collaboration among diverse agent types. Based on the multi-agent advantage decomposition lemma, TR methods adopt a sequential update scheme (i.e., agents' policy networks are trained with a certain order). However, current TR methods lack a ranking scheme and train the agents in a random order, this results in suboptimal performance and large variances. To solve this issue, based on agents' observations (the input of agents' policy networks), we formulate our ranking criteria and furthermore propose our ranking schemes. Specifically, we avoid agents with similar observations being ranked adjacent to each other for training and give higher priority to the agents with more information in their observations. We extend our schemes to popular TR methods and evaluate them on a series of StarCraftII, Google Football and Multi-Agent MuJoCo tasks, results show that our ranking schemes can enhance current TR methods in many tasks, whatever in performance, efficiency or stability, indicating its modeling capability on both homogeneous and heterogeneous agent tasks.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
EditorsBhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350368741
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India
Duration: 6 Apr 202511 Apr 2025

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Country/TerritoryIndia
CityHyderabad
Period6/04/2511/04/25

Keywords

  • MARL
  • Ranking
  • Reinforcement Learning
  • Trust Region

Fingerprint

Dive into the research topics of 'A Ranking Scheme for Trust Region Multi-agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this