Skip to main navigation Skip to search Skip to main content

IBPO: Solving 3D Strategy Game with the Intrinsic Reward

  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, deep reinforcement learning achieves great success in many fields, especially in the field of games, such as AlphaGo, AlphaZero and AlphaStar. However, reward sparsity is still a problem in the 3D strategy games with a higher dimension of state space and more complex game scenarios. To solve this problem, in this paper, we propose an intrinsic-based policy optimization algorithm (IBPO) for reward sparsity. The IBPO incorporates the intrinsic reward into the traditional policy, which composed by the differential fusion mechanism and the modified value network. The experimental results show our method can obtain better performance than the previous methods on the VizDoom.

Original languageEnglish
Title of host publicationAdvances in Smart Vehicular Technology, Transportation, Communication and Applications - Proceedings of VTCA 2021
EditorsTsu-Yang Wu, Shaoquan Ni, Shu-Chuan Chu, Chi-Hua Chen, Margarita Favorskaya
PublisherSpringer Science and Business Media Deutschland GmbH
Pages257-264
Number of pages8
ISBN (Print)9789811640384
DOIs
StatePublished - 2022
Externally publishedYes
Event4th International Conference on Smart Vehicular Technology, Transportation, Communication and Applications, VTCA 2021 - Chengdu, China
Duration: 22 May 202124 May 2021

Publication series

NameSmart Innovation, Systems and Technologies
Volume250
ISSN (Print)2190-3018
ISSN (Electronic)2190-3026

Conference

Conference4th International Conference on Smart Vehicular Technology, Transportation, Communication and Applications, VTCA 2021
Country/TerritoryChina
CityChengdu
Period22/05/2124/05/21

Keywords

  • Deep reinforcement learning
  • Game
  • Intrinsic reward
  • Reward sparsity

Fingerprint

Dive into the research topics of 'IBPO: Solving 3D Strategy Game with the Intrinsic Reward'. Together they form a unique fingerprint.

Cite this