Skip to main navigation Skip to search Skip to main content

Online Finite-Horizon ADP Algorithm for Solving Non-Cooperative Differential Games

  • Yingning Gao
  • , Kemao Ma*
  • *Corresponding author for this work
  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, online reinforcement learning solution to finite-horizon non-cooperative differential games is investigated. The main challenges are that, the Hamilton-Jacobi-Isaacs equation is time-varying, and in terms of parameter convergence, conventional learning approaches, with persistence of excitation condition satisfied, may cause undesired oscillations in the system states and deteriorate system performance, especially in finite-horizon situations. Regarding these, we develop a finite-horizon adaptive dynamic programming algorithm for continuous-time systems, where the idea of simulation of experience in concurrent learning technique is adopted to guarantee fast online parameter convergence. Boundedness of the system states and weight estimation errors is proved. Simulation results demonstrate the effectiveness of the presented method.

Original languageEnglish
Title of host publicationProceedings of the 36th Chinese Control and Decision Conference, CCDC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1364-1369
Number of pages6
ISBN (Electronic)9798350387780
DOIs
StatePublished - 2024
Event36th Chinese Control and Decision Conference, CCDC 2024 - Xi'an, China
Duration: 25 May 202427 May 2024

Publication series

NameProceedings of the 36th Chinese Control and Decision Conference, CCDC 2024

Conference

Conference36th Chinese Control and Decision Conference, CCDC 2024
Country/TerritoryChina
CityXi'an
Period25/05/2427/05/24

Keywords

  • adaptive dynamic programming
  • concurrent learning
  • finite-horizon differential games
  • reinforcement learning

Fingerprint

Dive into the research topics of 'Online Finite-Horizon ADP Algorithm for Solving Non-Cooperative Differential Games'. Together they form a unique fingerprint.

Cite this