TY - GEN
T1 - Online Finite-Horizon ADP Algorithm for Solving Non-Cooperative Differential Games
AU - Gao, Yingning
AU - Ma, Kemao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - In this paper, online reinforcement learning solution to finite-horizon non-cooperative differential games is investigated. The main challenges are that, the Hamilton-Jacobi-Isaacs equation is time-varying, and in terms of parameter convergence, conventional learning approaches, with persistence of excitation condition satisfied, may cause undesired oscillations in the system states and deteriorate system performance, especially in finite-horizon situations. Regarding these, we develop a finite-horizon adaptive dynamic programming algorithm for continuous-time systems, where the idea of simulation of experience in concurrent learning technique is adopted to guarantee fast online parameter convergence. Boundedness of the system states and weight estimation errors is proved. Simulation results demonstrate the effectiveness of the presented method.
AB - In this paper, online reinforcement learning solution to finite-horizon non-cooperative differential games is investigated. The main challenges are that, the Hamilton-Jacobi-Isaacs equation is time-varying, and in terms of parameter convergence, conventional learning approaches, with persistence of excitation condition satisfied, may cause undesired oscillations in the system states and deteriorate system performance, especially in finite-horizon situations. Regarding these, we develop a finite-horizon adaptive dynamic programming algorithm for continuous-time systems, where the idea of simulation of experience in concurrent learning technique is adopted to guarantee fast online parameter convergence. Boundedness of the system states and weight estimation errors is proved. Simulation results demonstrate the effectiveness of the presented method.
KW - adaptive dynamic programming
KW - concurrent learning
KW - finite-horizon differential games
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85200390832
U2 - 10.1109/CCDC62350.2024.10587453
DO - 10.1109/CCDC62350.2024.10587453
M3 - 会议稿件
AN - SCOPUS:85200390832
T3 - Proceedings of the 36th Chinese Control and Decision Conference, CCDC 2024
SP - 1364
EP - 1369
BT - Proceedings of the 36th Chinese Control and Decision Conference, CCDC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 36th Chinese Control and Decision Conference, CCDC 2024
Y2 - 25 May 2024 through 27 May 2024
ER -