Skip to main navigation Skip to search Skip to main content

Intelligent Online Multiconstrained Reentry Guidance Based on Hindsight Experience Replay

  • School of Astronautics, Harbin Institute of Technology
  • Beijing Institute of Aerospace Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Traditional guidance algorithms for hypersonic glide vehicles face the challenge of real-time requirements and robustness to multiple deviations or tasks. In this paper, an intelligent online multiconstrained reentry guidance is proposed to strikingly reduce computational burden and enhance the effectiveness with multiple constraints. First, the simulation environment of reentry including dynamics, multiconstraints, and control variables is built. Different from traditional decoupling methods, the bank angle command including its magnitude and sign is designed as the sole guidance variable. Secondly, a policy neural network is designed to output end-to-end guidance commands. By transforming the reentry process into a Markov Decision Process (MDP), the policy network can be trained by deep reinforcement learning (DRL). To address the sparse reward issue caused by multiconstraints, the improved Hindsight Experience Replay (HER) method is adaptively combined with Deep Deterministic Policy Gradient (DDPG) algorithm by transforming multiconstraints into multigoals. As a result, the novel training algorithm can realize higher utilization of failed data and improve the rate of convergence. Finally, simulations for typical scenes show that the policy network in the proposed guidance can output effective commands in much less time than the traditional method. The guidance is robust to initial bias, different targets, and online aerodynamic deviation.

Original languageEnglish
Article number5883080
JournalInternational Journal of Aerospace Engineering
Volume2023
DOIs
StatePublished - 2023
Externally publishedYes

Fingerprint

Dive into the research topics of 'Intelligent Online Multiconstrained Reentry Guidance Based on Hindsight Experience Replay'. Together they form a unique fingerprint.

Cite this