Abstract
Considering the characteristics of high speed and maneuverability of hypersonic vehicles in near-space,this paper proposes a deep reinforcement learning guidance algorithm based on the Trust Region Policy Optimization (TRPO)algorithm to improve the accuracy,robustness,and intelligence of the guidance algorithm for intercepting targets with different initial states and different maneuverability modes. The guidance algorithm based on the TRPO algorithm is composed of two policy(action)networks and a critic network,directly mapping the relative motion system state of the near-space target and the interceptor to the guidance command of the interceptor. In the algorithm training process,continuous action space and state space are reasonably designed,and the reward function is constructed to accelerate the training convergence speed by weighing energy consumption,relative distance,and other factors. Finally,tests are conducted for different task scenarios according to the trained agent model. The simulation results show that,compared with the traditional Proportional Navigation guidance law(PN)and the Improved Proportional Navigation guidance law(IPN),the guidance algorithm in this paper has smaller miss distances,a more stable interception effect,and robustness for learned scenarios and unknown scenarios,and can be widely used on multiple configuration computers.
| Translated title of the contribution | Trust region policy optimization guidance algorithm for intercepting maneuvering target |
|---|---|
| Original language | Chinese (Traditional) |
| Article number | 327596 |
| Journal | Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica |
| Volume | 44 |
| Issue number | 11 |
| DOIs | |
| State | Published - 15 Jun 2023 |
| Externally published | Yes |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
Fingerprint
Dive into the research topics of 'Trust region policy optimization guidance algorithm for intercepting maneuvering target'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver