Skip to main navigation Skip to search Skip to main content

Energy-optimal Attitude Control of Ultra-large-scale Spacecraft: An Off-policy Actor-critic Approach

  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the advancement of space-based remote sensing technology, ultra-large-scale spacecraft have become crucial platforms for earth observation and deep-space exploration missions. However, these spacecraft face significant challenges in high-precision attitude control due to the nonlinear coupling between flexible structural vibrations and rigid-body attitude motion, further compounded by complex model uncertainties. Additionally, attitude control methods for ultra-large-scale spacecraft that rely on chemical or electric propulsion exhibit substantial energy consumption, making them unsuitable for long-duration missions. Consequently, achieving energy-optimal rigid-flexible coupling control under uncertain system dynamics remains a critical challenge in the field of ultra-large-scale spacecraft control. While existing approaches, such as robust control, adaptive control, and performance-constrained control, have successfully achieved attitude stabilization, they generally neglect energy optimization objectives. To address these issues, this paper proposes an energy-optimal attitude control strategy based on off-policy reinforcement learning for rigid-flexible-coupling ultra-large-scale spacecraft with model uncertainties. More specifically, first, a nonlinear dynamic model is established to capture the coupling effects between flexible appendage vibration modes and rigid-body motion. Model uncertainties, environmental disturbances, along with internal nonlinear couplings are encapsulated into a lumped unknown term, which is online estimated via a neural network (NN)-based adaptive identifier. Then, an efficient online actor-critic learning strategy integrated with an experience replay mechanism is proposed to address the data-efficiency and stability limitations of traditional policy iteration methods. The actor-critic structure is employed to solve the Bellman optimality equation for the energy-optimal control problem, where policy evaluation benefits from reusing historically collected data to improve learning efficiency. Furthermore, the off-policy learning scheme facilitates the learning of optimal control strategies by leveraging past experiences, strengthening the stability and robustness in such a complex system. Theoretical analysis proves that the iterative sequences of the value function and control strategy converge to the optimal counterparts. In addition, rigorous proof of Lyapunov stability of the system in the sense of uniform ultimate boundedness (UUB) is provided, which ensures that the quaternion and angular velocity converge to a neighborhood of the equilibrium. Numerical simulation results based on a large-scale deployable truss spacecraft validate the effectiveness of the proposed strategy. The results highlight its potential for achieving efficient energyoptimal attitude stabilization in the ultra-large-scale spacecraft.

Original languageEnglish
Title of host publicationIAF Astrodynamics Symposium - Held at the 76th International Astronautical Congress, IAC 2025
PublisherInternational Astronautical Federation, IAF
Pages69-75
Number of pages7
ISBN (Electronic)9798331329358
DOIs
StatePublished - 2025
Event2025 IAF Astrodynamics Symposium at the 76th International Astronautical Congress, IAC 2025 - Sydney, Australia
Duration: 29 Sep 20253 Oct 2025

Publication series

NameProceedings of the International Astronautical Congress, IAC
Volume1-F219391
ISSN (Print)0074-1795

Conference

Conference2025 IAF Astrodynamics Symposium at the 76th International Astronautical Congress, IAC 2025
Country/TerritoryAustralia
CitySydney
Period29/09/253/10/25

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Energy-Optimal Control
  • Model Uncertainty Compensation
  • Off-Policy Reinforcement Learning
  • Rigid-Flexible Coupling

Fingerprint

Dive into the research topics of 'Energy-optimal Attitude Control of Ultra-large-scale Spacecraft: An Off-policy Actor-critic Approach'. Together they form a unique fingerprint.

Cite this