Skip to main navigation Skip to search Skip to main content

Demonstration-enhanced policy search for space multi-arm robot collaborative skill learning

  • Tian GAO
  • , Chengfei YUE*
  • , Xiaozhe JU
  • , Tao LIN
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

The increasing complexity of on-orbit tasks imposes great demands on the flexible operation of space robotic arms, prompting the development of space robots from single-arm manipulation to multi-arm collaboration. In this paper, a combined approach of Learning from Demonstration (LfD) and Reinforcement Learning (RL) is proposed for space multi-arm collaborative skill learning. The combination effectively resolves the trade-off between learning efficiency and feasible solution in LfD, as well as the time-consuming pursuit of the optimal solution in RL. With the prior knowledge of LfD, space robotic arms can achieve efficient guided learning in high-dimensional state-action space. Specifically, an LfD approach with Probabilistic Movement Primitives (ProMP) is firstly utilized to encode and reproduce the demonstration actions, generating a distribution as the initialization of policy. Then in the RL stage, a Relative Entropy Policy Search (REPS) algorithm modified in continuous state-action space is employed for further policy improvement. More importantly, the learned behaviors can maintain and reflect the characteristics of demonstrations. In addition, a series of supplementary policy search mechanisms are designed to accelerate the exploration process. The effectiveness of the proposed method has been verified both theoretically and experimentally. Moreover, comparisons with state-of-the-art methods have confirmed the outperformance of the approach.

Original languageEnglish
Article number103187
JournalChinese Journal of Aeronautics
Volume38
Issue number3
DOIs
StatePublished - Mar 2025
Externally publishedYes

Keywords

  • Demonstrations
  • Policy search mechanism
  • Probabilistic Movement Primitives
  • Reinforcement Learning
  • Relative Entropy Policy Search
  • Space multi-arm collaboration

Fingerprint

Dive into the research topics of 'Demonstration-enhanced policy search for space multi-arm robot collaborative skill learning'. Together they form a unique fingerprint.

Cite this