Abstract
The increasing complexity of on-orbit tasks imposes great demands on the flexible operation of space robotic arms, prompting the development of space robots from single-arm manipulation to multi-arm collaboration. In this paper, a combined approach of Learning from Demonstration (LfD) and Reinforcement Learning (RL) is proposed for space multi-arm collaborative skill learning. The combination effectively resolves the trade-off between learning efficiency and feasible solution in LfD, as well as the time-consuming pursuit of the optimal solution in RL. With the prior knowledge of LfD, space robotic arms can achieve efficient guided learning in high-dimensional state-action space. Specifically, an LfD approach with Probabilistic Movement Primitives (ProMP) is firstly utilized to encode and reproduce the demonstration actions, generating a distribution as the initialization of policy. Then in the RL stage, a Relative Entropy Policy Search (REPS) algorithm modified in continuous state-action space is employed for further policy improvement. More importantly, the learned behaviors can maintain and reflect the characteristics of demonstrations. In addition, a series of supplementary policy search mechanisms are designed to accelerate the exploration process. The effectiveness of the proposed method has been verified both theoretically and experimentally. Moreover, comparisons with state-of-the-art methods have confirmed the outperformance of the approach.
| Original language | English |
|---|---|
| Article number | 103187 |
| Journal | Chinese Journal of Aeronautics |
| Volume | 38 |
| Issue number | 3 |
| DOIs | |
| State | Published - Mar 2025 |
| Externally published | Yes |
Keywords
- Demonstrations
- Policy search mechanism
- Probabilistic Movement Primitives
- Reinforcement Learning
- Relative Entropy Policy Search
- Space multi-arm collaboration
Fingerprint
Dive into the research topics of 'Demonstration-enhanced policy search for space multi-arm robot collaborative skill learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver