Skip to main navigation Skip to search Skip to main content

一种基于深度强化学习的多对多在轨服务优化调度方法

Translated title of the contribution: A Multi-to-Multi On-orbit Servicing Optimization Scheduling Method Based on Deep Reinforcement Learning
  • School of Astronautics, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

An intelligent approach based on deep reinforcement learning is proposed to address the optimization scheduling problem for multi-to-multi on-orbit spacecraft services. Initially,this problem is modeled as an orbit-related vehicle routing problem. Subsequently,an encoder-decoder neural network,equipped with an attention mechanism,is introduced to construct a stochastic policy that generates solutions for given problem instances. Within this framework,the encoder is responsible for producing graph embeddings and node embeddings,while the decoder generates solutions in a step-by-step manner based on these embeddings. The neural network is then trained utilizing the REINFORCE algorithm,augmented with a greedy rollout baseline. Extensive experimental results ultimately demonstrate the effectiveness and superiority of the proposed method. The advantages of this intelligent approach are manifold. It provides near real-time solutions to scheduling problems,offers superior solution quality for large-scale scheduling problems compared to meta-heuristic algorithms,and exhibits good generalization ability,as models trained on instances with a specific number of targets can be applied to instances with differing numbers of targets.

Translated title of the contributionA Multi-to-Multi On-orbit Servicing Optimization Scheduling Method Based on Deep Reinforcement Learning
Original languageChinese (Traditional)
Pages (from-to)204-214
Number of pages11
JournalYuhang Xuebao/Journal of Astronautics
Volume46
Issue number1
DOIs
StatePublished - Jan 2025
Externally publishedYes

Fingerprint

Dive into the research topics of 'A Multi-to-Multi On-orbit Servicing Optimization Scheduling Method Based on Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this