TY - GEN
T1 - Research on Target Trajectory Planning Method of Humanoid Manipulators Based on Reinforcement Learning
AU - Liang, Keyao
AU - Zha, Fusheng
AU - Sheng, Wentao
AU - Guo, Wei
AU - Wang, Pengfei
AU - Sun, Lining
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd 2023.
PY - 2023
Y1 - 2023
N2 - The goal of most asymmetrically coordinated manipulative tasks of humanoid manipulators is multilevel. For example, a bottle cap screwing task is composed of several sub-objectives, such as reaching, grasping, aligning, and screwing. In addition, the flexible interaction requirements of dual-arm robots challenge the trajectory planning methods of manipulator with high dimensional and strong coupling characteristics. However, the traditional reinforcement learning algorithms cannot quickly learn and generate the required trajectories above. Based on the idea of multi-agent control, a dual-agent deep deterministic policy gradient algorithm is proposed in this paper, which uses two agents to simultaneously plan the coordinated trajectory of the left arm and the right arm online. This algorithm solves the problem of online trajectory planning for multi-objective tasks of humanoid manipulators. The design of observations and actions in the dual-agent structure can reduce the dimension and decouple the humanoid manipulators’ trajectory planning problem to a certain extent, thus speeding up the learning speed. Moreover, a reward function is constructed to realize the coordinated control between the two agents, to promote dual-agent to generate continuous trajectories for multi-objective tasks. Finally, the effectiveness of the proposed algorithm is verified in Baxter multi-objective task simulation environment under the Gym. The results show that this algorithm can quickly learn and online plan the coordinated trajectory of humanoid manipulators for multi-objective tasks.
AB - The goal of most asymmetrically coordinated manipulative tasks of humanoid manipulators is multilevel. For example, a bottle cap screwing task is composed of several sub-objectives, such as reaching, grasping, aligning, and screwing. In addition, the flexible interaction requirements of dual-arm robots challenge the trajectory planning methods of manipulator with high dimensional and strong coupling characteristics. However, the traditional reinforcement learning algorithms cannot quickly learn and generate the required trajectories above. Based on the idea of multi-agent control, a dual-agent deep deterministic policy gradient algorithm is proposed in this paper, which uses two agents to simultaneously plan the coordinated trajectory of the left arm and the right arm online. This algorithm solves the problem of online trajectory planning for multi-objective tasks of humanoid manipulators. The design of observations and actions in the dual-agent structure can reduce the dimension and decouple the humanoid manipulators’ trajectory planning problem to a certain extent, thus speeding up the learning speed. Moreover, a reward function is constructed to realize the coordinated control between the two agents, to promote dual-agent to generate continuous trajectories for multi-objective tasks. Finally, the effectiveness of the proposed algorithm is verified in Baxter multi-objective task simulation environment under the Gym. The results show that this algorithm can quickly learn and online plan the coordinated trajectory of humanoid manipulators for multi-objective tasks.
KW - Bimanual coordination
KW - Deep deterministic policy gradient
KW - Multi-objective trajectory planning
UR - https://www.scopus.com/pages/publications/85175823581
U2 - 10.1007/978-981-99-6492-5_39
DO - 10.1007/978-981-99-6492-5_39
M3 - 会议稿件
AN - SCOPUS:85175823581
SN - 9789819964918
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 452
EP - 463
BT - Intelligent Robotics and Applications - 16th International Conference, ICIRA 2023, Proceedings
A2 - Yang, Huayong
A2 - Liu, Honghai
A2 - Zou, Jun
A2 - Yin, Zhouping
A2 - Liu, Lianqing
A2 - Yang, Geng
A2 - Ouyang, Xiaoping
A2 - Wang, Zhiyong
PB - Springer Science and Business Media Deutschland GmbH
T2 - 16th International Conference on Intelligent Robotics and Applications, ICIRA 2023
Y2 - 5 July 2023 through 7 July 2023
ER -