TY - GEN
T1 - Sample-efficient policy learning based on completely behavior cloning
AU - Zou, Qiming
AU - Wang, Ling
AU - Li, Yu
AU - Liu, Jie
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Direct policy search is one of the most important algorithm of reinforcement learning. However, learning from scratch needs a large amount of experience data and can be easily prone to poor local optima. In order to overcome these challenges, this paper proposed a training-free behavior cloning algorithm called Policy Learning based on Completely Behavior Cloning (PLCBC). PLCBC transforms the Model Predictive Control (MPC) controller into a PieceWise Affine (PWA) function with multi-parametric programming, and uses a neural network to express this function. By this way, off-the-shelf deep reinforcement learning algorithms can be used to fine-tune this neural network. The experiments show that our method can help agent learn at the high reward state region, and converge faster and better.
AB - Direct policy search is one of the most important algorithm of reinforcement learning. However, learning from scratch needs a large amount of experience data and can be easily prone to poor local optima. In order to overcome these challenges, this paper proposed a training-free behavior cloning algorithm called Policy Learning based on Completely Behavior Cloning (PLCBC). PLCBC transforms the Model Predictive Control (MPC) controller into a PieceWise Affine (PWA) function with multi-parametric programming, and uses a neural network to express this function. By this way, off-the-shelf deep reinforcement learning algorithms can be used to fine-tune this neural network. The experiments show that our method can help agent learn at the high reward state region, and converge faster and better.
UR - https://www.scopus.com/pages/publications/85076737767
U2 - 10.1109/SMC.2019.8914085
DO - 10.1109/SMC.2019.8914085
M3 - 会议稿件
AN - SCOPUS:85076737767
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 2543
EP - 2548
BT - 2019 IEEE International Conference on Systems, Man and Cybernetics, SMC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE International Conference on Systems, Man and Cybernetics, SMC 2019
Y2 - 6 October 2019 through 9 October 2019
ER -