TY - GEN
T1 - CAC
T2 - 23rd IEEE International Conference on Data Mining, ICDM 2023
AU - Busaranuvong, Palawat
AU - Zhang, Xin
AU - Li, Yanhua
AU - Zhou, Xun
AU - Luo, Jun
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Rapid advances in perception, planning, and decision-making areas for self-driving vehicles have led to great improvements in their function and capabilities and enabled several prototypes to be driving on the roads and streets, such as Waymo Driver, TuSimple, Nuro, etc. Among various applications of self-driving vehicles, a promising one is the ride service as it has the potential to improve service quality and productivity and to provide service to anyone at any time. Extensive studies have been conducted on self-driving planning and safety, but few works focus on self-driving ride service decision-making and routing. In this work, we take the lead to study self-driving ride service planning and decision-making problem leveraging human-generated spatial-temporal data, and propose the data-driven Conservative Actor-Critic approach - CAC - based on offline reinforcement learning. Our CAC is able to make conservative decisions in a complicated environment with multiple goal states, and avoid dangerous and overly optimistic behaviors by exploiting human decisions. Extensive experiments with real-world data demonstrate that our CAC-learned policies are able to improve taxi service operation efficiency and quality drastically in terms of shortening passenger waiting time and improving service revenue.
AB - Rapid advances in perception, planning, and decision-making areas for self-driving vehicles have led to great improvements in their function and capabilities and enabled several prototypes to be driving on the roads and streets, such as Waymo Driver, TuSimple, Nuro, etc. Among various applications of self-driving vehicles, a promising one is the ride service as it has the potential to improve service quality and productivity and to provide service to anyone at any time. Extensive studies have been conducted on self-driving planning and safety, but few works focus on self-driving ride service decision-making and routing. In this work, we take the lead to study self-driving ride service planning and decision-making problem leveraging human-generated spatial-temporal data, and propose the data-driven Conservative Actor-Critic approach - CAC - based on offline reinforcement learning. Our CAC is able to make conservative decisions in a complicated environment with multiple goal states, and avoid dangerous and overly optimistic behaviors by exploiting human decisions. Extensive experiments with real-world data demonstrate that our CAC-learned policies are able to improve taxi service operation efficiency and quality drastically in terms of shortening passenger waiting time and improving service revenue.
KW - actor-critic
KW - conservative Q-learning
KW - offline reinforcement learning
KW - spatial-temporal data mining
UR - https://www.scopus.com/pages/publications/85185410264
U2 - 10.1109/ICDM58522.2023.00011
DO - 10.1109/ICDM58522.2023.00011
M3 - 会议稿件
AN - SCOPUS:85185410264
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 21
EP - 30
BT - Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023
A2 - Chen, Guihai
A2 - Khan, Latifur
A2 - Gao, Xiaofeng
A2 - Qiu, Meikang
A2 - Pedrycz, Witold
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 December 2023 through 4 December 2023
ER -