TY - GEN
T1 - USB-Rec
T2 - 19th ACM Conference on Recommender Systems, RecSys 2025
AU - Wen, Jianyu
AU - Wang, Jingyun
AU - Yan, Cilin
AU - Cai, Jiayin
AU - Jiang, Xiaolong
AU - Zhang, Ying
N1 - Publisher Copyright:
© 2025 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
PY - 2025/8/7
Y1 - 2025/8/7
N2 - Recently, Large Language Models (LLMs) have been widely employed in Conversational Recommender Systems (CRSs). Unlike traditional language model approaches that focus on training, all existing LLMs-based approaches are mainly centered around how to leverage the summarization and analysis capabilities of LLMs while ignoring the issue of training. Therefore, in this work, we propose an integrated training-inference framework, User-Simulator-Based framework (USB-Rec), for improving the performance of LLMs in conversational recommendation at the model level. Firstly, we design a LLM-based Preference Optimization (PO) dataset construction strategy for RL training, which helps the LLMs understand the strategies and methods in conversational recommendation. Secondly, we propose a Self-Enhancement Strategy (SES) at the inference stage to further exploit the conversational recommendation potential obtained from RL training. Extensive experiments on various datasets demonstrate that our method consistently outperforms previous state-of-the-art methods. Codes are available at https://github.com/John-Wendell/USB_Rec.
AB - Recently, Large Language Models (LLMs) have been widely employed in Conversational Recommender Systems (CRSs). Unlike traditional language model approaches that focus on training, all existing LLMs-based approaches are mainly centered around how to leverage the summarization and analysis capabilities of LLMs while ignoring the issue of training. Therefore, in this work, we propose an integrated training-inference framework, User-Simulator-Based framework (USB-Rec), for improving the performance of LLMs in conversational recommendation at the model level. Firstly, we design a LLM-based Preference Optimization (PO) dataset construction strategy for RL training, which helps the LLMs understand the strategies and methods in conversational recommendation. Secondly, we propose a Self-Enhancement Strategy (SES) at the inference stage to further exploit the conversational recommendation potential obtained from RL training. Extensive experiments on various datasets demonstrate that our method consistently outperforms previous state-of-the-art methods. Codes are available at https://github.com/John-Wendell/USB_Rec.
KW - Conversational Recommendation
KW - Large Language Model
KW - Reinforcement Learning
UR - https://www.scopus.com/pages/publications/105019647293
U2 - 10.1145/3705328.3748089
DO - 10.1145/3705328.3748089
M3 - 会议稿件
AN - SCOPUS:105019647293
T3 - RecSys2025 - Proceedings of the 19th ACM Conference on Recommender Systems
SP - 472
EP - 481
BT - RecSys2025 - Proceedings of the 19th ACM Conference on Recommender Systems
PB - Association for Computing Machinery, Inc
Y2 - 22 September 2025 through 26 September 2025
ER -