TY - GEN
T1 - Privacy-Preserving Intelligence-based Reinforcement Learning for Large Language Model via Homomorphic Encryption
AU - Wu, Feiyang
AU - Sun, Xiaoqiang
AU - Sun, Zhiwei
AU - Liu, Wei
AU - Jiang, Zoe L.
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Reinforcement learning (RL) is a widely used framework for sequential decision-making, but conventional RL often struggles with sparse rewards, limited reasoning, and long-horizon dependencies. Intelligence-based reinforcement learning (IRL) introduces intelligence-oriented metrics and is mainly applied to large language models (LLMs) to enhance contextual reasoning and process both numerical and textual information. Nevertheless, deploying IRL with LLMs in sensitive domains like healthcare, finance, and defense raises severe privacy risks, including gradient inversion, model extraction, and sensitive data leakage. We propose a privacy-preserving IRL framework with the integration of LLMs based on the CKKS fully homomorphic encryption scheme, which supports encrypted computation on real-valued data, enabling training to be performed entirely in the encrypted domain. Security and efficiency analyses demonstrate that the framework achieves strong cryptographic security, meeting indistinguishability under the chosen-plaintext attack, and practical efficiency with low communication overhead, enabling secure deployment in privacy-critical environments despite increased computational cost.
AB - Reinforcement learning (RL) is a widely used framework for sequential decision-making, but conventional RL often struggles with sparse rewards, limited reasoning, and long-horizon dependencies. Intelligence-based reinforcement learning (IRL) introduces intelligence-oriented metrics and is mainly applied to large language models (LLMs) to enhance contextual reasoning and process both numerical and textual information. Nevertheless, deploying IRL with LLMs in sensitive domains like healthcare, finance, and defense raises severe privacy risks, including gradient inversion, model extraction, and sensitive data leakage. We propose a privacy-preserving IRL framework with the integration of LLMs based on the CKKS fully homomorphic encryption scheme, which supports encrypted computation on real-valued data, enabling training to be performed entirely in the encrypted domain. Security and efficiency analyses demonstrate that the framework achieves strong cryptographic security, meeting indistinguishability under the chosen-plaintext attack, and practical efficiency with low communication overhead, enabling secure deployment in privacy-critical environments despite increased computational cost.
KW - Fully Homomorphic Encryption
KW - Intelligence
KW - Large Language Models
KW - Privacy-Preserving
KW - Reinforcement Learning
UR - https://www.scopus.com/pages/publications/105033796239
U2 - 10.1109/ISAICS66888.2025.11350197
DO - 10.1109/ISAICS66888.2025.11350197
M3 - 会议稿件
AN - SCOPUS:105033796239
T3 - Proceedings of 2025 2nd International Symposium on AI and Cybersecurity, ISAICS 2025
BT - Proceedings of 2025 2nd International Symposium on AI and Cybersecurity, ISAICS 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd International Symposium on AI and Cybersecurity, ISAICS 2025
Y2 - 24 October 2025 through 26 October 2025
ER -