TY - GEN
T1 - Joint Task Scheduling and Resource Allocation in Cloud-Edge Collaborative Computing Systems
AU - Du, Boyu
AU - Zhou, Jingya
AU - Wang, Jin
AU - Wang, Jiangwei
AU - Li, Zhijun
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/12/20
Y1 - 2025/12/20
N2 - Cloud-edge collaborative computing (CECC) facilitates the sharing of computing resources by collaboratively scheduling tasks among servers, thereby maximizing task execution efficiency. Task scheduling and resource allocation (TS-RA) are two interrelated issues that significantly affect the efficient utilization of computing resources. In this paper, we decouple the joint optimization problem of TS-RA and propose a novel model based on multi-agent reinforcement learning (TRMARL), which is applicable to distributed task scheduling and resource allocation in a heterogeneous CECC system. TRMARL consists of two modules: 1) the task scheduling module, where we introduce a value factorization algorithm to maximize joint rewards of distributed scheduling actions; 2) the resource allocation module, where we present a proximal policy optimization (PPO) algorithm based mechanism to optimize resource allocation. TRMARL efficiently captures the state difference among heterogeneous servers through a graph attention network-based recurrent deep Q-network (GAT-based recurrent-DQN) architecture and learns different strategies for heterogeneous services through a multi-expert schema. The experimental results demonstrate that TRMARL effectively improves the task completion rate, reduces average system latency, and enhances convergence stability in a heterogeneous CECC system.
AB - Cloud-edge collaborative computing (CECC) facilitates the sharing of computing resources by collaboratively scheduling tasks among servers, thereby maximizing task execution efficiency. Task scheduling and resource allocation (TS-RA) are two interrelated issues that significantly affect the efficient utilization of computing resources. In this paper, we decouple the joint optimization problem of TS-RA and propose a novel model based on multi-agent reinforcement learning (TRMARL), which is applicable to distributed task scheduling and resource allocation in a heterogeneous CECC system. TRMARL consists of two modules: 1) the task scheduling module, where we introduce a value factorization algorithm to maximize joint rewards of distributed scheduling actions; 2) the resource allocation module, where we present a proximal policy optimization (PPO) algorithm based mechanism to optimize resource allocation. TRMARL efficiently captures the state difference among heterogeneous servers through a graph attention network-based recurrent deep Q-network (GAT-based recurrent-DQN) architecture and learns different strategies for heterogeneous services through a multi-expert schema. The experimental results demonstrate that TRMARL effectively improves the task completion rate, reduces average system latency, and enhances convergence stability in a heterogeneous CECC system.
KW - cloud-edge collaborative computing
KW - multi-agent reinforcement learning.
KW - resource allocation
KW - task scheduling
UR - https://www.scopus.com/pages/publications/105026444400
U2 - 10.1145/3754598.3754646
DO - 10.1145/3754598.3754646
M3 - 会议稿件
AN - SCOPUS:105026444400
T3 - 54th International Conference on Parallel Processing, ICPP 2025 - Main Conference Proceedings
SP - 586
EP - 596
BT - 54th International Conference on Parallel Processing, ICPP 2025 - Main Conference Proceedings
PB - Association for Computing Machinery, Inc
T2 - 54th International Conference on Parallel Processing, ICPP 2025
Y2 - 8 September 2025 through 11 September 2025
ER -