Abstract
Edge devices generate vast amounts of data, which are transmitted to the remote cloud for inference—a process that can lead to excessive network load and privacy risks. Edge collaborative inference has emerged as a promising solution to these challenges. However, existing collaborative inference methods remain constrained, particularly by transmission costs, limited flexibility in model partitioning, and redundant computation. These issues constrain the efficient utilization of edge resources and limit overall system performance. To address these limitations, we adopt scalable pipelines in the context of resource and model orchestration, enabling dynamic adaptation to varying workloads and improving resource utilization efficiency. Specifically, we propose ScalPipe, a collaborative pipeline inference framework that enables scalable orchestration of resources and model partitions for efficient edge inference. To better accommodate edge devices, ScalPipe employs a lightweight customized heuristic algorithm for resource-adaptive model partitioning. A fine-tuning algorithm dynamically adjusts the scheduling strategy when monitored inference times deviate beyond a predefined threshold during inference. We provide theoretical analysis to establish performance bounds and computational complexity of the proposed algorithms. Comprehensive experiments in heterogeneous environments demonstrate that ScalPipe consistently surpasses state-of-the-art methods across diverse model architectures and evaluation metrics. ScalPipe reduces average inference latency by 20%-40% while achieving over 90% resource utilization, delivering a significant boost in overall performance.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Mobile Computing |
| DOIs | |
| State | Accepted/In press - 2026 |
| Externally published | Yes |
Keywords
- Edge computing
- collaborative inference
- model split
- resource scheduling
Fingerprint
Dive into the research topics of 'ScalPipe: Scalable Collaborative Pipeline Inference for Distributed Heterogeneous Devices'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver