Abstract
With the rapid advancement of Large Vision Models (LVMs) such as Sora, the initial comprehension of physical laws by large AI models has garnered significant attention, which enables them to interpret and apply physical principles with increasing accuracy and sophistication. Nevertheless, due to resource limitations and delay constraints, traditional cloud-based LVM services often fail to meet the diverse needs of users, particularly in scenarios requiring real-time responsiveness. In this work, we explore the scenario of Mobile Edge Computing (MEC)-empowered LVM services in wireless networks, where heterogeneous LVMs are deployed on both cloud and edge servers, and LVM Users (LUs) can offload computation task to edge servers to reduce delay and energy consumption. In such a scenario, we focus on the joint optimization of model inferencing and task offloading for LUs, aiming to maximize the total service utility, while minimizing delay and energy consumption. First, to characterize the utility of LVM services, we propose a multi-dimensional video quality metric based on real measurements, which incorporates both the prompt-video alignment and the classic video quality indicators. Then, to solve the problem in a decentralized manner, we propose a two-stage solution based on both learning and optimization techniques. In the first stage, we design a reinforcement learning-based Multi-Agent Proximal Policy Optimization (MAPPO) approach to make the real-time model inferencing and task offloading decisions. In the second stage, we employ the optimization-based Sequential Least Squares Programming (SLSQP) to make the efficient resource allocation decisions. Simulation results show that our proposed solution outperforms other benchmarks, and can reduce delay and energy consumption by up to 17.2% and 21.7%, respectively, while increasing service utility by up to 3%.
| Original language | English |
|---|---|
| Title of host publication | INFOCOM 2025 - IEEE Conference on Computer Communications |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| ISBN (Electronic) | 9798331543051 |
| DOIs | |
| State | Published - 2025 |
| Externally published | Yes |
| Event | 2025 IEEE Conference on Computer Communications, INFOCOM 2025 - London, United Kingdom Duration: 19 May 2025 → 22 May 2025 |
Publication series
| Name | Proceedings - IEEE INFOCOM |
|---|---|
| ISSN (Print) | 0743-166X |
Conference
| Conference | 2025 IEEE Conference on Computer Communications, INFOCOM 2025 |
|---|---|
| Country/Territory | United Kingdom |
| City | London |
| Period | 19/05/25 → 22/05/25 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
Keywords
- Large Vision Model
- Mobile Edge Computing
Fingerprint
Dive into the research topics of 'Joint Optimization of Model Inferencing and Task Offloading for MEC-Empowered Large Vision Model Services'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver