Skip to main navigation Skip to search Skip to main content

Real-Time Downlink Resource Allocation in NOMA Systems: A Reinforcement Learning Approach

  • Song Yan
  • , Kuang Chih Chou
  • , Hsiao Hwa Chen*
  • , Qing Guo*
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • National Cheng Kung University

Research output: Contribution to journalArticlepeer-review

Abstract

Non-orthogonal multiple access (NOMA) is an important multiple access technology for next generation wireless communications. This work focuses on realtime resource allocation in NOMA systems based on reinforcement learning (RL). Q-learning (QL) is an agile RL approach that can adjust its learning strategy to dynamic channel state, making it a perfect machine learning algorithm that can tell agents what to do to maximize its rewards. It does not require a given model of the environment and thus can work adaptively in different scenarios. However, the majority of existing works treated QL as a tool to solve an optimization problem. In this work, QL participates the entire resource allocation process in NOMA systems. As long as user equipment (UE) locations are given, it can optimize resource allocation to achieve a maximum sum rate. In particular, we demonstrate its effectiveness with simulation results in three NOMA schemes, including multi-user superposition transmission (MUST), pattern division multiple access (PDMA), and sparse code multiple access (SCMA) systems. The excellent tracking convergence property of the proposed schemes makes it an ideal choice to perform realtime resource allocation in wireless communications.

Original languageEnglish
Pages (from-to)17779-17795
Number of pages17
JournalIEEE Transactions on Vehicular Technology
Volume74
Issue number11
DOIs
StatePublished - 2025

Keywords

  • NOMA
  • realtime resource allocation
  • reinforcement learning
  • tracking convergence property

Fingerprint

Dive into the research topics of 'Real-Time Downlink Resource Allocation in NOMA Systems: A Reinforcement Learning Approach'. Together they form a unique fingerprint.

Cite this