Skip to main navigation Skip to search Skip to main content

GL2T-Diff: Medical image translation via spatial-frequency fusion diffusion models

  • Dong Sui
  • , Nanting Song
  • , Xiao Tian
  • , Han Zhou
  • , Yacong Li*
  • , Maozu Guo
  • , Kuanquan Wang
  • , Gongning Luo
  • *Corresponding author for this work
  • Beijing University of Civil Engineering and Architecture
  • Beijing Academy of Artificial Intelligence
  • Faculty of Computing, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Diffusion Probabilistic Models (DPMs) are effective in medical image translation (MIT), but they tend to lose high-frequency details during the noise addition process, making it challenging to recover these details during the denoising process. This hinders the model’s ability to accurately preserve anatomical details during MIT tasks, which may ultimately affect the accuracy of diagnostic outcomes. To address this issue, we propose a diffusion model (GL2T-Diff) based on convolutional channel and Laplacian frequency attention mechanisms, which is designed to enhance MIT tasks by effectively preserving critical image features. We introduce two novel modules: the Global Channel Correlation Attention Module (GC2A Module) and the Laplacian Frequency Attention Module (LFA Module). The GC2A Module enhances the model’s ability to capture global dependencies between channels, while the LFA Module effectively retains high-frequency components, which are crucial for preserving anatomical structures. To leverage the complementary strengths of both GC2A Module and LFA Module, we propose the Laplacian Convolutional Attention with Phase-Amplitude Fusion (FusLCA), which facilitates effective integration of spatial and frequency domain features. Experimental results show that GL2T-Diff outperforms state-of-the-art (SOTA) methods, including those based on Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and other DPMs, across the BraTS-2021/2024, IXI, and Pelvic datasets. The code is available at https://github.com/puzzlesong8277/GL2T-Diff .

Original languageEnglish
Article number104586
JournalComputer Vision and Image Understanding
Volume263
DOIs
StatePublished - Jan 2026
Externally publishedYes

Keywords

  • Diffusion Probabilistic Models
  • FusLCA
  • GC2A module
  • LFA module
  • Medical image translation

Fingerprint

Dive into the research topics of 'GL2T-Diff: Medical image translation via spatial-frequency fusion diffusion models'. Together they form a unique fingerprint.

Cite this