Skip to main navigation Skip to search Skip to main content

FDPMambaFuse: A frequency-domain and parallel Mamba-based model for multimodal medical image fusion

  • Baiya Li
  • , Boheng Zhang
  • , Yang Liu
  • , Haorui Huang
  • , Cailing Lin
  • , Rui Fan
  • , Mingjian Sun*
  • *Corresponding author for this work
  • First Affiliated Hospital of xi'An Jiaotong University
  • Harbin Institute of Technology
  • Pohang University of Science and Technology
  • Harbin Institute of Technology Weihai
  • Tongji University
  • Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal medical image fusion (MMIF) integrates complementary information from different imaging modalities, providing more comprehensive and reliable support for clinical diagnosis and analysis. Although Mamba-based models efficiently capture long-range dependencies with low computational cost, existing approaches often overlook frequency-domain features and directional structural information, resulting in insufficient detail preservation and reduced semantic consistency. To address these limitations, we propose FDPMambaFuse, a fusion network that combines frequency-domain decomposition with multi-directional parallel Mamba. The framework employs a multi-level CNN-Mamba feature extractor, in which CNNs capture shallow spatial features, whereas a discrete wavelet transform (DWT)-based parallel Mamba module models multi-scale frequency components, long-range dependencies, and direction-sensitive structural details. A dual-stage fusion module is subsequently introduced to enhance cross-modal complementarity through channel interaction and spatially adaptive attention. In addition, we design an unsupervised hybrid loss that combines pixel-level uncertainty with image gradients to improve structural consistency and overall visual quality. To further improve model generalization, we construct and publicly release a high-quality photoacoustic-ultrasound fusion dataset, HIT-MMIF-PAUS. Extensive qualitative and quantitative experiments demonstrate that FDPMambaFuse outperforms state-of-the-art methods in fusion quality, structural fidelity, and edge clarity. Moreover, it achieves superior accuracy and robustness in downstream tumor segmentation tasks, further verifying its practical potential in clinical applications. Our code is publicly available at https://github.com/670768312/FDPMambaFuse.

Original languageEnglish
Article number108994
JournalNeural Networks
Volume202
DOIs
StatePublished - Oct 2026

Keywords

  • Frequency-domain feature modeling
  • Multi-directional vision Mamba
  • Multimodal medical image fusion
  • Photoacoustic-ultrasound fusion
  • Unsupervised learning

Fingerprint

Dive into the research topics of 'FDPMambaFuse: A frequency-domain and parallel Mamba-based model for multimodal medical image fusion'. Together they form a unique fingerprint.

Cite this