Skip to main navigation Skip to search Skip to main content

MSFFRN: Multimodal Spatial–Frequency Fusion and Refinement Network for Hyperspectral and LiDAR Data Classification

  • School of Electronics and Information Engineering, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

The fusion of hyperspectral image (HSI) and light detection and ranging (LiDAR) data leverages complementary spectral and elevation information to improve land cover classification, yet its effectiveness is hindered by heterogeneous data distributions and a common oversight of frequency-domain information. To address this, we propose a novel multimodal spatial–frequency fusion and refinement network (MSFFRN). The framework is designed to holistically capture and fuse multimodal features. It employs a dual-branch architecture that integrates wavelet-based frequency decomposition with convolutional neural network (CNN)-based spatial encoding for complementary feature extraction. Furthermore, a hierarchical fusion strategy using adaptive gating and cross-attention mechanisms dynamically integrates cross-modality and cross-domain information, while a dedicated feature refinement module (FRM) enhances robustness against environmental distortions and preserves structural features. Extensive experiments on the Houston 2013 and MUUFL datasets demonstrate that MSFFRN achieves state-of-the-art performance, increasing the overall accuracy (OA) by 2.66% and 1.32% on the respective datasets.

Original languageEnglish
Article number5501905
JournalIEEE Geoscience and Remote Sensing Letters
Volume23
DOIs
StatePublished - 2026
Externally publishedYes

Keywords

  • Classification
  • hyperspectral image (HSI)
  • light detection and ranging (LiDAR) data
  • multimodal
  • wavelet transform

Fingerprint

Dive into the research topics of 'MSFFRN: Multimodal Spatial–Frequency Fusion and Refinement Network for Hyperspectral and LiDAR Data Classification'. Together they form a unique fingerprint.

Cite this