Abstract
Highlights: What are the main findings? This study introduces the Hybrid Attention Fusion Network (HAFNet), a unified framework that integrates content, scale, and frequency-domain adaptivity to address the insufficient feature adaptivity in existing deep learning-based pansharpening methods. Comprehensive evaluations on WorldView-3, GF-2, and QuickBird datasets demonstrate that HAFNet achieves superior performance in balancing spatial detail enhancement with spectral preservation. What is the implication of the main finding? The study validates that a coordinated hybrid attention strategy effectively resolves the fundamental spatial-spectral trade-off in remote sensing image fusion. HAFNet establishes a new architectural paradigm for adaptive feature learning, offering broad applicability to other remote sensing tasks requiring multi-dimensional feature integration, such as image fusion and super-resolution. Deep learning–based pansharpening methods for remote sensing have advanced rapidly in recent years. However, current methods still face three limitations that directly affect reconstruction quality. Content adaptivity is often implemented as an isolated step, which prevents effective interaction across scales and feature domains. Dynamic multi-scale mechanisms also remain constrained, since their scale selection is usually guided by global statistics and ignores regional heterogeneity. Moreover, frequency and spatial cues are commonly fused in a static manner, leading to an imbalance between global structural enhancement and local texture preservation. To address these issues, we design three complementary modules. We utilize the Adaptive Convolution Unit (ACU) to generate content-aware kernels through local feature clustering, thereby achieving fine-grained adaptation to diverse ground structures. We also develop the Multi-Scale Receptive Field Selection Unit (MSRFU), a module providing flexible scale modeling by selecting informative branches at varying receptive fields. Meanwhile, we incorporate the Frequency–Spatial Attention Unit (FSAU), designed to dynamically fuse spatial representations with frequency information. This effectively strengthens detail reconstruction while minimizing spectral distortion. Specifically, we propose the Hybrid Attention Fusion Network (HAFNet), which employs the Hybrid Attention-Driven Residual Block (HARB) as the fundamental utility to dynamically integrate the above three specialized components. This design enables dynamic content adaptivity, multi-scale responsiveness, and cross-domain feature fusion within a unified framework. Experiments on public benchmarks confirm the effectiveness of each component and demonstrate HAFNet’s state-of-the-art performance.
| Original language | English |
|---|---|
| Article number | 526 |
| Journal | Remote Sensing |
| Volume | 18 |
| Issue number | 3 |
| DOIs | |
| State | Published - Feb 2026 |
Keywords
- adaptive convolution
- frequency-spatial attention
- hybrid attention
- multi-scale feature fusion
- pansharpening
- remote sensing
Fingerprint
Dive into the research topics of 'HAFNet: Hybrid Attention Fusion Network for Remote Sensing Pansharpening'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver