Abstract
Multimodal data can effectively improve the accuracy and robustness of traditional RGB image semantic segmentation. However, the redundant information between cross-modal data hinders the complementary information mining of each modality. And the data misalignment between modes will aggravate the above effects. In this paper, we propose a complementary information mining network (CIMNet) for RGB-Thermal (RGB-T) semantic segmentation. We comprehensively consider the link between the difficulty of redundant information mining and modality misalignment. Through mutual information minimization and adaptive update of modality bias, we achieve more accurate and robust segmentation performance in complex environments. Specifically, we introduce a complementary information promotion and amplification (CIPA) module via mutual information minimization and channel attention mechanism to prevent a multi-modality network from focusing on redundant information and amplify the informative cross-modality features. Then, we design a spatial-channel sequential feature rectification (SCSFR) module with adaptive offset modeling to calibrate the modality misalignment features. Extensive experiments on public datasets demonstrate that our CIMNet outperforms other state-of-the-art (SOTA) methods in terms of objective metrics and subjective visual comparisons.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Geoscience and Remote Sensing |
| DOIs | |
| State | Accepted/In press - 2026 |
| Externally published | Yes |
Keywords
- RGB-T semantic segmentation
- cross-modality
- feature rectification
- mutual information minimization
Fingerprint
Dive into the research topics of 'Complementary Information Mining for Redundancy and Weakly Aligned RGB-T Semantic Segmentation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver