Skip to main navigation Skip to search Skip to main content

Cross-Domain Few-Shot Hyperspectral Image Classification With Cross-Modal Alignment and Supervised Contrastive Learning

  • Zhaokui Li*
  • , Chenyang Zhang
  • , Yan Wang
  • , Wei Li
  • , Qian Du
  • , Zhuoqun Fang
  • , Yushi Chen
  • *Corresponding author for this work
  • Shenyang Aerospace University
  • Beijing Institute of Technology
  • Mississippi State University
  • School of Electronics and Information Engineering, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, metric-based few-shot learning (FSL) methods have achieved good performance in the hyperspectral image (HSI) classification. However, existing methods suffer from two problems: over-reliance on image modality information leads to inaccurate prototype representation, where a prototype refers to the centroid of each class in the dataset, and the impact of redundant and noisy pixels on model discriminability is rarely considered. These problems result in insufficient discriminability of the model for the target domain. To address the above issues, we propose a cross-domain few-shot HSI classification framework with cross-modal alignment and supervised contrastive learning (CDFS-CASCL). It is well known that human visual learning greatly benefits from the input of various modal information such as vision, language, and video. Inspired by the way humans abstract image class concepts in language form and understand the essence of classes, we perform cross-modal alignment (CA) between similar image and text prototypes and use abstract text semantics to guide the model to learn semantic-related features with good generalization ability in images, so as to improve the accuracy of image prototypes representation of the prototypes. In addition, through supervised contrastive learning (SCL) based on neighborhood pixel mask in the target domain, the enhanced sample features belonging to the same class are closer, while the enhanced sample features belonging to different classes are pulled further, enabling the model to learn mask-robust discriminative feature representations, suppressing the negative impact of redundant and noisy pixels, and improving the model's discriminability. The experimental results demonstrate the superiority of the proposed CDFS-CASCL. The code is available at https://github.com/Li-ZK/CDFS-CASCL-2024.

Original languageEnglish
Article number5519319
Pages (from-to)1-19
Number of pages19
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume62
DOIs
StatePublished - 2024
Externally publishedYes

Keywords

  • Cross-modal alignment (CA)
  • few-shot learning (FSL)
  • hyperspectral image (HSI) classification
  • supervised contrastive learning (SCL)

Fingerprint

Dive into the research topics of 'Cross-Domain Few-Shot Hyperspectral Image Classification With Cross-Modal Alignment and Supervised Contrastive Learning'. Together they form a unique fingerprint.

Cite this