SCL-SOD: A hybrid self-supervised contrastive learning framework for salient object detection

  • Zhengda Wu
  • , Jinbao Wang*
  • , Yingchun Cui
  • , Jinghua Zhu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Salient Object Detection (SOD) aims to identify the most visually distinctive objects in images, with broad applications in object detection, image classification, and image synthesis. Most existing SOD methods adopt supervised learning frameworks that heavily rely on labeled images as supervision signals. However, these methods often underperform in complex scenarios where camouflaged objects and backgrounds exhibit high similarity, primarily due to two limitations: (1) Insufficient supervision from labels fails to capture holistic salient regions, and (2) Task-driven supervised learning overly focuses on target objects while neglecting contextual receptive fields, resulting in elevated false-positive rates. To address these challenges, we propose a novel hybrid model, SCL-SOD, that integrates self-supervised contrastive representation learning with supervised learning in an encoder-decoder architecture with a T2T-ViT backbone. Specifically, our model has two key components: Image-wise Contrastive Learning Encoder (ICLE) that enhances global feature discriminability by learning invariant representations across different augmented views; Pixel-wise Contrastive Learning Decoder (PCLD) that refines local prediction accuracy by enforcing feature consistency at the pixel level. The final optimization combines the weighted supervised detection loss and the self-supervised contrastive loss. Extensive experiments on six standard RGB benchmarks across five evaluation metrics demonstrate that our proposed SCL-SOD model outperforms 11 state-of-the-art SOD methods, particularly in challenging scenarios with cluttered backgrounds.

Original languageEnglish
Article number132889
JournalNeurocomputing
Volume674
DOIs
StatePublished - 14 Apr 2026
Externally publishedYes

Keywords

  • Contrastive learning
  • Salient object detection
  • Self-supervised learning
  • Transformer

Fingerprint

Dive into the research topics of 'SCL-SOD: A hybrid self-supervised contrastive learning framework for salient object detection'. Together they form a unique fingerprint.

Cite this