Skip to main navigation Skip to search Skip to main content

Weakly Supervised Salient Object Detection with Text Supervision

  • Shenzhen University
  • School of Computer Science and Technology, Harbin Institute of Technology
  • Faculty of Computing, Harbin Institute of Technology
  • Peng Cheng Laboratory
  • Nanjing University of Science and Technology
  • The Chinese University of Hong Kong, Shenzhen

Research output: Contribution to journalArticlepeer-review

Abstract

Weakly supervised salient object detection using image-category supervision offers a cost-effective alternative to dense annotations, yet suffers from significant performance degradation. This is primarily attributed to the limitations of existing pseudo-label generation methods, which tend to either under- or over-activate object regions and indiscriminately label all non-activated pixels as background, introducing considerable label noise. Furthermore, these methods are restricted in the ability to capture objects beyond the pre-trained category set. To overcome these challenges, we propose a CLIP-based pseudo-label generation that exploits text prompts to jointly activate generic background and salient objects, breaking the dependency on specific categories. However, we find that this paradigm faces three challenges: optimal prompt uncertainty, background redundancy, and object-background conflict. To mitigate these, we propose three key modules. First, spatial distribution-guided prompt selection evaluates the spatial distribution of activation regions to identify the optimal prompt. Second, center and scale prior-guided activation refinement integrates self-attention and superpixel cues to suppress background noise. Third, learning feedback-guided pseudo-label update learns saliency knowledge from other pseudo-labels to resolve conflicting regions and iteratively refine supervision. Extensive experiments demonstrate that our method surpasses previous weakly supervised methods with image-category supervision and unsupervised approaches.

Original languageEnglish
Article number74
JournalInternational Journal of Computer Vision
Volume134
Issue number2
DOIs
StatePublished - Feb 2026

Keywords

  • Language-vision large model
  • Salient object detection
  • Unsupervised learning
  • Weakly supervised learning

Fingerprint

Dive into the research topics of 'Weakly Supervised Salient Object Detection with Text Supervision'. Together they form a unique fingerprint.

Cite this