Skip to main navigation Skip to search Skip to main content

Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers

  • Zhou Huang
  • , Hang Dai
  • , Tian Zhu Xiang*
  • , Shuo Wang
  • , Huai Xin Chen
  • , Jie Qin
  • , Huan Xiong
  • *Corresponding author for this work
  • Sichuan Changhong Electric Co.,Ltd.
  • University of Electronic Science and Technology of China
  • University of Glasgow
  • G42
  • ETH Zurich
  • Nanjing University of Aeronautics and Astronautics
  • Mohamed Bin Zayed University of Artificial Intelligence

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders, which are not conducive to camou-flaged object detection that explores subtle cues from indistinguishable backgrounds. To address these issues, in this paper, we propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode locality-enhanced neighboring transformer features through progressive shrinking for camou-flaged object detection. Specifically, we propose a non-local token enhancement module (NL-TEM) that employs the non-local mechanism to interact neighboring tokens and explore graph-based high-order relations within tokens to enhance local representations of transformers. Moreover, we design a feature shrinkage decoder (FSD) with adjacent interaction modules (AIM), which progressively aggregates adjacent transformer features through a layer-by-layer shrinkage pyramid to accumulate imperceptible but effective cues as much as possible for object information decoding. Extensive quantitative and qualitative experiments demonstrate that the proposed model significantly outperforms the existing 24 competitors on three challenging COD benchmark datasets under six widely-used evaluation metrics. Our code is publicly available at https://github.com/ZhouHuang23/FSPNet.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
PublisherIEEE Computer Society
Pages5557-5566
Number of pages10
ISBN (Electronic)9798350301298
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023 - Vancouver, Canada
Duration: 18 Jun 202322 Jun 2023

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2023-June
ISSN (Print)1063-6919

Conference

Conference2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
Country/TerritoryCanada
CityVancouver
Period18/06/2322/06/23

Keywords

  • Segmentation
  • grouping and shape analysis

Fingerprint

Dive into the research topics of 'Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers'. Together they form a unique fingerprint.

Cite this