Skip to main navigation Skip to search Skip to main content

Look-Into-Object: Self-Supervised Structure Modeling for Object Recognition

  • Mohan Zhou
  • , Yalong Bai
  • , Wei Zhang
  • , Tiejun Zhao
  • , Tao Mei
  • Harbin Institute of Technology
  • JD AI Research

Research output: Contribution to journalConference articlepeer-review

Abstract

Most object recognition approaches predominantly focus on learning discriminative visual patterns, while overlooking the holistic object structure. Though important, structure modeling usually requires significant manual annotations and therefore is labor-intensive. In this paper, we propose to ''look into object' (explicitly yet intrinsically model the object structure) through incorporating self-supervisions into the traditional framework. We show the recognition backbone can be substantially enhanced for more robust representation learning, without any cost of extra annotation and inference speed. Specifically, we first propose an object-extent learning module for localizing the object according to the visual patterns shared among the instances in the same category. We then design a spatial context learning module for modeling the internal structures of the object, through predicting the relative positions within the extent. These two modules can be easily plugged into any backbone networks during training and detached at inference time. Extensive experiments show that our look-into-object approach (LIO) achieves large performance gain on a number of benchmarks, including generic object recognition (ImageNet) and fine-grained object recognition tasks (CUB, Cars, Aircraft). We also show that this learning paradigm is highly generalizable to other tasks such as object detection and segmentation (MS COCO). Project page: https://github.com/JDAI-CV/LIO.

Original languageEnglish
Article number9156918
Pages (from-to)11771-11780
Number of pages10
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs
StatePublished - 2020
Event2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 - Virtual, Online, United States
Duration: 14 Jun 202019 Jun 2020

Fingerprint

Dive into the research topics of 'Look-Into-Object: Self-Supervised Structure Modeling for Object Recognition'. Together they form a unique fingerprint.

Cite this