Skip to main navigation Skip to search Skip to main content

Deformable Template Network (DTN) for Object Detection

  • Shuai Wu
  • , Yong Xu*
  • , Bob Zhang
  • , Jian Yang
  • , David Zhang
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • University of Macau
  • Nanjing University of Science and Technology
  • The Chinese University of Hong Kong, Shenzhen

Research output: Contribution to journalArticlepeer-review

Abstract

Objects often have different appearances because of viewpoint changes or part deformation. How to reasonably model these variations is still a big challenge for object detection. In this paper, we propose a novel Deformable Template Network (DTN), which exploits the pictorial structure to model possible variations of an object. DTN represents an object by virtue of a generated template in a deformable way. It has two key modules: The template generating module and the part matching module. The template generating module produces a template for a given object which defines the anchor positions of the $k{\times }k$ parts. Based on such a template, the part matching module aims to perform part alignment around the anchor positions. In terms of each part, the matching process makes a trade-off between maximizing the detection score and minimizing the deformation cost relative to the anchor position. Moreover, DTN is a fully convolutional network which means it is competitive in terms of detection efficiency. We evaluate DTN on both the PASCAL VOC and MSCOCO datasets, achieving the state-of-The-Art results, an accuracy of 82.7% for PASCAL VOC and of 44.9% for MSCOCO.

Original languageEnglish
Pages (from-to)2058-2068
Number of pages11
JournalIEEE Transactions on Multimedia
Volume24
DOIs
StatePublished - 2022
Externally publishedYes

Keywords

  • deformable template
  • deformation cost
  • object detection
  • part matching

Fingerprint

Dive into the research topics of 'Deformable Template Network (DTN) for Object Detection'. Together they form a unique fingerprint.

Cite this