Skip to main navigation Skip to search Skip to main content

Object Tracking via Spatial-Temporal Memory Network

  • School of Computer Science and Technology, Harbin Institute of Technology
  • Peng Cheng Laboratory
  • University of Science and Technology of China

Research output: Contribution to journalArticlepeer-review

Abstract

Temporal and spatial contexts, characterizing target appearance variations and target-background differences, respectively, are crucial for improving the online adaptive ability and instance-level discriminative ability of object tracking. However, most existing trackers focus on either the temporal context or the spatial context during tracking and have not exploited these contexts simultaneously and effectively. In this paper, we propose a Spatial-TEmporal Memory (STEM) network to exploit these contexts jointly for object tracking. Specifically, we develop a key-value structured memory model equipped with a key-value index-based memory reading mechanism to model the spatial and temporal contexts simultaneously. To update the memory with new target states and ensure the diversity of the memory, we introduce a similarity-Aware memory update scheme. In addition, we construct an entropy-guided ensemble strategy to fuse the prediction models based on these two contexts, such that these two contexts can be exploited to estimate the target state jointly. Extensive experimental results on eight challenging datasets, including OTB2015, TC128, UAV123, VOT2018, LaSOT, TrackingNet, GOT-10k, and OxUvA, demonstrate that the proposed method performs favorably against state-of-The-Art trackers.

Original languageEnglish
Pages (from-to)2976-2989
Number of pages14
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume32
Issue number5
DOIs
StatePublished - 1 May 2022
Externally publishedYes

Keywords

  • Memory network
  • Object tracking
  • Spatial context
  • Temporal context

Fingerprint

Dive into the research topics of 'Object Tracking via Spatial-Temporal Memory Network'. Together they form a unique fingerprint.

Cite this