Skip to main navigation Skip to search Skip to main content

ODTrack: Online Dense Temporal Token Learning for Visual Tracking

  • Yaozong Zheng
  • , Bineng Zhong*
  • , Qihua Liang
  • , Zhiyi Mo
  • , Shengping Zhang
  • , Xianxian Li
  • *Corresponding author for this work
  • Guangxi Normal University
  • Wuzhou University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Online contextual reasoning and association across consecutive video frames are critical to perceive instances in visual tracking.However, most current top-performing trackers persistently lean on sparse temporal relationships between reference and search frames via an offline mode.Consequently, they can only interact independently within each image-pair and establish limited temporal correlations.To alleviate the above problem, we propose a simple, flexible and effective video-level tracking pipeline, named ODTrack, which densely associates the contextual relationships of video frames in an online token propagation manner.ODTrack receives video frames of arbitrary length to capture the spatio-temporal trajectory relationships of an instance, and compresses the discrimination features (localization information) of a target into a token sequence to achieve frame-to-frame association.This new solution brings the following benefits: 1) the purified token sequences can serve as prompts for the inference in the next video frame, whereby past information is leveraged to guide future inference; 2) the complex online update strategies are effectively avoided by the iterative propagation of token sequences, and thus we can achieve more efficient model representation and computation.ODTrack achieves a new SOTA performance on seven benchmarks, while running at real-time speed.Code and models are available at https://github.com/GXNU-ZhongLab/ODTrack.

Original languageEnglish
Title of host publicationTechnical Tracks 14
EditorsMichael Wooldridge, Jennifer Dy, Sriraam Natarajan
PublisherAssociation for the Advancement of Artificial Intelligence
Pages7588-7596
Number of pages9
Edition7
ISBN (Electronic)1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879
DOIs
StatePublished - 25 Mar 2024
Event38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada
Duration: 20 Feb 202427 Feb 2024

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
Number7
Volume38
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference38th AAAI Conference on Artificial Intelligence, AAAI 2024
Country/TerritoryCanada
CityVancouver
Period20/02/2427/02/24

Fingerprint

Dive into the research topics of 'ODTrack: Online Dense Temporal Token Learning for Visual Tracking'. Together they form a unique fingerprint.

Cite this