Skip to main navigation Skip to search Skip to main content

A Novel Temporal Channel Enhancement and Contextual Excavation Network for Temporal Action Localization

  • Zan Gao
  • , Xinglei Cui
  • , Yibo Zhao*
  • , Tao Zhuo
  • , Weili Guan
  • , Meng Wang
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The temporal action localization (TAL) task aims to locate and classify action instances in untrimmed videos. Most previous methods use classifiers and locators to act on the same feature; thus, the classification and localization processes are relatively independent. Therefore, if the classification results and localization results are fused, there will be a problem that the classification results are correct while the localization results are wrong, resulting in inaccurate final results, and vice versa. To solve this problem, we propose a novel temporal channel enhancement and contextual excavation network (TCN) for the TAL task, which generates robust classification and localization features and refines the final localization results. Specifically, a temporal channel enhancement module is designed to enhance the temporal and channel information of the feature sequence. Then, the temporal semantic contextual excavation module is developed to establish relationships between similar frames. Finally, the features with enhanced contextual information are transferred to a classifier. While executing the classification process, we obtain powerful classification features. Most importantly, with the robust classification features, the final localization features are produced by the refine localization module, which is applied to obtain the final localization results. Extensive experiments show that TCN can outperform all the SOTA methods on the THUMOS14 dataset, and achieves a comparable performance on the ActivityNet1.3 dataset. Compared with ActionFormer (ECCV 2022) and BREM (MM 2022) on the THUMOS14 dataset, the proposed TCN can achieve improvements of 1.8% and 5.0%, respectively.

Original languageEnglish
Title of host publicationMM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages6724-6733
Number of pages10
ISBN (Electronic)9798400701085
DOIs
StatePublished - 27 Oct 2023
Externally publishedYes
Event31st ACM International Conference on Multimedia, MM 2023 - Ottawa, Canada
Duration: 29 Oct 20233 Nov 2023

Publication series

NameMM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

Conference

Conference31st ACM International Conference on Multimedia, MM 2023
Country/TerritoryCanada
CityOttawa
Period29/10/233/11/23

Keywords

  • refine localization
  • temporal action localization
  • temporal channel enhancement
  • temporal semantic contextual

Fingerprint

Dive into the research topics of 'A Novel Temporal Channel Enhancement and Contextual Excavation Network for Temporal Action Localization'. Together they form a unique fingerprint.

Cite this