Skip to main navigation Skip to search Skip to main content

FineBadminton: A Multi-Level Dataset for Fine-Grained Badminton Video Understanding

  • Xusheng He
  • , Wei Liu
  • , Shanshan Ma*
  • , Qian Liu
  • , Chenghao Ma
  • , Jianlong Wu*
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • China Electronics Standardization Institute
  • Shandong University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Fine-grained analysis of complex and high-speed sports like badminton presents a significant challenge for Multimodal Large Language Models (MLLMs), despite their notable advancements in general video understanding. This difficulty arises primarily from the scarcity of datasets with sufficiently rich and domain-specific annotations. To bridge this gap, we introduce FineBadminton, a novel and large-scale dataset featuring a unique multi-level semantic annotation hierarchy (Foundational Actions, Tactical Semantics, and Decision Evaluation) for comprehensive badminton understanding. The construction of FineBadminton is powered by an innovative annotation pipeline that synergistically combines MLLM-generated proposals with human refinement. We also present FBBench, a challenging benchmark derived from FineBadminton, to rigorously evaluate MLLMs on nuanced spatio-temporal reasoning and tactical comprehension. Together, FineBadminton and FBBench provide a crucial ecosystem to catalyze research in fine-grained video understanding and advance the development of MLLMs in sports intelligence. Furthermore, we propose an optimized baseline approach incorporating Hit-Centric Keyframe Selection to focus on pivotal moments and Coordinate-Guided Condensation to distill salient visual information. The results on FBBench reveal that while current MLLMs still face significant challenges in deep sports video analysis, our proposed strategies nonetheless achieve substantial performance gains. The project homepage is available at https://finebadminton.github.io/FineBadminton/.

Original languageEnglish
Title of host publicationMM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
PublisherAssociation for Computing Machinery, Inc
Pages12776-12783
Number of pages8
ISBN (Electronic)9798400720352
DOIs
StatePublished - 27 Oct 2025
Externally publishedYes
Event33rd ACM International Conference on Multimedia, MM 2025 - Dublin, Ireland
Duration: 27 Oct 202531 Oct 2025

Publication series

NameMM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025

Conference

Conference33rd ACM International Conference on Multimedia, MM 2025
Country/TerritoryIreland
CityDublin
Period27/10/2531/10/25

Keywords

  • badminton video dataset
  • multi-modal llms
  • sports understanding

Fingerprint

Dive into the research topics of 'FineBadminton: A Multi-Level Dataset for Fine-Grained Badminton Video Understanding'. Together they form a unique fingerprint.

Cite this