Abstract
Temporal sentence grounding in videos (TSGV) is a challenging task that aims to match text queries with semantically relevant segments in untrimmed videos. However, existing methods face limitations in modeling modality features, which constrains the expressive power of candidate moment features. To address this challenge, we propose a novel Enhanced Feature Interaction Network (EFIN) that effectively captures semantic information within each modality and aligns relationships between modalities. Additionally, EFIN enhances the fusion of information between candidate moments and modality features. Specifically, our model begins by extracting modality features to generate candidate moments as priors. Building upon these modality features, we introduce an enhanced feature encoder to extract semantic information within each modality, thereby improving intra-modality feature representation. Simultaneously, the encoder captures alignment relationships between modalities to optimize cross-modality feature representation, enhancing the overall modeling capacity of modality features. Moreover, we design an information fusion module to enrich the comprehension of modality information for candidate moments. Extensive experiments on four benchmark datasets demonstrate the superiority of our proposed EFIN model. Notably, EFIN achieves a maximum performance improvement of approximately 1.67% and 1.91% across different evaluation metrics on TACoS dataset.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Multimedia |
| DOIs | |
| State | Accepted/In press - 2026 |
| Externally published | Yes |
Keywords
- Enhanced Feature Encoder
- Enhanced Feature Interaction Network
- Information Fusion Module
- Temporal Sentence Grounding in Videos
Fingerprint
Dive into the research topics of 'EFIN: A Novel Enhanced Feature Interaction Network for Temporal Sentence Grounding in Videos'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver