Skip to main navigation Skip to search Skip to main content

DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation

  • Xinglin Lyu
  • , Wei Tang
  • , Yuang Li
  • , Xiaofeng Zhao
  • , Ming Zhu
  • , Junhui Li
  • , Yunfei Lu
  • , Min Zhang
  • , Daimeng Wei
  • , Hao Yang*
  • , Min Zhang
  • *Corresponding author for this work
  • Huawei Translation Services Center
  • Soochow University
  • Huawei Consumer Business Group

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Document-level context is crucial for handling discourse challenges in text-to-text document-level machine translation (MT). Despite the increased discourse challenges introduced by noise from automatic speech recognition (ASR), the integration of document-level context in speech translation (ST) remains insufficiently explored. In this paper, we develop DoCIA, an online framework that enhances ST performance by incorporating document-level context. DoCIA decomposes the ST pipeline into four stages. Document-level context is integrated into the ASR refinement, MT, and MT refinement stages through auxiliary LLM (large language model)-based modules. Furthermore, DoCIA leverages document-level information in a multi-level manner while minimizing computational overhead. Additionally, a simple yet effective determination mechanism is introduced to prevent hallucinations from excessive refinement, ensuring the reliability of the final results. Experimental results show that DoCIA significantly outperforms traditional ST baselines in both sentence and discourse metrics across four LLMs, demonstrating its effectiveness in improving ST performance.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationACL 2025
EditorsWanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
PublisherAssociation for Computational Linguistics (ACL)
Pages14910-14924
Number of pages15
ISBN (Electronic)9798891762565
DOIs
StatePublished - 2025
Externally publishedYes
Event63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, Austria
Duration: 27 Jul 20251 Aug 2025

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Country/TerritoryAustria
CityVienna
Period27/07/251/08/25

Fingerprint

Dive into the research topics of 'DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation'. Together they form a unique fingerprint.

Cite this