Skip to main navigation Skip to search Skip to main content

Modeling lexical cohesion for document-level machine translation

  • Deyi Xiong
  • , Guosheng Ben
  • , Min Zhang*
  • , Yajuan Lü
  • , Qun Liu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Lexical cohesion arises from a chain of lexical items that establish links between sentences in a text. In this paper we propose three different models to capture lexical cohesion for document-level machine translation: (a) a direct reward model where translation hypotheses are rewarded whenever lexical cohesion devices occur in them, (b) a conditional probability model where the appropriateness of using lexical cohesion devices is measured, and (c) a mutual information trigger model where a lexical cohesion relation is considered as a trigger pair and the strength of the association between the trigger and the triggered item is estimated by mutual information. We integrate the three models into hierarchical phrase-based machine translation and evaluate their effectiveness on the NIST Chinese-English translation tasks with large-scale training data. Experiment results show that all three models can achieve substantial improvements over the baseline and that the mutual information trigger model performs better than the others.

Original languageEnglish
Title of host publicationIJCAI 2013 - Proceedings of the 23rd International Joint Conference on Artificial Intelligence
Pages2183-2189
Number of pages7
StatePublished - 2013
Externally publishedYes
Event23rd International Joint Conference on Artificial Intelligence, IJCAI 2013 - Beijing, China
Duration: 3 Aug 20139 Aug 2013

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference23rd International Joint Conference on Artificial Intelligence, IJCAI 2013
Country/TerritoryChina
CityBeijing
Period3/08/139/08/13

Fingerprint

Dive into the research topics of 'Modeling lexical cohesion for document-level machine translation'. Together they form a unique fingerprint.

Cite this