Skip to main navigation Skip to search Skip to main content

Unsupervised bilingual word embedding agreement for unsupervised neural machine translation

  • Harbin Institute of Technology
  • Japan National Institute of Information and Communications Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using nonparallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT. That is, the training of UBWE and UNMT are separate. In this paper, we first empirically investigate the relationship between UBWE and UNMT. The empirical findings show that the performance of UNMT is significantly affected by the performance of UBWE. Thus, we propose two methods that train UNMT with UBWE agreement. Empirical results on several language pairs show that the proposed methods significantly outperform conventional UNMT.

Original languageEnglish
Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1235-1245
Number of pages11
ISBN (Electronic)9781950737482
StatePublished - 2020
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Country/TerritoryItaly
CityFlorence
Period28/07/192/08/19

Fingerprint

Dive into the research topics of 'Unsupervised bilingual word embedding agreement for unsupervised neural machine translation'. Together they form a unique fingerprint.

Cite this