TY - GEN
T1 - SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
AU - Li, Zuchao
AU - Zhao, Hai
AU - Wang, Rui
AU - Chen, Kehai
AU - Utiyama, Masao
AU - Sumita, Eiichiro
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - In this paper, we introduced our joint team SJTU-NICT's participation in the WMT 2020 machine translation shared task. In this shared task, we participated in four translation directions of three language pairs: English-Chinese, English-Polish on supervised machine translation track, German-Upper Sorbian on low-resource and unsupervised machine translation tracks. Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques: document-enhanced NMT, XLM pre-trained language model enhanced NMT, bidirectional translation as a pre-training, reference language based UNMT, data-dependent gaussian prior objective, and BT-BLEU collaborative filtering self-training. We also used the TF-IDF algorithm to filter the training set to obtain a domain more similar set with the test set for finetuning. In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
AB - In this paper, we introduced our joint team SJTU-NICT's participation in the WMT 2020 machine translation shared task. In this shared task, we participated in four translation directions of three language pairs: English-Chinese, English-Polish on supervised machine translation track, German-Upper Sorbian on low-resource and unsupervised machine translation tracks. Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques: document-enhanced NMT, XLM pre-trained language model enhanced NMT, bidirectional translation as a pre-training, reference language based UNMT, data-dependent gaussian prior objective, and BT-BLEU collaborative filtering self-training. We also used the TF-IDF algorithm to filter the training set to obtain a domain more similar set with the test set for finetuning. In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
UR - https://www.scopus.com/pages/publications/85121360932
M3 - 会议稿件
AN - SCOPUS:85121360932
T3 - 5th Conference on Machine Translation, WMT 2020 - Proceedings
SP - 218
EP - 219
BT - 5th Conference on Machine Translation, WMT 2020 - Proceedings
A2 - Barrault, Loic
A2 - Bojar, Ondrej
A2 - Bougares, Fethi
A2 - Chatterjee, Rajen
A2 - Costa-Jussa, Marta R.
A2 - Federmann, Christian
A2 - Fishel, Mark
A2 - Fraser, Alexander
A2 - Graham, Yvette
A2 - Guzman, Paco
A2 - Haddow, Barry
A2 - Huck, Matthias
A2 - Yepes, Antonio Jimeno
A2 - Koehn, Philipp
A2 - Martins, Andre
A2 - Morishita, Makoto
A2 - Monz, Christof
A2 - Nagata, Masaaki
A2 - Nakazawa, Toshiaki
A2 - Negri, Matteo
PB - Association for Computational Linguistics (ACL)
T2 - 5th Conference on Machine Translation, WMT 2020
Y2 - 19 November 2020 through 20 November 2020
ER -