TY - GEN
T1 - Context-Aware Smoothing for Neural Machine Translation
AU - Chen, Kehai
AU - Wang, Rui
AU - Utiyama, Masao
AU - Sumita, Eiichiro
AU - Zhao, Tiejun
N1 - Publisher Copyright:
©2017 AFNLP.
PY - 2017
Y1 - 2017
N2 - In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information. This means that even if the word is in a different sentence context, it is represented as the fixed vector to learn source representation. Moreover, a large number of Out-Of-Vocabulary (OOV) words, which have different syntax and semantic information, are represented as the same vector representation of unk. To alleviate this problem, we propose a novel context-aware smoothing method to dynamically learn a sentence-specific vector for each word (including OOV words) depending on its local context words in a sentence. The learned context-aware representation is integrated into the NMT to improve the translation performance. Empirical results on NIST Chinese-to-English translation task show that the proposed approach achieves 1.78 BLEU improvements on average over a strong attentional NMT, and outperforms some existing systems.
AB - In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information. This means that even if the word is in a different sentence context, it is represented as the fixed vector to learn source representation. Moreover, a large number of Out-Of-Vocabulary (OOV) words, which have different syntax and semantic information, are represented as the same vector representation of unk. To alleviate this problem, we propose a novel context-aware smoothing method to dynamically learn a sentence-specific vector for each word (including OOV words) depending on its local context words in a sentence. The learned context-aware representation is integrated into the NMT to improve the translation performance. Empirical results on NIST Chinese-to-English translation task show that the proposed approach achieves 1.78 BLEU improvements on average over a strong attentional NMT, and outperforms some existing systems.
UR - https://www.scopus.com/pages/publications/105019293956
M3 - 会议稿件
AN - SCOPUS:105019293956
T3 - 8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
SP - 11
EP - 20
BT - 8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017
PB - Association for Computational Linguistics (ACL)
T2 - 8th International Joint Conference on Natural Language Processing, IJCNLP 2017
Y2 - 27 November 2017 through 1 December 2017
ER -