Skip to main navigation Skip to search Skip to main content

Bridging subword gaps in pretrain-finetune paradigm for natural language generation

  • Xin Liu
  • , Baosong Yang
  • , Dayiheng Liu
  • , Haibo Zhang
  • , Weihua Luo
  • , Min Zhang
  • , Haiying Zhang
  • , Jinsong Su*
  • *Corresponding author for this work
  • Xiamen University
  • Alibaba Group Holding Ltd.
  • Soochow University
  • Pengcheng Laboratory

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A well-known limitation in pretrain-finetune paradigm lies in its inflexibility caused by the one-size-fits-all vocabulary. This potentially weakens the effect when applying pretrained models into natural language generation (NLG) tasks, especially for the subword distributions between upstream and downstream tasks with significant discrepancy. Towards approaching this problem, we extend the vanilla pretrain-finetune pipeline with an extra embedding transfer step. Specifically, a plug-and-play embedding generator is introduced to produce the representation of any input token, according to pre-trained embeddings of its morphologically similar ones. Thus, embeddings of mismatch tokens in downstream tasks can also be efficiently initialized. We conduct experiments on a variety of NLG tasks under the pretrain-finetune fashion. Experimental results and extensive analyses show that the proposed strategy offers us opportunities to feel free to transfer the vocabulary, leading to more efficient and better performed downstream NLG models.

Original languageEnglish
Title of host publicationACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages6001-6011
Number of pages11
ISBN (Electronic)9781954085527
DOIs
StatePublished - 2021
Externally publishedYes
EventJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online
Duration: 1 Aug 20216 Aug 2021

Publication series

NameACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
Volume1

Conference

ConferenceJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
CityVirtual, Online
Period1/08/216/08/21

Fingerprint

Dive into the research topics of 'Bridging subword gaps in pretrain-finetune paradigm for natural language generation'. Together they form a unique fingerprint.

Cite this