Skip to main navigation Skip to search Skip to main content

JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation

  • Soochow University
  • Microsoft USA

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Transformer-based autoregressive and non-autoregressive models have played an essential role in sequence generation tasks. The autoregressive model can obtain excellent performance, while the non-autoregressive model brings fast decoding speed for inference. In this paper, we propose JANUS, a Joint Autoregressive and Non-autoregressive training method using aUxiliary losS to enhance the model performance in both AR and NAR manner simultaneously and effectively alleviate the problem of distribution discrepancy. Further, we pre-train BART with JANUS on a large corpus with minimal cost (16 GPU days) and make the BART-JANUS capable of non-autoregressive generation, demonstrating that our approach can transfer the AR knowledge to NAR. Empirically, we show our approach and BART-JANUS can achieve significant improvement on multiple generation tasks, including machine translation and GLGE benchmarks. Our code is available at Github.

Original languageEnglish
Title of host publicationProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
PublisherAssociation for Computational Linguistics (ACL)
Pages8050-8060
Number of pages11
ISBN (Electronic)9781959429401
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Hybrid, Abu Dhabi, United Arab Emirates
Duration: 7 Dec 202211 Dec 2022

Publication series

NameProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Conference

Conference2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityHybrid, Abu Dhabi
Period7/12/2211/12/22

Fingerprint

Dive into the research topics of 'JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation'. Together they form a unique fingerprint.

Cite this