Skip to main navigation Skip to search Skip to main content

Open-ended Long Text Generation via Masked Language Modeling

  • Xiaobo Liang
  • , Zecheng Tang
  • , Juntao Li*
  • , Min Zhang
  • *Corresponding author for this work
  • Soochow University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Pre-trained autoregressive (AR) language models such as BART and GPTs have dominated Open-ended Long Text Generation (Open-LTG). However, the AR nature will decrease the inference efficiency along with the increase of generation length, which hinder their application in Open-LTG. To improve inference efficiency, we alternatively explore the potential of the pre-trained masked language models (MLMs) along with a representative iterative non-autoregressive (NAR) decoding strategy for Open-LTG. Our preliminary study shows that pre-trained MLMs can merely generate short text and will collapse for long text modeling. To enhance the long text generation capability of MLMs, we introduce two simple yet effective strategies for the iterative NAR model: dynamic sliding window attention (DSWA) and linear temperature decay (LTD). It can alleviate long-distance collapse problems and achieve longer text generation with a flexible trade-off between performance and inference speedup. Experiments on the storytelling and multi-paragraph opinionated article writing tasks show that pre-trained MLMs can achieve more than 3 × → 13 × speedup with better performance than strong AR models. Our code is available at GitHub.

Original languageEnglish
Title of host publicationLong Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages223-241
Number of pages19
ISBN (Electronic)9781959429722
DOIs
StatePublished - 2023
Externally publishedYes
Event61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada
Duration: 9 Jul 202314 Jul 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume1
ISSN (Print)0736-587X

Conference

Conference61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/TerritoryCanada
CityToronto
Period9/07/2314/07/23

Fingerprint

Dive into the research topics of 'Open-ended Long Text Generation via Masked Language Modeling'. Together they form a unique fingerprint.

Cite this