TY - GEN
T1 - Grammar-Based Patches Generation for Automated Program Repair
AU - Tang, Yu
AU - Zhou, Long
AU - Blanco, Ambrosio
AU - Liu, Shujie
AU - Wei, Furu
AU - Zhou, Ming
AU - Yang, Muyun
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021
Y1 - 2021
N2 - Automated program repair (APR) aims to find an automatic solution to program language bugs without human intervention, and it can potentially reduce debugging costs and improve software quality. Conventional approaches adopt learning-based methods such as sequence-to-sequence models for the patches generation. However, they tend to ignore the code structure information and suffer from grammar and syntax errors. To consider the grammar and syntax information, in this paper, we propose a grammar-based rule-to-rule model, which regards the repair process as the transformation of grammar rules, and leverages two encoders modeling both the original token sequence and the grammar rules, enhanced with a new tree-based self-attention. Besides, to guarantee grammar correctness, we employ a grammatically restricted inference method to generate each grammar rule in a legally constrained sub-search-space considering the generated previous rules. Experimental evaluations on a Java dataset demonstrate that the proposed approach significantly outperforms the state-of-the-art baselines in terms of generated code accuracy.
AB - Automated program repair (APR) aims to find an automatic solution to program language bugs without human intervention, and it can potentially reduce debugging costs and improve software quality. Conventional approaches adopt learning-based methods such as sequence-to-sequence models for the patches generation. However, they tend to ignore the code structure information and suffer from grammar and syntax errors. To consider the grammar and syntax information, in this paper, we propose a grammar-based rule-to-rule model, which regards the repair process as the transformation of grammar rules, and leverages two encoders modeling both the original token sequence and the grammar rules, enhanced with a new tree-based self-attention. Besides, to guarantee grammar correctness, we employ a grammatically restricted inference method to generate each grammar rule in a legally constrained sub-search-space considering the generated previous rules. Experimental evaluations on a Java dataset demonstrate that the proposed approach significantly outperforms the state-of-the-art baselines in terms of generated code accuracy.
UR - https://www.scopus.com/pages/publications/85116662355
U2 - 10.18653/v1/2021.findings-acl.111
DO - 10.18653/v1/2021.findings-acl.111
M3 - 会议稿件
AN - SCOPUS:85116662355
T3 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
SP - 1300
EP - 1305
BT - Findings of the Association for Computational Linguistics
A2 - Zong, Chengqing
A2 - Xia, Fei
A2 - Li, Wenjie
A2 - Navigli, Roberto
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Y2 - 1 August 2021 through 6 August 2021
ER -