Skip to main navigation Skip to search Skip to main content

SCEDIT: Script-based Assessment of Knowledge Editing

  • Xinye Li*
  • , Zunwen Zheng
  • , Qian Zhang
  • , Dekai Zhuang
  • , Jiabao Kang
  • , Liyan Xu
  • , Qingbin Liu
  • , Xi Chen
  • , Zhiying Tu
  • , Dianhui Chu
  • , Dianbo Sui*
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Jilin University
  • Tencent

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Knowledge Editing (KE) has gained increasing attention, yet current KE tasks remain relatively simple. Under current evaluation frameworks, many editing methods achieve exceptionally high scores, sometimes nearing perfection. However, few studies integrate KE into real-world application scenarios (e.g., recent interest in LLM-as-agent). To support our analysis, we introduce a novel script-based benchmark - SCEDIT (Script-based Knowledge Editing Benchmark) - which encompasses both counterfactual and temporal edits. We integrate token-level and text-level evaluation methods, comprehensively analyzing existing KE techniques. The benchmark extends traditional fact-based (“What”-type question) evaluation to action-based (“How”-type question) evaluation. We observe that all KE methods exhibit a drop in performance on established metrics and face challenges on text-level metrics, indicating a challenging task. Our benchmark is available at https://github.com/asdfo123/ScEdit.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationACL 2025
EditorsWanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
PublisherAssociation for Computational Linguistics (ACL)
Pages2032-2052
Number of pages21
ISBN (Electronic)9798891762565
DOIs
StatePublished - 2025
Event63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, Austria
Duration: 27 Jul 20251 Aug 2025

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Country/TerritoryAustria
CityVienna
Period27/07/251/08/25

Fingerprint

Dive into the research topics of 'SCEDIT: Script-based Assessment of Knowledge Editing'. Together they form a unique fingerprint.

Cite this