Skip to main navigation Skip to search Skip to main content

Linguistic Rule Induction Improves Adversarial and OOD Robustness in Large Language Models

  • Harbin Institute of Technology Shenzhen
  • Peng Cheng Laboratory

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Ensuring robustness is especially important when AI is deployed in responsible or safety-critical environments. ChatGPT can perform brilliantly in both adversarial and out-of-distribution (OOD) robustness. Still, other popular large language models (LLMs), like LLaMA-2, ERNIE, and ChatGLM, do not perform satisfactorily in this regard. Therefore, it is valuable to study what efforts play essential roles in ChatGPT, and how to transfer these efforts to other LLMs. This paper experimentally finds that linguistic rule induction is the foundation for identifying the cause-effect relationships in LLMs. Accurately processing the cause-effect relationships in LLMs can improve their adversarial and OOD robustness. Furthermore, we explore a low-cost way of aligning LLMs with linguistic rules. Specifically, we constructed a linguistic rule instruction dataset to fine-tune LLMs. To further energize LLMs for reasoning step-by-step with the linguistic rules, we propose the task-relevant LingR-based chain-of-thoughts. Experiments showed that LingR-induced LLaMA-13B achieves comparable or better results with GPT-3.5 and GPT-4 on various adversarial and OOD robustness evaluations.

Original languageEnglish
Title of host publication2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
PublisherEuropean Language Resources Association (ELRA)
Pages10565-10577
Number of pages13
ISBN (Electronic)9782493814104
StatePublished - 2024
Externally publishedYes
EventJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy
Duration: 20 May 202425 May 2024

Publication series

Name2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Conference

ConferenceJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Country/TerritoryItaly
CityHybrid, Torino
Period20/05/2425/05/24

Keywords

  • Adversarial
  • Cause-effect
  • Chain-of-thoughts
  • Linguistic Rule
  • Out-of-distribution
  • Robustness

Fingerprint

Dive into the research topics of 'Linguistic Rule Induction Improves Adversarial and OOD Robustness in Large Language Models'. Together they form a unique fingerprint.

Cite this