Skip to main navigation Skip to search Skip to main content

CC-TUNING: A Cross-Lingual Connection Mechanism for Improving Joint Multilingual Supervised Fine-Tuning

  • Yangfan Ye
  • , Xiaocheng Feng*
  • , Zekun Yuan
  • , Xiachong Feng
  • , Libo Qin
  • , Lei Huang
  • , Weitao Ma
  • , Yichong Huang
  • , Zhirui Zhang
  • , Yunfei Lu
  • , Xiaohui Yan
  • , Duyu Tang
  • , Dandan Tu
  • , Bing Qin
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Peng Cheng Laboratory
  • The University of Hong Kong
  • Central South University
  • Huawei Technologies Co., Ltd.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Current large language models (LLMs) often exhibit imbalanced multilingual capabilities due to their English-centric training corpora. To address this, existing fine-tuning approaches operating at the data-level (e.g., through data augmentation or distillation) typically introduce implicit cross-lingual alignment, overlooking the potential for more profound, latent-level cross-lingual interactions. In this work, we propose CC-TUNING, a novel multilingual fine-tuning paradigm that explicitly establishes a cross-lingual connection mechanism at the latent level. During training, CC-TUNING fuses the feed forward activations from both English and non-English inputs, enabling the model to benefit from both linguistic resources. This process is facilitated with a trainable Decision Maker that identifies beneficial activations. Furthermore, during inference, a Transform Matrix is utilized to simulate the cross-lingual connection under monolingual setting through representation transformation. Our experiments on six benchmarks covering 22 languages show that CC-TUNING outperforms vanilla SFT and offers a strong latent-level alternative to data-level augmentation methods. Further analysis also highlights the practicality of CC-TUNING and the potential of latent-level cross-lingual interactions in advancing the multilingual performance of LLMs.

Original languageEnglish
Title of host publicationLong Papers
EditorsWanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
PublisherAssociation for Computational Linguistics (ACL)
Pages19036-19051
Number of pages16
ISBN (Electronic)9798891762510
DOIs
StatePublished - 2025
Event63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, Austria
Duration: 27 Jul 20251 Aug 2025

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume1
ISSN (Print)0736-587X

Conference

Conference63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Country/TerritoryAustria
CityVienna
Period27/07/251/08/25

Fingerprint

Dive into the research topics of 'CC-TUNING: A Cross-Lingual Connection Mechanism for Improving Joint Multilingual Supervised Fine-Tuning'. Together they form a unique fingerprint.

Cite this