Skip to main navigation Skip to search Skip to main content

Quartet: A Holistic Hybrid Parallel Framework for Training Large Language Models

  • Weigang Zhang
  • , Biyu Zhou*
  • , Xing Wu
  • , Chaochen Gao
  • , Zhibing Liu
  • , Xuehai Tang
  • , Ruixuan Li
  • , Jizhong Han
  • , Songlin Hu
  • *Corresponding author for this work
  • CAS - Institute of Information Engineering
  • University of Chinese Academy of Sciences

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Hybrid parallelism is popular in training large language models (LLMs). However, existing efforts have focused on optimizing individual strategies in hybrid parallelism, such as pipeline scheduling, device assignment, etc., which limits the overall training efficiency. This paper explores the intricate dependencies among four pivotal strategies-model scaling, model splitting, pipeline scheduling, and device assignment-and proposes Quartet, a holistic hybrid parallel framework for joint optimization. The novelty lies upon the formulation of parameterized pipeline scheduling and device assignment, alongside a pioneering analysis of model scaling’s impact on the throughput. These provide the basis for orchestrating four strategies within a unified framework to maximize the overall training throughput efficiently. Evaluation results show that: for representative LLMs , Quartet improves the training throughput by up to 2.16× over the state-of-the-art synchronous hybrid parallel approaches.

Original languageEnglish
Title of host publicationEuro-Par 2024
Subtitle of host publicationParallel Processing - 30th European Conference on Parallel and Distributed Processing, Proceedings
EditorsJesus Carretero, Javier Garcia-Blas, Sameer Shende, Ivona Brandic, Katzalin Olcoz, Martin Schreiber
PublisherSpringer Science and Business Media Deutschland GmbH
Pages424-438
Number of pages15
ISBN (Print)9783031697654
DOIs
StatePublished - 2024
Externally publishedYes
Event30th International Conference on Parallel and Distributed Computing, Euro-Par 2024 - Madrid, Spain
Duration: 26 Aug 202430 Aug 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14802 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference30th International Conference on Parallel and Distributed Computing, Euro-Par 2024
Country/TerritorySpain
CityMadrid
Period26/08/2430/08/24

Keywords

  • Distributed Training
  • Hybrid Parallelism
  • Large Language Models

Fingerprint

Dive into the research topics of 'Quartet: A Holistic Hybrid Parallel Framework for Training Large Language Models'. Together they form a unique fingerprint.

Cite this