Skip to main navigation Skip to search Skip to main content

Dynamic fault-tolerant workflow scheduling with hybrid spatial-temporal re-execution in clouds

  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Improving reliability is one of the major concerns of scientific workflow scheduling in clouds. The ever-growing computational complexity and data size of workflows present challenges to fault-tolerant workflow scheduling. Therefore, it is essential to design a cost-effective fault-tolerant scheduling approach for large-scale workflows. In this paper, we propose a dynamic fault-tolerant workflow scheduling (DFTWS) approach with hybrid spatial and temporal re-execution schemes. First, DFTWS calculates the time attributes of tasks and identifies the critical path of workflow in advance. Then, DFTWS assigns appropriate virtual machine (VM) for each task according to the task urgency and budget quota in the phase of initial resource allocation. Finally, DFTWS performs online scheduling, which makes real-time fault-tolerant decisions based on failure type and task criticality throughout workflow execution. The proposed algorithm is evaluated on real-world workflows. Furthermore, the factors that affect the performance of DFTWS are analyzed. The experimental results demonstrate that DFTWS achieves a trade-off between high reliability and low cost objectives in cloud computing environments.

Original languageEnglish
Article number169
JournalInformation (Switzerland)
Volume10
Issue number5
DOIs
StatePublished - 2019
Externally publishedYes

Keywords

  • Cloud computing
  • Fault-tolerant
  • Re-execution
  • Workflow scheduling

Fingerprint

Dive into the research topics of 'Dynamic fault-tolerant workflow scheduling with hybrid spatial-temporal re-execution in clouds'. Together they form a unique fingerprint.

Cite this