Skip to main navigation Skip to search Skip to main content

NetSync: A Network Adaptive and Deduplication-Inspired Delta Synchronization Approach for Cloud Storage Services

  • Harbin Institute of Technology Shenzhen
  • CAS - Institute of Computing Technology
  • Tsinghua University

Research output: Contribution to journalArticlepeer-review

Abstract

Delta sync (synchronization) is a key bandwidth-saving technique for cloud storage services. The representative delta sync utility, rsync, matches data chunks by sliding a search window byte-by-byte to maximize the redundancy detection for bandwidth efficiency. However, it is difficult for this process to cater to the forthcoming high-bandwidth cloud storage services which require lightweight delta sync that can well support large files. Moreover, rsync employs invariant chunking and compression methods during the sync process, making it unable to cater to services from various network environments which require the sync approach to perform well under different network conditions. Inspired by the Content-Defined Chunking (CDC) technique used in data deduplication, we propose NetSync, a network adaptive and CDC-based lightweight delta sync approach with less computing and protocol (metadata) overheads than the state-of-the-art delta sync approaches. Besides, NetSync can choose appropriate compressing and chunking strategies for different network conditions. The key idea of NetSync is (1) to simplify the process of chunk matching by proposing a fast weak hash called FastFP that is piggybacked on the rolling hashes from CDC, and redesigning the delta sync protocol by exploiting deduplication locality and weak/strong hash properties; (2) to minimize the sync time by adaptively choosing chunking parameters and compression methods according to the current network conditions. Our evaluation results driven by both benchmark and real-world datasets suggest NetSync performs '2\times'2×-'10\times'10× faster and supports '30\%'30%-'80\%'80% more clients than the state-of-the-art rsync-based WebR2sync+ and deduplication-based approach.

Original languageEnglish
Pages (from-to)2554-2570
Number of pages17
JournalIEEE Transactions on Parallel and Distributed Systems
Volume33
Issue number10
DOIs
StatePublished - 1 Oct 2022
Externally publishedYes

Keywords

  • cloud storage
  • content-defined chunking
  • network adaptive
  • rsync

Fingerprint

Dive into the research topics of 'NetSync: A Network Adaptive and Deduplication-Inspired Delta Synchronization Approach for Cloud Storage Services'. Together they form a unique fingerprint.

Cite this