Skip to main navigation Skip to search Skip to main content

Garbage Collection Does Not Only Collect Garbage: Piggybacking-Style Defragmentation for Deduplicated Backup Storage

  • Dingbang Liu
  • , Xiangyu Zou*
  • , Tao Lu
  • , Philip Shilane
  • , Wen Xia
  • , Wenxuan Huang
  • , Yanqi Pan
  • , Hao Huang
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • DapuStor Corporation
  • Dell

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deduplication is widely used in backup storage and reduces storage overhead by allowing backups to share common data chunks. However, it naturally disrupts the sequential layout of backup images, leading to fragmentation, which slows down backup restoration. Existing solutions to this issue often come with trade-offs, either reducing deduplication effectiveness or introducing significant I/O overhead. In this paper, we propose GCCDF, a novel approach that enhances the efficiency of deduplication-based backup storage. It reorders data as part of garbage collection to eliminate fragmentation, avoiding additional I/O costs. During the reordering, it effectively groups related data for better locality and aligns with the storage layout in backup storage. Evaluation results demonstrate that GCCDF significantly improves restoration speed, offsets data migration overhead, and preserves the deduplication ratio.

Original languageEnglish
Title of host publicationEuroSys 2025 - Proceedings of the 2025 20th European Conference on Computer Systems
PublisherAssociation for Computing Machinery, Inc
Pages1026-1043
Number of pages18
ISBN (Electronic)9798400711961
DOIs
StatePublished - 30 Mar 2025
Externally publishedYes
Event20th European Conference on Computer Systems, EuroSys 2025, co-located 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2025 - Rotterdam, Netherlands
Duration: 30 Mar 20253 Apr 2025

Publication series

NameEuroSys 2025 - Proceedings of the 2025 20th European Conference on Computer Systems

Conference

Conference20th European Conference on Computer Systems, EuroSys 2025, co-located 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2025
Country/TerritoryNetherlands
CityRotterdam
Period30/03/253/04/25

Fingerprint

Dive into the research topics of 'Garbage Collection Does Not Only Collect Garbage: Piggybacking-Style Defragmentation for Deduplicated Backup Storage'. Together they form a unique fingerprint.

Cite this