TY - GEN
T1 - Garbage Collection Does Not Only Collect Garbage
T2 - 20th European Conference on Computer Systems, EuroSys 2025, co-located 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2025
AU - Liu, Dingbang
AU - Zou, Xiangyu
AU - Lu, Tao
AU - Shilane, Philip
AU - Xia, Wen
AU - Huang, Wenxuan
AU - Pan, Yanqi
AU - Huang, Hao
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/3/30
Y1 - 2025/3/30
N2 - Deduplication is widely used in backup storage and reduces storage overhead by allowing backups to share common data chunks. However, it naturally disrupts the sequential layout of backup images, leading to fragmentation, which slows down backup restoration. Existing solutions to this issue often come with trade-offs, either reducing deduplication effectiveness or introducing significant I/O overhead. In this paper, we propose GCCDF, a novel approach that enhances the efficiency of deduplication-based backup storage. It reorders data as part of garbage collection to eliminate fragmentation, avoiding additional I/O costs. During the reordering, it effectively groups related data for better locality and aligns with the storage layout in backup storage. Evaluation results demonstrate that GCCDF significantly improves restoration speed, offsets data migration overhead, and preserves the deduplication ratio.
AB - Deduplication is widely used in backup storage and reduces storage overhead by allowing backups to share common data chunks. However, it naturally disrupts the sequential layout of backup images, leading to fragmentation, which slows down backup restoration. Existing solutions to this issue often come with trade-offs, either reducing deduplication effectiveness or introducing significant I/O overhead. In this paper, we propose GCCDF, a novel approach that enhances the efficiency of deduplication-based backup storage. It reorders data as part of garbage collection to eliminate fragmentation, avoiding additional I/O costs. During the reordering, it effectively groups related data for better locality and aligns with the storage layout in backup storage. Evaluation results demonstrate that GCCDF significantly improves restoration speed, offsets data migration overhead, and preserves the deduplication ratio.
UR - https://www.scopus.com/pages/publications/105002236643
U2 - 10.1145/3689031.3717493
DO - 10.1145/3689031.3717493
M3 - 会议稿件
AN - SCOPUS:105002236643
T3 - EuroSys 2025 - Proceedings of the 2025 20th European Conference on Computer Systems
SP - 1026
EP - 1043
BT - EuroSys 2025 - Proceedings of the 2025 20th European Conference on Computer Systems
PB - Association for Computing Machinery, Inc
Y2 - 30 March 2025 through 3 April 2025
ER -