Skip to main navigation Skip to search Skip to main content

A data grouping model based on cache transaction for unstructured data storage systems

  • School of Computer Science and Technology, Harbin Institute of Technology
  • School of Astronautics, Harbin Institute of Technology
  • Sanming University

Research output: Contribution to journalArticlepeer-review

Abstract

Cache prefetching technology has become the mainstream data access optimization strategy in the Industrial Intelligent Systems (IIS) and the data centers. However, the rapidly increasing of unstructured data generates massive pairwise access relationships. Therefore, researchers have to make a choice between spatial locality and temporal locality to ensure an acceptable computational complexity. We propose cache-transaction-based data grouping model (CTDGM) to solve the problems described above by optimizing the feature representation method and grouping efficiency. First, we provide the definition of the cache transaction and propose the method for extracting the cache transaction feature (CTF). Second, we design a data chunking algorithm based on CTF and spatiotemporal locality to optimize the relationship calculation efficiency. Third, we propose CTDGM by constructing a relation graph that groups data into independent groups according to the strength of the data access relation. Based on the results of the experiment, compared with the state-of-the-art and traditional methods, our algorithm achieves an average increase in the cache hit rate of 5%–20% on the MSR, VDI-LUN, and KC data set, which in turn reduces the number of data I/O accesses by 30%–60%.

Original languageEnglish
Pages (from-to)4488-4514
Number of pages27
JournalInternational Journal of Intelligent Systems
Volume37
Issue number8
DOIs
StatePublished - Aug 2022
Externally publishedYes

Keywords

  • cache prefetching
  • correlation analysis
  • data grouping model
  • distributed storage systems
  • feature extraction method

Fingerprint

Dive into the research topics of 'A data grouping model based on cache transaction for unstructured data storage systems'. Together they form a unique fingerprint.

Cite this