Skip to main navigation Skip to search Skip to main content

Trapezoidal Sketch: A Sketch Structure for Frequency Estimation of Data Streams

  • Ning Li*
  • , Xin Yuan
  • , Jose Fernan Martinez Ortega
  • , Vicente Hernandez Diaz
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The sketch is one of the typical and widely used data structures for estimating the frequencies of items in data streams. However, since the counter sizes in traditional rectangular sketch (r-sketch) are the same, it is hard to achieve small space usage, high capacity (i.e. the maximum frequency can be recorded) and high estimated accuracy simultaneously. Moreover, when considering the high skewness of data streams, this problem will become even worse. Consequently, we propose the trapezoidal sketch (t-sketch) in this paper. In the t-sketch, different from the r-sketch, the counter sizes in different layers are different. Therefore, the low space usage and high capacity can be achieved simultaneously in the t-sketch. Moreover, based on the basic t-sketch, we propose the space-saving t-sketch and the capacity-improvement t-sketch and analyze the properties of these two t-sketches. Finally, for improving the estimation accuracy of the t-sketch further, we propose the probabilistic-based estimation error-reducing algorithm. Compared with the CM sketch, CU sketch, C sketch and A sketch, the simulation results show that the performances on space usage, capacity and estimation accuracy are improved successfully by the space-saving t-sketch and the capacity-improvement t-sketch.

Original languageEnglish
Pages (from-to)2656-2673
Number of pages18
JournalComputer Journal
Volume66
Issue number11
DOIs
StatePublished - 1 Nov 2023
Externally publishedYes

Keywords

  • capacity improvement
  • data streams
  • frequency estimation
  • sketch
  • space saving

Fingerprint

Dive into the research topics of 'Trapezoidal Sketch: A Sketch Structure for Frequency Estimation of Data Streams'. Together they form a unique fingerprint.

Cite this