Abstract
The Count-Min sketch and its variations are widely used to solve the frequency estimation problem due to its sub-linear space cost. However, the collisions between high-frequency and low-frequency items introduce a significant estimation error. In this paper, we propose two learned sketches called the Learned Count-Min sketch and Learned Augmented sketch. We combine the machine learning methods with the traditional Count-Min sketch and Augmented sketch to improve the performance. We used a regression model trained from historical data to predict the frequencies and separate the high-frequency items and low-frequency items. The experimental results indicated that our learned sketches outperform the traditional Count-Min sketch and Augmented sketch. The learned sketches can provide a more accurate estimation with a more compact synopsis size.
| Original language | English |
|---|---|
| Pages (from-to) | 365-385 |
| Number of pages | 21 |
| Journal | Information Sciences |
| Volume | 507 |
| DOIs | |
| State | Published - Jan 2020 |
Keywords
- Frequency estimation
- Query processing
- Sketches
Fingerprint
Dive into the research topics of 'Learned sketches for frequency estimation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver