Abstract
Range query is an important operation on big multi-dimensional data. This paper studies the problem of multi-dimensional range query filtering for speeding up the range query processing by avoiding reading the useless data. To solve the problem, a novel multi-dimensional range filter is proposed to filter the multi-dimensional range queries, while the existing one-dimensional range filters can not provide efficient filtering. Based on the multi-dimensional range filter, an efficient range query processing algorithm is presented. It can directly return the locations of the I/O units that contain the data in the query result without any access to the input dataset. The time complexity of the algorithm is O(3mh)O(3mh), where hh is the number of I/O units partially overlapping with a range query, and mm is the dimension number. Since mm is usually o(log n)o(logn), it is a sublinear time algorithm if V=O(n)V=O(n), where nn is the size of the input dataset, V=πi=1m dii, and di is the number of distinct values on the ii-th dimension of the dataset for 1≤ i≤ m. Experimental results show that the multi-dimensional range filter has low false positive rate and good filtering efficiency. The proposed range query processing algorithm achieves at least 3∼7 times improvement compared to the one-dimensional filter based algorithms on different datasets.
| Original language | English |
|---|---|
| Pages (from-to) | 6600-6613 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Knowledge and Data Engineering |
| Volume | 36 |
| Issue number | 11 |
| DOIs | |
| State | Published - 2024 |
Keywords
- Multi-dimensional data
- range filter
- range query
- sublinear time
Fingerprint
Dive into the research topics of 'MulRF: A Multi-Dimensional Range Filter for Sublinear Time Range Query Processing'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver