Skip to main navigation Skip to search Skip to main content

MulRF: A Multi-Dimensional Range Filter for Sublinear Time Range Query Processing

  • Shuai Han
  • , Xianmin Liu
  • , Jianzhong Li*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Range query is an important operation on big multi-dimensional data. This paper studies the problem of multi-dimensional range query filtering for speeding up the range query processing by avoiding reading the useless data. To solve the problem, a novel multi-dimensional range filter is proposed to filter the multi-dimensional range queries, while the existing one-dimensional range filters can not provide efficient filtering. Based on the multi-dimensional range filter, an efficient range query processing algorithm is presented. It can directly return the locations of the I/O units that contain the data in the query result without any access to the input dataset. The time complexity of the algorithm is O(3mh)O(3mh), where hh is the number of I/O units partially overlapping with a range query, and mm is the dimension number. Since mm is usually o(log n)o(logn), it is a sublinear time algorithm if V=O(n)V=O(n), where nn is the size of the input dataset, V=πi=1m dii, and di is the number of distinct values on the ii-th dimension of the dataset for 1≤ i≤ m. Experimental results show that the multi-dimensional range filter has low false positive rate and good filtering efficiency. The proposed range query processing algorithm achieves at least 3∼7 times improvement compared to the one-dimensional filter based algorithms on different datasets.

Original languageEnglish
Pages (from-to)6600-6613
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume36
Issue number11
DOIs
StatePublished - 2024

Keywords

  • Multi-dimensional data
  • range filter
  • range query
  • sublinear time

Fingerprint

Dive into the research topics of 'MulRF: A Multi-Dimensional Range Filter for Sublinear Time Range Query Processing'. Together they form a unique fingerprint.

Cite this