TY - GEN
T1 - EEPH
T2 - 39th IEEE International Conference on Data Engineering, ICDE 2023
AU - Chen, Qi
AU - Hu, Hao
AU - Deng, Cai
AU - Liu, Dingbang
AU - Li, Shiyi
AU - Tang, Bo
AU - Yao, Ting
AU - Xia, Wen
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In recent years, the performance of hash indexes has been significantly improved by exploiting emerging persistent memory (PMem). However, the performance improvement of hash indexes mainly comes from exploiting the hardware features of PMem. Only a few studies optimize the hash index itself to fully exploit the potential of PMem. Interestingly, many of these studies improve the performance of write, but disregard the performance of read, of hash indexes on PMem. With extensive experimental evaluation, we find the major reason for inefficient read in the hash index on PMem is that the overhead of hash collision processing is expensive.To address that, we propose a novel Efficient Extendible Perfect Hashing (EEPH) on PMem-DRAM hybrid data layout to improve read performance of hash indexes. Specifically, we reduce the overhead of dynamic perfect hashing extension on PMem by combing extendible hashing. We then design a hybrid data layout to unlock the inherent read strengths of perfect hashing (i.e., zero collision). Last, we devise a complement move algorithm to efficiently guarantee the zero collision of perfect hashing when data move is conducted on PMem. We compare EEPH with the state-of-the-art hash indexes on PMem by conducting comprehensive experiments on several real-world read-intensive and read-skew workloads. The experimental results confirm the superiority of our EEPH as it achieves up to 2.21× higher throughput and about 1/3 of the 99th percentile latency than state-of-the-art hash indexes.
AB - In recent years, the performance of hash indexes has been significantly improved by exploiting emerging persistent memory (PMem). However, the performance improvement of hash indexes mainly comes from exploiting the hardware features of PMem. Only a few studies optimize the hash index itself to fully exploit the potential of PMem. Interestingly, many of these studies improve the performance of write, but disregard the performance of read, of hash indexes on PMem. With extensive experimental evaluation, we find the major reason for inefficient read in the hash index on PMem is that the overhead of hash collision processing is expensive.To address that, we propose a novel Efficient Extendible Perfect Hashing (EEPH) on PMem-DRAM hybrid data layout to improve read performance of hash indexes. Specifically, we reduce the overhead of dynamic perfect hashing extension on PMem by combing extendible hashing. We then design a hybrid data layout to unlock the inherent read strengths of perfect hashing (i.e., zero collision). Last, we devise a complement move algorithm to efficiently guarantee the zero collision of perfect hashing when data move is conducted on PMem. We compare EEPH with the state-of-the-art hash indexes on PMem by conducting comprehensive experiments on several real-world read-intensive and read-skew workloads. The experimental results confirm the superiority of our EEPH as it achieves up to 2.21× higher throughput and about 1/3 of the 99th percentile latency than state-of-the-art hash indexes.
KW - hybrid PMem DRAM
KW - perfect hashing
UR - https://www.scopus.com/pages/publications/85167684672
U2 - 10.1109/ICDE55515.2023.00109
DO - 10.1109/ICDE55515.2023.00109
M3 - 会议稿件
AN - SCOPUS:85167684672
T3 - Proceedings - International Conference on Data Engineering
SP - 1366
EP - 1378
BT - Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
PB - IEEE Computer Society
Y2 - 3 April 2023 through 7 April 2023
ER -