Skip to main navigation Skip to search Skip to main content

QCFE: An Efficient Feature Engineering for Query Cost Estimation

  • Yu Yan
  • , Hongzhi Wang*
  • , Junfang Huang
  • , Dake Zhong
  • , Tao Yu
  • , Kaixin Zhang
  • , Man Yang
  • , Tianqing Wang
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Huawei Technologies Co., Ltd.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Query cost estimation is a classical task for database management. Recently, researchers have applied AI-driven methods to implement query cost estimation for achieving high accuracy. However, two defects of the feature design lead to poor time-accuracy efficiency in the query cost estimation task. On the one hand, existing works only encode the query plan and data statistics while ignoring some important variables, like storage structure, hardware, database knobs, etc. These variables also have a significant impact on the query cost. On the other hand, existing works suffer the heavy model training and model inference due to inefficient features, such as the index encoding of write-only workloads. To address the above two problems, we first propose an efficient feature engineering for query cost estimation, called QCFE, consisting of the feature snapshot and feature reduction algorithm. (1) We design a novel concept called feature snapshot to efficiently integrate the influences of the missing variables. (2) We propose a difference-propagation feature reduction method for query cost estimation to filter the ineffective features. Compared to state-of-the-art methods, QCFE demonstrates significant improvements in various aspects with well-known benchmarks. QCFE saves up to 50% time consumption for model training, resulting in more efficient and faster training processes. QCFE also optimizes the mean q-error by 19.8% in TPCH, leading to more precise query cost estimation. QCFE offers up to an impressive 8 times inference speedup in query inference throughput.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PublisherIEEE Computer Society
Pages4302-4315
Number of pages14
ISBN (Electronic)9798350317152
DOIs
StatePublished - 2024
Event40th IEEE International Conference on Data Engineering, ICDE 2024 - Utrecht, Netherlands
Duration: 13 May 202417 May 2024

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627
ISSN (Electronic)2375-0286

Conference

Conference40th IEEE International Conference on Data Engineering, ICDE 2024
Country/TerritoryNetherlands
CityUtrecht
Period13/05/2417/05/24

Keywords

  • Cost Estimation
  • Feature Engineering
  • Query

Fingerprint

Dive into the research topics of 'QCFE: An Efficient Feature Engineering for Query Cost Estimation'. Together they form a unique fingerprint.

Cite this