Skip to main navigation Skip to search Skip to main content

bNDCRepair: Cleaning both Data Errors and Inaccurate Constraints on Numerical Sequential Data

  • Harbin Institute of Technology
  • Tsinghua University

Research output: Contribution to journalConference articlepeer-review

Abstract

Numerical sequence data from intelligent devices often have quality issues. While existing data cleaning methods focus on repairing data, we address the problem of repairing both data errors and inaccurate constraints. We propose two operations for modifying inaccurate constraints: expanding and compressing their value domains. Our solution includes constraint modification functions and algorithms to prevent under-and over-fitting in data cleaning. Theoretical evaluations demonstrate its reliability and effectiveness of the proposed solution, which achieves optimal repair with the distance no greater than from the optimal repair. Experiments on real-life and synthetic datasets show that our bND-CRepair method improves F1-score by 17.6% compared to using the original constraints and performs best in MNAD. Results show high-level performance with the combination of our bNDCRepair and the state-of-the-art CVtRepair and Clean4TSDB in sequential data tasks.

Original languageEnglish
Pages (from-to)5676-5688
Number of pages13
JournalProceedings of the VLDB Endowment
Volume18
Issue number13
DOIs
StatePublished - 2025
Event52nd International Conference on Very Large Data Bases, VLDB 2026 - Boston, United States
Duration: 31 Aug 20264 Sep 2026

Fingerprint

Dive into the research topics of 'bNDCRepair: Cleaning both Data Errors and Inaccurate Constraints on Numerical Sequential Data'. Together they form a unique fingerprint.

Cite this