Abstract
Numerical sequence data from intelligent devices often have quality issues. While existing data cleaning methods focus on repairing data, we address the problem of repairing both data errors and inaccurate constraints. We propose two operations for modifying inaccurate constraints: expanding and compressing their value domains. Our solution includes constraint modification functions and algorithms to prevent under-and over-fitting in data cleaning. Theoretical evaluations demonstrate its reliability and effectiveness of the proposed solution, which achieves optimal repair with the distance no greater than from the optimal repair. Experiments on real-life and synthetic datasets show that our bND-CRepair method improves F1-score by 17.6% compared to using the original constraints and performs best in MNAD. Results show high-level performance with the combination of our bNDCRepair and the state-of-the-art CVtRepair and Clean4TSDB in sequential data tasks.
| Original language | English |
|---|---|
| Pages (from-to) | 5676-5688 |
| Number of pages | 13 |
| Journal | Proceedings of the VLDB Endowment |
| Volume | 18 |
| Issue number | 13 |
| DOIs | |
| State | Published - 2025 |
| Event | 52nd International Conference on Very Large Data Bases, VLDB 2026 - Boston, United States Duration: 31 Aug 2026 → 4 Sep 2026 |
Fingerprint
Dive into the research topics of 'bNDCRepair: Cleaning both Data Errors and Inaccurate Constraints on Numerical Sequential Data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver