Abstract
Billions of data points are generated by devices equipped with thousands of sensors, leading to significant data quality issues in time series data. These errors not only complicate time series data management but also compromise the accuracy and reliability of analysis based on such data. Given the noteworthy characteristics of time series data, existing cleaning methods struggle to provide adequate repairs, and tools supporting expressive constraints for time series remain scarce. To address this, we develop Clean4TSDB, a specialized data cleaning system for time series databases. This system integrates three key modules: expressive data quality constraint discovery, violation detection, and multivariate time series repairing, forming a comprehensive “profiling-detection-repair” workflow. Technically, we introduce TSDD, a data quality constraint that effectively captures contextual relationships within multivariate time series, and implement an efficient algorithm for its automated mining. Leveraging both row- and column-based constraints, we propose an effective time series cleaning algorithm. From a system standpoint, Clean4TSDB is pre-configured for seamless integration with time series databases like Apache IoTDB. Using user-provided and algorithmically-mined constraints, it effectively identifies various error patterns and offers reliable cleaning solutions. Furthermore, we establish a comprehensive library of state-of-the-art time series repair algorithms to meet the diverse needs of different management scenarios.
| Original language | English |
|---|---|
| Pages (from-to) | 4377-4380 |
| Number of pages | 4 |
| Journal | Proceedings of the VLDB Endowment |
| Volume | 17 |
| Issue number | 12 |
| DOIs | |
| State | Published - 2024 |
| Event | 50th International Conference on Very Large Data Bases, VLDB 2024 - Guangzhou, China Duration: 24 Aug 2024 → 29 Aug 2024 |
Fingerprint
Dive into the research topics of 'Clean4TSDB: A Data Cleaning Tool for Time Series Databases'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver