Abstract
Missing data is a common issue in many real-world applications, significantly impacting the efficiency of statistical estimators. This is particularly prevalent in fields like air quality monitoring, where high missing rates and long consecutive gaps frequently occur due to sensor malfunctions. Much of the existing research on univariate time series imputation, however, focuses on scenarios with low missing rates and short gaps, leaving a significant gap in evaluating methods under more complex conditions. In this paper, we fill this gap by examining the robustness of various univariate time series imputation methods, notably under conditions of high missing data rates and long gaps. Through extensive simulations, we find that the autoregressive integrated moving average (ARIMA) combined with Kalman smoothing performs exceptionally well for time series with strong seasonality and trends, even in challenging scenarios. We further validate this method by imputing missing values in particulate matter (PM2. 5) data from multi-site air quality monitoring in Beijing, where the imputed values maintained consistency with observed temporal patterns. Overall, our findings underscore the necessity of adopting imputation approaches matched to the individual characteristics of the missing data and the underlying time series patterns.
| Original language | English |
|---|---|
| Pages (from-to) | 1602-1622 |
| Number of pages | 21 |
| Journal | Communications in Statistics - Theory and Methods |
| Volume | 55 |
| Issue number | 5 |
| DOIs | |
| State | Published - 2026 |
| Externally published | Yes |
Keywords
- ARIMA state space
- Kalman filtering method
- Missing values
- air quality data
- univariate imputation
Fingerprint
Dive into the research topics of 'Robust univariate missing data imputation in time series for air quality data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver