L2C: Learn to Clean Time Series Data

Mayuresh Hooli, Rabi Mahapatra

2025

Abstract

In today’s data-driven economy, where decisions hinge on vast amounts of data from diverse sources such as social media and government agencies, the accuracy of this data is paramount. However, data complexities including errors from missing information and outliers challenge its integrity. To address this, we introduce a novel machine learning framework, L2C (Learn to Clean), specifically designed to enhance the cleanliness of time series data. Unlike existing methods like SVR and ARIMA that are limited to handling one or two types of outliers, L2C integrates techniques from SVR, ARIMA, and Loess to robustly identify and correct for all three major types of outliers—global, contextual, and collective. This paper marks the first implementation of a framework capable of detecting collective outliers in time series data. We demonstrate L2C’s effectiveness by applying it to air quality sensor data sampled every 120 seconds from wireless sensors, showcasing superior performance in outlier detection and data integrity enhancement compared to traditional methods like ARIMA and Loess.

Download


Paper Citation


in Harvard Style

Hooli M. and Mahapatra R. (2025). L2C: Learn to Clean Time Series Data. In Proceedings of the 10th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS; ISBN 978-989-758-750-4, SciTePress, pages 361-369. DOI: 10.5220/0013421300003944


in Bibtex Style

@conference{iotbds25,
author={Mayuresh Hooli and Rabi Mahapatra},
title={L2C: Learn to Clean Time Series Data},
booktitle={Proceedings of the 10th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS},
year={2025},
pages={361-369},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013421300003944},
isbn={978-989-758-750-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS
TI - L2C: Learn to Clean Time Series Data
SN - 978-989-758-750-4
AU - Hooli M.
AU - Mahapatra R.
PY - 2025
SP - 361
EP - 369
DO - 10.5220/0013421300003944
PB - SciTePress