Clustering-Based Pattern Prediction Framework for Air Pollution Prediction

Athiruj Poositaporn, Athiruj Poositaporn, Hanmin Jung, Hanmin Jung

2025

Abstract

Accurately predicting patterns from large and complex datasets remains a significant challenge, particularly in environments where real-time predictions are crucial. Despite advancements in predictive modeling, there remains a gap in effectively integrating clustering techniques with advanced similarity metrics to enhance prediction accuracy. This research introduces a clustering-based pattern prediction framework integrating Kmeans with our Overall Difference with Crossover Penalty (OD with CP) similarity metric to predict data patterns. In the experiment, we demonstrated its application in air pollution pattern prediction by comparing 15 different model-cluster combinations. We employed five predictive models: Euclidean Distance, Markov Chain, XGBoost, Random Forest, and LSTM to predict the next day's pollution pattern across three cluster sizes (K = 10, 20, and 30). Our aim was to address the limitation of traditional clustering methods in pattern prediction by evaluating the performance of each model-cluster combination to determine the most accurate predictions. The results showed that our framework identified the most accurate model-cluster combination. Therefore, the study highlighted the generalizability of our framework and indicated its adaptability in pattern prediction. In the future, we aim to apply our framework to a Large Language Model (LLM) combined with Retrieval Augmented Generation (RAG) to enhance in-depth result interpretation. Furthermore, we intend to expand the study to include client engagement strategy to further validate the effectiveness of our study in real-world applications.

Download


Paper Citation


in Harvard Style

Poositaporn A. and Jung H. (2025). Clustering-Based Pattern Prediction Framework for Air Pollution Prediction. In Proceedings of the 10th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS; ISBN 978-989-758-750-4, SciTePress, pages 428-435. DOI: 10.5220/0013474300003944


in Bibtex Style

@conference{iotbds25,
author={Athiruj Poositaporn and Hanmin Jung},
title={Clustering-Based Pattern Prediction Framework for Air Pollution Prediction},
booktitle={Proceedings of the 10th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS},
year={2025},
pages={428-435},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013474300003944},
isbn={978-989-758-750-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS
TI - Clustering-Based Pattern Prediction Framework for Air Pollution Prediction
SN - 978-989-758-750-4
AU - Poositaporn A.
AU - Jung H.
PY - 2025
SP - 428
EP - 435
DO - 10.5220/0013474300003944
PB - SciTePress