Authors:
Hassiba Laifa
;
Raoudha Khcherif
and
Henda Ben Ghezala
Affiliation:
RIADI Laboratory, ENSI, University of Manouba, Manouba, Tunisia
Keyword(s):
Train Delay, Prediction, Machine Learning, Classification, Regression, LightGBM.
Abstract:
Train delay is a critical problem in railway systems. A previous prediction of delays is a critical issue advantageous for passengers to re-plan their journeys more reliably. It is also essential for railway operators to control the feasibility of timetable realization for more efficient train schedules. This paper aims to present a novel two-level Light Gradient Boosting Machine (LightGBM) approach that combines classification and regression in a hybrid model. It was proposed to predict passenger train delays on the Tunisian railway.
The first level indicates the class of delay, where the delays are divided into intervals of 5 minutes ([0,5], [6,10], …, [>60]), 13 classes in total were obtained. The second level then predicts the actual delay in minutes, considering the expected delay class at the first level. This model was trained and tested based on the historical data of train operation collected by the Tunisian National Railways Company (SNCFT) and infrastructure characterist
ics. Our methodology consists of the following phases: data collection, data cleaning, complete data analysis, feature engineering, modeling and evaluation. The obtained results indicate that the two-level approach based on the LightGBM model outperforms the one-level method. It also outperformed the benchmark models.
(More)