Intelligent Transportation Systems: A Survey on Data Engineering

Safa Batita

, Achraf Makni

and Ikram Amous

National School of Electronics and Telecommunications (ENET’Com), University of Sfax, MIRACL Laboratory,

Airport Road, km 4, B.P. 1088, 3018 Sfax, Tunisia

Keywords:

Intelligent Transportation Systems, Databases, Artiﬁcial Intelligence, Advanced Driver Assistance Systems.

Abstract:

This paper presents an examination of data engineering within Intelligent Transportation Systems (ITS), focus-

ing on integrating advanced technologies such as Real-Time Databases (RT-DBs), Graph Databases (GDBs),

and Artiﬁcial Intelligence (AI) to improve ITS capabilities. The decision to focus on database systems and AI

in this paper is based on their crucial roles in shaping modern transportation systems and offers a comprehen-

sive view of the technological framework inﬂuencing ITS. Through an extensive review of existing literature,

the paper explores how these solutions synergistically contribute to data collection, organization, processing,

and extraction of value from various ITS data. The paper analyzes the transformative impact of real-time data

management in connected vehicle systems and the efﬁcacy of GDBs in capturing complex relationships within

intelligent transportation networks. Additionally, it assesses the adaptability of AI in various ITS applications,

including trafﬁc prediction, driver assistance, and accident analysis. Despite their beneﬁts, the paper discusses

persistent challenges related to system complexity, interoperability, data management, and model accuracy,

which impact the widespread deployment of ITS. Furthermore, the paper presents recommendations for ad-

dressing these challenges and emphasizes research directions that require further exploration, underscoring

the importance of intelligent and efﬁcient transportation worldwide.

1 INTRODUCTION

In the structure of modern civilization, transportation

systems are deeply integrated into daily human activ-

ities. With an estimated 40% of the world’s popula-

tion spending at least one hour on the road every day,

it’s clear that transportation is central to our collective

existence. This widespread dependence has increased

signiﬁcantly in recent years, a trend that reﬂects the

rapid urbanization and globalization seen around the

world. As a result, transportation infrastructure is at

an intersection, facing multiple opportunities for in-

novation and development while handling numerous

challenges.

ITS are advanced applications that offer innova-

tive services for different modes of transport and traf-

ﬁc management. ITS use information and communi-

cation technologies to improve transportation system

efﬁciency, safety, and environmental performance.

These systems, supported by sophisticated computa-

tional models and databases, are central to the transi-

https://orcid.org/0009-0005-2988-6516

https://orcid.org/0000-0002-6992-5824

https://orcid.org/0000-0002-5893-9833

tion towards autonomous vehicles and interconnected

urban trafﬁc networks. This paper comprehensively

analyzes emerging research on data engineering for

ITS applications relating to the integration of ad-

vanced technologies like RT-DBs, GDBs, and IA.

The survey examines the role of RT-DBs in

enhancing Advanced Driver Assistance Systems

(ADAS), a key ITS application for vehicle safety and

autonomy. It explores the use of GDBs in effectively

managing the complex relationships inherent in ITS

networks, including connected vehicles, smart infras-

tructure, and transportation systems. Additionally, the

transformational impact of AI and Machine Learning

(ML) techniques is analyzed across various ITS im-

plementations encompassing trafﬁc prediction, con-

gestion avoidance, driver behavior modeling, and ac-

cident analysis. These results underscore the signiﬁ-

cance of integrating advanced database systems and

AI technologies in shaping the future of intelligent

transportation.

The rest of this paper is organized as follows: Sec-

tion 2 presents ITS integrating databases. Section

3 delves into the ITS using AI. Section 4 offers a

comprehensive discussion on the current state of ITS

technology, accompanied by recommendations and

Batita, S., Makni, A. and Amous, I.

Intelligent Transportation Systems: A Sur vey on Data Engineering.

DOI: 10.5220/0012857300003756

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 13th International Conference on Data Science, Technology and Applications (DATA 2024), pages 169-179

ISBN: 978-989-758-707-8; ISSN: 2184-285X

169

research challenges for future initiatives. The paper

concludes with Section 5.

2 ITS INTEGRATING

DATABASES

ITS rely on advanced technologies to handle the vast

volume and velocity of data produced by connected

vehicles, transit networks, and smart infrastructure.

Effective data storage and processing capabilities are

crucial for extracting value and enabling enhanced

transportation analytics and services. This section

analyzes two key data management technologies ap-

plied across various ITS implementations: RT-DBs

and GDBs.

This section explores research that integrates real-

time relational databases and enhances ITS represen-

tation with GDBs.

2.1 ITS Enhanced with Real-Time

Relational Databases

The modern ITS face the signiﬁcant challenge of

managing increasingly voluminous data, much of

which is generated in real-time. This presents a criti-

cal challenge for ensuring data consistency and valid-

ity. To address this challenge, integrating databases

into ADAS has been proposed in several studies. RT-

DBs are especially instrumental in ADAS, empow-

ering reliable analysis of dynamic vehicle data from

various onboard sensors. By consistently integrating

with ADAS, RT-DBs ensure efﬁcient management of

constantly evolving driving variables such as vehicle

position, speed, and direction. This integration im-

proves the overall effectiveness and responsiveness of

ADAS.

In their pursuit of advancing autonomous ADAS,

(Marouane et al., 2016) focused on leveraging pat-

tern reuse methods. They introduced the integration

of a real-time database system and formulated three

distinct design patterns. These patterns were crafted

to articulate real-time constraints and manage real-

time data, considering both structural and dynamic as-

pects of the system. Continuing this line of research,

(Marouane et al., 2018) proposed a specialized UML

(Uniﬁed Modeling Language) proﬁle, named UML-

RTDB2 (UML-RealTimeDataBase), to express pat-

tern variability and real-time constraints and convey

non-functional properties. This further reﬁned the

framework for developing sophisticated autonomous

ADAS.

However, the objective of (Elleuch et al., 2023)

aimed to enhance the performance of Cooperative Ad-

vanced Driver Assistance Systems (C-ADAS) with a

focus on improving road safety. This enhancement

addressed various scenarios such as collision avoid-

ance at intersections, obstacle and dangerous area de-

tection (Elleuch et al., 2021), and management of

overtaking maneuvers (Elleuch et al., 2019). The key

aspect of their approach involved effectively manag-

ing the signiﬁcant amount of sensor data transmitted

through Vehicle-To-Vehicle (V2V) and Vehicle-To-

Infrastructure (V2I) communications. To address this

challenge, the authors proposed integrating real-time

database into C-ADAS within the framework of Ve-

hicular Ad-hoc NETworks (VANETs), which would

be managed by a suitable Real-Time Database Man-

agement System (RT-DBMS). The incorporation of

RT-DB has proven to enhance communication efﬁ-

ciency by reducing the number of messages sent, ex-

changed, and lost. Additionally, it improves response

times by establishing new formulas, conditions, and

rules based on the information stored in the real-time

database.

In summary, the innovative approaches cited

above in these studies signify considerable progress

in the development of ADAS and C-ADAS technolo-

gies. By emphasizing the reuse of patterns, integrat-

ing real-time databases, and formulating specialized

UML proﬁles, these approaches announce a trans-

formative move towards autonomous driving systems

that are not only more intricate and capable but also

safer and more efﬁcient.

Note that these approaches are based on the real-

time Relational Database (RDB) model proposed by

(Idoudi et al., 2008). According to this model, real-

time data is deﬁned as a quadruple: d = (d

value

, d

stamp

, d

avi

, mde).

Here, d

value

represents the current value of the

data, d

stamp

denotes the timestamp of the value up-

date, d

avi

signiﬁes the absolute validity interval, and

mde refers to the maximum permissible error between

the actual and stored values. This model is imple-

mented to ensure precise data management and pro-

cessing in autonomous driving systems.

Despite the beneﬁts of the real-time aspect, incor-

porating RDB presents challenges like implementa-

tion intricacies, data processing complexities, and

scalability and adaptability issues. As this domain

progresses, imminent research is expected to ad-

dress these challenges, aiming for autonomous driv-

ing systems that are more adaptable and ﬂexible while

achieving a balance between technological sophisti-

cation and practical usability.

DATA 2024 - 13th International Conference on Data Science, Technology and Applications

170

2.2 ITS Empowered by Graph

Databases

Graph databases are increasingly signiﬁcant in ITS

due to their exceptional ability to handle complex and

interconnected data, which is essential for modern

transportation networks. This technology is pivotal

for managing and analyzing vast amounts of data gen-

erated by various components of ITS, including trafﬁc

ﬂow, transportation networks, and vehicle communi-

cations.

Several studies have delved deeply into this area

to illustrate the signiﬁcance of GDBs in ITS further.

For instance, the research by (Oberoi et al., 2018) pro-

poses a structured time-varying graph (TVG) model

for understanding dynamic road trafﬁc environments,

which integrates both spatial and temporal dimen-

sions. The purpose of this model is to record and an-

alyze the interactions and changes within urban traf-

ﬁc systems while considering static and dynamic el-

ements such as vehicles and road infrastructures. By

integrating time-varying node and edge presence and

labeling functions, the model improves the precision

of trafﬁc ﬂow analysis in urban environments. The

paper details the theoretical foundation for the TVG

and establishes it with existing work on spatial graph

modeling. The goal of this study is to establish the

groundwork for future application development, with

a focus on using these models to develop graph algo-

rithms that can analyze and interpret trafﬁc patterns.

The incorporation of real-world trafﬁc data, gathered

by CEREMA in Rouen, France, will simplify the im-

plementation and evaluation of the proposed models

and algorithms.

(Wirawan et al., 2019) proposes a unique database

design for multimodal transportation, focusing on Se-

marang. They developed this model using an Ori-

ented Entity-Relationship Diagram (O-ERD), later

converting it into a graph database schema executed

on the Neo4j graph database. The model includes

three main nodes: Shelter, Angkot Stopper, and

Closer Place, representing Bus Rapid Transit (BRT)

shelters, city transportation, and nearby locations. A

unique feature is the ”Angkot Stopper” node, sym-

bolizing angkots with ﬂexible stopping points. The

model’s efﬁciency was tested through Cypher query

language search queries, particularly using the ”col-

lect” function to enhance path formation. This ap-

proach differs from previous studies, in that it inte-

grates routing algorithms within the graph database

system, simplifying route construction and improving

route discovery based on passenger destination prox-

imity.

(Bhogaram et al., 2020) highlights the utility of

the graph database Neo4j for analyzing transportation

networks. They used centrality algorithms to identify

important nodes and critical paths within the trans-

portation network, improving resilience against chal-

lenges like heavy trafﬁc and natural disasters. This

study demonstrates the efﬁciency of GDBs in analyz-

ing and optimizing transportation systems.

(Chandra et al., 2020) introduced GraphRQI, an

advanced algorithm for classifying driving behaviors

by analyzing movement paths. This method employs

a supervised learning algorithm and spectral analysis

of trafﬁc graphs to enhance computational efﬁciency.

In the GraphRQI model, drivers are depicted as nodes

in a sparse, undirected, and unweighted, with their in-

teractions indicated by edges. It classiﬁes behaviors

such as aggressive or conservative driving based on

interactions within trafﬁc graphs. GraphRQI effec-

tively classiﬁes driving behaviors by capturing trafﬁc

graph interactions. Its characteristic value algorithm

computes the trafﬁc graph with double the speed of

previous methods. Tests using trafﬁc videos and au-

tonomous driving datasets, particularly in urban ar-

eas, showed a 25% improvement in accuracy over ex-

isting driver behavior classiﬁcation methods. How-

ever, its accuracy depends on the reliability of the

tracking technology used to monitor road agent po-

sitions.

(Zhang et al., 2021) presented a novel method to

analyze the structure and behavior of Autonomous

Transportation Systems (ATS). ATS represents an ad-

vanced form of intelligent transportation, known for

its self-organizing and autonomous features. The

team’s approach involved creating a knowledge graph

network to represent the ATS, categorizing it into

ﬁve distinct nodes: Technology, Demand, Service,

Function, and Component. Each of these nodes cap-

tures different aspects of the ATS. The research used

Neo4j to store structured data, forming a compre-

hensive knowledge graph of the transportation sys-

tem network. This graph is composed of two layers:

the model layer and the data layer. The model layer

outlines the relationships between various entities in

the ATS, according to the ﬁve elements and their at-

tributes. This creates a structural framework for the

system. The data layer, meanwhile, uses Neo4j’s ca-

pabilities to store and visually present data related to

the ATS.

(Bollen et al., 2021) explored Neo4j for manag-

ing data in sensor-equipped transportation networks,

centering on spatial and temporal data querying using

Cypher with custom functions and procedures. Their

study aims to bridge the gap between spatiotemporal

data and queries, laying advanced analytical improve-

ments in ITS.

Intelligent Transportation Systems: A Survey on Data Engineering

171

(Garc

ıa et al., 2022) introduced the interopera-

ble graph-based Local Dynamic Map (iLDM) for

autonomous and connected vehicles. This local

database effectively integrates both static and dy-

namic data from multiple sources employing Neo4j

and OpenLABEL, ensuring adaptability in the rapidly

changing vehicle technology sector. A thorough per-

formance testing process, involving a vehicle discov-

ery service function, showcased iLDM’s superiority

over other LDM implementations, making it highly

practical for the real-time development of advanced

driver assistance systems.

(Maduako et al., 2022) introduced a novel ap-

proach for distinguishing high-risk trafﬁc accident lo-

cations, incorporating Neo4j to illustrate the dynamic

relationship between accidents and the road network

as a space-time-varying graph. By analyzing net-

work connectivity through graph analytics metrics

such as degree centrality and PageRank, the research

identiﬁes high-risk areas for urban planners, enabling

proactive accident prevention.

(Zhang et al., 2022) presented a comprehen-

sive analysis of trafﬁc accident data and constructed

a knowledge graph to enhance trafﬁc safety man-

agement. By integrating multidimensional factors

such as people, vehicles, roads, and the environ-

ment, the knowledge graph facilitates the acquisition

and reuse of valuable insights within structured case

data. Through visualization analysis, including ac-

cident portraits, classiﬁcations, statistics, and corre-

lation paths, the knowledge graph provides complex

relationships among accident elements. This helps

both researchers and trafﬁc management departments

better understand accident characteristics and imple-

ment effective measures to avoid accidents and im-

prove overall safety.

(Yuan et al., 2023) focused on developing a

knowledge graph for trafﬁc safety management us-

ing Neo4j. The study addresses the complexity and

scattered nature of trafﬁc safety data by integrating

various data types into a structured knowledge graph.

It includes creating node and relationship entities to

represent different aspects of trafﬁc safety, like ille-

gal acts, vehicle failures, and emergency responses.

This study moreover discusses the implementation of

query functions using Cypher and rule matching for

effective data analysis and decision-making in trafﬁc

safety management. It highlights the potential of us-

ing Neo4j for organizing and analyzing complex data

in the context of trafﬁc safety.

Note that the previous studies collectively under-

score the importance of GDBs, through Neo4j, as in-

dispensable tools for transportation network analysis

and administration, driving advancements in safety,

efﬁciency, and resilience across various transportation

domains. So, each study presented aimed to high-

light the unique capabilities and applications of graph

databases through Neo4j. For instance, (Oberoi et al.,

2018) demonstrated that using a graph representation

in Neo4j facilitated effective modeling and analysis

of the dynamic spatial and temporal aspects of the ur-

ban intersection scenario. While (Oberoi et al., 2018)

concentrated on developing theoretical graph mod-

els, the experimental ﬁndings showcased the practi-

cal utility of using Neo4j’s graph queries for real-time

collision detection. Similar to (Wirawan et al., 2019)

and (Bhogaram et al., 2020), the experiments utilized

Neo4j’s capabilities for route optimization and iden-

tifying critical paths, further validating the beneﬁts of

GDBs for transportation network analysis. In addi-

tion, the integration of AI techniques, like neural net-

works, aligns with (Chandra et al., 2020) and (Mad-

uako et al., 2022), which utilized Neo4j to analyze

driving behaviors and mitigate risks within the trans-

portation systems, emphasizing its role in improving

accuracy and facilitating proactive risk identiﬁcation.

Other studies, including (Zhang et al., 2021), (Garc

ıa

et al., 2022), (Zhang et al., 2022), and (Yuan et al.,

2023) further investigate the development of compre-

hensive knowledge graphs for ITS, with a focus on

real-time data management, spatial queries, and vehi-

cle trajectory prediction. Combining these functions

within a uniﬁed knowledge graph could pave the way

for future advancements, using Neo4j’s capabilities

to improve safety, efﬁciency, and robustness across

transportation networks. Each cited research high-

lights the importance of GDBs in ITS, demonstrating

their value in improving network operations, analyz-

ing accidents, managing risks proactively, and opti-

mizing trafﬁc ﬂow in various transportation settings.

Table 1 offers a comprehensive comparison of var-

ious ITS that have integrated GDBs. Each approach

is evaluated based on its objective, use of real-world

and real-time data, decision-making, reliance on data

quality, and complexity. The table shows variations

among approaches, with some excelling in leveraging

real-world and real-time data, whereas others distin-

guish themselves in decision-making. In any case, it’s

essential to recognize that each approach has its own

set of strengths and weaknesses.

For instance, methodologies heavily dependent on

real-world and real-time data may yield more precise

and timely experiences, essentially improving ITS.

However, managing and processing such extensive

data volumes can pose challenges.

In conclusion, while Table 1 gives an extensive

overview of these various approaches and their contri-

butions to ITS using GDBs, it‘s signiﬁcant to consider

DATA 2024 - 13th International Conference on Data Science, Technology and Applications

172

the strengths and weaknesses of these approaches.

3 ITS USING AI

Advancements have signiﬁcantly inﬂuenced the evo-

lution of intelligent transportation systems in artiﬁ-

cial intelligence. These innovations are revolutioniz-

ing our strategies concerning urban mobility, trafﬁc

management, and vehicle safety. Within this dynamic

environment, numerous studies have emerged, each

investigating various applications and approaches of

artiﬁcial intelligence to enhance the efﬁciency and ef-

ﬁcacy of transportation systems. This comprehen-

sive review delves into various signiﬁcant research

endeavors that highlight the integration of these tech-

nologies in managing and improving trafﬁc ﬂow, pre-

dicting driver actions, and optimizing safety protocols

at intersections and urban roads. The incorporation of

these technologies is not just improving current sys-

tems but also laying the way for future advancements

in transportation safety and efﬁciency.

(Meena et al., 2020) introduced a novel tool to

accurately and timely predict trafﬁc ﬂow consider-

ing diverse environmental factors that can affect traf-

ﬁc like trafﬁc signals, accidents, and road mainte-

nance. Given the recent exponential increase in traf-

ﬁc data and the move toward big data concepts for

transportation, today’s trafﬁc prediction methods that

rely on trafﬁc models are still insufﬁcient for real-

world applications. To analyze the vast amounts of

data transportation system data with less complex-

ity, the authors intend to use machine learning, ge-

netic algorithms, soft computing, deep learning al-

gorithms, and image processing techniques for trafﬁc

sign recognition. The proposed algorithm showcased

improved complexity concerns and showed greater

accuracy than previous algorithms.

(Hu et al., 2020) simulated driver behavior at sig-

nalized intersections under diverse trafﬁc scenarios

involving many vehicle types like cars, buses, and

motorized three-wheelers. This research collects real

world GPS data and video data from vehicles ap-

proaching intersections with red signals in Delhi and

Mumbai, India. It examines the acceleration and de-

celeration patterns of these vehicles to determine the

impact zone of the intersection - the distance from

the intersection where drivers commence decelerat-

ing after observing the red signal. The research aims

to categorize drivers into categories such as aggres-

sive, normal, and timid based on their acceleration

and deceleration behavior. Nevertheless, it ﬁnds that

drivers cannot be easily classiﬁed, and their behavior

is better represented by a continuous normal distribu-

tion rather than discrete classes. The analysis, which

does not employ machine learning or deep learning

techniques, emphasizes the complexity of modeling

driver behavior in diverse trafﬁc conditions and the

dependency on high-quality GPS and video data.

(Lv et al., 2020) used Deep Learning (DL) to ad-

dress safety issues in ITS. The research examines var-

ious aspects such as data transmission performance,

prediction accuracy, and route change strategies. In

the analysis of the system’s data transmission perfor-

mance, it is found that when the probability of suc-

cessful transmission is 100% and the λ value between

0.01 and 0.05, it is closest to the actual result, and

the data delay is the smallest. In the analysis of pre-

diction accuracy, and using the Gated Recurrent Unit

(GRU) and Long Short-Term Memory (LSTM) algo-

rithms, the authors found that in different types of

cases, the improved system has the best prediction

performance with increasing iterations. After further

analysis of the system’s route guidance strategy, it is

found that the route guidance strategy in this study

can effectively inhibit congestion propagation in the

face of congested road sections, and achieve the ef-

fect of timely evacuation for trafﬁc congestion. As a

result of this study, the improved ITS can signiﬁcantly

reduce system data transmission delay, improve pre-

diction accuracy, and effectively change the path in

the face of congestion to suppress congestion propa-

gation, providing an experimental reference for fur-

ther transportation.

(Olayode et al., 2021) compared the Markov

Chain Model (MCM) and the Artiﬁcial Neural Net-

work (ANN) model for predicting vehicle trafﬁc ﬂow

at signalized intersections. Trafﬁc datasets were ob-

tained from South African highways, roads, and inter-

sections, courtesy of the South African Department

of Transport. This trafﬁc information was obtained

using sophisticated trafﬁc monitoring equipment and

techniques, such as inductive loop detectors, video

cameras, and GPS- controlled equipment stationed

throughout the road. In the ANN model, 100 sets of

trafﬁc data were considered, 70% for learning, 15%

for testing, and 15% for validation. According to the

results obtained in this study, the best trafﬁc dataset

training performance was obtained when the number

of hidden neurons was 9, giving a good coefﬁcient of

determination of 0.96304.

(Karri et al., 2021) aimed to enhance safety at sig-

nalized intersections by employing machine learning

to address the challenges drivers face in the dilemma

zone, the critical moment when a trafﬁc light turns

yellow. The research used Support Vector Machine

(SVM) and K-Nearest Neighbors (KNN) to classify

driver decisions as safe or unsafe. It analyzed behav-

Intelligent Transportation Systems: A Survey on Data Engineering

173

Table 1: Comparative table of ITS that have integrated GDB.

Real Real Decision Reliance

Approach Objectif world time making on data Complexity

data data quality

analysis

(Oberoi et al., 2018) Develop dynamic yes yes no yes yes

trafﬁc graph algorithms

(Wirawan et al., 2019) Design multimodal no no no yes yes

transportation database

(Bhogaram et al., 2020) Analyze Critical yes yes no yes no

Transport Paths

(Chandra et al., 2020) Classify driving yes no no yes no

behaviors

(Zhang et al., 2021) Create and analyze yes yes no no yes

ATS networks

(Bollen et al., 2021) Optimize sensor yes no yes no yes

network queries

(Garc

ıa et al., 2022) Create an interoperable no no no no yes

LDM withOpenLABEL

(Maduako et al., 2022) Identify high-risk yes yes no no yes

trafﬁc locations

(Zhang et al., 2022) Analyze trafﬁc no no yes no yes

accident data

(Yuan et al., 2023) Develop trafﬁc no no yes no no

safety graphs

iors from 49 drivers in varied environments. It found

that except for the cubic SVM kernel, all SVM ap-

proaches predicted behavior with over 85% accuracy,

with the linear SVM being the most precise. Although

the coarse Gaussian SVM ranked second in accuracy,

it demanded more computation time. Compared with

KNN and Linear Discriminant Analysis, these also

demonstrated high accuracy rates, with 90.1% and

89.4% respectively. The ﬁndings offer crucial insights

into the efﬁcacy of different ML techniques in predict-

ing driver behavior and potentially reducing accidents

at intersections.

(Bagheri et al., 2022) proposed an Artiﬁcial Neu-

ral Network-based simulation model for gap accep-

tance behavior. This model was developed through

ANN simulations, leveraging real-world data from a

comprehensive database collected at a stop-controlled

intersection in New Jersey. The practicality of inte-

grating this model into a microscopic simulation tool

was evaluated using the Simulation of Urban MObil-

ity (SUMO) package’s Application Programming In-

terface (API). The ANN model was trained to mimic

drivers’ gap acceptance decisions and subsequently

implemented in SUMO through its API, enabling the

simulation of driver behavior at intersections. This

model was benchmarked against the standard SUMO

settings and a calibrated version of SUMO based on

waiting times and acceptable deviations of vehicles

on the minor road approach. The comparative analy-

sis revealed that the ANN-based model outperformed

the default and calibrated SUMO models in terms of

the selected output metrics. Furthermore, the study

highlighted that the ANN model yielded a more accu-

rate representation of vehicle driving behavior on the

major road approach to the intersection, indicating its

potential for enhancing the realism and accuracy of

trafﬁc simulations.

(Singh et al., 2022) distinguished between the In-

tersection Zone Of Inﬂuence (IZOI) and the middle

of a block by analyzing the driver’s acceleration and

deceleration maneuvers. This behavioral data is cap-

tured using a Global Positioning System (GPS) in ve-

hicles, particularly after drivers encounter a red signal

at an intersection. Additionally, the study attempts to

determine the optimal approach length for intersec-

tion simulations that affect driver behaviors, catego-

rizing them as aggressive, normal, or shy based on

their acceleration/deceleration patterns. This compre-

hensive approach provides for a nuanced understand-

ing of how different zones at intersections inﬂuence

driver behavior.

(Bharadiya, 2023) focused on exploring the piv-

otal role of ML and AI in the development of smart

cities. Its primary goal is to understand how these

technologies contribute to managing growing urban

areas, enhancing economic growth, reducing energy

DATA 2024 - 13th International Conference on Data Science, Technology and Applications

174

consumption, and improving residents’ living stan-

dards. Additionally, the study examines the infor-

mation ﬂow related to Information and Communica-

tion Technology (ICT) in smart cities. Methodologi-

cally, this research encompasses conducting surveys

to identify typical technologies supporting commu-

nication in smart cities and systematically evaluating

current trends in publications concerning ICT in these

urban areas. ML and AI techniques are employed to

analyze and interpret the data gathered. The ﬁndings

reveal that ML and AI are instrumental in various as-

pects of smart city development, especially in ITS.

These technologies are used for tasks like modeling

and simulation, dynamic routing, congestion manage-

ment, and intelligent trafﬁc control, extending their

use across different modes of transportation such as

rail, and road travel.

(Sayed et al., 2023) provided a comprehensive re-

view of ML and DL techniques utilized in trafﬁc pre-

diction, along with addressing the challenges inherent

in applying ML and DL in this domain. The rapid

expansion of the Internet of Things (IoT) has facili-

tated the emergence of smart cities, with ITS at their

core, aiming to enhance transportation efﬁciency and

mobility, particularly in addressing trafﬁc congestion.

With the increasing adoption of artiﬁcial intelligence

approaches, the accuracy of trafﬁc ﬂow prediction

models has improved signiﬁcantly.

(Shafﬁee Haghshenas et al., 2023) focused on pre-

dicting the Level of Road Crash Severity (LRCS) us-

ing ML methods applied to real-existing data from

1627 accidents on roads in Calabria, Italy. The main

objectives include building accurate prediction mod-

els, comparing the performance of ANN and Convo-

lutional Neural Networks (CNN), and identifying the

most inﬂuential parameters through sensitivity anal-

ysis. Results indicate that while there is no signif-

icant difference in model accuracy, the CNN model

outperforms the ANN model, achieving 68.4% accu-

racy compared to 61.7%. Sensitivity analysis reveals

the number of vehicles and road elements as the most

and least important factors affecting LRCS, respec-

tively. The study concludes that these models offer

valuable tools for predicting LRCS, with variations

depending on speciﬁc case studies.

The advancements in AI have signiﬁcantly inﬂu-

enced the evolution of ITS, evident in the various ar-

ray of studies examining its applications. The chal-

lenges are various going from enhancing safety and

trafﬁc ﬂow prediction to facilitating smart city de-

velopment and crash severity prediction. Thus, (Lv

et al., 2020) focused on improving safety within ITS

through DL, highlighting enhancements in data trans-

mission performance and congestion management. In

contrast, (Olayode et al., 2021) conducted a compar-

ative analysis between the MCM and ANN for traf-

ﬁc ﬂow prediction at signalized intersections, with

the ANN demonstrating superior performance. (Karri

et al., 2021) aimed to enhance safety at signalized in-

tersections by employing ML techniques to classify

driver decisions during critical moments. Conversely,

(Bagheri et al., 2022) proposed an Artiﬁcial Neu-

ral Network-based simulation model for gap accep-

tance behavior, surpassing standard simulation mod-

els in accuracy. Additionally, (Singh et al., 2022) dif-

ferentiated between intersection zones’ inﬂuences on

driver behavior, providing nuanced insights into traf-

ﬁc safety measures. (Bharadiya, 2023) explored the

role of ML and AI in smart city development, em-

phasizing their contributions to urban growth man-

agement and transportation efﬁciency. (Sayed et al.,

2023) provided a comprehensive review of ML and

DL techniques in trafﬁc prediction, highlighting their

increasing accuracy in addressing congestion chal-

lenges. Lastly, (Shafﬁee Haghshenas et al., 2023)

employed ML methods to predict road crash sever-

ity, with CNN outperforming ANN. These compara-

tive analyses highlight the various applications of AI

in addressing various challenges within ITS, from en-

hancing safety and trafﬁc ﬂow prediction to facilitat-

ing smart city development and crash severity predic-

tion.

Table 2 presents a comprehensive overview of

studies within ITS using AI, detailing their objectives

as well as the strengths and weaknesses related to

each approach. Each study leverages real-world data

to address complex challenges in transportation safety

and efﬁciency. Whereas these approaches harness the

power of ML or DL algorithms to extract valuable in-

sights from vast datasets, they also encounter chal-

lenges related to the complexity of modeling and the

dependency on data quality. The use of AI in these

studies offers promising advancements in understand-

ing and managing trafﬁc ﬂow, driver behavior, and

road safety. However, ensuring the reliability and rep-

resentativeness of the data to maximize the efﬁciency

of AI-powered solutions.

In conclusion, Table 2 provides a comprehen-

sive look at the various studies in ITS that utilize

AI, highlighting the importance of considering the

strengths and weaknesses related to each approach

to understand their contributions and potential chal-

lenges completely.

Intelligent Transportation Systems: A Survey on Data Engineering

175

Table 2: Comparative table of ITS using AI.

Real Dependency

Approach Objectif world Complexity on data ML DL

data Quality

(Meena et al., 2020) Enhance trafﬁc yes yes no yes no

ﬂow prediction

(Hu et al., 2020) Analyze intersection yes yes no no yes

driver behavior

(Lv et al., 2020) Improve safety in yes yes no no yes

ITS

(Olayode et al., 2021) Predict vehicle yes no no no yes

trajectory

(Karri et al., 2021) Classify driving no no yes yes no

behaviors

Determine the

(Bagheri et al., 2022) acceptable gap no no yes yes no

in intersections

Modeling driver

(Singh et al., 2022) behaviors at yes no no no no

intersections

(Bharadiya, 2023) Optimize urban no no yes yes no

management

(Sayed et al., 2023) Improve trafﬁc no no yes yes yes

prediction accuracy

(Shafﬁee Haghshenas et al., 2023) Predict road crash yes yes yes no yes

4 DISCUSSION AND FUTURE

DIRECTIONS

This section starts by exploring the strengths and chal-

lenges associated with the integration of RT-DBs and

GDBs in ITS. By combining the capabilities of timely

data processing and efﬁcient data structures, these

databases can offer promising avenues to improve ef-

ﬁciency and decision-making processes within trans-

portation networks. Then, it discusses the integration

of AI in ITS and shows the gained beneﬁts. This para-

graph also shows the strong and relevant relationship

between databases and AI.

Real-time databases offer signiﬁcant advantages

that make them highly suitable for applications in

ITS. These databases handle, analyze, and store data

with minimal delay, ensuring that the system always

has access to the latest information. This is vital

for making time-sensitive decisions that are related

to optimizing routes, managing trafﬁc, and respond-

ing to incidents. In real-time environments, where

quick handling and action on data are crucial, RT-DBs

can effectively manage the high-velocity data streams

generated by sources such as trafﬁc sensors, cameras,

and GPS devices. So, we recommend implementing

RT-DBs in ITS and leveraging their beneﬁts by incor-

porating advanced analytics techniques for real-time

data processing, and regularly updating and optimiz-

ing database infrastructure to manage increasing data

volumes and changing system requirements. Addi-

tionally, encouraging collaboration among database

engineers, transportation specialists, and AI experts

can aid in the improvement of innovative solutions

that harness the full capabilities of RT-DBs in enhanc-

ing transportation efﬁciency and safety.

It should be noted that relational databases are

mainly used in the research work proposing RT-DBs.

Within the relational model based on tabular for-

mat, modeling the intricate interconnections among

entities such as vehicles, roads, and obstacles poses

a signiﬁcant challenge, resulting in computationally

costly queries. This mismatch between the network-

like structure of ITS data and the tabular format of

relational databases can lead to challenges in han-

dling complex relationships and querying intercon-

nected data, which are common in ITS scenarios. So,

this can negatively impact performance and efﬁcient

data retrieval. So, we think that the use of relational

databases may not be the best choice for ITS applica-

tions.

In addition, relational databases struggle to keep

up with the rapidly evolving data sources and require-

DATA 2024 - 13th International Conference on Data Science, Technology and Applications

176

ments of ITS due to their rigid and static schema de-

sign. This often results in costly and disruptive re-

structuring efforts.

In contrast, NoSQL databases such as GDBs excel

in managing interconnected data while using ﬂexible

data models to effectively represent complex relation-

ships. They are characterized by great efﬁciency and

scalability and are often considered a natural form of

representation of ITS data. Thus, as the volume and

complexity of ITS data continue to grow, relational

databases are becoming increasingly impractical for

advanced ITS applications, while NoSQL alternatives

have architectural advantages that are better aligned

with ITS requirements.

To check the performance of the graph database

versus the relational database, we have conducted

comparative studies by simulating many road situa-

tions with obstacles and executing useful queries to

pass these situations without danger. The ﬁrst kind of

simulation used an RDB system whereas the second

was based on a graph database. Figure 1 shows an

example of four vehicles used in the simulation.

The target objective: “Vehicle1” must avoid the

obstacle while taking into account vehicles coming in

the opposite direction.

The useful queries to execute by “Vehicle1” are:

• Q1: First vehicle in the opposite direction

• Q2: Distance between two vehicles

• Q3: List of vehicles preceding “Vehicle1”

• Q4: List of vehicles in the opposite direction to

“Vehicle1”

Figure 1: Example of four vehicles on the road.

The results presented in ﬁgure 2, ﬁgure 3, ﬁgure 4,

and ﬁgure 5 indicate that the use of a graph database

is more advantageous compared to RDB. Particularly,

in certain cases, the execution time is reduced by a

factor of four. Additionally, the quantity of data ma-

nipulated by a graph database does not impact the ex-

ecution time for certain queries.

Therefore, we recommend exploring the integra-

tion of NoSQL databases, especially GDBs, in ITS

applications to address the limitations of relational

databases and improve system performance and scal-

ability.

Figure 2: Query execution time of query 1.

Figure 3: Query execution time of query 2.

Figure 4: Query execution time of query 3.

Figure 5: Query execution time of query 4.

Moreover, we believe that integrating real-time

capabilities into GDBs could present a promising

path forward, combining the beneﬁts of real-time

data processing with the ﬂexibility and efﬁciency of

GDBs structures. By using real-time data processing

and analysis features, GDBs empower ITS to react

promptly to changing trafﬁc conditions and make in-

formed decisions in real-time. Moreover, this integra-

tion facilitates accurate trafﬁc pattern prediction and

route planning optimization, improving trafﬁc man-

Intelligent Transportation Systems: A Survey on Data Engineering

177

agement and decreasing congestion.

On the other hand, the integration of AI within

ITS brings various beneﬁts to transportation systems.

AI algorithms use extensive datasets from various

sources like sensors and cameras to predict trafﬁc

ﬂows with precision, optimize route planning, and

improve real-time decision-making. Through the uti-

lization of AI, ITS can improve trafﬁc management,

boost safety, and optimize transportation networks,

ultimately leading to more efﬁcient and sustainable

urban mobility solutions.

The integration of AI within ITS underscores the

importance of factors like data quality which can have

a substantial impact on the performance and accu-

racy of AI algorithms. Thus, we can distinguish the

relevant relationship between AI and databases and

consider databases as the basic element of AI. In-

deed, databases can deliver the timely and relevant

data needed for training data sets. Furthermore, Per-

formance and speed directly impact the ability to pro-

cess data on time. The ability of the database to grow

with data, known as scalability, is another important

advantage. In addition, the use of databases allows

building a pipeline that performs data-science-driven

model hosting.

For more performance, NoSQL databases are re-

quired. Indeed, in addition to their high scalabil-

ity, NoSQL databases support various data structures,

which is beneﬁcial for AI applications requiring ﬂexi-

bility in data modeling. Moreover, with NoSQL graph

databases, a knowledge graph describes the meaning

of relationships between two elements. This kind of

graph provides a semantic view of data and can be an

efﬁcient way to model semantics which is important

to getting pertinent results from AI.

Consequently, we recommend yet another time in-

tegration of NoSQL databases, especially GDBs, in

ITS applications to build efﬁcient support for AI.

Conversely, AI can have an impact on databases

by providing innovative solutions to effectively man-

age the growing complexity and volume of data gen-

erated by applications. Thus, the incorporation of

AI techniques into databases improves their perfor-

mance, facilitating enhanced data processing, storage,

and retrieval capabilities. This can increase the poten-

tial to signiﬁcantly augment ITS capabilities, present-

ing promising avenues for future research and devel-

opment in both ITS and database management. This

enhanced performance plays a critical role in advanc-

ing ITS by guaranteeing that can manage the increas-

ing needs for real-time data analysis and decision-

making in transportation systems.

5 CONCLUSION

In conclusion, the research highlights the importance

of combining ITS with advanced database manage-

ment systems and AI technologies to revolutionize

transportation systems. ADAS with RT-DBs excels

in quick decision-making for safety, while ITS with

GDBs effectively handles complex network relation-

ships. The versatility of AI in various ITS applica-

tions, such as trafﬁc prediction, driver assistance, and

accident analysis, has been explored. However, chal-

lenges like system complexity, interoperability issues,

and data handling constraints persist. These chal-

lenges need to be resolved to enable widespread im-

plementation of ITS solutions.

Moving forward, future ITS research should con-

centrate on overcoming these obstacles to ensure the

consistent integration of database systems and AI for

more efﬁcient, safe, and adaptable urban mobility

solutions. By aligning these technologies, the fu-

ture of intelligent transportation shows great poten-

tial for transforming transportation systems globally.

Moreover, the recommendations presented in this re-

search offer valuable insights into addressing these

challenges and making the ﬁeld of ITS.

REFERENCES

Bagheri, M., Bartin, B., and Ozbay, K. (2022). Simulation

of vehicles’ gap acceptance decision at unsignalized

intersections using sumo. Procedia Computer Sci-

ence, 201:321–329.

Bharadiya, J. (2023). Artiﬁcial intelligence in transporta-

tion systems a critical review. American Journal of

Computing and Engineering, 6(1):34–45.

Bhogaram, P., Wu, X., He, M., and Okenwa, O. (2020). Op-

timal and critical path analysis of state transportation

network using neo4j. International Journal of Urban

and Civil Engineering, 14(10):312–317.

Bollen, E., Hendrix, R., Kuijpers, B., and Vaisman, A.

(2021). Time-series-based queries on stable trans-

portation networks equipped with sensors. ISPRS In-

ternational Journal of Geo-Information, 10(8):531.

Chandra, R., Bhattacharya, U., Mittal, T., Li, X., Bera, A.,

and Manocha, D. (2020). Graphrqi: Classifying driver

behaviors using graph spectrums. In 2020 IEEE In-

ternational Conference on Robotics and Automation

(ICRA), pages 4350–4357. IEEE.

Elleuch, I., Makni, A., and Bouaziz, R. (2019). Coop-

erative overtaking assistance system based on v2v

communications and rtdb. The Computer Journal,

62(10):1426–1449.

Elleuch, I., Makni, A., and Bouaziz, R. (2021). An intelli-

gent and efﬁcient safe driving system. In International

Conference on Hybrid Intelligent Systems, pages 181–

193. Springer.

DATA 2024 - 13th International Conference on Data Science, Technology and Applications

178

Elleuch, I., Makni, A., and Bouaziz, R. (2023). Cicaps: a

cooperative intersection collision avoidance persistent

system for cooperative intersection adas. The Journal

of Supercomputing, 79(6):6087–6114.

Garc

ıa, M., Urbieta, I., Nieto, M., Gonz

alez de Mendibil, J.,

and Otaegui, O. (2022). ildm: An interoperable graph-

based local dynamic map. Vehicles, 4(1):42–59.

Hu, J., Huang, M.-C., and Yu, X. (2020). Efﬁcient mapping

of crash risk at intersections with connected vehicle

data and deep learning models. Accident Analysis &

Prevention, 144:105665.

Idoudi, N., Duvallet, C., Sadeg, B., Bouaziz, R., and

Gargouri, F. (2008). Structural model of real-time

databases: An illustration. In 2008 11th IEEE In-

ternational Symposium on Object and Component-

Oriented Real-Time Distributed Computing (ISORC),

pages 58–65. IEEE.

Karri, S. L., De Silva, L. C., Lai, D. T. C., and Yong, S. Y.

(2021). Classiﬁcation and prediction of driving be-

haviour at a trafﬁc intersection using svm and knn. SN

computer science, 2:1–11.

Lv, Z., Zhang, S., and Xiu, W. (2020). Solving the security

problem of intelligent transportation system with deep

learning. IEEE Transactions on Intelligent Trans-

portation Systems, 22(7):4281–4290.

Maduako, I., Ebinne, E., Uzodinma, V., Okolie, C., and

Chiemelu, E. (2022). Computing trafﬁc accident high-

risk locations using graph analytics. Spatial informa-

tion research, 30(4):497–511.

Marouane, H., Duvallet, C., Makni, A., Bouaziz, R., and

Sadeg, B. (2018). An uml proﬁle for representing real-

time design patterns. Journal of King Saud University-

Computer and Information Sciences, 30(4):478–497.

Marouane, H., Makni, A., Bouaziz, R., Duvallet, C., and

Sadeg, B. (2016). Deﬁnition of design patterns for

advanced driver assistance systems. In Proceedings of

the 10th Travelling Conference on Pattern Languages

of Programs, pages 1–10.

Meena, G., Sharma, D., and Mahrishi, M. (2020). Traf-

ﬁc prediction for intelligent transportation system us-

ing machine learning. In 2020 3rd International Con-

ference on Emerging Technologies in Computer En-

gineering: Machine Learning and Internet of Things

(ICETCE), pages 145–148. IEEE.

Oberoi, K. S., Del Mondo, G., Dupuis, Y., and Vasseur,

P. (2018). Modeling road trafﬁc takes time (short

paper). In 10th International Conference on Ge-

ographic Information Science (GIScience 2018).

Schloss-Dagstuhl-Leibniz Zentrum f

ur Informatik.

Olayode, I. O., Tartibu, L. K., and Okwu, M. O. (2021).

Trafﬁc ﬂow prediction at signalized road intersec-

tions: a case of markov chain and artiﬁcial neural net-

work model. In 2021 IEEE 12th International Con-

ference on Mechanical and Intelligent Manufacturing

Technologies (ICMIMT), pages 287–292. IEEE.

Sayed, S. A., Abdel-Hamid, Y., and Hefny, H. A. (2023).

Artiﬁcial intelligence-based trafﬁc ﬂow prediction: a

comprehensive review. Journal of Electrical Systems

and Information Technology, 10(1):13.

Shafﬁee Haghshenas, S., Guido, G., Vitale, A., and Astarita,

V. (2023). Assessment of the level of road crash sever-

ity: Comparison of intelligence studies. Expert Sys-

tems with Applications, 234:121118.

Singh, M. K., Pathivada, B. K., Rao, K. R., and Perumal, V.

(2022). Driver behaviour modelling of vehicles at sig-

nalized intersection with heterogeneous trafﬁc. IATSS

research, 46(2):236–246.

Wirawan, P. W., Riyanto, D. E., Nugraheni, D. M. K., and

Yasmin, Y. (2019). Graph database schema for mul-

timodal transportation in semarang. Journal of In-

formation Systems Engineering and Business Intelli-

gence, 5(2):163–170.

Yuan, D., Zhou, K., and Yang, C. (2023). Architecture and

application of trafﬁc safety management knowledge

graph based on neo4j. Sustainability, 15(12):9786.

Zhang, L., Jiang, S., Huang, K., Xiao, Y., You, L., and Cai,

M. (2021). Knowledge graph-based network analy-

sis on the elements of autonomous transportation sys-

tem. In 2021 IEEE 21st International Conference on

Software Quality, Reliability and Security Companion

(QRS-C), pages 536–542. IEEE.

Zhang, L., Zhang, M., Tang, J., Ma, J., Duan, X., Sun, J.,

Hu, X., and Xu, S. (2022). Analysis of trafﬁc acci-

dent based on knowledge graph. Journal of advanced

transportation, 2022.

Intelligent Transportation Systems: A Survey on Data Engineering

179