Adaptive Highway Traffic Management: A Reinforcement Learning
Approach for Variable Speed Limit Control with Random Anomalies
B
´
alint Pelenczei
1 a
, Istv
´
an Gell
´
ert Kn
´
ab
1 b
, B
´
alint K
˝
ov
´
ari
2,3 c
,
Tam
´
as B
´
ecsi
2 d
and L
´
aszl
´
o Palkovics
1,4 e
1
Systems and Control Laboratory, HUN-REN Institute for Computer Science and Control (SZTAKI), Budapest, Hungary
2
Department of Control for Transportation and Vehicle Systems, Faculty of Transportation Engineering and Vehicle
Engineering, Budapest University of Technology and Economics, Budapest, Hungary
3
Asura Technologies Ltd., Budapest, Hungary
4
Sz
´
echenyi Istv
´
an University, Gy
˝
or, Hungary
{pelenczei.balint, knab.istvan.gellert}@sztaki.hun-ren.hu, {kovari.balint, becsi.tamas}@kjk.bme.hu,
Keywords:
Reinforcement Learning, Variable Speed Limit Control, Intelligent Transportation Systems, Cooperative
Traffic Control, Multi-Agent Systems.
Abstract:
Efficient traffic flow management on highway scenarios is crucial for ensuring safety and minimizing emis-
sions through the reduction of so-called shockwave effects. In this paper, we propose a novel approach based
on cooperative Multi Agent Reinforcement Learning for optimizing traffic flow, utilizing Variable Speed Limit
Control in dynamic simulation environments with random anomalies. Our method leverages Reinforcement
Learning to adaptively adjust speed limits on distinct road sections in response to alternating traffic conditions,
thereby improving not only general traffic flow parameters, but also reducing sustainability measures overall.
Through extensive simulations in a Simulation of Urban MObility environment, we demonstrate the supe-
riority of our approach in enhancing traffic flow efficiency and robustness compared to alternative solutions
found in literature. Our findings reveal an enhanced performance of RL-based VSL control over traditional ap-
proaches due to its generalizability, which contributes to the progression of Intelligent Transportation Systems
by presenting a proactive and adaptable resolution for highway traffic management within dynamic real-world
contexts.
1 INTRODUCTION
Continuous improvements are on the horizon con-
cerning the spread of Artificial Intelligence-based so-
lutions across interconnected domains relevant to ev-
eryday life, such as logistics (Richey Jr et al., 2023),
autonomous vehicles (F
´
enyes et al., 2021), and lastly
traffic control (K
˝
ov
´
ari et al., 2021). These advance-
ments are positioned to revolutionize traditional prac-
tices, offering remarkable levels of efficiency, safety
and sustainability. With AI technologies increasingly
integrated into various aspects of society, the poten-
tial for transformative impact on these critical areas is
vast.
a
https://orcid.org/0000-0001-9194-8574
b
https://orcid.org/0009-0007-6906-3308
c
https://orcid.org/0000-0003-2178-2921
d
https://orcid.org/0000-0002-1487-9672
e
https://orcid.org/0000-0001-5872-7008
In the realm of Intelligent Transportation Systems
(ITS), AI holds immense promise for optimizing traf-
fic flow, enhancing road safety, and reducing envi-
ronmental impacts (An et al., 2011). Through real-
time data analysis, predictive modeling, and adap-
tive control algorithms, AI-powered ITS solutions can
dynamically respond to changing traffic conditions,
minimizing congestion and improving overall effi-
ciency. Moreover, the integration of AI with emerg-
ing technologies, such as Connected Autonomous
Vehicles (CAV), opens up possibilities for seam-
less vehicle-to-everything (V2X) communication, en-
abling coordinated traffic management and enhanced
safety measures (Kavas-Torris et al., 2022).
However, for the time being, nearly none of the
vehicles on public roads possesses the capabilities
necessary to utilize these advanced features, resulting
in bottlenecks appearing throughout all fundamental
parts of a traffic network. At intersections, the pri-
Pelenczei, B., Knáb, I., Kõvári, B., Bécsi, T. and Palkovics, L.
Adaptive Highway Traffic Management: A Reinforcement Learning Approach for Variable Speed Limit Control with Random Anomalies.
DOI: 10.5220/0012920700003822
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024) - Volume 2, pages 117-124
ISBN: 978-989-758-717-7; ISSN: 2184-2809
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
117
mary challenge is the Traffic Signal Control (TSC)
problem, where optimizing signal timings to manage
varying traffic volumes is critical. In urban areas,
pedestrian crossings introduce additional complexi-
ties, requiring careful coordination to ensure both ve-
hicular flow and pedestrian safety.
Additionally, disturbances caused by sudden brak-
ing, lane changes or speed fluctuations are particu-
larly problematic in areas like toll plazas, highway on-
and off-ramps and also in the vicinity of lane closures.
This phenomenon, known as traffic shockwave, is vi-
sualized in Figure 1, where space-time trajectories of
vehicles are plotted, therefore slopes of these curves
represent the speeds of vehicles.
Advanced traffic management systems, such as
Variable Speed Limit Control (VSLC), often integrate
Machine Learning and predictive analytics to dynam-
ically adjust speed limits based on real-time traffic
conditions on virtually separated road sections, inde-
pendently from each other. By continuously adapt-
ing speed limits in response to real-time traffic condi-
tions, VSLC systems aim to manage disturbances and
maintain optimal flow conditions, while minimizing
the risk of congestions and accidents.
Therefore, in this research we introduce a coop-
erative MARL approach for VSLC in a generalized
highway setting, utilizing random anomalies. As de-
tailed in Section 6, by leveraging the Machine Learn-
ing framework’s inherent generalizability, our ap-
proach demonstrates superior performance compared
to baseline methods outlined in Section 5.1.
2 RELATED WORK
In the realm of ITS, various methodological ap-
proaches are employed for different tasks, ranging
from rule-based systems to sophisticated ML tech-
niques capable of high-level decision-making. This
section provides a comprehensive overview of such
methods in literature, with a focus on Variable Speed
Limit Control.
Rule-based methods are widely accepted due to
their simplicity and ease of applicability. For in-
stance, the Motorway Control System (MCS), pro-
Figure 1: Space-time trajectories of vehicles demonstrating
shockwave effects (Huang et al., 2010).
Figure 2: Comparison of mean speeds over the entire traffic
network achieved by different algorithms (Grumert et al.,
2018).
posed by (Van Toorenburg and De Kok, 1999), has
been successfully applied in real traffic scenarios.
This method is also discussed in Section 5.1 in more
detail as one of the baselines for this research. An-
other benchmark, being the Motorway Traffic Flow
Control (MTFC) introduced by (M
¨
uller et al., 2013),
stabilizes traffic flow at the maximum throughput
level based on occupancy measures. A comparative
study by (Grumert et al., 2018) found, that MTFC
outperformed four other methods, including MCS, as
shown in Figure 2.
Another innovative, Reinforcement Learning-
based approach is presented by (Kim et al., 2024);
they developed a proactive traffic safety management
methodology based on real-time crash risk estimation.
Their system has been able to reduce real-time crash
risk by approximately 55% during lane closure sce-
narios.
In (K
˝
ov
´
ari et al., 2024), a Deep Reinforcement
Learning approach has been introduced, that utilizes
a Multi Agent framework with a Deep Q-Network al-
gorithm to capture the spatial-temporal characteristics
of traffic flow. The MARL-based VSL system, trained
and tested in a static simulator environment with a
fixed location lane drop bottleneck, demonstrated sig-
nificant improvements in traffic stability and conges-
tion reduction compared to a free-flow baseline.
For further insights into different kinds of meth-
ods applied to the VSLC problem, refer to (Khon-
daker and Kattan, 2015) and (Lahmiss and Khatory,
2020). For an overview of Reinforcement Learn-
ing techniques concerning this application, see (Ku
ˇ
si
´
c
et al., 2020).
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
118
3 CONTRIBUTION
While numerous approaches address the issue of
shockwave effects in highway scenarios, Variable
Speed Limit Control has emerged as a particularly
promising solution, offering several demonstrated
benefits. This paper explores the effectiveness and ad-
vantages of Machine Learning by proposing a coop-
erative Multi-Agent Reinforcement Learning method-
ology such, that outperforms two well-established tra-
ditional solutions cited in the literature.
Therefore, the contribution of this paper is
twofold: firstly, we have developed a cooperative
MARL framework and successfully evaluated it in
dynamic scenarios with random anomalies, hence
demonstrating the method’s generalizable manner.
Secondly, we present the results of an extensive com-
parative experimentation, according to which the per-
formance of the trained agent surpasses three alter-
native methods’: a simple free-flow scenario with
no control realized; the Motorway Control System
implemented in Stockholm, Sweden; and lastly the
Mainstream Traffic Flow Control method, which is
recognized as a superior solution among the com-
pared methods according to (Grumert et al., 2018).
4 ENVIRONMENT
In this research, training, testing and evaluation have
been conducted using the Simulation of Urban MO-
bility (SUMO) software package proposed in (Al-
varez Lopez et al., 2018), which is widely recognized
in literature as a leading traffic simulator. Firstly,
SUMO is an open-source software capable of simu-
lating both real and artificial traffic networks. More-
over, it offers the ability to numerically monitor vari-
ous traffic flow metrics, including density, travel time,
waiting time, and sustainability parameters, such as
CO
2
emission, NO
x
emission and fuel consumption.
Additionally, SUMO supports appropriate random-
ization and provides a comprehensive set of tools for
scenario generation and modification. The package
also includes the TraCI interface, enabling environ-
ment manipulation through various programming lan-
guages. The schematic illustration of the communica-
tion among software components is presented later in
Figure 4.
The geometric design of the network is illustrated
in Figure 3. It comprises a total of eight straight road
segments, each 300 m in length, unidirectional and
consisting of three lanes. Each lane is 3.2 m wide,
making the total segment width 9.6 m. At the begin-
ning of the first segment, designated spawn points for
each lane generate and inject vehicles into the sim-
ulation randomly at each step. The initial two seg-
ments, covering the first 600 m, are solely for obser-
vation purposes. Segments 3 and 4 are designated as
Variable Speed Limit Control zones, but no anomalies
are generated in this section of the network, allowing
the VSLC system to proactively prevent and manage
anomalies, that may appear in subsequent locations.
Segments 5 through 8 are also VSLC zones, where
random anomalies can be set up, hence constructing
a lane drop bottleneck. In case of an anomaly be-
ing generated on any of the lanes, all subsequent sec-
tions are closed. Each of the 18 VSLC zones can be
independently controlled with thresholded speed lim-
its, ranging from a minimum of 30 km/h to a maxi-
mum of 130 km/h, adjusted in 10 km/h increments.
The control system’s task is to determine the optimal
speed limit for each of these zones.
4.1 Abstractions
Specifically in Reinforcement Learning, the proper
definition of abstractions – being state representation,
action space and the reward function are deemed
fundamental, as these are the only connections with
the environment through which the agent can develop
its behaviour and understand complex inner dynamics
of the processes.
Figure 3: Structural design of the traffic network.
Adaptive Highway Traffic Management: A Reinforcement Learning Approach for Variable Speed Limit Control with Random Anomalies
119
4.1.1 State Representation
The state abstraction encapsulates critical elements of
the environment, enabling the agent to accurately per-
ceive and interpret it, which is essential for learning
effective policies. In the context of Intelligent Trans-
portation Systems, designers must carefully select
state abstractions, that maximize information-value of
state sequences, while ensuring that the process of
data acquisition remains relatively cost-efficient and
straightforward.
In our study, the state representation for a single
observation section consists solely of a lane occu-
pancy metric ρ, which is calculated from the number
of vehicles on the given section N and the length of
the section l. This value can be efficiently measured
using either a roadside camera unit or a simple loop
detector device located at each end of the VSLC zone,
thereby providing a practical, yet sufficiently infor-
mative state abstraction for RL-based traffic manage-
ment.
4.1.2 Action Space
The action space represents the set of possible inter-
ventions the agent can take in a given environmental
state, balancing complexity and expressiveness to en-
able efficient exploration of various strategies.
Many studies in this field employ high-
dimensional vectors to address the broad range
of legal speed limit values. In contrast, our research
defines the action space only as a three-dimensional
vector of speed-limit increments, as shown in
Equation 1:
action =
+10km/h
0km/h
10km/h
(1)
4.1.3 Reward Function
The abstraction of reward signals constitutes a criti-
cal element in Reinforcement Learning, being a single
scalar value provided by the environment to evaluate
the effectiveness of a given action. The aim of this
feedback mechanism is to quantify the quality of the
action within its scenario, thereby guiding the agent’s
learning process and shaping its behavior to achieve
the desired objectives as defined by the reward func-
tion.
We have formulated the reward function of the
agents to minimize waiting time on the entire ob-
served traffic network, as shown in Equation 2:
reward
t+1
=
1
w
t
+ ε
(2)
where the reward at time step t + 1 is given by the
waiting time w
t
during the preceding t time interval,
with the inclusion of a small constant ε to prevent zero
division.
5 METHODOLOGY
5.1 Baseline Solutions
Concerning any novelties in the realm of controllers,
the inclusion of relevant baseline solutions is crit-
ically essential, as these benchmarks facilitate the
evaluation and comparison of newly proposed algo-
rithms. By providing a point of reference, baseline
controllers enable researchers to quantify improve-
ments and understand the practical significance of
their methods. Additionally, they may also help in
identifying strengths and weaknesses of an approach
ensuring, that advancements are both meaningful and
contextually significant within the landscape of exist-
ing technologies.
5.1.1 Free-Flow (FF)
In the context of Variable Speed Limit Control, the
free-flow scenario serves as a fundamental baseline
method. This approach assumes, that no active speed
limit control is implemented, thus the maximum al-
lowed speed in each zone is 130km/h.
5.1.2 Motorway Control System (MCS)
An examined VSL algorithm is the rule-based Motor-
way Control System (Van Toorenburg and De Kok,
1999), which utilizes predefined thresholds in order
to determine, when to decrease or increase speed
limits at certain sections, based on real-time traffic
conditions detected by simple sensors. Each lane is
equipped with a detector and a corresponding VSL
sign, although a common speed limit is applied across
adjacent lanes. The decision-making process for set-
ting the speed limit v
t, j
at time t and detector location
j relies on the measured speed ˜v
t, j
. The system as-
sumes the most restrictive lane, i.e. the lane with the
lowest mean speed regulates the common speed limit.
5.1.3 Mainstream Traffic Flow Control (MTFC)
The model-based Mainstream Traffic Flow Control
algorithm (M
¨
uller et al., 2013), being the most ad-
vanced baseline used in the comparison, is designed
to optimize speed limits by regulating traffic occu-
pancy at bottlenecks, thereby maintaining efficient
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
120
flow conditions. This algorithm determines the vari-
able speed limit at time t as a fraction b(t), of the
original road speed limit, updated using Equation 3:
b(t) = b(t 1) + K
I
· e
0
(t) (3)
where K
I
denotes the integral gain and e
0
(t) is the
occupancy error, defined as the difference between the
critical occupancy ˆo
out
and the measured occupancy
at the bottleneck ˜o
out
.
The system integrates four detectors positioned
around the bottleneck, utilizing the maximum occu-
pancy measurement from these sensors. Then, the
calculated speed limits are applied to the 300 m sec-
tions.
5.2 Proposed Solution
5.2.1 Reinforcement Learning
Reinforcement Learning has become a pivotal branch
of Machine Learning for addressing sequential
decision-making problems and optimization chal-
lenges, demonstrating its superiority in numerous ap-
plications ranging from robotics through autonomous
vehicle control to traffic signal control. Contrary
to Supervised Learning, being the most widespread
technique in the vehicle industry, the provided advan-
tages of RL are significant, as there is no reliance
on pre-annotated datasets, because the agent gener-
ates its training samples in a continuous interaction
sequence during the training process with an envi-
ronment object. A single interaction between the RL
agent and the SUMO environment, and the communi-
cation framework are depicted in Figure 4.
The agent’s learning process involves updating
the action-value function Q(s,a) firstly using the
Bellman-equation, and secondly approximating it by
a neural network with parameters θ. The update rule
for Q-values is given by Equation 4:
Q(s,a) Q(s,a)+ α · [r
+γ ·max
a
Q
s
,a
Q(s,a)]
(4)
where Q(s,a) denotes the current estimate of the
action-value function, while the term r
represents the
reward received after taking action a in state s, tran-
sitioning to the next state s
and lastly, γ denotes the
discount factor.
5.2.2 Multi Agent Reinforcement Learning
MARL extends the conventional single-agent RL
paradigm introduced in Section 5.2.1 by incorporat-
ing multiple interacting agents, each learning and
making decisions to optimize their respective objec-
tives within a shared environment.
In the context of MARL, each agent i similarly
aims to maximize its own expected cumulative re-
ward, as discussed above. The policy improvement
step, which seeks to find a new policy, that maximizes
this value G
t
, is described by Equation 5:
π
new
i
= argmax
π
i
E
sd
π
,aπ
i
[Q
π
i
(s,a)] (5)
where π
new
i
denotes the improved policy for agent i.
The expectation E is taken over the state distribution
d
π
induced by the current policy π and the action dis-
tribution π
i
.
By integrating these methodologies, MARL pro-
vides a robust approach for developing either compet-
itive or cooperative strategies among agents, enhanc-
ing the overall performance of decision-making.
5.2.3 Cooperative Multi Agent Reinforcement
Learning (cMARL)
In this study, the investigated road infrastructure is
segmented into discrete sections, each capable of au-
tonomous decision-making. This segmentation facil-
itates the resolution of shock waves across the net-
work by implementing an independent learner multi-
agent system. Throughout training iterations, individ-
ual agents operate without prior knowledge of their
peers’ actions, thus preserving autonomy and mitigat-
ing coordination complexities.
Experiences gained from these interactions are
stored in a shared buffer, forming the basis of sub-
sequent learning. Notably, all agents share a common
neural network architecture, that yields to a self-play
paradigm. This paradigm allows agents to contribute
adaptively to environmental changes without necessi-
tating explicit cooperation.
In our implementation, agents pursue a unified
objective guided by a predefined reward function,
which follows the identical payoff scheme. Such
design promotes an implicit cooperative behaviour,
wherein agents optimize their individual actions to-
wards achieving a globally optimal network state.
By structuring the system in this manner, the net-
work retains flexibility, negating the need for agents
to identify the active section. This modularity ensures
scalability, enabling the seamless integration of addi-
tional sections without altering the underlying state
representation.
Adaptive Highway Traffic Management: A Reinforcement Learning Approach for Variable Speed Limit Control with Random Anomalies
121
Figure 4: Reinforcement Learning training loop and communication framework for Variable Speed Limit Control using the
Simulation of Urban MObility (SUMO) environment and the Traffic Control Interface (TraCI).
Table 1: Statistical comparison of baseline methods and the Cooperative Multi-Agent Reinforcement Learning approach based
on average values of 100 test episodes in high traffic density conditions.
Method Distribution
Travel time
[s]
Waiting time
[s]
Queue length
[veh / s]
CO
2
[kg/s]
NO
x
[g/s]
Fuel
[kg/s]
FF
Uniform 314.7 16.21 29.46 2251.9 956.3 718.3
Poisson 305.9 14.37 26.75 2156.1 914.1 687.7
MCS
Uniform 312.6 16.52 29.25 2357.0 1005.5 751.8
Poisson 295.9 12.19 24.11 2247.5 957.3 716.8
MTFC
Uniform 434.0 10.94 15.98 2515.4 1071.1 802.3
Poisson 441.1 7.95 10.79 2490.6 1061.9 794.4
cMARL
Uniform 275.6 5.62 7.99 1953.6 817.3 623.1
Poisson 277.2 5.50 6.39 1874.2 782.8 597.8
Minimal
Performance Gain
Uniform 11.8% 48.6% 50.0% 13.3% 14.5% 13.3%
Poisson 6.3% 30.8% 40.8% 13.1% 14.4% 13.1%
6 RESULTS
In order to evaluate our methodology and test its per-
formance against established baselines, we conducted
an experimentation under consistent, identical envi-
ronmental conditions.
To validate our approach and support its robust-
ness, we have tested all the methods employing ran-
domized traffic flows based on two different distribu-
tions. Furthermore, each distribution has been evalu-
ated under two traffic density levels, later referred to
as normal and high.
Over 100 seeded pseudorandom test episodes, we
have measured six key metrics. Half of the met-
rics, being travel time, waiting time and queue length,
reflect general traffic flow characteristics; while the
other three, fuel consumption, CO
2
and NO
x
emis-
sions, are critical sustainability indicators.
Figure 5 gives a visual illustration of the results,
with high-density traffic conditions shown in grey and
normal density in blue. On the X-axis, our cMARL
abbreviated solution is consistently positioned right-
most, followed by baseline methods in descending or-
der of performance. The data clearly demonstrates,
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
122
Figure 5: Average performance of the different solutions over 100 test episodes for both classic and sustainability measures.
that cMARL outperforms baseline methods across
all traffic flow parameters and significantly reduces
emissions metrics. Notably, the performance gain of
cMARL increases with traffic density, highlighting its
scalability and efficiency under congested conditions.
The same can be seen numerically in Table 1
for the high traffic densities. These results con-
firm the earlier assumption about RLs superiority
among other solutions. This is further supported by
the substantial reduction in waiting times, which re-
flects efficient traffic management. The mean queue
lengths were also shorter with cMARL, suggesting a
smoother traffic flow and reduced congestion. On the
sustainability front, cMARL yields substantial reduc-
tion concerning CO
2
and NO
x
emissions, demonstrat-
ing its environmental benefits. Additionally, the re-
duced fuel consumption also underlines the economic
and ecological advantages of our approach.
In summary, the cMARL algorithm not only en-
hances traffic flow efficiency by reducing average
waiting times and travel times but also contributes
to environmental sustainability through lower emis-
sions. This dual benefit underscores the potential of
Cooperative Multi-Agent Reinforcement Learning to
improve public road conditions and support a sustain-
able future.
7 CONCLUSION
This paper deals with the problem of Variable Speed
Limit Control under randomly emerging anomalies,
that cause bottlenecks on a simulated traffic network.
In VSLC, the objective is to construct a speed limit se-
quence for each vslc zone (most commonly segments
of fix length one after the other) such, that the algo-
rithm proactively responds to real-time traffic condi-
tions and minimizes certain flow parameters, includ-
ing queue length and waiting time, thereby preventing
the formation of congestions.
Adaptive Highway Traffic Management: A Reinforcement Learning Approach for Variable Speed Limit Control with Random Anomalies
123
The real-time nature of the problem and the hard
task to precisely construct mathematical models for
such events on highway scenarios both contribute
to the choice of Machine Learning-based solutions.
This study demonstrates the efficacy and scalability
of a Cooperative Multi-Agent Reinforcement Learn-
ing (cMARL) approach.
As detailed in its context, three important meth-
ods have been utilized as benchmarks in our exten-
sive comparison scheme, being the free-flow condi-
tion with no control enabled, the MCS currently used
in real world application in Sweden, and the MTFC,
which can achieve the best performance in certain
flow metrics according to a survey carried out among
VSLC techniques.
The results, obtained from extensive simulations
in both normal and high traffic density conditions,
highlight the significant advantages of cMARL over
traditional traffic management methods. Specifically,
cMARL consistently outperforms baseline methods
in reducing average travel times, waiting times, and
queue lengths, thus enhancing overall traffic flow effi-
ciency. Furthermore, the approach proves to be highly
effective in lowering fuel consumption and emissions
of CO
2
and NO
x
.
Overall, the dual benefits of improved traffic flow
and reduced environmental impact make cMARL a
promising solution for modern traffic management
challenges. The results of this study pave the way
for future research: further development of the RL
abstraction terms would certainly yield to even better
results, such as a sliding kernel-type state representa-
tion for the ease of interpretability.
ACKNOWLEDGEMENTS
This work was supported by the European Union
within the framework of the National Labora-
tory for Autonomous Systems (RRF-2.3.1-21-2022-
00002). T.B. was supported by BO/00233/21/6: the
J
´
anos Bolyai Research Scholarship of the Hungarian
Academy of Sciences.
REFERENCES
Alvarez Lopez, P., Behrisch, M., Bieker-Walz, L., Erdmann,
J., Fl
¨
otter
¨
od, Y.-P., Hilbrich, R., L
¨
ucken, L., Rum-
mel, J., Wagner, P., and Wießner, E. (2018). Micro-
scopic traffic simulation using sumo. In 2019 IEEE In-
telligent Transportation Systems Conference (ITSC),
pages 2575–2582. IEEE.
An, S.-h., Lee, B.-H., and Shin, D.-R. (2011). A sur-
vey of intelligent transportation systems. In 2011
third international conference on computational intel-
ligence, communication systems and networks, pages
332–337. IEEE.
F
´
enyes, D., N
´
emeth, B., and G
´
asp
´
ar, P. (2021). A novel
data-driven modeling and control design method for
autonomous vehicles. Energies, 14(2):517.
Grumert, E. F., Tapani, A., and Ma, X. (2018). Characteris-
tics of variable speed limit systems. European trans-
port research review, 10:1–12.
Huang, D., Shere, S., and Ahn, S. (2010). Dynamic high-
way congestion detection and prediction based on
shock waves. In Proceedings of the seventh ACM in-
ternational workshop on VehiculAr InterNETworking,
pages 11–20.
Kavas-Torris, O., Gelbal, S. Y., Cantas, M. R., Aksun Gu-
venc, B., and Guvenc, L. (2022). V2x commu-
nication between connected and automated vehicles
(cavs) and unmanned aerial vehicles (uavs). Sensors,
22(22):8941.
Khondaker, B. and Kattan, L. (2015). Variable speed limit:
an overview. Transportation Letters, 7(5):264–278.
Kim, Y., Kang, K., Park, N., Park, J., and Oh, C. (2024).
Reinforcement learning approach to develop variable
speed limit strategy using vehicle data and simula-
tions. Journal of Intelligent Transportation Systems,
pages 1–18.
K
˝
ov
´
ari, B., Sz
˝
oke, L., B
´
ecsi, T., Aradi, S., and G
´
asp
´
ar, P.
(2021). Traffic signal control via reinforcement learn-
ing for reducing global vehicle emission. Sustainabil-
ity, 13(20):11254.
Ku
ˇ
si
´
c, K., Ivanjko, E., Greguri
´
c, M., and Mileti
´
c, M.
(2020). An overview of reinforcement learning meth-
ods for variable speed limit control. Applied Sciences,
10(14):4917.
K
˝
ov
´
ari, B., Kn
´
ab, I., and B
´
ecsi, T. (EasyChair, 2024). Vari-
able speed limit control for highway scenarios a multi-
agent reinforcement learning based appraoch. Easy-
Chair Preprint no. 13400.
Lahmiss, H. and Khatory, A. (2020). Variable speed limit
(vsl) system applications in motorways. In 2020 IEEE
13th International Colloquium of Logistics and Sup-
ply Chain Management (LOGISTIQUA), pages 1–5.
IEEE.
M
¨
uller, E. R., Carlson, R. C., Kraus, W., and Papageorgiou,
M. (2013). Microscopic simulation analysis of main-
stream traffic flow control with variable speed lim-
its. In 16th International IEEE Conference on Intelli-
gent Transportation Systems (ITSC 2013), pages 998–
1003. IEEE.
Richey Jr, R. G., Chowdhury, S., Davis-Sramek, B., Gian-
nakis, M., and Dwivedi, Y. K. (2023). Artificial intel-
ligence in logistics and supply chain management: A
primer and roadmap for research.
Van Toorenburg, J. and De Kok, M. (1999). Automatic in-
cident detection in the motorway control system mtm.
Bureau Transpute, Gouda.
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
124