required and legal frameworks need to be considered
in data-based models (Zhang, 2022). As the third
category, Deep Reinforcement Learning (DRL)
models can continuously learn and improve when
interacting with the environment, which is more
generalizable than the rule-based model and can
avoid the need for large datasets effectively.
Combining DRL with vehicle networking technology
for urban road traffic control is a current research
hotspot and frontier field (Sutton, 2018). At present,
there are few studies on the interactions between
multiple autonomous vehicles and cooperative lane
change, while previous studies have rarely considered
the different performances of vehicles at different
speed limits.
The primary aim of this study is to delve into the
cooperative lane-change decisions of multiple
autonomous vehicles. Firstly, employing the deep
deterministic policy gradient (DDPG) framework,
this study tackles the multi-autonomous vehicles’
highway lane-change challenge amidst mixed traffic
scenarios. Here, vehicles collaborate to learn safe and
efficient driving strategies, leveraging averaged
output performance. Secondly, to ensure optimal
vehicle operations, the paper imposes penalties for
unnecessary lane changes while incentivizing
effective lane changes. This addresses the issue of
vehicles excessively or insufficiently changing lanes
to maximize reward values. Thirdly, this study
analyzes and compare the predictive performance of
models under different lane change reward schemes.
Moreover, this study incorporates various speed
limits commonly observed on highways (40, 60, 80
meters per second), adjusting safety distances
between cars accordingly. This resolves the limitation
of employing a uniform speed limit for all vehicles.
Additionally, in crafting the reward function, this
study considers sudden accelerations or decelerations
of vehicles, thereby mitigating the tendency to
prioritize driving efficiency over passenger comfort,
a common oversight in previous studies.
2 METHODOLOGIES
2.1 Dataset Description and
Preprocessing
In order to simulate driving scenarios, the highway-
env platform is used in this study (Leurent, 2018).
Highway-env is an open-source simulation
environment for developing and testing autonomous
driving strategies. Created by Edouard Leurent, the
environment provides a series of customizable, rule-
based traffic scenarios for evaluating the decision-
making and control systems of self-driving vehicles.
In highway-env, there are six specialized driving
scenarios to choose from, which are highway, merge,
roundabout, parking, intersection and racetrack. This
study considers a three-lane, one-way highway, with
a vehicle density of 8 autonomous vehicles and 10
manually driven vehicles kept constant.
2.2 Proposed Approach
The objective of this research is to investigate the
lane-change performance of autonomous vehicles
when different levels of penalties are imposed for
unnecessary lane changing at different speed limits
respectively, in order to find effective lane- change
strategies. The approach is based on the DDPG, a
DRL algorithm, combined with a highway simulation
environment.
To enhance lane-change effectiveness, a penalty
for unnecessary lane-change distance is introduced.
Meanwhile, the acceleration during lane changes and
the range of the distance between vehicles after lane
changes are limited, which ensures that lane changes
provide a higher level of comfort and has minimal
impact on neighbouring vehicles. This paper
evaluated the average lane-change performance of
multi vehicles in different kind of traffic scenarios by
varying speed limit and penalty for unnecessary lane
change , controlling the density of vehicles
unchanged. The controlled autonomous vehicles (the
agents) interact with the simulated traffic
environment and utilizes the return from the
environment to develop a lane-change strategy.
Figure 1 below illustrates the structure of the system.
Figure 1: The pipeline of the model (Photo/Picture credit: Original).