ical training algorithms may lead to locally instead
of globally optimal solution for the objective function
(11). We therefore use a Monte-Carlo type approach
with 50 parameter initializations for each of the NNs
to make them more comparable.
5 SUMMARY AND CONCLUSION
We have presented different model-based and data-
driven solution approaches to solve the ring road
control problem. While the highest performance is
reached for the NLMPC approach and it performs
well even if the dynamical system differs from the
real system (cf. Section 4.2.1), its computing times
may prevent it from being real-time applicable. Thus,
we have replaced the MPC controller by an imitation
learning approach which can be extended to include
several different training scenarios. With this exten-
sion, the controller can perform well even in scenarios
that have not been observed exactly during the train-
ing and may not even require data from all vehicles in
the system (cf. Section 4.2.2). By imitating another
controller, expert knowledge about the dynamical sys-
tem is induced which leads to much faster training
than for other ML techniques like RL.
For the training, we require an, at least basic, un-
derstanding of the HD dynamics induced by the car-
following model. But to run it in real-world appli-
cation, we need real-time data from other vehicles.
We expect that current developments in communica-
tion technologies will enable the realization of such
controllers in real-world traffic situations in the near
future.
Although its performance depends on the ob-
served training data, there are advantages of apply-
ing data-driven controllers such as the extIL approach
in real-world situations. That is, while the presented
NLMPC controller is only valid for the closed ring
road in which we are able to describe the dynamics,
the extIL controller may be applied in all set-ups in
which data of one or several leading vehicles is avail-
able. As the occurring stop-and-go waves and traffic
jams in, for instance, city traffic are similar to the ones
at the ring road, we expect our extIL controller to per-
form well in these situations, too.
Like in other applications, in which ML tech-
niques show promising results, there are still open
questions regarding the application of NN-based con-
trollers in safety critical situations like steering a ve-
hicle. While we have shown a certain robustness of
our approach, guaranteed accuracy and stability are
still hard to prove and, at least partially, open tasks.
Further, the tuning of hyperparameters and the repro-
ducibility of training results of Neural Nets are crit-
ical as well and may be overcome by using other
structures like radial basis function (RBF) networks
(Bishop, 2006).
In this work, we have focused on scenarios with
only one intelligently controlled vehicles to empha-
size that already one of these suffices to outbalance
the emerging stop-and-go wave. However, taking into
account several AVs may further improve the results
and may lead to a more efficient outbalancing effect,
cf. (Chou et al., 2022).
In future work, we aim to further analyze the robust-
ness and stability of our approach and other AI-based
techniques. Especially, to guarantee certain safety cri-
teria such that the controller can be applied not only in
artificial but also realistic real-world traffic situations.
REFERENCES
Baumgart, U. and Burger, M. (2021). A reinforcement
learning approach for traffic control. In Proceedings
of the 7th International Conference on Vehicle Tech-
nology and Intelligent Transport Systems - VEHITS,,
pages 133–141. INSTICC, SciTePress.
Bishop, C. M. (2006). Pattern Recognition and Machine
Learning. Springer-Verlag New York.
Chou, F.-C., Bagabaldo, A. R., and Bayen, A. M. (2022).
The Lord of the Ring Road: A Review and Evaluation
of Autonomous Control Policies for Traffic in a Ring
Road. ACM Transactions on Cyber-Physical Systems
(TCPS), 6(1):1–25.
Cui, S., Seibold, B., Stern, R., and Work, D. B. (2017). Sta-
bilizing traffic flow via a single autonomous vehicle:
Possibilities and limitations. In 2017 IEEE Intelligent
Vehicles Symposium (IV), pages 1336–1341.
Cybenko, G. (1989). Approximation by superpositions of a
sigmoidal function. Mathematics of Control, Signals
and Systems, 2(4):303–314.
De Schutter, B. and De Moor, B. (1998). Optimal Traf-
fic Light Control for a Single Intersection. European
Journal of Control, 4(3):260 – 276.
Duan, Y., Chen, X., Houthooft, R., Schulman, J., and
Abbeel, P. (2016). Benchmarking Deep Reinforce-
ment Learning for Continuous Control. In Proceed-
ings of The 33rd International Conference on Ma-
chine Learning, volume 48 of Proceedings of Machine
Learning Research, pages 1329–1338.
Gazis, D. C., Herman, R., and Rothery, R. W. (1961). Non-
linear Follow-the-Leader Models of Traffic Flow. Op-
erations Research, 9(4):545–567.
Gerdts, M. (2011). Optimal Control of ODEs and DAEs.
De Gruyter.
Gipps, P. (1981). A behavioural car-following model for
computer simulation. Transportation Research Part
B: Methodological, 15(2):105–111.
Gr
¨
une, L. and Pannek, J. (2011). Nonlinear Model Predic-
tive Control. Springer-Verlag, London.
Hybrid Optimal Traffic Control: Combining Model-Based and Data-Driven Approaches
93