12 CONCLUSION AND FUTURE
WORK
This paper presents a methodological approach to an-
alyze the exact output regions of policy FF-DNNs for
high-dimensional input-output spaces. Such an exact
analysis is necessary for the understanding of agents’
strategies, especially with regards to CNI operation.
The paper builds upon the exact transformation of FF-
DNNs into a pruned DT. The paths from a root to the
leaf nodes of such an DT make up output regions that
can be represented by polytopes. For an overview of
a polytope, the inner box is computed. Furthermore,
the output regions are depicted by vertices in the in-
troduced NHG. They are connected by edges if they
are direct neighbors to each other. Further proper-
ties, like the polytope constraints, inner box bounds
and their volume, as well as, the Chebyshev ball cen-
ter and radius, the bounding box, and the cosine dis-
tance of the normal vectors of the hyperplanes are
also mapped to the vertices and thus hyperplanes in
the output regions. Thereby, the Neighbor Graph es-
pecially allows us to visualize the relations between
higher-dimensional output regions.
In future work, these analysis tools could be eval-
uated in real scenarios with even higher dimensions.
They could also be used to analyze the learned control
strategies of ARL agents in the power grid for larger
input domains. This would make it possible to get a
better overview of the entire possible behavior of the
agent.
ACKNOWLEDGEMENTS
This work was funded by the German Federal Min-
istry for Education and Research (BMBF) under
Grant No. 01IS22071.
REFERENCES
Arrieta, A. B., D
´
ıaz-Rodr
´
ıguez, N., Del Ser, J., Bennetot,
A., Tabik, S., Barbado, A., Garc
´
ıa, S., Gil-L
´
opez, S.,
Molina, D., Benjamins, R., et al. (2020). Explainable
artificial intelligence (xai): Concepts, taxonomies, op-
portunities and challenges toward responsible ai. In-
formation fusion, 58:82–115.
Arulkumaran, K., Deisenroth, M. P., Brundage, M., and
Bharath, A. A. (2017). Deep reinforcement learning:
A brief survey. IEEE Signal Processing Magazine,
34(6):26–38.
Aytekin, C¸ . (2022). Neural networks are decision trees.
CoRR, abs/2210.05189:1–8. [retrieved: 05, 2023].
Bastani, O., Pu, Y., and Solar-Lezama, A. (2018). Ver-
ifiable reinforcement learning via policy extraction.
Advances in neural information processing systems,
31:2499–2509.
Bemporad, A., Filippi, C., and Torrisi, F. D. (2004). Inner
and outer approximations of polytopes using boxes.
Computational Geometry, 27(2):151–178.
Coppens, Y., Efthymiadis, K., Lenaerts, T., and Now
´
e, A.
(2019). Distilling deep reinforcement learning poli-
cies in soft decision trees. In International Joint Con-
ference on Artificial Intelligence.
Du, M., Liu, N., and Hu, X. (2019). Techniques for in-
terpretable machine learning. Communications of the
ACM, 63(1):68–77.
Fraunhofer IEE and University of Kassel (2023). Pan-
dapower 2.0 cigre benchmark power grid implemen-
tation. [retrieved: 06, 2023].
Frosst, N. and Hinton, G. (2017). Distilling a neural
network into a soft decision tree. arXiv preprint
arXiv:1711.09784.
Fujimoto, S., Hoof, H., and Meger, D. (2018). Address-
ing function approximation error in actor-critic meth-
ods. In Proceedings of the 35th International Con-
ference on Machine Learning, ICML 2018, Stock-
holmsm
¨
assan, Stockholm, Sweden, July 10-15, 2018,
volume 80 of Proceedings of Machine Learning Re-
search, pages 1587–1596. PMLR.
Fukuda, K. (2015). Lecture: Polyhedral computation,
spring 2015. [retrieved: 06.2024].
Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018).
Soft actor-critic: Off-policy maximum entropy deep
reinforcement learning with a stochastic actor. CoRR,
abs/1801.01290. [retrieved: 05, 2023].
Irsoy, O., Yildiz, O. T., and Alpaydin, E. (2012). Soft de-
cision trees. In International Conference on Pattern
Recognition.
Jaunet, T., Vuillemot, R., and Wolf, C. (2020). Drlviz: Un-
derstanding decisions and memory in deep reinforce-
ment learning. Computer Graphics Forum, 39(3):49–
61.
Ju, P. and Lin, X. (2018). Adversarial attacks to distributed
voltage control in power distribution networks with
DERs. In Proceedings of the Ninth International Con-
ference on Future Energy Systems, pages 291–302.
ACM.
Logemann, T. (2023). Explainability of power grid at-
tack strategies learned by deep reinforcement learning
agents.
Logemann, T. (2024). Power grid experi-
ment. https://gitlab.com/arl-experiments/
explains-simple-voltage-controller. [retrieved:
09, 2024].
Logemann, T. and Veith, E. M. (2023). Nn2eqcdt: Equiv-
alent transformation of feed-forward neural networks
as drl policies into compressed decision trees. vol-
ume 15, page 94–100. IARIA, ThinkMind. [retrieved:
07, 2023].
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M. A.
EXPLAINS 2024 - 1st International Conference on Explainable AI for Neural and Symbolic Methods
106