sequential decision problem to reason in order to gen-
eralize faster and in a more interpretable manner.
In consequence, we proposed a solution for prob-
lems that are fully observable in the form of such a
system of two components. The reasoning compo-
nent will make use of a KB organized as a layered
directed acyclic graph and after a sufficient period of
learning and generalization the learner will turn to the
KB to expedite the process of learning the optimal
policy. This is analogous to capitalizing on memory
of past experience, that forms the cognitive abilities of
the solver, in the face of the unknown, which is much
better than facing the unknown with high hopes.
We are currently investigating extensions of the
proposed neuro-symbolic framework to partially ob-
servable environments and the integration of addi-
tional cognitive elements in the KB. Furthermore, we
are exploring enhancing the efficiency of the inferen-
tial procedures in the proposed KB structure. Ad-
ditionally, we plan to conduct empirical validations
across diverse problem domains to provide insights
into the broader applicability and robustness of the
proposed approach, paving the way for advancements
in neuro-symbolic systems for artificial intelligence
applications.
REFERENCES
Agarwal, R., Machado, M. C., Castro, P. S., and Bellemare,
M. G. (2021). Contrastive behavioral similarity em-
beddings for generalization in reinforcement learning.
Apeldoorn, D. and Kern-Isberner, G. (2017). An agent-
based learning approach for finding and exploiting
heuristics in unknown environments. In International
Symposium on Commonsense Reasoning.
Apeldoorn, D. and Kern-Isberner, G. (2017). Towards an
understanding of what is learned: Extracting multi-
abstraction-level knowledge from learning agents. In
Rus, V. and Markov, Z., editors, Proceedings of the
Thirtieth International Florida Artificial Intelligence
Research Society Conference, FLAIRS 2017, Marco
Island, Florida, USA, May 22-24, 2017, pages 764–
767. AAAI Press.
Barto, A. G. and Mahadevan, S. (2003). Recent advances
in hierarchical reinforcement learning. Discrete Event
Dynamic Systems, 13:41–77.
Barto, A. G., Sutton, R. S., and Watkins, C. (1989). Sequen-
tial decision problems and neural networks. Advances
in neural information processing systems, 2.
Brachman, R. and Levesque, H. (2004). Knowledge Rep-
resentation and Reasoning. The Morgan Kaufmann
Series in Artificial Intelligence. Morgan Kaufmann,
Amsterdam.
Christoffersen, P. J. K., Li, A. C., Icarte, R. T., and McIl-
raith, S. A. (2023). Learning symbolic representations
for reinforcement learning of non-markovian behav-
ior.
Evans, J. S. B. T. and Stanovich, K. E. (2013). Dual-process
theories of higher cognition: Advancing the debate.
Perspectives on Psychological Science, 8(3):223–241.
PMID: 26172965.
Guan, L., Sreedharan, S., and Kambhampati, S. (2022).
Leveraging approximate symbolic models for rein-
forcement learning via skill diversity.
Kaelbling, L. P., Littman, M. L., and Moore, A. W.
(1996). Reinforcement learning: A survey. CoRR,
cs.AI/9605103.
Kautz, H. (2020). Rochester hci. https://roc-hci.com/
announcements/the-third-ai-summer/. Accessed on
January 6th, 2024.
Landajuela, M., Petersen, B. K., Kim, S., Santiago, C. P.,
Glatt, R., Mundhenk, N., Pettit, J. F., and Faissol, D.
(2021). Discovering symbolic policies with deep re-
inforcement learning. In Meila, M. and Zhang, T.,
editors, Proceedings of the 38th International Con-
ference on Machine Learning, volume 139 of Pro-
ceedings of Machine Learning Research, pages 5979–
5989. PMLR.
Ledentsov, A. (2023). Knowledge base reuse with frame
representation in artificial intelligence applications.
IAIC Transactions on Sustainable Digital Innovation
(ITSDI), 4(2):146–154.
Sutton, R. S. (1988). Learning to predict by the methods of
temporal differences. Mach. Learn., 3(1):9–44.
Sutton, R. S., Szepesv
´
ari, C., Geramifard, A., and Bowling,
M. P. (2012). Dyna-style planning with linear func-
tion approximation and prioritized sweeping. arXiv
preprint arXiv:1206.3285.
Yu, D., Yang, B., Liu, D., and Wang, H. (2021). A survey
on neural-symbolic systems. CoRR, abs/2111.08164.
Towards Knowledge-Augmented Agents for Efficient and Interpretable Learning in Sequential Decision Problems
1019