An Extended Q Learning System with Emotion State to Make Up an
Agent with Individuality
Masanao Obayashi
1
, Shunsuke Uto
1
, Takashi Kuremoto
1
, Shingo Mabu
1
and Kunikazu Kobayashi
2
1
Graduate School of Science and Engineering, Yamaguchi University, Ube Yamaguchi, Japan
2
School of Information Science and Technology, Aichi Prefectural University, Nagakute, Aichi, Japan
Keywords: Reinforcement Learning, Amygdala, Emotional Model, Q Learning, Individuality.
Abstract: Recently, researches for the intelligent robots incorporating knowledge of neuroscience have been actively
carried out. In particular, a lot of researchers making use of reinforcement learning have been seen,
especially, "Reinforcement learning methods with emotions", that has already proposed so far, is very
attractive method because it made us possible to achieve the complicated object, which could not be
achieved by the conventional reinforcement learning method, taking into account of emotions. In this paper,
we propose an extended reinforcement (Q) learning system with amygdala (emotion) models to make up
individual emotions for each agent. In addition, through computer simulations that the proposed method is
applied to the goal search problem including a variety of distinctive solutions, it finds that each agent is able
to have each individual solution.
1 INTRODUCTION
Reinforcement learning (RL) for the behavior
selection of agents/robots has been proposed since
1950’s. As a machine learning method, it uses trial-
and-error search, and rewards are given by the
environment as the results of exploration
/exploitation behaviors of the agent to improve its
policy of the action selection (Sutton et al., 1998).
The architecture of RL system is shown in Fig.1.
However, when human makes a decision, he finally
does it using the various functions in the brain, e.g.,
emotion. Even the environmental state is the same;
many different selections of the behavior may be
done depending on his emotional state then.
A computational emotion model has been
proposed by J. Moren and C. Balkenius (Moren et
al., 2001). Their emotion model consists of four
parts of the brain: “thalamus, sensory cortex,
orbitofrontal cortex and amygdala” as shown in
Fig.2. Fig.2 represents the flow from receptors of
sensory stimuli to assessing the value of it. So far,
the emotion model has been applied to various fields,
especially, the control field of something. For
example, H. Rouhani, et al. applied it to speed and
position control of the switched reluctance motor
(Rouhani, et al., 2007) and micro heat exchanger
control (Rouhani, et al., 2007). N. Goerke applied it
to the robot control (Nils, 2006), E. Daglari, et al.
applied it to behavioral task processing for cognitive
robot (Daglari, et al., 2009). On the other hand,
Obayashi et al. combined emotion model with
reinforcement Q learning to realize the agent with
individuality (Obayashi, et al., 2012). F. Yang et al.
also proposed the agent’s behaviour decision-
making system based on artificial emotion using
cerebellar model arithmetic computer (CMAC)
network (Fuping, et al., 2014). H. Xue et al.
proposed emotion expression method of robot with
personality to enable robots have different
personalities (Xue, et al., 2013). Kuremoto et al.
applied it to a dynamic associative memory system
(Kuremoto, et al., 2009). All of these applications
have good results.
In this paper, we propose an interesting
reinforcement learning system equipping with
emotional models to make up “individuality” for the
agent.
The rest of this paper is organized as follows. In
Section 2, a computational emotion model we used
is provided. Our proposed hierarchical Q learning
system with emotions is given in Section 3.
70
Obayashi, M., Uto, S., Kuremoto, T., Mabu, S. and Kobayashi, K..
An Extended Q Learning System with Emotion State to Make Up an Agent with Individuality.
In Proceedings of the 7th International Joint Conference on Computational Intelligence (IJCCI 2015) - Volume 3: NCTA, pages 70-78
ISBN: 978-989-758-157-1
Copyright
c
2015 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved