tions in comparison with equivalent decision when
taken by a human specialist. In the moment, we are
evaluating the performance of the agent when han-
dling specific management situations; checking the
performance of the algorithm to process variations
of scenario; and changing the set of attributes used
to generate the rules, which can make them less sus-
ceptible to influence. Such statements are objects of
study for future research.
ACKNOWLEDGEMENTS
This research has been supported by Araucária
Foundation and National Council for Scientific and
Technological Development (CNPq), under grants
numbers 378/2014 and 484859/2013-7 respectively.
REFERENCES
Chakraborty, D. and Stone, P. (2014). Multiagent learning
in the presence of memory-bounded
agents. Autonomous Agents and Multi-Agent Systems,
28(2):182–213.
Chapelle, J., Simonin, O., and Ferber, J. (2002). How situ-
ated agents can learn to cooperate by monitoring their
neighbors’ satisfaction. ECAI, 2:68–78.
DeLoach, S. and Valenzuela, J. (2007). An agent envi-
ronment interaction model. In Padgham, L. and
Zambonelli, F., editors, Agent-Oriented Software En-
gineering VII, volume 4405 of Lecture Notes in Com-
puter Science, pages 1–18. Springer Berlin Heidel-
berg.
Devlin, S., Yliniemi, L., Kudenko, D., and Tumer, K.
(2014). Potential-based difference rewards for multia-
gent reinforcement learning. In Proceedings of
the 2014 International Conference on Autonomous
Agents and Multi-agent Systems, AAMAS ’14, pages
165–172, Richland, SC. International Foundation for
Autonomous Agents and Multiagent Systems.
Efthymiadis, K. and Kudenko, D. (2015). Knowledge
revision for reinforcement learning with abstract
mdps. In Proceedings of the 2015 International Con-
ference on Autonomous Agents and Multiagent Sys-
tems, AAMAS ’15, pages 763–770, Richland, SC. In-
ternational Foundation for Autonomous Agents and
Multiagent Systems.
Grze’s, M. and Hoey, J. (2011). Efficient planning in
rmax. In The 10th International Conference on Auton-
omous Agents and Multiagent Systems - Volume 3,
AAMAS ’11, pages 963–970, Richland, SC. Interna-
tional Foundation for Autonomous Agents and Multi-
agent Systems.
Kaelbling, L. P., Littman, M. L., and Moore, A. P. (1996).
Reinforcement learning: A survey. Journal of Artifi-
cial Intelligence Research, 4:237–285.
Ribeiro, R., Borges, A., and Enembreck, F. (2008). Inter-
action models for multiagent reinforcement learning.
In Computational Intelligence for Modelling Control
Automation, 2008 International Conference on, pages
464–469.
Ribeiro, R. and Enembreck, F. (2013). A sociologically
inspired heuristic for optimization algorithms: A case
study on ant systems. Expert Systems with Applica-
tions, 40(5):1814 – 1826.
Ribeiro, R., Enembreck, F., and Koerich, A. (2006). A
hybrid learning strategy for discovery of policies of
action. In Sichman, J., Coelho, H., and Rezende, S.,
editors, Advances in Artificial Intelligence -
IBERAMIASBIA 2006, volume 4140 of Lecture Notes
in Computer Science, pages 268–277. Springer Berlin
Heidelberg.
Ribeiro, R., Ronszcka, A. F., Barbosa, M. A. C., Favarim,
F., and Enembreck, F. (2013). Updating strategies
of policies for coordinating agent swarm in dynamic
environments. In Hammoudi, S., Maciaszek, L. A.,
Cordeiro, J., and Dietz, J. L. G., editors, ICEIS (1),
pages 345–356. SciTePress.
Saito, M. and Kobayashi, I. (2016). A study on efficient
transfer learning for reinforcement learning using
sparse coding.
Automation and Control Engineering,
4(4):324 – 330.
Stone, P. and Veloso, M. (2000). Multiagent systems: A
survey from a machine learning perspective. Autono-
mous
Robots, 8(3):345–383.
Sutton, R. S. and Barto, A. G. (1998). Reinforcement
Learning : An Introduction. MIT Press.
Tesauro, G. (1995). Temporal difference learning and
tdgammon. Commun. ACM, 38(3):58–68.
Walsh, T. J., Goschin, S., and Littman, M. L. (2010). Inte-
grating sample-based planning and model-based rein-
forcement learning. In Fox, M. and Poole, D., editors,
AAAI. AAAI Press.
Watkins, C. J. C. H. and Dayan, P. (1992). Q-learning.
Machine Learning, 8(3):272–292.
Xinhai, X. and Lunhui, X. (2009). Traffic signal control
agent interaction model based on game theory
and reinforcement learning. In Computer Science-
Technology and Applications, 2009. IFCSTA ’09. In-
ternational Forum on, volume 1, pages 164–168.
Xuan, P. and Lesser, V. (2002). Multi-Agent Policies:
From Centralized Ones to Decentralized Ones. Pro-
ceedings of the 1st International Joint Conference
on Autonomous Agents and Multiagent Systems, Part
3:1098–1105.
Zhang, C. and Lesser, V. (2010). Multi-Agent Learning
with Policy Prediction. In Proceedings of the
24th AAAI Conference on Artificial Intelligence, pages
927–934, Atlanta.
Zhang, C. and Lesser, V. (2013). Coordinating Multi-
Agent Reinforcement Learning with Limited Commu-
nication. In Ito, J. and Gini, S., editors, Proceedings of
the 12th International Conference on Autonomous
Agents and Multiagent Systems, pages 1101–1108, St.
Paul, MN. IFAAMAS.