Coalition Formation for Simulating and Analyzing Iterative Prisoner’s Dilemma
Udara Weerakoon
2015
Abstract
In this paper, we analyze the strictly competitive iterative version of the non-zero-sum two player game, the Prisoner’s Dilemma. This was accomplished by simulating the players in a memetic framework. Our primary motivation involves solving the tragedy of the commons problem, a dilemma in which individuals acting selfishly destroy the shared resources of the population. In solving this problem, we identify strategies for applying coalition formation to the spatial distribution of cooperative or defective agents. We use two reinforcement learning methods, temporal difference learning and Q-learning, on the agents in the environment. This overcomes the negative impact of random selection without cooperation between neighbors. Agents of the memetic framework form coalitions in which the leaders make the decisions as a way of improving performance. By imposing a reward and cost schema to the multiagent system, we are able to measure the performance of the individual leader as well as the performance of the organization.
References
- Aunger, R. (2000). Darwinzing Culture: The Status of Memetics as a Science. Oxford University Press.
- Axelrod, R. (1995). Building New Political Actors: A model for the Emergence of New Political Actors. Artificial Societies: The Computer Simulation of Social Life. London: University College Press.
- Burguillo-Rial, J. C. (2009). A memetic framework for describing and simulating spatial prisoner's dilemma with coalition formation. In The 8th International Conference on Autonomous Agents and Multiagent Systems, Budapest, Hungary.
- Gotts, N. M., Polhill, J. G., and Law, A. N. R. (2003). Agent-based simulation in the study of social dilemmas. Artif. Intell. Rev., 19(1):3-92.
- Griffiths, N. (2008). Tags and image scoring for robust cooperation. In Padgham, Parkes, Mller, and Parsons, editors, Proc. of 7th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008), pages 575- 582, Estoril, Portugal.
- Hardin, G. (1968). The tragedy of the commons. Science, 162:1243-1248.
- Leng, J., Sathyaraj, B., and Jain, L. (2008). Temporal difference learning and simulated annealing for optimal control: A case study. In Agent and Multi-Agent Systems: Technologies and Applications, volume 4953, pages 495-504.
- Nowak, M. A. and Sigmund, K. (1998). Evolution of indirect reciprocity by image scoring. Nature, 393:573- 577.
- Riolo, R. L., Cohen, M. D., and Axelrod, R. (2001). Evolution of cooperation without reciprocity. Nature, 414:441-443.
- Smith, J. M. and Prince, G. R. (1973). The logic of animal conflict. Nature, 246:15-18.
- Watkins, C. and Dayan, P. (1992). Technical note: Qlearning. Machine Learning, 8(3-4):279-292.
Paper Citation
in Harvard Style
Weerakoon U. (2015). Coalition Formation for Simulating and Analyzing Iterative Prisoner’s Dilemma . In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-073-4, pages 22-31. DOI: 10.5220/0005199000220031
in Bibtex Style
@conference{icaart15,
author={Udara Weerakoon},
title={Coalition Formation for Simulating and Analyzing Iterative Prisoner’s Dilemma},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2015},
pages={22-31},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005199000220031},
isbn={978-989-758-073-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - Coalition Formation for Simulating and Analyzing Iterative Prisoner’s Dilemma
SN - 978-989-758-073-4
AU - Weerakoon U.
PY - 2015
SP - 22
EP - 31
DO - 10.5220/0005199000220031