Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning - A Study Intended for Large-scale Video Games

D. Taralla, Z. Qiu, A. Sutera, R. Fonteneau, D. Ernst

2016

Abstract

Video games have become more and more complex over the past decades. Today, players wander in visuallyand option- rich environments, and each choice they make, at any given time, can have a combinatorial number of consequences. However, modern artificial intelligence is still usually hard-coded, and as the game environments become increasingly complex, this hard-coding becomes exponentially difficult. Recent research works started to let video game autonomous agents learn instead of being taught, which makes them more intelligent. This contribution falls under this very perspective, as it aims to develop a framework for the generic design of autonomous agents for large-scale video games. We consider a class of games for which expert knowledge is available to define a state quality function that gives how close an agent is from its objective. The decision making policy is based on a confidence measurement on the growth of the state quality function, computed by a supervised learning classification model. Additionally, no stratagems aiming to reduce the action space are used. As a proof of concept, we tested this simple approach on the collectible card game Hearthstone and obtained encouraging results.

References

  1. Bauckhage, C., Thurau, C., and Sagerer, G. (2003). Learning human-like opponent behavior for interactive computer games. In Pattern Recognition, pages 148- 155. Springer.
  2. Breiman, L. (2001). Random forests. Machine Learning, 45(1):5-32.
  3. Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., and Colton, S. (2012). A survey of monte carlo tree search methods. Computational Intelligence and AI in Games, IEEE Transactions on, 4(1):1-43.
  4. Bunescu, R., Ge, R., Kate, R. J., Marcotte, E. M., Mooney, R. J., Ramani, A. K., and Wong, Y. W. (2005). Comparative experiments on learning information extractors for proteins and their interactions. Artificial intelligence in medicine, 33(2):139-155.
  5. Cowling, P. I., Ward, C. D., and Powley, E. J. (2012). Ensemble determinization in monte carlo tree search for the imperfect information card game magic: The gathering. Computational Intelligence and AI in Games, IEEE Transactions on, 4(4):241-257.
  6. Craven, J. B. M. (2005). Markov networks for detecting overlapping elements in sequence data. Advances in Neural Information Processing Systems, 17:193.
  7. Davis, J. and Goadrich, M. (2006). The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233-240. ACM.
  8. Frandsen, F., Hansen, M., Sørensen, H., Sørensen, P., Nielsen, J. G., and Knudsen, J. S. (2010). Predicting player strategies in real time strategy games. PhD thesis, Masters thesis.
  9. Gemine, Q., Safadi, F., Fonteneau, R., and Ernst, D. (2012). Imitative learning for real-time strategy games. In Computational Intelligence and Games (CIG), 2012 IEEE Conference on, pages 424-429. IEEE.
  10. Geurts, P., Ernst, D., and Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1):3-42.
  11. Goadrich, M., Oliphant, L., and Shavlik, J. (2004). Learning ensembles of first-order clauses for recallprecision curves: A case study in biomedical information extraction. In Inductive logic programming, pages 98-115. Springer.
  12. Gorman, B. and Humphrys, M. (2007). Imitative learning of combat behaviours in first-person computer games. Proceedings of CGAMES.
  13. Hanley, J. A. and McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29-36.
  14. Lee, C.-S., Wang, M.-H., Chaslot, G., Hoock, J.-B., Rimmel, A., Teytaud, F., Tsai, S.-R., Hsu, S.-C., and Hong, T.-P. (2009). The computational intelligence of mogo revealed in taiwan's computer go tournaments. Computational Intelligence and AI in Games, IEEE Transactions on, 1(1):73-89.
  15. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830.
  16. Provost, F. J., Fawcett, T., and Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In ICML, volume 98, pages 445-453.
  17. Rimmel, A., Teytaud, F., Lee, C.-S., Yen, S.-J., Wang, M.-H., and Tsai, S.-R. (2010). Current frontiers in computer go. Computational Intelligence and AI in Games, IEEE Transactions on, 2(4):229-238.
  18. Safadi, F., Fonteneau, R., and Ernst, D. (2015). Artificial intelligence in video games: Towards a unified framework. International Journal of Computer Games Technology, 2015.
  19. Sailer, F., Buro, M., and Lanctot, M. (2007). Adversarial planning through strategy simulation. In Computational Intelligence and Games, 2007. CIG 2007. IEEE Symposium on, pages 80-87. IEEE.
  20. Soemers, D. (2014). Tactical planning using mcts in the game of starcraft1. Master's thesis, Maastricht University.
  21. Sutera, A. (2013). Characterization of variable importance measures derived from decision trees. Master's thesis, University of Liège.
  22. van den Herik, H. J. (2010). The drosophila revisited. ICGA journal, 33(2):65-66.
  23. Ward, C. D. and Cowling, P. I. (2009). Monte carlo search applied to card selection in magic: The gathering. In Computational Intelligence and Games, 2009. CIG 2009. IEEE Symposium on, pages 9-16. IEEE.
Download


Paper Citation


in Harvard Style

Taralla D., Qiu Z., Sutera A., Fonteneau R. and Ernst D. (2016). Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning - A Study Intended for Large-scale Video Games . In Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-172-4, pages 264-271. DOI: 10.5220/0005666202640271


in Bibtex Style

@conference{icaart16,
author={D. Taralla and Z. Qiu and A. Sutera and R. Fonteneau and D. Ernst},
title={Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning - A Study Intended for Large-scale Video Games},
booktitle={Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2016},
pages={264-271},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005666202640271},
isbn={978-989-758-172-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning - A Study Intended for Large-scale Video Games
SN - 978-989-758-172-4
AU - Taralla D.
AU - Qiu Z.
AU - Sutera A.
AU - Fonteneau R.
AU - Ernst D.
PY - 2016
SP - 264
EP - 271
DO - 10.5220/0005666202640271