Why you should Empirically Evaluate your AI Tool - From SPOSH to yaPOSH

Jakub Gemrot; Martin Černý; Cyril Brom

doi:10.5220/0004818604610468

Why you should Empirically Evaluate your AI Tool - From SPOSH to yaPOSH

Jakub Gemrot, Martin Černý, Cyril Brom

2014

Abstract

The autonomous agents community has been developing specific agent-oriented programming languages for more than two decades. Some of the languages have been considered by academia as possible tools for developing artificial intelligence (AI) for non-player characters in computer games. However, as most of the research related to the development of new AI languages within the agent community does not reach production quality, they are seldom adopted by the games industry. As our experience has shown, it is not only the actual language that matters. The toolchain supporting the language and its integration (or lack thereof) with a development environment can make or break the success of the language in practical applications. In this paper, we describe our methodology for evaluating AI languages and associated tools in practice based on controlled experiments with programmers and/or game designers. The methodology is demonstrated on our development and evaluation of SPOSH and yaPOSH high level agent behavior languages. We show that incomplete development support may prevent the tool from giving any benefit to developers at all. We also present our experience from transferring knowledge gained during yaPOSH development to actual AI design for an upcoming AAA game.

References

Bryson, J. J. (2001). Intelligence by design: Principles of Modularity and Coordination for Engineering Complex Adaptive Agent. PhD thesis, MIT, Department of EECS, Cambridge, MA.
Champandard, A. J. (2010). Finding a better way to Mordor. Presentation, CIG 2010. http://vimeo.com/14390998 Accessed 2014-01-05.
Cutumisu, M., Onuczko, C., McNaughton, M., Roy, T., Schaeffer, J., Schumacher, A., Siegel, J., Szafron, D., Waugh, K., Carbonaro, M., et al. (2007). ScriptEase: A generative/adaptive programming paradigm for game scripting. Science of Computer Programming, 67(1):32 - 58.
Hollingsed, T. and Novick, D. G. (2007). Usability inspection methods after 15 years of research and practice. In Proceedings of the 25th annual ACM international conference on Design of communication, pages 249- 255. ACM.
Jeffries, R., Miller, J. R., Wharton, C., and Uyeda, K. (1991). User interface evaluation in the real world: a comparison of four techniques. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 119-124. ACM.
Karat, C.-M., Campbell, R., and Fiegel, T. (1992). Comparison of empirical testing and walkthrough methods in user interface evaluation. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 397-404. ACM.
Nielsen, J. and Phillips, V. L. (1993). Estimating the relative usability of two interfaces: Heuristic, formal, and empirical methods compared. In Proceedings of the INTERACT'93 and CHI'93 conference on Human factors in computing systems, pages 214-221. ACM.
Píbil, R., Novák, P., Brom, C., and Gemrot, J. (2012). Notes on pragmatic agent-programming with Jason. In Programming Multi-Agent Systems, volume LNCS 7217, pages 58-73. Springer.
Sadowski, C. and Kurniawan, S. (2011). Heuristic evaluation of programming language features: two parallel programming case studies. In Proceedings of the 3rd ACM SIGPLAN workshop on Evaluation and usability of programming languages and tools, pages 9-14. ACM.
Stefik, A., Siebert, S., Stefik, M., and Slattery, K. (2011). An empirical comparison of the accuracy rates of novices using the Quorum, Perl, and Randomo programming languages. In Proceedings of the 3rd ACM SIGPLAN workshop on Evaluation and usability of programming languages and tools, pages 3-8. ACM.

Download

Paper Citation

in Harvard Style

Gemrot J., Černý M. and Brom C. (2014). Why you should Empirically Evaluate your AI Tool - From SPOSH to yaPOSH . In Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-015-4, pages 461-468. DOI: 10.5220/0004818604610468

in Bibtex Style

@conference{icaart14,
author={Jakub Gemrot and Martin Černý and Cyril Brom},
title={Why you should Empirically Evaluate your AI Tool - From SPOSH to yaPOSH},
booktitle={Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2014},
pages={461-468},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004818604610468},
isbn={978-989-758-015-4},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - Why you should Empirically Evaluate your AI Tool - From SPOSH to yaPOSH
SN - 978-989-758-015-4
AU - Gemrot J.
AU - Černý M.
AU - Brom C.
PY - 2014
SP - 461
EP - 468
DO - 10.5220/0004818604610468