AI agents in the near future will produce results far
better in almost every aspect than a human possibly
can. We aim to leverage this phenomenon so as to
judge human decisions by ‘machines’ in our model.
These procedures can also be employed to model
a decision-maker by tuning down to match the
decision-maker’s native characteristics. Numerous
aspects like a speed-accuracy trade-off, effect of pro-
crastination and the impact of time pressure also can
be analyzed, and their effect on performances by the
decision-makers can be tested. Other fields where
this model can be applied include, but are not limited
to, economics, psychology, test-takings, sports, stock
market trading, and software benchmarking.
We wish to devise a tool to measure the qual-
ity of human decisions from the performances. We
hope this tool can be used for personnel assessment to
cheating detection. Though we have concentrated on
the chess domain which is a constrained environment,
we wish to apply the learning to adapt the model to
fit in other domains, from test-taking to stock market
trading.
8 STAGE OF THE RESEARCH
Research on judging decisions made by fallible (hu-
man) agents is not as much advanced as research on
finding optimal decisions, and on the supervision of
AI agents’ decisions by humans. Human decisions
are often influenced by various factors, such as risk,
uncertainty, time pressure, and depth of cognitive ca-
pability, whereas decisions by an AI agent can be ef-
fectively optimal without these limitations. The con-
cept of ‘depth’, a well-defined term in game theory
(including chess), does not have a clear formulation
in decision theory. To quantify ‘depth’ in decision
theory, we can configure an AI agent of supreme com-
petence to ‘think’ at depths beyond the capability of
any human, and in the process collect evaluations of
decisions at various depths. One research goal is to
create an intrinsic measure of the depth of thinking
required to answer certain test questions, toward a re-
liable means of assessing their difficulty apart from
the item-response statistics.
Currently, we are working on relating the depth of
cognition by humans to depths of searching alterna-
tives, and using this information to infer the quality
of decisions made, so as to judge the decision-maker
from his decisions. Our research extends the model
of Regan and Haworth to quantify depth, plus related
measures of complexity and difficulty, in the context
of chess. We use large data from real chess tourna-
ments and evaluations of chess programs (AI agents)
of strength beyond all human players. We then seek to
transfer the results to other decision-making fields in
which effectively optimal judgments can be obtained
from either hindsight, answer banks, or powerful AI
agents. In some applications, such as multiple-choice
tests, we establish an isomorphism of the underlying
mathematical quantities, which induces a correspon-
dence between various measurement theories and the
chess model. We provide results toward the objective
of applying the correspondence in reverse to obtain
and quantify the measure of depth and difficulty for
multiple-choice tests, stock market trading, and other
real-world applications and utilizing this knowledge
to design intelligent and automated systems to judge
the quality of human or artificial agents.
REFERENCES
Allis, L. V. (1994). Searching for solutions in games and
artificial intelligence. PhD thesis, Rijksuniversiteit
Maastricht, Maastricht, the Netherlands.
Andersen, E. (1973). Conditional inference for multiple-
choice questionnaires. Brit. J. Math. Stat. Psych.,
26:31–44.
Andrich, D. (1978). A rating scale formulation for ordered
response categories. Psychometrika, 43:561–573.
Andrich, D. (1988). Rasch Models for Measurement. Sage
Publications, Beverly Hills, California.
Baker, F. (2004). Item response theory : parameter estima-
tion techniques. Marcel Dekker, New York.
Baker, F. B. (2001). The Basics of Item Response Theory.
ERIC Clearinghouse on Assessment and Evaluation.
Biswas, T. and Regan, K. (2015). Quantifying depth and
complexity of thinking and knowledge. In proceed-
ings, International Conference on Agents and Artifi-
cial Intelligence (ICAART).
Busemeyer, J. R. and Townsend, J. T. (1993). Decision
field theory: a dynamic-cognitive approach to deci-
sion making in an uncertain environment. Psycholog-
ical review, 100(3):432.
Chabris, C. and Hearst, E. (2003). Visualization, pattern
recognition, and forward search: Effects of playing
speed and sight of the position on grandmaster chess
errors. Cognitive Science, 27:637–648.
DiFatta, G., Haworth, G., and Regan, K. (2009). Skill rating
by Bayesian inference. In Proceedings, 2009 IEEE
Symposium on Computational Intelligence and Data
Mining (CIDM’09), Nashville, TN, March 30–April 2,
2009, pages 89–94.
Fox, C. R. (1999). Strength of evidence, judged probability,
and choice under uncertainty. Cognitive Psychology,
38(1):167–189.
Fox, C. R. and Tversky, A. (1998). A belief-based account
of decision under uncertainty. Management Science,
44(7):879–895.
MeasuringIntrinsicQualityofHumanDecisions
49