
Table 2: Estimated strength of imaginary Pok
´
emons or real Pok
´
emons not present in the dataset. The last one is the Pok
´
emon
most frequently used in the official online single battles of the latest season at the time of the experiment, yielding higher
rating than others with the same or higher total stat.
HP Attack Defence Sp. Atk Sp. Def Speed Type Output of E
70 70 70 70 70 70 Steel 6.15
80 80 80 80 80 80 Steel 6.99
90 90 90 90 90 90 Steel 7.83
100 100 100 100 100 100 Steel 8.67
100 100 100 100 100 100 Fairy 8.65
100 100 100 100 100 100 Normal 8.58
100 100 100 100 100 100 Ice 8.54
100 100 100 100 100 100 Bug 8.52
100 134 110 70 84 72 Rock, Electric 6.28
55 55 55 135 135 135 Fairy, Ghost 9.84
into neural network structures. Our method success-
fully quantified desired properties in both symmetric
and asymmetric experimental settings.
Our framework provides a new ground for data
mining and poses an alternative to the format of
dataset, especially in environments where comparison
data is structurally easier to collect. In online plat-
forms where people choose and click something, we
can just record those choices as training data. The
same can be said for online multi-player games, as it
creates a lot of match results between decks or par-
ties. Even when a survey is required, gathering data
of comparisons can be a better option than graded
human scores, since comparisons are more precise
and reliable, and our rating based on it provides a
strong insight into the outcome of comparisons be-
tween items. We expect a lot of such applications be-
ing conducted in the future.
ACKNOWLEDGEMENTS
I would like to thank Professor Hideki Tsuiki for
meaningful discussions. I am also grateful to the
ICAART referees for useful comments.
REFERENCES
Bradley, R. A. and Terry, M. E. (1952). Rank analysis of in-
complete block designs: I. the method of paired com-
parisons. Biometrika, 39(3/4):324–345.
Elo, A. E. and Sloan, S. (1978). The rating of chessplayers:
Past and present.
Ford, L. R. J. (1957). Solution of a ranking problem
from binary comparisons. The American Mathemat-
ical Monthly, 64(8P2):28–33.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Herbrich, R., Minka, T., and Graepel, T. (2006).
Trueskill
TM
: a bayesian skill rating system. Advances
in neural information processing systems, 19.
Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing
the dimensionality of data with neural networks. sci-
ence, 313(5786):504–507.
Hunter, D. R. (2004). Mm algorithms for general-
ized bradley-terry models. The annals of statistics,
32(1):384–406.
Inan, H., Khosravi, K., and Socher, R. (2016). Tying word
vectors and word classifiers: A loss framework for
language modeling. arXiv preprint arXiv:1611.01462.
Jalili, M., Ahmadian, S., Izadi, M., Moradi, P., and Salehi,
M. (2018). Evaluating collaborative filtering recom-
mender algorithms: A survey. IEEE Access, 6:74003–
74024.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. Proceedings of the IEEE, 86(11):2278–2324.
Li, S., Ma, H., and Hu, X. (2021). Neural image beauty
predictor based on bradley-terry model. arXiv preprint
arXiv:2111.10127.
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach
to interpreting model predictions. Advances in neural
information processing systems, 30.
Maystre, L. (2015). choix.
https://choix.lum.li/en/latest/index.html.
Menke, J. E. and Martinez, T. R. (2008). A bradley–terry
artificial neural network model for individual ratings
in group competitions. Neural computing and Appli-
cations, 17:175–186.
T7 (2017). Pokemon- weedle’s cave.
https://www.kaggle.com/datasets/terminus7/pokemon-
challenge.
Xi, W.-D., Huang, L., Wang, C.-D., Zheng, Y.-Y., and Lai,
J.-H. (2022). Deep rating and review neural network
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
428