
8 CONCLUSION AND FUTURE
WORK
We introduced and empirically validated model-
agnostic metrics for evaluating black-box classifica-
tion algorithms: algorithmic bias, entropic expres-
sivity, and algorithmic capacity. These information-
theoretic metrics provide interpretable insights into
model behavior. Moving forward, we hope to explore
the behavior of these metrics with non-static data and
data of varying entropy. Given the methods’ reliance
on bootstrapping and retraining, we must also test
these metric estimations on larger and more complex
algorithms and verify their practical applicability with
modern ecosystems.
REFERENCES
Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M.,
Hansen, K., and M
¨
uller, K.-R. (2010). How to explain
individual classification decisions. The Journal of Ma-
chine Learning Research, 11:1803–1831.
Bashir, D., Monta
˜
nez, G. D., Sehra, S., Segura, P. S., and
Lauw, J. (2020). An information-theoretic perspective
on overfitting and underfitting. In AI 2020: Advances
in Artificial Intelligence: 33rd Australasian Joint Con-
ference, AI 2020, Canberra, ACT, Australia, November
29–30, 2020, Proceedings 33, pages 347–358. Springer.
Bekerman, S., Chen, E., Lin, L., and Monta
˜
nez, G. D.
(2022). Vectorization of bias in machine learning algo-
rithms. In ICAART (2), pages 354–365.
Bramer, M. (2007). Avoiding overfitting of decision trees.
Principles of data mining, pages 119–134.
Butakov, I., Tolmachev, A., Malanchuk, S., Neopryatnaya,
A., and Frolov, A. (2024). Mutual information estimation
via normalizing flows. arXiv preprint arXiv:2403.02187.
Craven, M. and Shavlik, J. (1995). Extracting tree-
structured representations of trained networks. Advances
in neural information processing systems, 8.
Dua, D. and Graff, C. (2017). Uci machine learning reposi-
tory.
Lauw, J., Macias, D., Trikha, A., Vendemiatti, J., and Mon-
tanez, G. D. (2019). The bias-expressivity trade-off.
arXiv preprint arXiv:1911.04964.
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach
to interpreting model predictions. Advances in neural
information processing systems, 30.
Mitchell, T. M. (1980). The Need for Biases in Learning
Generalizations. Department of Computer Science, Lab-
oratory for Computer Science Research, Rutgers Univ.
Montanez, G. D. (2017a). The famine of forte: Few search
problems greatly favor your algorithm. In 2017 IEEE
International Conference on Systems, Man, and Cyber-
netics (SMC), pages 477–482. IEEE.
Montanez, G. D. (2017b). Why machine learning
works. URL https://www. cs. cmu. edu/˜ gmon-
tane/montanez dissertation. pdf.
Monta
˜
nez, G. D., Bashir, D., and Lauw, J. (2021). Trading
bias for expressivity in artificial learning. In Agents and
Artificial Intelligence: 12th International Conference,
ICAART 2020, Valletta, Malta, February 22–24, 2020,
Revised Selected Papers 12, pages 332–353. Springer.
Monta
˜
nez, G. D., Hayase, J., Lauw, J., Macias, D., Trikha,
A., and Vendemiatti, J. (2019). The futility of bias-free
learning and search. In Australasian Joint Conference on
Artificial Intelligence, pages 277–288. Springer.
Mucherino, A., Papajorgji, P. J., Pardalos, P. M.,
Mucherino, A., Papajorgji, P. J., and Pardalos, P. M.
(2009). K-nearest neighbor classification. Data mining
in agriculture, pages 83–106.
Parmar, A., Katariya, R., and Patel, V. (2019). A review on
random forest: An ensemble classifier. In International
conference on intelligent data communication technolo-
gies and internet of things (ICICI) 2018, pages 758–763.
Springer.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cour-
napeau, D., Brucher, M., Perrot, M., and Duchesnay, E.
(2011). Scikit-learn: Machine learning in Python. Jour-
nal of Machine Learning Research, 12:2825–2830.
Powers, D. M. (2020). Evaluation: from precision, recall
and f-measure to roc, informedness, markedness and cor-
relation. arXiv preprint arXiv:2010.16061.
Probst, P. and Boulesteix, A.-L. (2018). To tune or not to
tune the number of trees in random forest. Journal of
Machine Learning Research, 18(181):1–18.
Ramalingam, R., Dice, N. E., Kaye, M. L., and Monta
˜
nez,
G. D. (2022). Bounding generalization error through bias
and capacity. In 2022 International Joint Conference on
Neural Networks (IJCNN), pages 1–8. IEEE.
Ribeiro, M. T., Singh, S., and Guestrin, C. (2016a). Model-
agnostic interpretability of machine learning. arXiv
preprint arXiv:1606.05386.
Ribeiro, M. T., Singh, S., and Guestrin, C. (2016b). ”why
should I trust you?”: Explaining the predictions of any
classifier. CoRR, abs/1602.04938.
Rong, K., Khant, A., Flores, D., and Monta
˜
nez, G. D.
(2021). The label recorder method: Testing the memo-
rization capacity of machine learning models. In Interna-
tional Conference on Machine Learning, Optimization,
and Data Science, pages 581–595. Springer.
Segura, P. S., Lauw, J., Bashir, D., Shah, K., Sehra, S., Ma-
cias, D., and Montanez, G. (2019). The labeling distribu-
tion matrix (ldm): a tool for estimating machine learning
algorithm capacity. arXiv preprint arXiv:1912.10597.
Sehra, S., Flores, D., and Monta
˜
nez, G. D. (2021). Undecid-
ability of Underfitting in Learning Algorithms. In 2021
2nd International Conference on Computing and Data
Science (CONF-CDS), pages 591–594.
Strumbelj, E. and Kononenko, I. (2010). An efficient ex-
planation of individual classifications using game theory.
The Journal of Machine Learning Research, 11:1–18.
Ying, X. (2019). An overview of overfitting and its solu-
tions. In Journal of physics: Conference series, volume
1168, page 022022. IOP Publishing.
Model Characterization with Inductive Orientation Vectors
681