Model Characterization with Inductive Orientation Vectors

Kerria Pang-Naylor

1 a

, Eric Chen

1,2 b

and George D. Monta

nez

1 c

AMISTAD Lab, Dept. of Computer Science, Harvey Mudd College, Claremont, CA, U.S.A.

Department of Computer Science, Stanford University, Stanford, CA, U.S.A.

Keywords:

Inductive Orientation Vector, Inductive Bias, Algorithmic Bias, Algorithmic Capacity, Entropic Expressivity,

Algorithmic Search Framework, Interpretable AI, Labeling Distribution Matrix.

Abstract:

As models rise in complexity, black-box evaluation and interpretation methods become critical. We introduce

estimation methods for characterizing model-theoretic quantities such as algorithm ﬂexibility, responsiveness

to changes in training data, and ability to specialize. These methods are applicable to any black-box clas-

siﬁcation algorithm. Past theoretical work has shown how such qualities affect probability of task success,

generalization, and tendency to overﬁt. We perform metric estimations of interpretable models across hy-

perparameters and corroborate the metrics’ behavior with known algorithm heuristics. This work presents a

general model-agnostic interpretability tool.

1 INTRODUCTION

Machine learning practitioners face seemingly end-

less choices of models and hyperparameters. With

this, scalable methods to evaluate and interpret al-

gorithms are critical. Model-agnostic techniques –

i.e., methods approaching models as black box func-

tions – provide ﬂexibility crucial for describing highly

complex algorithms (e.g., deep neural networks) and

straightforward model comparison (Ribeiro et al.,

2016a).

The inductive orientation vector offers one such

black-box evaluation and interpretation technique. As

a vectorized representation of a trained model’s in-

ductive bias (Mitchell, 1980), one can easily com-

pare black-box algorithms and identify model rela-

tionships (Bekerman et al., 2022). Grounded in the

algorithmic search framework (Montanez, 2017a), the

inductive orientation vector can be used to calcu-

late interpretable model characteristics, namely, en-

tropic expressivity, algorithmic capacity, and algorith-

mic bias (Bekerman et al., 2022). These metrics de-

scribe, respectively, an algorithm’s ﬂexibility, respon-

siveness to changes in training data, and ability to spe-

cialize (Bashir et al., 2020; Lauw et al., 2019). Un-

like established model-agnostic evaluation and inter-

https://orcid.org/0009-0007-3329-5211

https://orcid.org/0000-0002-0469-3858

https://orcid.org/0000-0002-1333-4611

pretability methods, the inductive orientation vector

produces understandable model-theoretic metrics that

are generalizeable to entire trained model behavior.

Past work formalized the inductive orientation

vector and analyzed common algorithms’ relation-

ships based on pairwise vector distances (Bekerman

et al., 2022). However, the inductive orientation vec-

tor’s potential use as a model evaluation and charac-

terization method remains unexplored.

We present empirical estimations and analyses

of interpretable model characteristics – algorithmic

bias, algorithmic capacity, and entropic expressivity –

through the inductive orientation vector. This method

may be applied to any black-box classiﬁer, i.e., met-

rics are estimated given only input and output data.

We ground this method by corroborating the results

of interpretable classiﬁcation models like decision

trees or k-nearest neighbors with known, algorithm-

speciﬁc theoretical characteristics (Section 4). Exper-

iments over a range of algorithms and datasets also

conﬁrm trade-off bounds between entropic expressiv-

ity and algorithmic bias (Lauw et al., 2019; Bashir

et al., 2020) that have only been shown theoretically

(Section 5). Our work presents and veriﬁes a new

method of model-agnostic characterization.

670

Pang-Naylor, K., Chen, E. and Montañez, G. D.

Model Characterization with Inductive Orientation Vectors.

DOI: 10.5220/0013304400003890

In Proceedings of the 17th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2025) - Volume 2, pages 670-681

ISBN: 978-989-758-737-5; ISSN: 2184-433X

2 BACKGROUND

2.1 Algorithmic Search Framework

The algorithmic search framework (ASF) provides

the theoretical foundation of the inductive orientation

vector and consequent model characterizations (Mon-

tanez, 2017b). The ASF is a formalization of search

through a three tuple, (Ω, T,F), the search space, tar-

get set, and external information resource. We reduce

the ASF to classiﬁcation inference on n data points,

which we refer to as the holdout set H. Given all pos-

sible labelings of the n data points, a black-box clas-

siﬁcation algorithm A “searches” for labelings with

high accuracies (e.g., how close the chosen labeling

is to assigning the n elements’ true labels). Formally,

suppose we classify n data points with c categories.

Then, the search process (Ω,T, F) is deﬁned as fol-

lows.

1. Search space (Ω) contains all possible c

label-

ings of the holdout set. For example, if c =

2 and n = 5, Ω contains elements (0, 0, 0, 0, 1),

(0,0,1,0,1), and so on.

2. Target set (T ) is a subset of Ω containing la-

belings with accuracies above some minimum

threshold q

min

(for example, 80%). We may en-

code this as target function t, a |T |-hot binary en-

coding vector of length |Ω| where each index in-

dicates an element’s inclusion in the target set T .

3. External information resource (F) represents

information used by the algorithm to guide its

search. In our problem, F embeds the training

data the model receives sampled from some dis-

tribution D, along with its loss or ﬁtness function.

BLACK-BOX

ALGORITHM

HISTORY

ω₀, F(ω₀)

ω₃, F(ω₃)

ω₈, F(ω₈)

ω₅, F(ω₅)

ω₂, F(ω₂)

i − 6

i − 5

i − 4

i − 3

i − 2

i − 1

ω₆, F(ω₆)

CHOOSE NEXT POINT AT TIME STEP i

ω, F(ω)

Figure 1: ASF process (Montanez, 2017a).

Over iterations of the search, the algorithm con-

sults external resource F and its search history

H to

assign a probability mass function P

over the search

space rating an element’s likelihood of belonging to

target set T (Figure 1). “Success” is deﬁned by ﬁnd-

ing at least one element of T during search. By the

end of the search, a probability distribution sequence

P is produced (Bekerman et al., 2022). Normalizing

across all steps (given constant resource F), we de-

note the averaged probability distribution induced on

Ω as P

(Bekerman et al., 2022), where

:= E

∑

i=1



. (1)

2.2 Inductive Orientation Vector

Provided with the same external information, learning

algorithms are not guaranteed to generate the same

probability distribution over the search space; dif-

ferent learning architectures achieve different losses

when trained on the same data. These differences

can be attributed to an algorithm’s innate character-

istics known as its inductive bias (Mitchell, 1980).

Any black-box evaluation of an algorithm’s induc-

tive bias requires that bias is estimated with respect

to some generator of training data, D. Otherwise, al-

gorithm behavior cannot be observed. Shared model

behavior across various data-generating distributions

D suggests algorithm characteristics that are indepen-

dent of training data or its inductive bias. We estimate

the behavior on each data distribution using an induc-

tive orientation vector, P

, which can be thought of

as an expectation of algorithm behavior over different

training datasets F ∼ D.

:= E





= E

∑

i=1



. (2)

The inductive orientation vector is a useful proxy

for inductive bias when comparing several algorithms

on a ﬁxed data source D. Experiments by Beker-

man et al. (2022) have shown that inductive orienta-

tion vectors conﬁrm known relationships between al-

gorithms’ inductive biases. The inductive orientation

vector can also be used to calculate the three model-

theoretic metrics: algorithmic bias, entropic expres-

sivity, and algorithmic capacity. This use is the sub-

ject of our work.

2.2.1 Algorithmic Bias

Algorithmic bias quantiﬁes how much an algorithm

deviates in performance from that of uniform random

sampling.

Deﬁnition 1 (Algorithmic Bias, Monta

nez et al.

(2021)). Let D be a distribution over a space of in-

formation resources F and let F ∼ D. For a given D

Model Characterization with Inductive Orientation Vectors

671

and a ﬁxed k-hot target function t,

Bias(D,t) = t

⊤

−

∥t∥

|Ω|

. (3)

Recall that P

is an averaged probability distri-

bution across Ω where probability mass indicates an

element’s expected likelihood of belonging in the tar-

get set. Letting t be a |T |-hot vector representation of

target set T , inner-product t

⊤

is equivalent to the

sum of the probability mass P

places on elements of

the target set. Thus, t

⊤

is the algorithm’s expected

probability of success. We then subtract the probabil-

ity of success under uniform random sampling which

is simply |T |/|Ω| = ∥t∥

/|Ω|.

An algorithm without algorithmic bias cannot

generalize beyond training data and will behave like

random uniform sampling (Mitchell, 1980; Monta

nez

et al., 2019). Mathematically, algorithmic bias is

high when the algorithm’s inductive orientation vec-

tor points towards the target function, resulting in a

greater than uniform probability of success. There-

fore, algorithmic bias captures whether the algo-

rithm’s assumptions are biased toward or against the

task at hand.

2.2.2 Entropic Expressivity

The inductive orientation vector also determines the

entropic expressivity of an algorithm. The entropic

expressivity measures an algorithm’s ability to dis-

tribute its probability mass over the search space

(Lauw et al., 2019). Since the inductive orientation

vector represents the expected probability distribution

over the search space relative to D, its Shannon en-

tropy H(P

) serves as a measure of the spread of the

algorithm’s probability mass.

Deﬁnition 2 (Entropic Expressivity, Monta

nez et al.

(2021)).

H(P

) = H(E

])

= H(U)− D

∥U) (4)

where D

||U) is the Kullback-Leibler diver-

gence between the inductive orientation vector P

and the uniform distribution U over Ω.

The spread of probability mass on an output space

relative to a data distribution could either be due to

an algorithm’s intrinsic randomness or its nonrandom

response to data. Due to this ambiguity, entropic ex-

pressivity is often difﬁcult to interpret in practice.

2.2.3 Algorithmic Capacity

Algorithmic capacity is deﬁned as the maximum mu-

tual information between the algorithm and data dis-

tribution D (Bashir et al., 2020). Also known as dis-

tributional algorithmic capacity, what we call algo-

rithmic capacity is conditioned on a speciﬁc data dis-

tribution. True algorithm capacity, or an algorithm’s

general ability to learn, is the algorithm’s theoretical

supremum of algorithmic capacity over all possible

data-generating distributions (Bashir et al., 2020).

Deﬁnition 3 (Distributional Algorithmic Capacity,

Bashir et al. (2020)). For a ﬁxed distribution D, the

algorithm capacity speciﬁc to that distribution is rep-

resented by

A,D

= H(P

) − E

[H(P

)].

The ﬁrst term H(P

) represents the spread of

the overall probability distribution in expectation,

namely, the entropic expressivity. It measures the

“ﬂatness” of distribution P

, which can result either

from averaging ﬂat P

distributions or averaging to-

gether many “sharp” distributions P

that place mass

on different parts of Ω (Bashir et al., 2020). The

second term, P

, measures the expected ﬂatness for

a given information resource F, i.e., an algorithm’s

innate stochasticity from training on the same data

F. By subtracting away the algorithm’s intrinsic ran-

domness, C

A,D

isolates the algorithm’s nonrandom

response to data.

For a deterministic algorithm, retraining on the

same data will always produce the same model pa-

rameters, making each distribution vector P

place all

its probability mass on a single outcome. This results

in E

[H(P

)] = E

[0] = 0 and causes algorithmic

capacity to equal entropic expressivity.

3 METHODS

3.1 Estimations of Inductive

Orientation Vectors

All explored metrics require precise estimation of the

inductive orientation vector. We adopt the methodol-

ogy proposed by Bekerman et al. to estimate an ex-

pected inductive orientation vector (Bekerman et al.,

2022). Full details of the procedure and its theoretical

justiﬁcation can be found in their work, but we will

brieﬂy summarize the key steps.

We assume some dataset D as a proxy for our data-

generating distribution D (Section 6 discusses proper-

ties and limitations of this approach). We ﬁrst create

K subsets of D that serve as training datasets, denoted

as F

. We sample with replacement to form each sub-

set, ensuring that each subset comes from the same

underlying distribution while allowing for variance

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

672

between samples. We train the binary classiﬁcation

algorithm on each F

r times. This repeated train-

ing on each F

captures possible stochastic behavior

within the same training set.

After training, each model is evaluated on a com-

mon holdout set H ⊂ D to infer its inductive orienta-

tion on the holdout data. The model’s labeling of H

is represented as a one-hot encoded vector of length

|Ω| = 2

|H|

, where the “1” element corresponds to the

labeling sequence produced by the model trained on

. The average of these vectors over the r repetitions

for the same subset F

is denoted as P

, representing

the inductive orientation vector for that subset.

We compute the expected inductive orientation

vector P

by averaging the P

vectors across all K

subsets. This results in an estimate of the overall in-

ductive orientation relative to the overall dataset.

for k = 1,. .. ,K do

← Sample without replacement from

training set;

for r = 1,. .. , R do

Generate P

after training A on F

;

← P

+ P

;

end

← P

/R;

Store P

in LDM;

end

← Average of the columns of LDM;

return P

;

Algorithm 1: Generate Labeling Distribution Matrix

(LDM) and Inductive Orientation Vector (P

Algorithmic bias, entropic expressivity, and algo-

rithmic capacity are computed as in Section 2.2.

3.1.1 Experimental Parameters

In our experiments, each data subset F

is 15% the

size of the training dataset (which is 80% the size of

the entire dataset). We pick each holdout set H to be

5 data points from the 20% test set. This means there

are 2

elements in the search space Ω. We selected

100 holdout sets per dataset to obtain a conﬁdence

interval. This results in 100 inductive orientation vec-

tors per pair of model and dataset. Bias, expressivity,

and capacity are calculated from each vector. Note

that we chose to train many inductive orientation vec-

tors rather than increasing the size of the holdout set

because the size of the inductive orientation vector

scales exponentially with the holdout set size. Rather

than choosing a single ﬁxed target set, we generated

results with ﬁve target sets corresponding to ﬁve min-

imum accuracy thresholds:

/5,

/5, and

/5.

Table 1: Theoretical maximum and minimum values for ex-

pressivity, capacity, and bias of all thresholds.

Metric Minimum Maximum

Entropic Expressivity 0 5

Algorithmic Capacity 0 5

Algorithmic Bias (size 1) -0.9688 0.0312

Algorithmic Bias (size 2) -0.8125 0.1875

Algorithmic Bias (size 3) -0.5000 0.5000

Algorithmic Bias (size 4) -0.1875 0.8125

Algorithmic Bias (size 5) -0.0313 0.9688

Table 2: Summary of experiment hyperparameter ranges.

Each range entry respectively embeds hyperparameter

[minimum, maximum; and step-size].

Algorithm Parameter Range & Step

k-nearest neighbors neighbors [1,200;5]

Decision tree max. depth [1,70;5]

Linear SVC iterations [1,1000;50]

c-support SVC iterations [1,1000;50]

Logistic regression iterations [1,200;10]

Random forest max. depth [1,70;5]

Random forest estimators [1,200;5]

Adaboost estimators [1,100;5]

3.2 Maximum and Minimum Values

The minimum algorithmic capacity and entropic ex-

pressivity are 0, which occurs when the model al-

ways places all probability mass on one element of

the search space. In contrast, the maximum of both

corresponds to the Shannon entropy of a uniform dis-

tribution on the 2

|H|

= 2

search space, which is 5

bits.

Algorithmic bias compares model performance to

uniform sampling. For binary classiﬁcation on a hold-

out size of 5 with threshold z, the probability of suc-

cess for uniform random sampling p

(i.e., getting at

least z labels correct) is p

∑

l=z





. The model

performance ranges from 100% to 0%. Therefore,

bias ranges from 1 − p

to −p

. See Table 1 for all

ranges.

3.3 Datasets & Algorithms

We explore classic, highly interpretable classiﬁcation

algorithms. This lets us corroborate experimental re-

sults with known algorithm properties and, therefore,

more reliably ground the model evaluation technique.

For each selected algorithm, we measured algorithmic

bias, entropic expressivity, and algorithmic capac-

ity over a wide range of possible hyper-parameters.

All models were built with Scikit-learn (Pedregosa

et al., 2011). All algorithm hyperparameter choices

are shown in Table 2.

We derived metrics from each algorithm’s per-

formance on ten UCI Machine Learning Repository

Model Characterization with Inductive Orientation Vectors

673

Table 3: Summary of datasets. B.S.E. refers to the bootstrap

standard error averaged over all features (Section 6).

Dataset Size Balance |F

| B.S.E.

EEG Eye State 14979 0.449 1797 41.045

Random 2000 0.501 240 1.8638

Shopper’s Intention 12245 0.155 1469 3.5670

Bank Marketing 11162 0.474 1339 2.4254

Abalone 4177 0.312 501 0.0118

Car Evaluation 1728 0.922 207 0.0758

Letter Recognition 1609 0.495 193 0.1497

Obesity 2111 0.460 253 0.1186

Spam 4600 0.394 552 0.6473

Wine Quality 6497 0.754 779 0.2457

datasets (Dua and Graff, 2017) and one synthetically

generated dataset (Random). Datasets were binarized

either by thresholding the label value or by choosing

two classes. For example, we only use the letters “T”

and “U” from the Letter Recognition dataset.

4 INDIVIDUAL ALGORITHM

ANALYSIS

In this section, we analyze metrics obtained from in-

ductive orientation vector estimations on interpretable

algorithms (decision trees, random forests, and k-

nearest neighbors). We corroborate known algorithm-

speciﬁc heuristics with experimental results. Unless

noted otherwise, the described trends and analyses

generalize an algorithm’s behavior across all datasets.

However, for all tree-based algorithms, we only dis-

play results on the EEG Eye State dataset to conserve

space and for clear comparisons.

4.1 Decision Trees

Decision tree classiﬁers are trained by recursively

splitting data with feature boundaries that maximize

information gain. The ﬁnal tree consists of decision

nodes that lead to leaf nodes representing the pre-

dicted class. Increasing a decision tree’s depth grows

its complexity and allows the algorithm to capture

more patterns in data. However, too many layers let

the decision tree “memorize” the noise of a dataset

and overﬁt (Bramer, 2007; Bashir et al., 2020). Many

techniques aim to prevent and correct overﬁtting, such

as limiting a tree’s maximum depth (Bramer, 2007).

Across all non-random datasets, we observe an

initial sharp upward trend in nontrivial threshold algo-

rithmic bias as maximum depth grows (Figure 2). At a

certain depth, typically between 5 to 10 layers, the al-

gorithmic bias plateaus. Given that algorithmic bias is

performance compared to uniform random guessing,

this trend unsurprisingly mirrors that of training and

testing accuracy (Figure 4). Heuristically, this plateau

Figure 2: Estimated algorithmic bias for decision tree on

EEG dataset, averaged over 100 trials. Shaded regions indi-

cate 95% conﬁdence intervals.

in accuracy when varying maximum depth indicates

that the algorithm has stopped learning generalizeable

patterns and its additional layers are simply memoriz-

ing noise (Ying, 2019).

Decision trees’ entropic expressivity and algorith-

mic capacity exhibit a nearly identical upward trend

for the ﬁrst 4 to 10 layers, up to around where al-

gorithmic bias begins to plateau. This increasing al-

gorithmic capacity indicates that adding layers at low

depths helps the tree respond more to changes in train-

ing data (i.e., higher mutual information between D

and model predictions). Such behavior is consistent

with general knowledge of decision trees. Increasing

a tree’s maximum depth, particularly at low layers,

increases its complexity and allows them to handle

more input varieties (Bramer, 2007). This increased

learning capacity is consistent with the identical up-

ward trend in testing and training accuracy in this 1

to 10 layer region (Figure 4). These metrics’ plateau

at a higher depth indicates that any new layers will

have exhausted learning patterns and are only learn-

ing from noise (Bramer, 2007), leaving the distribu-

tion on Ω with the same averaged entropy.

As the maximum depth of the tree increases fur-

ther, algorithmic capacity is constant or dips slightly.

Entropic expressivity, on the other hand, increases

for a few more layers before plateauing. High en-

tropic expressivity indicates that decision trees with

more layers induce an unpredictable, “ﬂat” probabil-

ity mass over Ω.

When entropic expressivity departs from algorith-

mic capacity, we know this increased unpredictability

is due to stochasticity rather than increased model re-

sponsiveness. Recall that E

[H(P

)] is the difference

between entropic expressivity and algorithmic capac-

ity. This term captures the spread of models’ predic-

tions across repeated trainings on the same dataset F

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

674

Figure 3: Estimated entropic expressivity and algorithmic

capacity for decision tree of EEG dataset, averaged over 100

trials. Shaded regions indicate 95% conﬁdence intervals.

Figure 4: Decision tree train and test accuracies on the EEG

dataset, averaged over 100 trials. Conﬁdence intervals are

negligible.

taken in expectation over all F

⊂ D. In other words, if

you retrain a model on the same dataset, E

[H(P

)]

captures how much it changes. This is the innate

stochasticity of the decision tree’s training process.

Decision trees only make random choices when there

are ties or “clashes” between alternative boundary de-

cisions due to data points with similar features but dif-

ferent outputs (Pedregosa et al., 2011; Bramer, 2007).

A popular cause of these splitting ties is overﬁtting,

speciﬁcally, the deeper decision nodes are trying to

learn from random noise rather than patterns (Bramer,

2007; Rong et al., 2021). Thus, increase in the aver-

age value of E

[H(P

)] could indicate overﬁtting.

Heuristically, non-negligible differences between

test and train accuracy may indicate overﬁtting (Ying,

2019). For all datasets where decision trees exceed a

test-train accuracy of 3% at any depth, we observed

overall strong and statistically signiﬁcant Spearman

Spearman coefﬁcients may be more relevant than Pear-

Table 4: Spearman and Pearson correlation coefﬁcients of

E(H(P

)) vs. test-train accuracy deviation from maximum

depth values of 1 to 70. Bolded entries denote that a train-

test accuracy difference of more than 3 percent was reached.

** indicates p < 0.05, and * indicates p < 0.07 signiﬁcance.

Dataset Pearson Spearman

EEG Eye State 0.9990 0.6791**

Random 0.9967 0.6923**

Shopper’s Intention 0.9988 0.9893*

Bank Marketing 0.9978 0.9626**

Abalone 0.9963 0.9963**

Car Evaluation 0.9875 0.1253*

Letter Recognition 0.7839 0.4374

Obesity 0.9907 0.2044

Spam 0.9607 0.5165*

Wine Quality -0.9759 0.2

and Pearson correlation coefﬁcients between the esti-

mated E

[H(P

)] and the average difference between

train and test accuracy (Table 4). Overﬁtting and un-

derﬁtting are undecidable model properties (Bashir

et al., 2020; Sehra et al., 2021), but such a strong cor-

relation between E

[H(P

)] and accuracy deviations

may indicate a relationship between E

[H(P

)] and

overﬁtting in tree-based models.

4.2 Random Forest

Next, we analyze how decision trees behave when en-

sembled as random forests. Designed to address the

noise sensitivity of individual decision trees, a ran-

dom forest is formed by bootstrap aggregation of n

trees (i.e., the number of “estimators”). The random

forest trains n decision trees on n bootstrapped sam-

ples of the training dataset, each ignoring some ran-

domly selected subset of features. A random forest

will run any input through each of its n trees and out-

put the majority class (Parmar et al., 2019; Pedregosa

et al., 2011).

We calculated metrics for random forests when

varying both the number of estimators and the max-

imum depth of each tree. While varying depth, we

maintained the default estimator count of 100. When

varying the number of estimators, we did not impose

any pruning or maximum depth limit.

Like with individual decision trees, increasing the

maximum number of layers results in a similar up-

ward then plateauing trend for algorithmic bias val-

ues of non-trivial target threshold sizes (Figure 5).

An initial increase in model complexity allows indi-

vidual trees to capture patterns, but too many layers

let the model overﬁt and do not improve performance

(Bramer, 2007).

son coefﬁcients because monotonic trends are less sensitive

to outliers compared to linear relationships.

Model Characterization with Inductive Orientation Vectors

675

Figure 5: Random forest algorithm bias trend varying max-

imum depth (EEG dataset).

Figure 6: Random forest bias when varying estimator count

range (EEG dataset).

Increasing the number of estimators in a random

forest is generally thought to improve performance, as

more “voters” will overwhelm the few “uninformed”

trees given irrelevant features (Probst and Boulesteix,

2018). Unsurprisingly, we observe this increase in

performance and algorithmic bias over all datasets.

Typically, a sharp increase of algorithmic bias occurs

in the 1 to 20 estimator range, likely due to the forest

gaining enough decision trees to “cover” all features

of the dataset. Due to the aggregation/voting process

of random forests, the addition of one estimator can

change the overall forest output. This results in an

“even-odd” alternating pattern (Figure 6).

Aggregation with a large number of estimators

also produces a stabilizing effect on inferences. More

trees voting will cause the forest to produce more con-

sistent labelings of the holdout set and also decreases

its vulnerability to noise and overﬁtting (Parmar et al.,

2019). Because of this, we observe that the entropic

expressivity, i.e., the spread over the ﬁnal inductive

orientation vector, decreases as the number of estima-

Figure 7: Random forest capacity and expressivity varying

number of estimators (Bank marketing).

tors increase. More speciﬁcally, a random forest with

a large number of estimators is less prone to innate

stochastic effects in outcome (Parmar et al., 2019;

Bramer, 2007). When performing repeated trainings

on the same dataset, this property causes the entropy

within individual training sets E

[H(P

)] (i.e., the

difference between expressivity and capacity) to de-

crease, and so entropic expressivity and algorithmic

capacity grow closer and the number of estimators

grows (Figure 7). The downward trend of algorithmic

capacity indicates less variation from altering train-

ing subsets F

⊂ D. In the context of random forests,

these trends can be interpreted as a forest becom-

ing more focused and less prone to randomness and

changes in data within D as the number of estimators

grows. This is consistent with random forests’ stabi-

lizing effect on outputs. (Parmar et al., 2019; Bramer,

2007)

Varying each estimator’s maximum depth has a

smaller effect on E

[H(P

)] than the number of esti-

mators (Figure 8). The variations between predictions

of random forest models trained on the same dataset

are mainly caused by the random feature selection

process (Parmar et al., 2019). This source of stochas-

ticity is independent of tree depth, so E

(H(P

)) is

less affected by varying tree depth. Changing the

maximum depth causes much more dramatic effects

on individual decision trees as in Figure 3. Further-

more, the algorithmic capacity and entropic expres-

sivity of random forests are consistently less than

that of individual trees across all non-random datasets

(Figure 3). This is consistent with the stabiliza-

tion that random forests provide. An ensemble of

trees will collectively react less strongly to changes

in training subsets within D and are less innately

stochastic than individual trees.

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

676

Figure 8: Expressivity and capacity of random forests var-

ied by maximum depth (EEG).

4.3 K-Nearest Neighbors

The K-Nearest Neighbors (KNN) classiﬁer labels us-

ing the majority class of the k nearest data points to

the input feature vector. Similar to random forests,

KNN voting produces an even-odd pattern in algo-

rithmic bias (Figures 9, 10). As k grows large rela-

tive to the number of samples, the classiﬁer resembles

majority voting and ignores local patterns (Mucherino

et al., 2009). For imbalanced datasets, majority vot-

ing labels all data the same way, and for balanced

datasets, majority voting resembles random guessing.

Both are typically more incorrect than small k voting,

so we observe an overall downward trend in bias (Fig-

ures 9, 10).

The KNN algorithm directly depends on data-

points’ locations, and so KNN classiﬁcation performs

best on datasets where classes are clearly separated

in the feature-space. The modiﬁed Letter Recogni-

tion is one such dataset, as the distinct features of ‘T’

and ‘U’ characters creates distinct contiguous regions

of the feature-space. (This is conﬁrmed by how the

linear-kernel SVC algorithm has near-perfect 0.9956

accuracy, indicating that classes are easily separable

into contiguous regions.) When trained on the mod-

iﬁed Letter Recognition dataset, KNNs at low neigh-

bor count k nearly meet the theoretical upper limits

for algorithmic bias (see Section 5) and has >98%

test and train accuracy, likely due to Letter Recogni-

tion’s well-separated classes. As k approaches 190

or the size of its training set (Table 3), the algorithm

simply becomes majority voting on highly balanced

dataset (Table 3), and so we see a steep fall in per-

formance. This fall in performance is observed on all

other non-random datasets, such as EEG (Figure 10).

Since KNN is a deterministic algorithm (i.e.,

training a KNN on the same data always produces

sames the output), E

[H(P

)] is zero and algorith-

Figure 9: KNN algorithmic bias varying number of neigh-

bors (Letter Recognition).

Figure 10: KNN algorithmic bias varying number of neigh-

bors (EEG).

mic capacity and entropic expressivity are equivalent.

At low k, KNNs trained on any subset F

of the let-

ter recognition dataset have near perfect accuracy, and

place almost all probability mass on the single ele-

ment of the search space Ω representing the correct

labeling of holdout set H. This results in an ex-

tremely low entropy inductive orientation vector P

and therefore, both expressivity and capacity are near

zero at small k. As k rises, the classiﬁer is no longer

perfect, and training on different F

training samples

will produce different labelings of H depending on the

majority class of each (balanced) random subset of

D. Thus, the algorithm will place probability mass on

more elements of Ω (not just the correct labeling), and

entropic expressivity and algorithmic capacity will in-

crease (Figure 12).

However, most datasets’ classes are not perfectly

separated in the feature space. This means that low

neighbor KNNs lack local class purity, resulting in

a lower starting algorithmic bias compared Letter

Recognition (Figure 10). With this “greater room

Model Characterization with Inductive Orientation Vectors

677

Figure 11: KNN accuracy varying number of neighbors

(Letter Recognition).

Figure 12: KNN expressivity and capacity varying number

of neighbors (Letter Recognition).

to fail”, entropic expressivity and algorithmic capac-

ity begin at much higher values compared to Letter

Recognition (Figure 13). For balanced datasets such

as EEG, expressivity and capacity do not see dramatic

changes (Figure 13). High neighbor KNNs where k

approaches the size of a balanced training set |F

| are

subject to the randomness of the majority class of F

and low k KNNs are subject to the local randomness

for a non-locally pure dataset.

However, the expressivity and capacity for KNNs

trained on highly imbalanced datasets like Car Evalu-

ation quickly fall to zero as k approaches |F

|. This is

because if k = |F

| (and F

is highly imbalanced), the

KNN algorithm will always choose the majority class

of the dataset, resulting in all probability mass placed

on the element of Ω where all ﬁve elements of H are

labeled as the majority class. Thus, the expected en-

tropy over the inductive orientation is zero, and ex-

pressivity and capacity are also zero (Figure 14).

Figure 13: KNN expressivity and capacity varying number

of neighbors (EEG).

Figure 14: KNN expressivity and capacity varying number

of neighbors (Car Evaluation).

5 THE BIAS-EXPRESSIVITY

TRADEOFF

Bashir et al. (2020) and Lauw et al. (2019) proved

trade-off bounds between algorithmic bias with both

entropic expressivity and algorithmic capacity. Intu-

itively, this tradeoff reﬂects how an algorithm can-

not be both very effective at one task (i.e., high bias)

while ﬂexible for all tasks (i.e., high expressivity). If

we let p be the probability of success from random

sampling, Lauw et al. (2019) proved the following

ranges for expressivity given values of bias (Table 5).

Table 5: Varying ranges of entropic expressivity for differ-

ent levels of bias on target t where k is the target set size.

Bias(D,t) E[t

⊤

] Expressivity Range

−p (Min) 0 [0,log

(|Ω| − k)]

0 p [H(p),log

|Ω|]

1 − p (Max) 1 [0,log

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

678

Figure 15: Random forest algorithmic bias (Random

dataset).

Figure 16: Random forest entropic expressivity and algo-

rithmic capacity (Random dataset).

Trade-off bounds are most visible where expres-

sivity and bias reach close to their theoretical lim-

its. Following the bounds from Table 5, when the

threshold

algorithmic bias of the k-nearest neigh-

bors algorithm reaches near its theoretic maximum

bias at low k (Figure 9), expressivity

is around 0.25

(Figure 12). Furthermore, the two ﬁgures show ca-

pacity and expressivity do not increase until bias de-

creases from its theoretical maximum (achieved by

increasing k). Similarly, when algorithmic capacity

is close to 0, expressivity obeys only the upper bound

of log

|Ω| = 5. For example, random forest ran on

the Random dataset has near-zero bias across all es-

timator counts and thresholds (Figure 15). As a re-

sult, entropic expressivity appears upper bounded by

5 (Figure 16).

We have also veriﬁed the direct upper bound for-

mulas for algorithmic bias and entropic expressivity

This is below log

(|T |) when |T | = 1.2, which ap-

proaches the size 1 target set of our strongest

threshold.

Figure 17: C-support SVC entropic expressivity and upper

bounds (Shopper’s Intention).

developed by Lauw et al. (2019). In our experiments,

the upper bound for algorithmic bias was consistently

above 1 (and thus trivial). The entropic expressivity

upper bound, H(P

) ≤ log

|Ω| − 2Bias(D,t)

, was

nontrivial for all experiments and often mirrors the

expressivity trends (Figure 17).

6 DISCUSSION & LIMITATIONS

The inductive orientation vector allows researchers to

estimate algorithmic bias, entropic expressivity, and

algorithmic capacity, which are three model-theoretic

values with established properties and behaviors (Se-

gura et al., 2019; Bashir et al., 2020; Rong et al., 2021;

Ramalingam et al., 2022; Monta

nez et al., 2021). In

this section, we discuss important factors to consider

when estimating and using these metrics.

First, recall that algorithmic bias is dependent on a

minimum accuracy threshold on H which determines

whether a sequence of labels is included in the target

set. This threshold can be chosen by the experiment

setup. For a holdout size |H|, there are |H| + 1 pos-

sible thresholds and thus versions of algorithmic bias.

As the threshold lowers, the size of the target set T

approaches |Ω|, and it becomes harder for the algo-

rithm to outperform uniform random sampling. This

leads to low algorithmic bias when the threshold is

close to 0. On the other extreme, when the threshold

is near maximum (|H|/|H|), both random sampling

and trained algorithms tend to struggle, which gener-

ally causes a dip in bias. In our experiments (where

|H| = 5) we consider thresholds of

(ig-

noring the trivial 0 threshold). By the aforementioned

logic, “middle ground” thresholds of

tend to have

the highest bias values. It is important to generate al-

gorithmic bias based on the threshold most aligned

Model Characterization with Inductive Orientation Vectors

679

with the needed accuracy for a problem.

When interpreting the estimated algorithmic ca-

pacity of a trained model in terms of mutual infor-

mation, it is important to stress that inductive orien-

tation vectors measure distributional algorithmic ca-

pacity with respect to a training dataset D, as devel-

oped by Bashir et al. (2020). Unlike classical def-

initions of capacity, our estimated distributional al-

gorithmic capacity captures mutual information be-

tween model outputs and potential subsets within a

given training dataset. Theoretical algorithm capac-

ity posits D as a theoretical universal data-generating

distribution where sampled datasets F ∼ D may be

entirely different. Thus, for most datasets, distribu-

tional algorithmic capacity does not reﬂect the clas-

sical intuition that capacity is an algorithm’s “ability

to learn”. Rather, this form of algorithmic capacity

is the mutual information between a dataset’s subsets

and model behavior. In other words, how much does

knowing which subset of D was selected (i.e., the out-

come of “random variable” D) tell you about the be-

havior of the model it will train (and vice versa).

One consequence is that an extremely low entropy

dataset may produce F

subsets that are virtually iden-

tical and, therefore, algorithmic capacity will be zero.

For example, imagine a dataset with only two data

points repeated n times. Then, models trained on dif-

ferent F

subsets randomly sampled from D will be

identical and capacity will be zero. Hence, it is im-

portant to interpret algorithmic capacity with respect

to the dataset’s entropy or bootstrap variance. (Our

datasets’ averaged bootstrap standard errors are dis-

played in Table 3). Similarly, if a model’s behavior

is zero entropy, that is, the model always assigns the

labels of H to the same values, then capacity will also

be zero (this is demonstrated in the KNN behavior in

Figures 14 and 12).

To estimate true non-distributional algorithmic ca-

pacity, that is, an algorithm’s ability to learn, one must

swap each F

for an entire dataset generated by some

distribution. In practice, this would require synthetic

data or some large, representative data generator (e.g.,

real-time internet data).

Regarding computational constraints, the size of

the search space scales exponentially with the hold-

out set size |H|. Our method characterizes model

behavior on H, so we recommend addressing uncer-

tainty by running estimations with different randomly

sampled small holdout sets rather than increasing |H|,

as brieﬂy mentioned in Section 3.1.1. Unfortunately,

this need for repeated inferences may be resource-

intensive for larger algorithms.

That said, these metrics have practical insights.

For example, an online learning algorithm may use

estimates of entropic expressivity and algorithmic ca-

pacity to quantify the stochasticity from the model as

opposed to its time-dependent data distribution. Our

analysis showcased connections between these met-

rics and known model behavior, suggesting predictive

abilities on general black-box algorithms. Past the-

oretical work has also proven how such qualities af-

fect model generalization and the tendency to over-

ﬁt (Monta

nez et al., 2021; Bashir et al., 2020; Rama-

lingam et al., 2022).

7 RELATED WORK

Traditional evaluation metrics such as accuracy, pre-

cision/recall, and F1 score effectively describe a mod-

els’ overall performance for a general task, but of-

fer little insight for the algorithms’ innate behav-

ior (Powers, 2020). On the other hand, model-

agnostic explainability techniques such as Local In-

terpretable Model-agnostic Explanations (LIME) and

other local estimation methods (Craven and Shav-

lik, 1995; Strumbelj and Kononenko, 2010; Baehrens

et al., 2010) use interpretable algorithms (e.g., deci-

sion trees, linear functions) to approximate a models’

underlying behavior local to an individual prediction,

but struggle when describing general model behavior

(Ribeiro et al., 2016b,a). Inductive orientation vectors

allow a middle ground between generalization and

interpretability, describing overall model behavior in

terms of information-theoretic model properties.

SHapley Additive exPlanations (SHAP) is another

model interpretation technique used to determine

which features are most inﬂuential for model output

(Lundberg and Lee, 2017). This means SHAP pro-

vides interpretability at the inference stage. In con-

trast, our approach focuses on evaluating a model’s

capacity to learn and adapt to speciﬁc problems which

is more useful for model selection.

We speciﬁcally introduce a method estimating

distributional algorithmic capacity as a proxy for

capacity or mutual information for a speciﬁc in-

putted dataset. In contrast, existing methods of es-

timating mutual information (Butakov et al., 2024)

assume a speciﬁc data distribution, and classical

methods such as applying VC dimension estimation

and Rademacher complexity provide capacity upper

bounds rather than direct estimations (Segura et al.,

2019).

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

680

8 CONCLUSION AND FUTURE

WORK

We introduced and empirically validated model-

agnostic metrics for evaluating black-box classiﬁca-

tion algorithms: algorithmic bias, entropic expres-

sivity, and algorithmic capacity. These information-

theoretic metrics provide interpretable insights into

model behavior. Moving forward, we hope to explore

the behavior of these metrics with non-static data and

data of varying entropy. Given the methods’ reliance

on bootstrapping and retraining, we must also test

these metric estimations on larger and more complex

algorithms and verify their practical applicability with

modern ecosystems.

REFERENCES

Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M.,

Hansen, K., and M

uller, K.-R. (2010). How to explain

individual classiﬁcation decisions. The Journal of Ma-

chine Learning Research, 11:1803–1831.

Bashir, D., Monta

nez, G. D., Sehra, S., Segura, P. S., and

Lauw, J. (2020). An information-theoretic perspective

on overﬁtting and underﬁtting. In AI 2020: Advances

in Artiﬁcial Intelligence: 33rd Australasian Joint Con-

ference, AI 2020, Canberra, ACT, Australia, November

29–30, 2020, Proceedings 33, pages 347–358. Springer.

Bekerman, S., Chen, E., Lin, L., and Monta

nez, G. D.

(2022). Vectorization of bias in machine learning algo-

rithms. In ICAART (2), pages 354–365.

Bramer, M. (2007). Avoiding overﬁtting of decision trees.

Principles of data mining, pages 119–134.

Butakov, I., Tolmachev, A., Malanchuk, S., Neopryatnaya,

A., and Frolov, A. (2024). Mutual information estimation

via normalizing ﬂows. arXiv preprint arXiv:2403.02187.

Craven, M. and Shavlik, J. (1995). Extracting tree-

structured representations of trained networks. Advances

in neural information processing systems, 8.

Dua, D. and Graff, C. (2017). Uci machine learning reposi-

tory.

Lauw, J., Macias, D., Trikha, A., Vendemiatti, J., and Mon-

tanez, G. D. (2019). The bias-expressivity trade-off.

arXiv preprint arXiv:1911.04964.

Lundberg, S. M. and Lee, S.-I. (2017). A uniﬁed approach

to interpreting model predictions. Advances in neural

information processing systems, 30.

Mitchell, T. M. (1980). The Need for Biases in Learning

Generalizations. Department of Computer Science, Lab-

oratory for Computer Science Research, Rutgers Univ.

Montanez, G. D. (2017a). The famine of forte: Few search

problems greatly favor your algorithm. In 2017 IEEE

International Conference on Systems, Man, and Cyber-

netics (SMC), pages 477–482. IEEE.

Montanez, G. D. (2017b). Why machine learning

works. URL https://www. cs. cmu. edu/˜ gmon-

tane/montanez dissertation. pdf.

Monta

nez, G. D., Bashir, D., and Lauw, J. (2021). Trading

bias for expressivity in artiﬁcial learning. In Agents and

Artiﬁcial Intelligence: 12th International Conference,

ICAART 2020, Valletta, Malta, February 22–24, 2020,

Revised Selected Papers 12, pages 332–353. Springer.

Monta

nez, G. D., Hayase, J., Lauw, J., Macias, D., Trikha,

A., and Vendemiatti, J. (2019). The futility of bias-free

learning and search. In Australasian Joint Conference on

Artiﬁcial Intelligence, pages 277–288. Springer.

Mucherino, A., Papajorgji, P. J., Pardalos, P. M.,

Mucherino, A., Papajorgji, P. J., and Pardalos, P. M.

(2009). K-nearest neighbor classiﬁcation. Data mining

in agriculture, pages 83–106.

Parmar, A., Katariya, R., and Patel, V. (2019). A review on

random forest: An ensemble classiﬁer. In International

conference on intelligent data communication technolo-

gies and internet of things (ICICI) 2018, pages 758–763.

Springer.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,

Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,

Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cour-

napeau, D., Brucher, M., Perrot, M., and Duchesnay, E.

(2011). Scikit-learn: Machine learning in Python. Jour-

nal of Machine Learning Research, 12:2825–2830.

Powers, D. M. (2020). Evaluation: from precision, recall

and f-measure to roc, informedness, markedness and cor-

relation. arXiv preprint arXiv:2010.16061.

Probst, P. and Boulesteix, A.-L. (2018). To tune or not to

tune the number of trees in random forest. Journal of

Machine Learning Research, 18(181):1–18.

Ramalingam, R., Dice, N. E., Kaye, M. L., and Monta

nez,

G. D. (2022). Bounding generalization error through bias

and capacity. In 2022 International Joint Conference on

Neural Networks (IJCNN), pages 1–8. IEEE.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2016a). Model-

agnostic interpretability of machine learning. arXiv

preprint arXiv:1606.05386.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2016b). ”why

should I trust you?”: Explaining the predictions of any

classiﬁer. CoRR, abs/1602.04938.

Rong, K., Khant, A., Flores, D., and Monta

nez, G. D.

(2021). The label recorder method: Testing the memo-

rization capacity of machine learning models. In Interna-

tional Conference on Machine Learning, Optimization,

and Data Science, pages 581–595. Springer.

Segura, P. S., Lauw, J., Bashir, D., Shah, K., Sehra, S., Ma-

cias, D., and Montanez, G. (2019). The labeling distribu-

tion matrix (ldm): a tool for estimating machine learning

algorithm capacity. arXiv preprint arXiv:1912.10597.

Sehra, S., Flores, D., and Monta

nez, G. D. (2021). Undecid-

ability of Underﬁtting in Learning Algorithms. In 2021

2nd International Conference on Computing and Data

Science (CONF-CDS), pages 591–594.

Strumbelj, E. and Kononenko, I. (2010). An efﬁcient ex-

planation of individual classiﬁcations using game theory.

The Journal of Machine Learning Research, 11:1–18.

Ying, X. (2019). An overview of overﬁtting and its solu-

tions. In Journal of physics: Conference series, volume

1168, page 022022. IOP Publishing.

Model Characterization with Inductive Orientation Vectors

681