tions that require responsibility.
In our proposed method, when a model makes a
misprediction for a certain test input, we first extract
k-nearest neigbours from the training set based on a
specific distance calculation approach, and then feed
these extracted samples into the model as input to get
auxiliary predictions which will be used for post-hoc
analysis. Considering the original misprediction to-
gether with auxiliary predictions, we perform both
sample-based individual analyses and collective sta-
tistical analysis on them. The main contribution of
this study is that it provides a methodology, with
supportive experimental results, based on the analy-
sis of the model’s behaviour on k-nearest neighbors
of the mispredicted sample to understand the reasons
for the model’s inaccurate estimations, by presenting
more appropriate distance calculation method in near-
est neighbour search when dealing with image data.
The rest of the paper is organized as follows: First,
in Section 2, we give an overview of related work
and explain how our work differs from prior studies.
Then, in Section 3, we present our post-hoc analysis
method to explain inaccurate decisions of deep learn-
ing models. Section 4 includes our experimental anal-
ysis for two different datasets. Finally, we conclude
our work by giving final remarks.
2 RELATED WORK
There are certain concepts that are highly related with
model explainability, and some studies provide well-
defined meanings of these concepts and discuss their
differences. For example, in (Roscher et al., 2020),
the authors review XAI in view of applications in the
natural sciences and discuss three main relevant el-
ements: transparency, interpretability, and explain-
ability. Transparency can be considered as the oppo-
site of the “black-boxness” (Lipton, 2018), whereas
interpretability pertains to the capability of making
sense of an obtained ML model (Roscher et al., 2020).
The work (Holzinger et al., 2019) introduces the no-
tion of causability as a property of a person in contrast
to explainability which is a property of a system, and
discusses their difference for medical applications.
Some other studies providing comprehensive outline
of the different aspects of XAI are (Chakraborti et al.,
2020), (Arrieta et al., 2020) and (Cui et al., 2019).
Rule Extraction. One common and longstand-
ing approach used to explain AI decisions is the rule
extraction, which aims to construct a simpler coun-
terpart of a complex model via approximation such
as building a decision tree or linear model leading
to similar predictions of the complex model. An
early work in this category belongs to Ribeiro et
al. (Ribeiro et al., 2016), who present a method
to explain the predictions of any model by learning
an interpretable sparse linear model in a local re-
gion around the prediction. In another work (Bar-
bado and Corcho, 2019), the authors evaluate some
of the most important rule extraction techniques over
the OneClass SVM model which is a method for
unsupervised anomaly detection. In addition, they
propose algorithms to compute metrics related with
XAI regarding the “comprehensivility”, “representa-
tiveness”, “stability” and “diversity” of the rules ex-
tracted. The works (Bologna and Hayashi, 2017;
Bologna, 2019; Bologna and Fossati, 2020) present
a few different variants of a similar propositional
rule extraction technique from several neural network
models trained for various tasks such as sentiment
analysis, image classification, etc.
Post-hoc Analysis. Another widely adopted ap-
proach is the post-hoc analysis, which involves dif-
ferent techniques trying to explain the predictions of
ML models that are not transparent by design. In
this category, the authors of (Petkovic et al., 2018)
develop frameworks for post-training analysis of a
trained random forest with the objective of explaining
the model’s behavior. Adopting a user-centered ap-
proach, they generate an easy to interpret one page ex-
plainability summary report from the trained RF clas-
sifier, and claim that the reports dramatically boosted
the user’s understanding of the model and trust in the
system. In another study (Hendricks et al., 2016),
the authors bring a visual explanation method that fo-
cuses on the discriminating properties of the visible
object, jointly predicts a class label, and explains why
the predicted label is appropriate for the image.
A model explanation technique relevant to our
proposed method is the explanation by example as a
subcategory of the post-hoc analysis approach (Arri-
eta et al., 2020). As an early work in this category,
Bien et al. (Bien and Tibshirani, 2011) develop a Pro-
totype Selection (PS) method, where a prototype can
be considered as a very close or identical observation
in the training set, that seeks a minimal representa-
tive subset of samples with the objective of making
the dataset more easily “human-readable”. Aligned
with (Bien and Tibshirani, 2011), Li et al. (Li et al.,
2018) use prototypes to design an interpretable neural
network architecture whose predictions are based on
the similarity of the input to a small set of prototypes.
Similarly, Caruena et al. (Caruana et al., 1999) sug-
gest that a comparison of the representation predicted
by a single layer neural network with the represen-
tations learned on its training data would help iden-
tify points in the training data that best explain the
Explaining Inaccurate Predictions of Models through k-Nearest Neighbors
229