Neural Networks Bias Mitigation

Through Fuzzy Logic and Saliency Maps

Sahar Shah

, Davide E. Ciucci

, Sara L. Manzoni

and Italo F. Zoppis

Department of Informatics, System and Communication, University of Milano-Bicocca, Viale Sarca 336, Milano, Italy

Keywords:

Bias Mitigation, Neural Networks, Decision-Making, Saliency Maps, Fuzzy Logic.

Abstract:

Mitigating biases in neural networks is crucial to reduce or eliminate the predictive model’s unfair responses,

which may arise from unbalanced training, defective architectures, or even social prejudices embedded in the

data. This study proposes a novel and fully differentiable framework for mitigating neural network bias using

Saliency Maps and Fuzzy Logic. We focus our analysis on a simulation study for recommendation systems,

where neural networks are crucial in classifying job applicants based on relevant and sensitive attributes.

Leveraging the interpretability of a set of Fuzzy implications and the importance of features attributed by

Saliency Maps, our approach penalizes models when they overly rely on biased predictions during training.

In this way, we ensure that bias mitigation occurs within the gradient-based optimization process, allowing

efﬁcient model training and evaluation.

1 INTRODUCTION

Bias in machine learning remains a signiﬁcant chal-

lenge, particularly in decision-making systems that

involve sensitive features such as gender, race, or dis-

ability status. Recent research has shown that mod-

els trained on biased data can perpetuate or even ex-

acerbate social prejudice, leading to unfair predic-

tions (Kamiran et al., 2010). Various strategies have

been proposed to address this problem, including pre-

processing, in-processing, and post-processing meth-

ods.

Pre-processing methods involve removing bias

from the dataset before it is used for training (Ghosh

et al., 2023). Common techniques include adjusting

the importance of data samples to ensure a balanced

representation between sensitive groups (Kamiran

and Calders, 2012) or editing feature distributions to

reduce disparate impacts without compromising the

utility of data (Feldman et al., 2015).

In-processing approaches focus on modifying the

learning algorithm to incorporate fairness constraints

during training (Iosiﬁdis and Ntoutsi, 2019). For ex-

ample, Kamiran et al. (2010) modify the splitting cri-

https://orcid.org/0009-0001-5588-8823

https://orcid.org/0000-0002-8083-7809

https://orcid.org/0000-0002-6406-536X

https://orcid.org/0000-0001-7312-7123

terion of decision trees to consider the impact of the

split on the protected attribute. Similarly, Kamishima

et al. (2012) introduces a regularized technique to

reduce the effect of indirect prejudice (measured as

the mutual information between sensitive features and

class labels). Furthermore, constraints, such as de-

mographic parity or equalized odds, are directly inte-

grated into the optimization objective in Zafar et al.

(2017). Recent work on adversarial networks has also

focused on minimizing predictive disparities between

sensitive groups (Zhang et al., 2018).

Post-processing methods adjust predictions with-

out altering the model or data. For example, Hardt

et al. (2016) modify predictions to satisfy the fair-

ness criteria, such as equalized odds, while (Pleiss

et al., 2017) balances accuracy and fairness through

calibrated prediction adjustments.

Although there has been extensive work on bias

mitigation, the explicit integration of eXplainable

Artiﬁcial Intelligence (XAI) techniques for bias re-

duction has been less explored particularly for in-

processing methods (Tjoa and Guan, 2020). In this

paper, we focus on such an integration aiming not

only to enhance explainability (according to XAI) but

also to allow the model to adjust dynamically pre-

deﬁned, human-understandable fairness constraints.

This capability is critical, especially in high-risk ap-

plications such as medical diagnostics, where human

intervention is essential.

Shah, S., Ciucci, D. E., Manzoni, S. L. and Zoppis, I. F.

Neural Networks Bias Mitigation Through Fuzzy Logic and Saliency Maps.

DOI: 10.5220/0013366800003890

In Proceedings of the 17th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2025) - Volume 3, pages 1343-1351

ISBN: 978-989-758-737-5; ISSN: 2184-433X

1343

The intent is to modify the model’s loss function by

including a regularization term or constraints that ac-

count for discriminatory behavior or fairness criteria.

By leveraging the importance of sensitive features,

as attributed by Saliency Maps, we evaluate human-

predeﬁned rules (i.e., fuzzy antecedents of an implica-

tion). Simultaneously, we quantify the bias for neural

loss regularization by evaluating the output of these

rules (fuzzy consequent). In other words, fuzzy rules

provide humanly understandable and readable expres-

sions of how certain sensitive attributes (race, gender,

and disability status) should or should not inﬂuence

the prediction process. Importantly, this approach

will ensure that bias mitigation occurs during gradi-

ent backpropagation, allowing efﬁcient model train-

ing and evaluation.

The paper is organized as follows. Section 2 out-

lines the methods utilized in our analysis; Section 3

describes the numerical experiments conducted along

with the insights obtained, and Section 4 concludes

this paper with a discussion of potential future exten-

sions.

2 MATERIALS AND METHODS

Although our approach is generalizable to other set-

ting and applications, we focus on neural decision-

making for classiﬁcation of job applicants. A neural

network categorizes applicants into two groups: those

qualiﬁed for higher-skilled positions and those qual-

iﬁed for lower-skilled positions. The following sec-

tions outline the methods employed in our approach.

2.1 Saliency Maps

Saliency maps (SM) are part of a broader category

of methods known as attribute methods, which pro-

vide insights into which attributes have the great-

est inﬂuence on the corresponding predictive output.

We focus on SM because they are intrinsically dif-

ferentiable, meaning they exploit the gradients of the

model’s predictions f (x) for each input (x), i.e.,

S(x) =



∂ f (x)

∂x



(1)

The differentiability of f (x) allows us to apply

a gradient-based optimization to the neural network

custom loss function. Speciﬁcally, the gradients in

Eq. 1 are evaluated and averaged across epochs to

determine the importance of each sensitive feature in

decision-making. To this end, the average sensitivity

of the network responses (with respect to the sensitive

features) is sliced from S(x), and passed to the fuzzy

controller, as described in the next paragraph, to guide

the bias mitigation efforts.

2.2 Fuzzy Controller

A fuzzy logic system (referred to here as a fuzzy con-

troller) reﬁnes our approach by incorporating human

knowledge through a set of fuzzy rules. The con-

troller’s main goal is to quantify bias and ensure that

relevant factors for decision-making are adequately

addressed. This task is achieved by carefully tuning

the linguistic terms (fuzzy sets) as speciﬁed by hu-

man operators. Furthermore, to effectively integrate

the fuzzy logic and enable gradient backpropagation

(in-processing), we have appropriately applied differ-

entiable functions to approximate the typical fuzzy

operators. We report the applied controller, using a

typical pipeline for the fuzzy system design.

2.2.1 Fuzziﬁcation

Let s ∈ [0,1] be the saliency value of any sensitive

feature. To allow differentiation, we deﬁne the fol-

lowing Gaussian curves as membership functions for

three fuzzy sets: Low, Moderate, and High.

Low

(s) = exp



−

(s −c

Low

)

2σ



Moderate

(s) = exp



−

(s −c

Moderate

)

2σ



(2)

High

(s) = exp



−

(s −c

High

)

2σ



Here c

Low

= 0,c

Moderate

= 0.5,c

High

= 1 represent the

Gaussian centers, while σ denotes the standard devia-

tion.

2.2.2 Fuzzy Rules Deﬁnition

Given a pair of saliency values s

∈ [0,1], and an

output variable y (representing bias), our rules follow

a typical ”if-then” implication with the format:

: IF s

is A

AND s

is A

THEN y is B

(3)

In Eq. 3 A

and A

represent a pair of fuzzy sets

in rule R

, y denotes the output variable (bias), and

is the output fuzzy set with Gaussian membership

function (similarly to Eq. 2), respectively with center

in c

Out Low

Out Moderate

Out High

All rules applied in this study are reported in Tab.

1. To ensure a conservative and interpretable fuzzy in-

ference process, the rules are consistent with a fuzzy

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1344

partial ordering, i.e., the fuzzy sets representing lin-

guistic terms (e.g., Low, Moderate, High) are ordered

based on their membership dominance (Zadeh, 1971),

establishing a natural hierarchy such that:

Low < Moderate < High. (4)

Equation 4 reﬂects a progression in intensity (or

magnitude), and ensures that the output adheres to

the principle of being the smallest fuzzy set consistent

with the deﬁned order. This conservative approach

prevents overestimation in the inference process by

assigning the outputs that are minimal in terms of

the deﬁned hierarchy, preserving interpretability and

caution.

2.2.3 Rule Evaluation and Aggregation

Each rule (Eq. 3) is evaluated by applying a Product-

Over-Sum softmin function, deﬁned as

softmin(µ

j,1

),µ

j,2

)) =

j,1

) ·µ

j,2

)

j,1

) + µ

j,2

)

(5)

for any pair of fuzzy sets A

, A

. Eq. 5 approximates

the minimum ﬁring strength, α

, for the rule R

across

the antecedents, thus providing smooth transitions be-

tween values. Importantly, since Eq. 5 depends on the

sums and products of differentiable functions (Gaus-

sian curves), it is itself differentiable, making it useful

for gradient-based optimization.

Finally, we aggregate the bias across the rules using

a Weighted Sum (softmax) of each Gaussian output,

thus maintaining the differentiability of the fuzzy op-

erations, i.e.,

agg

(y) = softmax(µ

(y),µ

(y),...,µ

(y))

∑

i=1

·e

−

(y−c

)

2σ

(6)

2.2.4 Defuzziﬁcation

The defuzziﬁcation converts the fuzzy output back

into a quantitative measure which serves, in this case,

as a quantitative measure of the detected bias. The

well-known weighted average method (centroid) is

considered, i.e.,

∗

y ·µ

agg

(y)dy

agg

(y)dy

(7)

When the Gaussian curves share the same deviation,

then Eq. 7 simpliﬁes to a weighted average of Gaus-

sian centers (c

Low

, c

Moderate

, and c

High

), over the sum

of the activation weights,

∗

≈

∑

·c

∑

(8)

where c

are the centers of the output Gaussian mem-

bership (i.e., Low, Moderate, High), and α

are the

rule activations.

To derive Eq. 8, for our application, it sufﬁces

to note that according to Eq. 6, we can expand the

numerator of y

∗

y ·µ

agg

(y)dy =

y ·

∑

i=1

·e

−

(y−c

)

2σ

dy (9)

Then, bringing the sum outside the integral,

y ·µ

agg

(y)dy =

∑

i=1

y ·e

−

(y−c

)

2σ

dy (10)

Each integral

y ·e

−

(y−c

)

2σ

dy represents the expected

value of y for a Gaussian curve centered at c

, or

equivalently the expected value of a Gaussian distri-

bution with parameters (c

,σ), scaled by

√

2πσ

, i.e.,

y ·e

−

(y−c

)

2σ

dy = c

√

2πσ

(11)

Thus, the numerator becomes,

∑

i=1

·c

√

2πσ

(12)

Similarly, by extending the denominator in Eq. 8, we

get

agg

(y)dy =

∑

i=1

·e

−

(y−c

)

2σ

dy (13)

bringing out the sum outside the integral,

agg

(y)dy =

∑

i=1

−

(y−c

)

2σ

dy =

∑

i=1

√

2πσ

(14)

Finally, simplifying common terms in Eq. 12, and

14 we result with Eq. 8. The crisp value y

∗

is then

incorporated into the overall loss to mitigate biased

predictions.

2.3 Neural Network

The Neural Network is relatively simple: a feed-

forward architecture that generates biased decisions

only. Essentially, the model will be required to re-

spond to sensitive and relevant input to determine

Neural Networks Bias Mitigation Through Fuzzy Logic and Saliency Maps

1345

Table 1: Fuzzy rules used to determine bias levels based on saliency map values.

Fuzzy Rules Descriptions

IF Saliency of Race is High AND Saliency of Gender is Medium THEN Bias is High

IF Saliency of Race is Medium AND Saliency of Gender is Low THEN Bias is Medium

IF Saliency of Race is Low AND Saliency of Gender is High THEN Bias is High

IF Saliency of Gender is High AND Saliency of Disability is Medium THEN Bias is High

IF Saliency of Gender is Medium AND Saliency of Disability is Low THEN Bias is Medium

IF Saliency of Gender is Low AND Saliency of Disability is High THEN Bias is High

IF Saliency of Disability is High AND Saliency of Race is Medium THEN Bias is High

IF Saliency of Disability is Medium AND Saliency of Race is Low THEN Bias is Medium

IF Saliency of Disability is Low AND Saliency of Race is High THEN Bias is High

IF Saliency of Race is Low AND Saliency of Gender is Low AND Saliency of Disability is Low

THEN Bias is Low

which of the two jobs assigned classes a candi-

date belongs to (i.e., either higher-skilled or lower-

skilled class). As mentioned previously, we follow

in-processing approaches: the core mechanism of

our predictions is a loss function, L = L

+ λy

∗

that combined L

, implemented as a typical binary

cross-entropy, with the defuzziﬁed bias y

∗

obtained in

Eq. 8 .

This regularization should penalize during training

the model when the antecedents of a fuzzy implica-

tion is met (i.e., when the saliency map shows the

model overly relies on biased features, as returned by

the fuzzy controller).

2.4 Data Set

We implemented a data generation mechanism that

explicitly introduces bias through an unfair ﬁlter con-

dition to discriminate against a speciﬁc demographic

group. We deﬁne the following two sets of features in

the simulated data.

• Sensitive features include Race, Gender, and Dis-

ability status. These are data attributes that require

special protection due to their nature, and any bias

introduced here could result in prejudice towards

a group of applicants.

• Relevant features reﬂect the subject’s Skills, Ex-

perience, and Education. These are legitimate fac-

tors for a fair job assignment and represent the

quality levels in applicant proﬁles.

The generation process (reported in Tab. 1) corre-

lates relevant features (mean score) with skillful job

labels for every proﬁle, and penalizes those proﬁles

that match the discriminatory ﬁlter, reducing the job

label assigned. The resulting label ﬁnally provides a

job class.

Algorithm 1: Job Class Assignment.

Require:

Proﬁle x (Sensitive & Relevant Features)

Sensitive Filter:

e.g., Race = 1, Gender = 0, Disability = 1

Ensure: Job class assignment (JClass)

score ← skill + exp + edu ▷ Fair correlation

score ← (score −score

min

)/(score

max

−score

min

)

jLab ← ⌊6 ×score⌋ + 1 ▷ Mapping to integer

if x matches the ﬁlter then ▷ Penalization

jLab = jLab −2

jLab = max(jLab,0) ▷ ensure jLab ≥ 0

end if

if 1 ≤ jLab ≤3 then

jClass ← 0 ▷ Lower-Skilled Job assigned

else if 4 ≤ jLab ≤6 then

jClass ← 1 ▷ Higher-Skilled Job assigned

end if

3 NUMERICAL EXPERIMENTS

We conducted a series of numerical experiments to

evaluate the effectiveness of our approach, focusing

mainly on the inﬂuence of sensitive features on neu-

ral decision making. To accomplish this goal, we de-

signed our experiments to evaluate and estimate the

mitigation gain through the introduction of a regular-

ization term. In detail, the following three models are

considered.

1. Fuzzy-regularized model. A model based on

fuzzy regularization.

2. Saliency-regularized model. In this case, the reg-

ularization term is obtained by combining (aver-

aging) the importance values assigned to feature

across epochs (Saliency values).

3. Non-regularized model - No regularization is ap-

plied.

All models are simple, fully connected (dense)

feed-forward networks designed for binary classiﬁ-

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1346

cation. They aim to establish a baseline for testing

bias mitigation through regularization while ensur-

ing that the models remain interpretable and efﬁcient.

All models share the following architectures and are

trained with Adam Optimizer.

• Input Layer. Accepts a vector of six features as in-

put: three relevant features (skill, experience, and

education) and three sensitive features (race, gen-

der, and disability status).

• Hidden Layer. A fully connected dense layer

with 10 hidden units, with ReLU (Rectiﬁed Lin-

ear Unit).

• Output Layer. A fully connected dense layer with

a single sigmoid output that is interpreted as the

likelihood of the sample belonging to the higher-

skilled job class (class 1). A threshold of 0.5 is

applied to classify samples into class 0 or class 1.

To evaluate generalization and robustness, 10-fold

cross-validation is applied. Training meta-parameters

such as batch size, percentages of training and test-

ing, and learning rate are the same for all the exper-

iments. Each model is trained for a ﬁxed number of

200 epochs in each cross-validation fold. The choice

of 200 epochs is made based on initial experiments,

which suggested that the models achieve stable ac-

curacy and loss convergence; no stop criteria are ap-

plied. For each cross-validation fold, we proceed as

follows:

• The data is partitioned into 90% for training and

10% for validation.

• A new model is initialized and trained from

scratch (for each fold).

• After each fold, the training and validation metrics

are accumulated.

Moreover, Accuracy and Saliency values were

collected for each sensitive and relevant feature for

proﬁles that match the ﬁlter and those that do not.

To better interpret our results, it is important to note

that we are focusing on classiﬁcation tasks. The pre-

dictions made by our models will be compared to data

obtained through an unfair generation process that as-

signs labels with inherent biases. As a result, com-

paring regularized and non-regularized models (un-

der similar experimental conditions) may show a de-

crease in accuracy for the regularized model. This

decrease may reﬂect the impact of bias mitigation

efforts, which aim to promote fair decision-making

that differs from the original biased labels assigned

by the unfair data generation mechanism. In our ex-

periments, we directly assessed the impact of regular-

ization on saliency maps. The following paragraphs

provide a detailed report of our analysis and the cor-

responding results.

3.1 Saliency Analysis

We directly assessed the impact of regularization on

saliency maps. In particular, we evaluated the effect

of model type on bias mitigation by estimating the

amount of bias reduction as a dependent variable of

ANOVA models.

To proceed, we initially considered the Null hy-

pothesis that there is no difference in mean saliency

values between sensitive and relevant features of

penalized subjects at a conservative level of 1%

(p-value). The obtained p-value of 0.0139, calculated

using a two-sample t-test to compare accumulated

mean sensitive and mean relevant values from the

non-regularized model (see Tab. 1), was greater

than the conservative threshold; thus returning no

statistical evidence to reject the Null.

Please note that while Algorithm 1 establishes a

linear correlation between relevant features and job

classes (for each proﬁle) and penalizes ﬁltered sub-

jects based on sensitive features, the bias-generating

mechanism, applied here, does not provide statistical

evidence of differing feature importance values when

assigning labels to different job classes, according

to the interpretation of the SM. This aspect offered

a valuable scenario for our estimation: a situation

in which bias perpetuates through neural processes,

where sensitive and relevant features contribute

equally to the unfair decision-making about penal-

ized subjects. In other words, by assuming feature’s

equal contribution to unfair decisions, we reasonably

estimated the amount of bias reduction as explained

by the difference between mean relevant and mean

sensitive feature values in proﬁle matching the ﬁlter

when applying regularization.

Following the above considerations, we conducted

an ANOVA with post-hoc test to assess the effect of

model type and estimate the amount of bias reduction

provided by the regularized models. To this aim, we

used the difference between mean relevant and mean

sensitive feature values (referred to as the delta of

saliency in this analysis) as the ANOVA dependent

variable. Delta values were accumulated over folds

for the models considered, and reported in Fig. 1.

Only proﬁles that match the biased ﬁlter are included.

ANOVA in Fig. 2 indicates a statistically signiﬁcant

effect of the model type on delta values (p < 0.01).

Therefore, we conducted a post hoc test with the fol-

lowing results.

Neural Networks Bias Mitigation Through Fuzzy Logic and Saliency Maps

1347

Figure 1: Delta of Saliency for the applied models.

Figure 2: Saliency Analysis: ANOVA with post-hoc test. Delta of Saliency is used as dependent variable of the ANOVA

models.

• There is statistically signiﬁcant difference in the

delta of saliency values between Non-Reg vs

Saliency-Reg (p = 0.0000).

• There is statistically signiﬁcant difference in the

delta of saliency values between Non-Reg vs

Fuzzy-Reg (p = 0.0000).

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1348

Figure 3: Delta of saliency across models for proﬁles matching the ﬁlters.

• There is no statistically signiﬁcant difference in

the delta of saliency values between Saliency-Reg

vs Fuzzy-Reg (p = 1.0000).

Finally, we summarize these results qualitatively

in Fig. 4 by reporting the bar-plot of the importance

of each relevant and sensitive feature (Left column),

for the neural decision making, averaged across vali-

dation folds (only proﬁles matching the ﬁlter are con-

sidered.)

These plots reveals, on average, how much each

model relies on each sensitive and relevant feature

when making predictions, thus highlighting again

how the effect of regularization is distributed across

the 2 groups of features.

3.2 Results

In conclusion, our experiments produced the follow-

ing insights:

• Bias Mitigation Effectiveness: We estimated the

”effect” of mitigation directly focusing on the dif-

ference between relevant and sensitive saliency,

as explained by the saliency maps: higher delta

values are observed when applying regularized

models, thus implying higher mitigation capabil-

ity concerning the non regularized models. No

signiﬁcant difference in delta is assessed between

saliency and fuzzy-based regularization.

• Consistency Across Folds: The performance

trend across (training vs. validation) folds (ac-

curacy) provides a comprehensive view of each

model’s reliability and robustness under different

data distributions (Fig. 4, right column)

These results demonstrate that in-processing reg-

ularization through the saliency maps and the inte-

grated fuzzy logic allows mitigation of the predictive

model’s bias induced by data distortion. In particular,

the mitigation achieved through regularization em-

phasizes the relationship between relevant and sensi-

tive attributes, which is central to the decision-making

process of this simulation study.

4 CONCLUSIONS

Fuzzy implications combined with saliency maps

offer a promising strategy to make neural predic-

tions more transparent and accountable while simul-

taneously addressing biases embedded in decision-

making. In this paper, we have integrated a fuzzy sys-

tem and saliency maps to penalize models that overly

rely on sensitive features.

Experimental results demonstrate a signiﬁcant re-

duction in the saliency values of sensitive features

when fuzzy-based and saliency-based regularization

are applied, thereby promoting fairness in decision-

making. While no signiﬁcant performance differ-

ences have been observed between fuzzy-based (inte-

grated system) and saliency-based regularization, the

integrated system offers additional advantages. It fa-

cilitates neural explainability (as per XAI principles)

and enables the model to adapt pre-deﬁned, human-

understandable fairness constraints. This capability is

particularly crucial in high-risk applications such as

medical diagnostics.

It is important to emphasize that the numerical re-

sults presented here are affected by various factors

that create degrees of freedom in the system. Ele-

ments like the choice of the Gaussian curve, its pa-

rameters, the rules formulated, and the fuzzy opera-

tors used for evaluations contribute to the complex-

ity and variability of the outcomes. Therefore, future

extensions will require experiments to constrain the

system’s degrees of freedom to better understand how

these parameters inﬂuence the robustness of the per-

formances obtained.

Neural Networks Bias Mitigation Through Fuzzy Logic and Saliency Maps

1349

Figure 4: Saliency Regularization across feature (Left) and Train vs validation across folders (right).

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1350

ACKNOWLEDGEMENTS

This work was partially supported by the MUR under

the grant “Dipartimenti di Eccellenza 2023-2027” of

the Department of Informatics, Systems and Commu-

nication of the University of Milano-Bicocca, Italy.

REFERENCES

Feldman, M., Friedler, S. A., Moeller, J., Scheidegger,

C., and Venkatasubramanian, S. (2015). Certifying and

removing disparate impact. Proc. of the 21th ACM

SIGKDD, pages 259–268.

Ghosh, B., Basu, D., and Meel, K. S. (2023). ”how bi-

ased are your features?”: Computing fairness inﬂuence

functions with global sensitivity analysis. In Proc. of the

2023 ACM Conf. on Fairness, Accountability, and Trans-

parency, pages 138–148.

Hardt, M., Price, E., and Srebro, N. (2016). Equality of

opportunity in supervised learning. Advances in Neural

Information Processing Systems, pages 3315–3323.

Iosiﬁdis, V. and Ntoutsi, E. (2019). Adafair: Cumulative

fairness adaptive boosting. In Proc. of the 28th ACM

international conference on information and knowledge

management, pages 781–790.

Kamiran, F. and Calders, T. (2012). Data preprocess-

ing techniques for classiﬁcation without discrimination.

Knowledge and Information Systems, 33(1):1–33.

Kamiran, F., Calders, T., and Pechenizkiy, M. (2010). Dis-

crimination aware decision tree learning. In 2010 IEEE

conf. on data mining, pages 869–874. IEEE.

Kamishima, T., Akaho, S., Asoh, H., and Sakuma, J.

(2012). Fairness-aware classiﬁer with prejudice remover

regularizer. In Machine Learning and Knowledge Dis-

covery in Databases: ECML PKDD 2012, Bristol, UK,

September 24-28, 2012. Proceedings, Part II 23, pages

35–50. Springer.

Pleiss, G., Raghunathan, A., Wu, F., Kleinberg, J., and

Weinberger, K. Q. (2017). On fairness and calibra-

tion. Advances in Neural Information Processing Sys-

tems, pages 5680–5689.

Tjoa, E. and Guan, C. (2020). A survey on explainable arti-

ﬁcial intelligence (xai): Toward medical xai. IEEE trans.

on neural networks and learning systems, 32(11):4793–

4813.

Zadeh, L. A. (1971). Similarity relations and fuzzy order-

ings. Information sciences, 3(2):177–200.

Zafar, M. B., Valera, I., Gomez-Rodriguez, M., and Gum-

madi, K. P. (2017). Fairness constraints: Mechanisms

for fair classiﬁcation. Artiﬁcial Intelligence and Statis-

tics, pages 962–970.

Zhang, B. H., Lemoine, B., and Mitchell, M. (2018). Miti-

gating unwanted biases with adversarial learning. Proc.

of the 2018 AAAI/ACM Conference on AI, Ethics, and

Society, pages 335–340.

Neural Networks Bias Mitigation Through Fuzzy Logic and Saliency Maps

1351