2016). Analysis of COMPAS, the system used by
judges, probation, and parole officers, used to assess
the likelihood of criminal recidivating, showed that
black people are more likely to be incorrectly clas-
sified as a higher risk to re-offend (Angwin et al.,
2016b). In comparison, white people are more likely
to be incorrectly classified as a lower risk to re-offend
(Angwin et al., 2016a). Google’s ad systems tend to
show more ads for high-paid jobs to males than fe-
male users (Datta et al., 2015). These applications do
not imply ML’s inefficiency, but rather the need for
further research on fair ML.
1.2 Existing Literature
Some work has already been done on increasing the
fairness of ML models. But with this paper, we build
upon the existing literature, where the fairness is ad-
dressed with the ensemble of classifiers. Fair Forests
(Raff et al., 2017) were proposed for fairness induc-
tion in decision trees. An alteration of how informa-
tion gain is calculated with respect to sensitive fea-
ture is proposed. This approach improves fairness in
decision trees, whereas we wanted an iterative pro-
cess, where fairness is corrected in steps. Thus, in-
stead of a forest (ensemble of trees) of uncorrelated
classifiers, we opted to use the boosting technique
in an ensemble of decision trees. This was already
touched upon in a case study done by researchers
from the University of Illinois (Fish et al., 2015),
where the boosting technique increased fairness in the
Census Income dataset. This approach relabels exist-
ing instances according to fairness rules. It focuses
on improving individual fairness, while we wanted
to focus on group unfairness. Next, AdaFair (Iosi-
fidis and Ntoutsi, 2019) was proposed for boosting in-
stances using cumulative fairness while also tackling
the problem of class imbalance of used datasets. This
approach changes how the weights are updated, so
it considers the model’s confidence score and equal-
ized odds. On the other hand, our approach uses the
maximum difference between two groups to calcu-
late estimator error and fairness of each group to up-
date weights, making the equal treatment of group the
main priority of here proposed approach.
1.3 Contributions
From this, we present the boosting classification en-
semble, which strives to optimize both, the group fair-
ness and the overall model quality simultaneously.
Our proposed approach is used to address the
common unfairness problem in the Drug benchmark
dataset, notorious for its historical bias for age and
ethnicity (Donini et al., 2020).
Thus, our main contributions are the following:
• We define fairness of sensitive feature group,
which we use to alter the weights that the origi-
nal AdaBoost calculated.
• We propose Fair AdaBoost classification ensem-
ble, for balanced group results of sensitive feature,
with achieving the same overall quality.
2 METHODOLOGY
AdaBoost was first introduced by Freund and
Schapire in 1995 (Freund and Schapire, 1997). It is
an adapting boosting algorithm in which weak learn-
ers are combined in order to create a strong one. The
boosting technique enables a weak learner to learn
from his own mistakes and boost his knowledge. Es-
timators are created iteratively, and each estimator at
the end receives its weight corresponding to its accu-
racy. The final prediction is presented as a weighted
sum of all estimators.
At the beginning of the algorithm, an equal weight
is assigned to each instance. At the end of every itera-
tion, weights of misclassified instances are increased,
allowing the learner to focus on more challenging in-
stances in the next iteration. Weights are adapted ac-
cording to an error in estimation that suggests the im-
portance of instances until a certain number of itera-
tions or a perfect estimator with estimator error 0 is
achieved.
2.1 Fair AdaBoost
We propose the Fair AdaBoost algorithm to expand
the multi-class AdaBoost algorithm (Hastie et al.,
2009) that considers fairness in training its classifi-
cation model. As well as AdaBoost, Fair AdaBoost
is based on a boosting technique, where each data in-
stance gets a weight updated through iterations until
an optimal result is achieved. Therefore, misclassi-
fied instances, get increased weight while the weight
of correctly classified instances decreases. In addi-
tion to that, weights of the instances also contain the
error rate for its sensitive feature group. At the end of
each iteration, a model is created that, together with
its weight, is defined according to the model perfor-
mance, combining results for the final prediction. A
few hundred iterations could be performed before the
estimator is perfect, having estimator error 0 or before
it starts to stagnate.
As presented, Fair AdaBoost takes into account
fairness when updating instance weights and, in that
Improved Boosted Classification to Mitigate the Ethnicity and Age Group Unfairness
433