Comparing the prediction of counterfactual ef-
fect by models integrating LFR and the prediction
with no concern on fairness (see Table 2), it is ex-
plicit that LFR makes it a little fairer since 𝑍 (medi-
an household income) predicted by 𝑋 is not that low.
The outcome 𝑌 (total number of violent crimes per
100K population) is relatively low.
Moreover, we notice that the adopting LFR on
𝑋→𝑍 makes predicted 𝑍 above its average of 0.36.
If we divide the dataset into the protected group and
unprotected group according to 𝑋 (samples with 𝑋>
0.23 is divided into the protected group), we see that
the mean of 𝑍 in the protected group is 0.24 (below
the average), while the value in unprotected group is
0.40 (above the average). The predicted 𝑍 using
LFR, interestingly, is close to 0.40.
5 CONCLUSION
In this paper, we took fairness in machine learning
as a starting point since it became a social issue
today. We chose Community and Crime Dataset
because it has lots of attributes, including some sen-
sitive ones, and we conducted experiments on it to
explore some approaches to improve fairness. Our
research was a glimpse of the world of fairness from
three different perspectives.
In causal inference, we focused on the effect of
intervening some variables on the outcome variable,
which we denote as 𝑃(𝑌 = 𝑦|𝑑𝑜(𝑋 = 𝑥)), inspired
by some previous work (Binns 2018). Since the
effect of intervention is not observable, we need to
convert expression with 𝑑𝑜 to probability condition-
al on observable variables. The adjustment formula,
back-door formula, and front-door formula are of
great importance.
However, the dataset we chose contains mostly
continuous data, making the probability of a con-
crete point meaningless. In our research, we innova-
tively proposed an approach that replaces 𝑃 with 𝔼,
the expectation. For example, 𝑃(𝑌 = 𝑦|𝑑𝑜(𝑋 = 𝑥))
is equivalent to 𝔼𝑌𝑑𝑜
(
𝑋=𝑥
)
. Then by either
calculating directly or predicting with machine
learning models, we can get the expectation of 𝑌
while intervening on 𝑋.
Later, we apply some measurements for fairness
on our dataset: natural direct path and path-specific
effect, proposed by Nabi and Shpitser (Nabi, Shpit-
ser, 2018). They work when there are multiple paths
from 𝑋, the variable we are interested in, to 𝑌, the
outcome variable. By shadowing the mediators be-
tween 𝑋 to 𝑌, we can learn the effect of 𝑋 to 𝑌 spe-
cific to certain fair paths. For example, it is unfair to
make gender directly affect the offer, but it is fair
that gender influences the capabilities concerning
the offer.
Finally, we studied the counterfactual inference.
The goal of this part is computing 𝑌
(𝑢), which
means the value 𝑌 would be if 𝑋 was 𝑥, with exoge-
nous variables 𝑈=𝑢. The first step is to build mod-
els for edges in causal graphs, which signify the
causal relationship between variables. We tried 4
different machine learning models: Linear Regres-
sion, Decision Tree, Support Vector Regression, and
Bayesian Ridge, and trained each of them by 10-fold
cross validation. Then we computed the counterfac-
tual effect, according to the approaches introduced
in Chapter 4 of Causal Inference in Statistics: a
Primer (Pearl, Glymour, Jewell, 2016).
Since we found that building the models as men-
tioned above without concern on fairness may lead
to discrimination on certain protected groups, we
introduced learning fair representations to improve
fairness. This model performs well on both group
and individual fairness (Zemel, Wu, Swersky,
Pitassi, Dwork, 2013). The results of our experiment
showed that after integrating LFR on some counter-
factual problems, the fairness was greatly improved
while the accuracy remained at a relatively high
level.
Indeed, there are many limitations in our current
research. For PSE and NDE, we tried the same algo-
rithm as the causal inference part (2), that is, replac-
ing probability to expectation and calculating with
the help of machine learning models. However, it
did not work well because the prediction values
overfocused the fairness criteria and had a signifi-
cant error.
REFERENCES
Acharya, A., Blackwell, M., & Sen, M. (2016). Explaining
causal findings without bias: Detecting and assessing
direct effects. American Political Science Re-
view, 110(3), 512-529.
Binns, R. (2018, January). Fairness in machine learning:
Lessons from political philosophy. In Conference on
Fairness, Accountability and Transparency (pp. 149-
159). PMLR.
Pearl, J., Glymour, M., and Jewell, N. Causal Inference in
Statistics: a Primer. Wiley, 2016.
Nabi, R., & Shpitser, I. (2018, April). Fair inference on
outcomes. In Proceedings of the AAAI Conference on
Artificial Intelligence (Vol. 32, No. 1).
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., &
Galstyan, A. (2021). A survey on bias and fairness in