further promising application of our setup, be c ause
our template-based approach naturally allows to in-
clude multiple protected group s also in the test set.
Finally, a further extension of the benchmark, w.r.t.
templates for the multi-attribute attributes or other tar-
gets than occupations cou ld improve the evaluation
further.
ACKNOWLEDGEMENTS
We gratefully acknowledge the funding by the Ger-
man Federal Ministry of Economic Affairs and En-
ergy (01MK20007E) and by the Ministry of Culture
and Science of the state of North Rhine-Westp halia in
the project ”Bias aus KI-Modellen”.
REFERENCES
Bartl, M., Nissim, M., and Gatt, A. (2020). Unmasking con-
textual stereotypes: Measuring and miti gating bert’s
gender bias. arXiv preprint arXiv:2010.14534.
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., and
Kalai, A. T. (2016). Man is to computer programmer
as woman is to homemaker? debiasing word embed-
dings. Advances in neural information processing sys-
tems, 29:4349–4357.
Caliskan, A., Bryson, J. J., and Narayanan, A. (2017). Se-
mantics derived automatically from language corpora
contain human-like biases. Science, 356(6334):183–
186.
De-Arteaga, M., Romanov, A., Wallach, H., Chayes, J.,
Borgs, C ., Chouldechova, A., Geyik, S., Kenthapadi,
K., and Kalai, A. T. (2019). Bi as in bios: A case study
of semantic representation bias in a high-stakes set-
ting. In proceedings of the Conference on Fairness,
Accountability, and Transparency, pages 120–128.
Delobelle, P., Tokpo, E. K., Calders, T., and Berendt, B.
(2021). Measuring fairness with biased rulers: A sur-
vey on quantifying biases in pretrained language mod-
els. arXiv preprint arXiv:2112.07447.
Ethayarajh, K., Duvenaud, D., and Hirst, G. (2019). Under-
standing undesirable word embedding associations.
arXiv preprint arXiv:1908.06361.
Goldfarb-Tarrant, S., Marchant, R., S´anchez, R. M.,
Pandya, M., and Lopez, A. (2020). Intrinsic bias
metrics do not correlate with application bias. arXiv
preprint arXiv:2012.15859.
Gonen, H. and Goldberg, Y. (2019). Lipstick on a pig: De-
biasing methods cover up systematic gender biases in
word embeddings but do not remove them. CoRR,
abs/1903.03862.
Kaneko, M. and Bollegala, D. (2021). Unmasking the mask
- evaluating social biases in masked language models.
CoRR, abs/2104.07496.
Kaneko, M., Bollegala, D., and Okazaki, N. (2022). De-
biasing isn’t enough!–on the effectiveness of debias-
ing mlms and their social biases in downstream tasks.
arXiv preprint arXiv:2210.02938.
Kurita, K., Vyas, N., Pareek, A., Black, A. W., and
Tsvetkov, Y. (2019). Measuring bias in con-
textualized word representations. arXiv preprint
arXiv:1906.07337.
Lin, J. (1991). Divergence measures based on the shannon
entropy. IEEE Transactions on Information Theory,
37(1):145–151.
May, C ., Wang, A., Bordia, S., Bowman, S. R., and
Rudinger, R. (2019). On measuring social biases in
sentence encoders. CoRR, abs/1903.10561.
Nadeem, M., Bethke, A., and Reddy, S. (2021). StereoSet:
Measuring stereotypical bias in pretrained language
models. In Proceedings of the 59th Annual Meet-
ing of the Association for Computational Linguistics
and the 11th International Joint Conference on Nat-
ural Language Processing (Volume 1: Long Papers),
pages 5356–5371, Online. Association for Computa-
tional Linguistics.
Nangia, N., Vania, C., Bhalerao, R., and Bowman, S. R.
(2020). CrowS-pairs: A challenge dataset for measur-
ing social biases in masked language models. In Pro-
ceedings of the 2020 Conference on Empirical Meth-
ods in Natural Language Processing (EMNLP), pages
1953–1967, Online. Association for Computational
Linguistics.
Schr¨oder, S., Schulz, A. , Kenneweg, P., Feldhans, R., Hin-
der, F., and Hammer, B. (2021). Evaluating metri cs
for bias in word embeddings. CoRR, abs/2111.07864.
Seshadri, P., Pezeshkpour, P., and Singh, S. (2022). Quanti-
fying social biases using templates is unreliable. arXiv
preprint arXiv:2210.04337.
Shah, D., Schwartz, H. A., and Hovy, D. (2019). Predic-
tive biases in natural language processing models: A
conceptual framework and overview. arXiv preprint
arXiv:1912.11078.
Steed, R., Panda, S., Kobren, A., and Wick, M. (2022). Up-
stream mitigation is not all you need: Testing the bias
transfer hypothesis in pre-trained language models. In
Proceedings of the 60th Annual Meeting of the Associ-
ation for Computational Linguistics (Volume 1: Long
Papers), pages 3524–3542.
Swinger, N. , De-Arteaga, M., Heffernan IV, N. T., Leiser-
son, M. D., and Kalai, A. T. (2019). What are the
biases in my word embedding? In Proceedings of the
2019 AAAI/ACM Conference on AI, Ethics, and Soci-
ety, pages 305–311.