All code for this paper can be found on
https://github.com/TheMody/Debiasing- Sentence-
Embedders-through-contrastive-word-pairs.
ACKNOWLEDGEMENTS
We gratefully acknowledge funding by the BMWi
(01MK20007E) in the project AI-marketplace.
REFERENCES
Abid, A., Farooqi, M., and Zou, J. (2021). Persistent anti-
muslim bias in large language models. In Proceed-
ings of the 2021 AAAI/ACM Conference on AI, Ethics,
and Society, AIES ’21, page 298–306, New York, NY,
USA. Association for Computing Machinery.
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., and
Kalai, A. T. (2016). Man is to computer programmer
as woman is to homemaker? debiasing word embed-
dings. In Lee, D., Sugiyama, M., Luxburg, U., Guyon,
I., and Garnett, R., editors, Advances in Neural Infor-
mation Processing Systems, volume 29, pages 4349–
4357. Curran Associates, Inc.
Caliskan, A., Bryson, J. J., and Narayanan, A. (2017). Se-
mantics derived automatically from language corpora
contain human-like biases. Science, 356(6334):183–
186.
Cheng, P., Hao, W., Yuan, S., Si, S., and Carin, L. (2021).
Fairfil: Contrastive neural debiasing method for pre-
trained text encoders. CoRR, abs/2103.06413.
Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2018).
BERT: pre-training of deep bidirectional transformers
for language understanding. Proceedings of the 2019
Conference of the North American Chapter of the As-
sociation for Computational Linguistics: Human Lan-
guage Technologies, abs/1810.04805.
Fabbri, A. R., Li, I., She, T., Li, S., and Radev, D. R. (2019).
Multi-news: a large-scale multi-document summa-
rization dataset and abstractive hierarchical model.
Gonen, H. and Goldberg, Y. (2019). Lipstick on a pig: De-
biasing methods cover up systematic gender biases in
word embeddings but do not remove them. In Pro-
ceedings of the 2019 Conference of the North Amer-
ican Chapter of the Association for Computational
Linguistics: Human Language Technologies.
Liang, P. P., Li, I. M., Zheng, E., Lim, Y. C., Salakhutdinov,
R., and Morency, L.-P. (2020). Towards debiasing sen-
tence representations.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A robustly optimized BERT pre-
training approach. CoRR, abs/1907.11692.
Manzini, T., Yao Chong, L., Black, A. W., and Tsvetkov,
Y. (2019). Black is to criminal as caucasian is to po-
lice: Detecting and removing multiclass bias in word
embeddings. In Proceedings of the 2019 Conference
of the North American Chapter of the Association for
Computational Linguistics: Human Language Tech-
nologies, Volume 1.
May, C., Wang, A., Bordia, S., Bowman, S. R., and
Rudinger, R. (2019). On measuring social biases in
sentence encoders.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013).
Efficient estimation of word representations in vector
space.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G.,
Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark,
J., et al. (2021). Learning transferable visual models
from natural language supervision. In International
Conference on Machine Learning, pages 8748–8763.
PMLR.
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., and Chen,
M. (2022). Hierarchical text-conditional image gener-
ation with clip latents. OpenAI papers.
Ravfogel, S., Elazar, Y., Gonen, H., Twiton, M., and Gold-
berg, Y. (2020). Null it out: Guarding protected at-
tributes by iterative nullspace projection. In Proceed-
ings of the 58th Annual Meeting of the Association for
Computational Linguistics.
Schr
¨
oder, S., Schulz, A., Kenneweg, P., and Hammer, B.
(2023). So can we use intrinsic bias measures or not?
In International Conference on Pattern Recognition
Applications and Methods.
Schulz, A., Hinder, F., and Hammer, B. (2020). Deep-
view: Visualizing classification boundaries of deep
neural networks as scatter plots using discriminative
dimensionality reduction. Proceedings of the Twenty-
Ninth International Joint Conference on Artificial In-
telligence.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L. u., and Polosukhin,
I. (2017). Attention is all you need. In Guyon,
I., Luxburg, U. V., Bengio, S., Wallach, H., Fer-
gus, R., Vishwanathan, S., and Garnett, R., editors,
Advances in Neural Information Processing Systems,
volume 30. Curran Associates, Inc.
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., and
Bowman, S. R. (2019). GLUE: A multi-task bench-
mark and analysis platform for natural language un-
derstanding. In the Proceedings of ICLR.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C.,
Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz,
M., Davison, J., Shleifer, S., von Platen, P., Ma, C.,
Jernite, Y., Plu, J., Xu, C., Scao, T. L., Gugger, S.,
Drame, M., Lhoest, Q., and Rush, A. M. (2020). Hug-
gingface’s transformers: State-of-the-art natural lan-
guage processing.
Zhao, J., Wang, T., Yatskar, M., Cotterell, R., Ordonez, V.,
and Chang, K.-W. (2019). Gender bias in contextual-
ized word embeddings.
Zhao, J., Wang, T., Yatskar, M., Ordonez, V., and Chang,
K.-W. (2018). Gender bias in coreference resolution:
Evaluation and debiasing methods. In Proceedings of
the 2018 Conference of the North American Chapter
of the Association for Computational Linguistics: Hu-
man Language Technologies, Volume 2.
ICPRAM 2023 - 12th International Conference on Pattern Recognition Applications and Methods
212