Comparing Deep Learning Models for Multi-label Classification of Arabic Abusive Texts in Social Media
Salma Azzi, Chiraz Zribi
2022
Abstract
Facing up to abusive texts in social networks is gradually becoming a mainstream NLP research topic. However, the detection of its specific related forms is still scarce. The majority of automatic solutions cast the problem into a two-class or three-class classification issue not taking into account its variety of aspects. Specifically in the Arabic language, as one of the most widely spoken languages, social media abusive texts are written in a mix of different dialects which further complicates the detection process. The goal of this research is to detect eight specific subtasks of abusive language in Arabic social platforms, namely Racism, Sexism, Xenophobia, Violence, Hate, Pornography, Religious hatred, and LGBTQ a Hate. To conduct our experiments, we evaluated the performance of CNN, BiLSTM, and BiGRU deep neural networks with pre-trained Arabic word embeddings (AraVec). We also investigated the recent Bidirectional Encoder Representations from Transformers (BERT) model with its special tokenizer. Results show that DNN classifiers achieved nearly the same performance with an overall average precision of 85%. Moreover, although all the deep learning models obtained very close results, BERT slightly outperformed the others with a precision of 90% and a micro-averaged F1 score of 79%.
DownloadPaper Citation
in Harvard Style
Azzi S. and Zribi C. (2022). Comparing Deep Learning Models for Multi-label Classification of Arabic Abusive Texts in Social Media. In Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-588-3, pages 374-381. DOI: 10.5220/0011141700003266
in Bibtex Style
@conference{icsoft22,
author={Salma Azzi and Chiraz Zribi},
title={Comparing Deep Learning Models for Multi-label Classification of Arabic Abusive Texts in Social Media},
booktitle={Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2022},
pages={374-381},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011141700003266},
isbn={978-989-758-588-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,
TI - Comparing Deep Learning Models for Multi-label Classification of Arabic Abusive Texts in Social Media
SN - 978-989-758-588-3
AU - Azzi S.
AU - Zribi C.
PY - 2022
SP - 374
EP - 381
DO - 10.5220/0011141700003266