on other NLP tasks, such as Named Entity Recogni-
tion (NER) (Aguilar et al., 2017). Furthermore, the
presented work did not investigate paths of the files,
which could be pivotal evidence when the file name
is meaningless, such as a file with a name made up of
numbers or random characters (e.g., kf3kfk3985.png).
Also, the metadata of the file, such as its header, size,
extension, could provide further clues to predict its
class correctly.
The assessment of transformer-based models,
such as BERT (Luo et al., 2018), RoBERTa (Liu et al.,
2019), and XLNet (Yang et al., 2019) for text classifi-
cation is part of our immediate future research, as they
have shown promising results on various NLP tasks.
This research has been funded with support from the
European Commission under the 4NSEEK project
with Grant Agreement 821966. This publication re-
flects the views only of the author, and the Euro-
pean Commission cannot be held responsible for any
use which may be made of the information contained
File Name Classification Approach to Identify Child Sexual Abuse