Authors:
Rahma Basti
1
;
Salma Jamoussi
2
;
Anis Charfi
3
and
Abdelmajid Ben Hamadou
4
Affiliations:
1
Multimedia InfoRmation Systems and Advanced Computing Laboratory (MIRACL), University of Sfax, Tunis and Tunisia
;
2
Higher Institute of Computer Sceience and Multimedia of Sfax, 1173 Sfax 3038 and Tunisia
;
3
Carnegie Mellon University Qatar and Qatar
;
4
Digital Research Center of Sfax (DRCS) and Tunisia
Keyword(s):
Author Profiling, Arabic Text Processing, Age and Gender Prediction, Dangerous Profiles, Stylometric Features.
Related
Ontology
Subjects/Areas/Topics:
Social Media Analytics
;
Society, e-Business and e-Government
;
Web Information Systems and Technologies
Abstract:
In recent years, we witnessed a rapid growth of social media networking and micro-blogging sites such as Twitter. In these sites, users provide a variety of data such as their personal data, interests, and opinions. However, this data shared is not always true. Often, social media users hide behind a fake profile and may use it to spread rumors or threaten others. To address that, different methods and techniques were proposed for user profiling. In this article, we use machine learning for user profiling in order to predict the age and gender of a user’s profile and we assess whether it is a dangerous profile using the users’ tweets and features. Our approach uses several stylistic features such as characters based, words based and syntax based. Moreover, the topics of interest of a user are included in the profiling task. We obtained the best accuracy levels with SVM and these were respectively 73.49% for age, 83.7% for gender, and 88.7% for the dangerous profile detection.