Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling

Paloma Piot–Perez-Abadin, Patricia Martin–Rodilla, Javier Parapar

2021

Abstract

Automatic user profiling from social networks has become a popular task due to its commercial applications (targeted advertising, market studies...). Automatic profiling models infer demographic characteristics of social network users from their generated content or interactions. Users’ demographic information is also precious for more social worrying tasks such as automatic early detection of mental disorders. For this type of users’ analysis tasks, it has been shown that the way how they use language is an important indicator which contributes to the effectiveness of the models. Therefore, we also consider that for identifying aspects such as gender, age or user’s origin, it is interesting to consider the use of the language both from psycho-linguistic and semantic features. A good selection of features will be vital for the performance of retrieval, classification, and decision-making software systems. In this paper, we will address gender classification as a part of the automatic profiling task. We show an experimental analysis of the performance of existing gender classification models based on external corpus and baselines for automatic profiling. We analyse in-depth the influence of the linguistic features in the classification accuracy of the model. After that analysis, we have put together a feature set for gender classification models in social networks with an accuracy performance above existing baselines.

Download


Paper Citation


in Harvard Style

Piot–Perez-Abadin P., Martin–Rodilla P. and Parapar J. (2021). Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling. In Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-758-508-1, pages 103-113. DOI: 10.5220/0010431901030113


in Bibtex Style

@conference{enase21,
author={Paloma Piot–Perez-Abadin and Patricia Martin–Rodilla and Javier Parapar},
title={Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling},
booktitle={Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},
year={2021},
pages={103-113},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010431901030113},
isbn={978-989-758-508-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
TI - Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling
SN - 978-989-758-508-1
AU - Piot–Perez-Abadin P.
AU - Martin–Rodilla P.
AU - Parapar J.
PY - 2021
SP - 103
EP - 113
DO - 10.5220/0010431901030113