loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Yaguang Liu 1 ; Lisa Singh 1 and Zeina Mneimneh 2

Affiliations: 1 Department of Computer Science, Georgetown University, 3700 O St., NW, Washington, DC, U.S.A. ; 2 Survey Research Center, University of Michigan, 426 Thompson Street, Ann Arbor, Michigan, U.S.A.

Keyword(s): Demographic Inference, Siamese Network, BERT, Deep Learning.

Abstract: In order for social scientists to use social media as a source for understanding human behavior and public opinion, they need to understand the demographic characteristics of the population participating in the conversation. What proportion are female? What proportion are young? While previous literature has investigated this problem, this work presents a larger scale study that investigates inference techniques for predicting age and gender using Twitter data. We consider classic text features used in previous work and introduce new ones. Then we use a range of learning approaches from classic machine learning models to deep learning ones to understand the role of different language representations for demographic inference. On a data set created from Wikidata, we compare the value of different feature sets with different algorithms. In general, we find that classic models using statistical features and unigrams perform well. Neural networks also perform well, particularly models us ing sentence embeddings, e.g. a Siamese network configuration with attention to tweets and user biographies. The differences are marginal for age, but more significant for gender. In other words, it is reasonable to use simpler, interpretable models for some demographic inference tasks (like age). However, using richer language model is important for gender, highlighting the varying role language plays for demographic inference on social media. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 52.14.7.103

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Liu, Y. ; Singh, L. and Mneimneh, Z. (2021). A Comparative Analysis of Classic and Deep Learning Models for Inferring Gender and Age of Twitter Users. In Proceedings of the 2nd International Conference on Deep Learning Theory and Applications - DeLTA; ISBN 978-989-758-526-5; ISSN 2184-9277, SciTePress, pages 48-58. DOI: 10.5220/0010559500480058

@conference{delta21,
author={Yaguang Liu and Lisa Singh and Zeina Mneimneh},
title={A Comparative Analysis of Classic and Deep Learning Models for Inferring Gender and Age of Twitter Users},
booktitle={Proceedings of the 2nd International Conference on Deep Learning Theory and Applications - DeLTA},
year={2021},
pages={48-58},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010559500480058},
isbn={978-989-758-526-5},
issn={2184-9277},
}

TY - CONF

JO - Proceedings of the 2nd International Conference on Deep Learning Theory and Applications - DeLTA
TI - A Comparative Analysis of Classic and Deep Learning Models for Inferring Gender and Age of Twitter Users
SN - 978-989-758-526-5
IS - 2184-9277
AU - Liu, Y.
AU - Singh, L.
AU - Mneimneh, Z.
PY - 2021
SP - 48
EP - 58
DO - 10.5220/0010559500480058
PB - SciTePress