Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification

Elias Bassani, Marco Viviani

2019

Abstract

Wikipedia is nowadays one of the biggest online resources on which users rely as a source of information. The amount of collaboratively generated content that is sent to the online encyclopedia every day can let to the possible creation of low-quality articles (and, consequently, misinformation) if not properly monitored and revised. For this reason, in this paper, the problem of automatically assessing the quality of Wikipedia articles is considered. In particular, the focus is (i) on the analysis of groups of hand-crafted features that can be employed by supervised machine learning techniques to classify Wikipedia articles on qualitative bases, and (ii) on the analysis of some issues behind the construction of a suitable ground truth. Evaluations are performed, on the analyzed features and on a specifically built labeled dataset, by implementing different supervised classifiers based on distinct machine learning algorithms, which produced promising results.

Download


Paper Citation


in Harvard Style

Bassani E. and Viviani M. (2019). Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification. In Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - Volume 1: KDIR; ISBN 978-989-758-382-7, SciTePress, pages 338-346. DOI: 10.5220/0008149303380346


in Bibtex Style

@conference{kdir19,
author={Elias Bassani and Marco Viviani},
title={Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification},
booktitle={Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - Volume 1: KDIR},
year={2019},
pages={338-346},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008149303380346},
isbn={978-989-758-382-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - Volume 1: KDIR
TI - Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification
SN - 978-989-758-382-7
AU - Bassani E.
AU - Viviani M.
PY - 2019
SP - 338
EP - 346
DO - 10.5220/0008149303380346
PB - SciTePress