Authors:
Nikos Kirtsis
;
Sofia Stamou
;
Paraskevi Tzekou
and
Nikos Zotos
Affiliation:
Patras University, Greece
Keyword(s):
Wikipedia, Information Uniqueness, Content Duplication, Quality Assessment.
Related
Ontology
Subjects/Areas/Topics:
Searching and Browsing
;
Social Information Systems
;
Society, e-Business and e-Government
;
Web Information Systems and Technologies
;
Web Interfaces and Applications
Abstract:
Wikipedia is one of the most successful worldwide collaborative efforts to put together user generated content in a meaningfully organized and intuitive manner. Currently, Wikipedia hosts millions of articles on a variety of topics, supplied by thousands of contributors. A critical factor in Wikipedia’s success is its open nature, which enables everyone edit, revise and /or question (via talk pages) the article contents. Considering the phenomenal growth of Wikipedia and the lack of a peer review process for its contents, it becomes evident that both editors and administrators have difficulty in validating its quality on a systematic and coordinated basis. This difficulty has motivated several research works on how to assess the quality of Wikipedia articles. In this paper, we propose the exploitation of a novel indicator for the Wikipedia articles’ quality, namely information uniqueness. In this respect, we describe a method that captures the information duplication across the artic
le contents in an attempt to infer the amount of distinct information every article communicates. Our approach relies on the intuition that an article offering unique information about its subject is of better quality compared to an article that discusses issues already addressed in several other Wikipedia articles.
(More)