INFORMATION UNIQUENESS IN WIKIPEDIA ARTICLES

Nikos Kirtsis, Sofia Stamou, Paraskevi Tzekou, Nikos Zotos

Abstract

Wikipedia is one of the most successful worldwide collaborative efforts to put together user generated content in a meaningfully organized and intuitive manner. Currently, Wikipedia hosts millions of articles on a variety of topics, supplied by thousands of contributors. A critical factor in Wikipedia’s success is its open nature, which enables everyone edit, revise and /or question (via talk pages) the article contents. Considering the phenomenal growth of Wikipedia and the lack of a peer review process for its contents, it becomes evident that both editors and administrators have difficulty in validating its quality on a systematic and coordinated basis. This difficulty has motivated several research works on how to assess the quality of Wikipedia articles. In this paper, we propose the exploitation of a novel indicator for the Wikipedia articles’ quality, namely information uniqueness. In this respect, we describe a method that captures the information duplication across the article contents in an attempt to infer the amount of distinct information every article communicates. Our approach relies on the intuition that an article offering unique information about its subject is of better quality compared to an article that discusses issues already addressed in several other Wikipedia articles.

References

  1. Adler, N.T., de Alfaro, L., 2007. A content-driven reputation system for the Wikipedia. In Proceedings of the 16th International World Wide Web Conference.
  2. Blumenstock, J.E, 2008a. Automatically assessing the quality of Wikipedia articles. UBCiSchool Report. 2008-021
  3. Blumenstock, J.E, 2008b. Size matters: word count as a measure of quality in Wikipedia. In Proceedings of the 17th Intl. WWW Conference.
  4. Buriol, J., Castillo, C., Donato, D., Leonardi, S., Millozzi, S., 2006. Temporal evolution of the wiki graph. In Proceedings of the Web Intelligence Conference.
  5. Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G. 1997. Syntactic clustering of the web. In Proceedings of the 6th Intl. WWW Conference, pp. 391-404
  6. Cross, T., 2006. Puppy smoothies: improving the reliability of open, collaborative wikis. First Monday, 11 (9).
  7. Davison, D. 2000. Topical locality on the web. In Proceedings of the 23rd Intl. SIGIR Conference.
  8. Emigh, W., Herring, S., 2005. Collaborative authoring on the web: a genre analysis of online enclyclopedias. In Proceedings of the HICSS Conference.
  9. Fellbaum, Ch. 1998. WordNet: An Electronic Lexical Database. MIT Press.
  10. Giles, J., 2005. Internet encyclopaedias go head to head. In Nature, 438:900-901.
  11. Kamps, J., Koolen, M., 2009. Is Wikipedia ;link structure different? In Proceedings of the 2nd Intl. WSDM Conference.
  12. Koolen, M., Kamps, J., 2009. What's in a link? From document importance to topical relevance. In Proceedings of the 2nd International Conference on Theory of Information Retrieval, pp. 313-321.
  13. Nielsen, F.A., 2007. Scientific Citations in Wikipedia. In Computing Research Repository.
  14. Rieche, D., 2005. How and why Wikipedia works? an interview. In Proceedings of the ACM Wikisym.
  15. Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L. 2005a. Assessing information quality of a community-based encyclopaedia. In Proceedings of the International Conference on Information Quality.
  16. Stvilia, B., Twidale, M.B., Gasser, L., Smith, L.C.. 2005b. Information quality discussions in Wikipedia. In Proceedings of the International Conference on Knowledge Management.
  17. Voss, J., 2005. Measuring Wikipedia. In Proceedings of the 10th International Conference of the International Society for Scientometrics and Infometrics.
  18. Wilkinson, D.M., Huberman, B.A., 2007. Assessing the value of cooperation in Wikipedia. First Monday, 12(4).
  19. Wu, Z., Palmer, M. 1994. Verb semantics and lexical selection. In Proceedings of the 32nd ACL Conference, pp. 133-138.
Download


Paper Citation


in Harvard Style

Kirtsis N., Stamou S., Tzekou P. and Zotos N. (2010). INFORMATION UNIQUENESS IN WIKIPEDIA ARTICLES . In Proceedings of the 6th International Conference on Web Information Systems and Technology - Volume 2: WEBIST, ISBN 978-989-674-025-2, pages 137-143. DOI: 10.5220/0002841401370143


in Bibtex Style

@conference{webist10,
author={Nikos Kirtsis and Sofia Stamou and Paraskevi Tzekou and Nikos Zotos},
title={INFORMATION UNIQUENESS IN WIKIPEDIA ARTICLES},
booktitle={Proceedings of the 6th International Conference on Web Information Systems and Technology - Volume 2: WEBIST,},
year={2010},
pages={137-143},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002841401370143},
isbn={978-989-674-025-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Web Information Systems and Technology - Volume 2: WEBIST,
TI - INFORMATION UNIQUENESS IN WIKIPEDIA ARTICLES
SN - 978-989-674-025-2
AU - Kirtsis N.
AU - Stamou S.
AU - Tzekou P.
AU - Zotos N.
PY - 2010
SP - 137
EP - 143
DO - 10.5220/0002841401370143