WIKIPEDIA AS DOMAIN KNOWLEDGE NETWORKS - Domain Extraction and Statistical Measurement

Zheng Fang, Jie Wang, Benyuan Liu, Weibo Gong

Abstract

This paper investigates knowledge networks of specific domains extracted from Wikipedia and performs statistical measurements to selected domains. In particular, we first present an efficient method to extract a specific domain knowledge network from Wikipedia. We then extract four domain networks on, respectively, mathematics, physics, biology, and chemistry. We compare the mathematics domain network extracted from Wikipedia with MathWorld, the web’s most extensive mathematical resource created and maintained by professional mathematicians, and show that they are statistically similar to each other. This indicates that MathWorld and Wikipedia’s mathematics domain knowledge share a similar internal structure. Such information may be useful for investigating knowledge networks.

References

  1. Caldarelli, G. (2007). Scale-Free Network. Oxford Univeristy Press.
  2. Capocci, A. (2006). Preferential attachment in the growth of social networks: The internet encyclopedia wikipedia”. Phys. Rev. E; Physical Review E, 74(3).
  3. Chesney, T. (2006). An empirical examination of wikipedia's credibility. Firstmonday, 11.
  4. Giles, J. (2005). Internet encyclopaedias go head to head. Nature, 438(7070):900-901.
  5. Gong, W., Liu, Y., Misra, V., and Towsley, D. F. (2005). Self-similarity and long range dependence on the internet: a second look at the evidence, origins and implications. Computer Networks, 48(3):377-399.
  6. Halavais, A. and Lackaff, D. (2008). An analysis of topical coverage of wikipedia. Journal of ComputerMediated Communication, 13(2):429-440.
  7. Holland, P. W. and Leinhardt, S. (1971). Transitivity in structural models of small groups. Comparative Group Studies, 2(2):107-124.
  8. Jia, Q. and Guo, Y. (2009). Discovering the knowledge hierarchy of mathworld for web intelligence. In Fuzzy Systems and Knowledge Discovery, 2009. FSKD 7809. Sixth International Conference on, volume 7, pages 535 -539.
  9. Kamps, J. and Koolen, M. (2009). Is wikipedia link structure different? In Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM 7809, pages 232-241, New York, NY, USA. ACM.
  10. Kittur, A. and Kraut, R. E. (2008). Harnessing the wisdom of crowds in wikipedia: quality through coordination. In Proceedings of the 2008 ACM conference on Computer supported cooperative work, CSCW 7808, pages 37-46, New York, NY, USA. ACM.
  11. Lih, A. (2004). Wikipedia as participatory journalism: Reliable sources? metrics for evaluating collaborative media as a news resource. Proceedings of the International Symposium on Online Journalism 2004.
  12. Muchnik, L., Itzhack, R., Solomon, S., and Louzoun, Y. (2007). Self-emergence of knowledge trees: Extraction of the wikipedia hierarchies. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 76(1).
  13. Silva, F. N., Travencolo, B. A., Viana, M. P., and da Fontoura Costa, L. (2010). Identifying the borders of mathematical knowledge. Journal of Physics A: Methematical and Theoretical, 43(325202).
  14. Voss, J. (2005). Measuring wikipedia. In Proceedings of the 10th International Conference of the International Society for Scientometrics and Informetrics.
  15. Watts, D. and Strogatz, S. (1998). Collective dynamics of small-world networks. Nature, 393(6684):440-442.
  16. Yu, J., Thom, J. A., and Tam, A. (2007). Ontology evaluation using wikipedia categories for browsing. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, CIKM 7807, pages 223-232, New York, NY, USA. ACM.
  17. Zlatic, V., Boz?ic?evic, M., S? tefanc?ic, H., and Domazet, M. (2006). Wikipedias: Collaborative web-based encyclopedias as complex networks. Physical Review E, 74(1):016115.
Download


Paper Citation


in Harvard Style

Fang Z., Wang J., Liu B. and Gong W. (2011). WIKIPEDIA AS DOMAIN KNOWLEDGE NETWORKS - Domain Extraction and Statistical Measurement . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 151-157. DOI: 10.5220/0003633001590165


in Bibtex Style

@conference{kdir11,
author={Zheng Fang and Jie Wang and Benyuan Liu and Weibo Gong},
title={WIKIPEDIA AS DOMAIN KNOWLEDGE NETWORKS - Domain Extraction and Statistical Measurement},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={151-157},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003633001590165},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - WIKIPEDIA AS DOMAIN KNOWLEDGE NETWORKS - Domain Extraction and Statistical Measurement
SN - 978-989-8425-79-9
AU - Fang Z.
AU - Wang J.
AU - Liu B.
AU - Gong W.
PY - 2011
SP - 151
EP - 157
DO - 10.5220/0003633001590165