Lauri Lahti


We propose a new semi-automated method for generating personalized learning paths from the Wikipedia online encyclopedia by following inter-article hyperlink chains based on various rankings that are retrieved from the statistics of the articles. Alternative perspectives for learning topics are achieved when the next hyperlink to access is selected based on hierarchy of hyperlinks, repetition of hyperlink terms, article size, viewing rate, editing rate, or user-defined weighted mixture of them all. We have implemented the method in a prototype enabling the learner to build independently concept maps following her needs and consideration. A list of related concepts is shown in a desired type of ranking to label new nodes (titles of target articles for current hyperlinks) accompanied with parsed explanation phrases from the sentences surrounding each hyperlink to label directed arcs connecting nodes. In experiments the alternative ranking schemes well supported various learning needs suggesting new pedagogical networking practices.


  1. Adler, B., & de Alfaro, L. (2007). A content-driven reputation system for the Wikipedia. Proc. 16th international conference on World Wide Web, Banff, Alberta, Canada, ACM Press, 261-270.
  2. Berners-Lee, T., Hendler, J. & Lassila; O. (2001). The semantic web. Scientific American Magazine, May 2001.
  3. Blumenstock, J. (2008). Automatically assessing the quality of Wikipedia articles. University of California at Berkeley, School of Information. Technical Report 2008-021.
  4. Chesney, T. (2006). An empirical examination of wikipedia's credibility. First Monday, 11(11).
  5. Coursey, K., Mihalcea, R., & Moen, W. (2008). Automatic keyword extraction for learning object repositories. Proc. Conference of the American Society for Information Science and Technology (ASIST 2008), Columbus, Ohio, USA.
  6. Dicheva D., & Dichev C. (2007). Helping courseware authors to build ontologies: the case of TM4L. Proc. 13th International Conference on Artificial Intelligence in Education, (AI-ED 2007), Los Angeles, California, USA, IOS Press, 77-84.
  7. Gandrabur, S., Foster, G., & Lapalme, G. (2006). Confidence estimation for NLP applications. Transactions on Speech and Language Processing, 3(3), 1-29.
  8. Guthrie, J., Wigfield, A., Barbosa, P., Perencevich, K., Taboada, A., Davis, M., Scafiddi, N., & Tonks, S. (2004). Increasing reading comprehension and engagement through concept-oriented reading instruction. Journal of Educational Psychology, 96(3), 403-423.
  9. Gutiérrez, S., Pardo, A., & Kloos, C. (2006). Finding a learning path: toward a swarm intelligence approach, Proc. 5th IASTED international conference on Webbased education, Puerto Vallarta, Mexico, ACTA Press, 94-99.
  10. Haase, P., & Völker, J. (2008). Ontology learning and reasoning - dealing with uncertainty and inconsistency. In da Costa, P. et al. (eds.), Uncertainty Reasoning for the Semantic Web I. LNAI 5327, 366- 384.
  11. Haruechaiyasak, C., & Damrongrat, C. (2008). Article recommendation based on a topic model for Wikipedia Selection for Schools. Proc. 11th International Conference on Asian Digital Libraries, LNCS 5362, 339-342.
  12. Holmes, B., Tangney, B., Fitz-Gibbon, A., Savage, T., & Mehan, S. (2001). Communal constructivism: students constructing learning for as well as with others. Proc. 12th International Conference of the Society for Information Technology and Teacher Education (SITE 2001), Orlando, Florida, USA, 3114-3119.
  13. Hu, B. (in press). WiKi'mantics: interpreting ontologies with Wikipedia. Journal of Knowledge and Information Systems, Springer. DOI: 10.1007/s10115- 009-0247-6
  14. Janssen, J., Berlanga, A., Vogten, H., & Koper, R. (2008). Towards a learning path specification. International Journal of Continuing Engineering Education and Lifelong Learning, 18(1).
  15. Kilgarriff, A. (2009). BNC database and word frequency lists. Lemmatized frequency list of British National Corpus. URL:
  16. Lahti, L. (2009). Guided generation of pedagogical concept maps from the Wikipedia. Proc. World Conference on E-Learning in Corporate, Government, Healthcare and Higher Education (E-Learn 2009), Vancouver, Canada, 1741-1750. ölvlably pwjxjyt åjtdtj, pwzyt ldltybcvtulb jtbt taj bccwt, vttbza.
  17. Luyt, B., Aaron, T., Thian, L., & Hong, C. (2008). Improving Wikipedia's accuracy: Is edit age a solution? Journal of the American Society for Information Science and Technology, 59(2), 318-330.
  18. Nation, P., & Waring, R. (1997). Vocabulary size, text coverage, and word lists. In Schmitt, N., & McCarthy, M. (eds.), Vocabulary: Description, acquisition, pedagogy. Cambridge University Press, New York, USA, 6-19.
  19. Nastase, V., & Szpakowicz, S. (2006). Matching semantic-syntactic graphs for semantic relation assignment. Proc. Textgraphs 2006 Workshop on Graph-based Algorithms for Natural Language Processing, New York, USA.
  20. Neumann, D., & Hood, M. (2009). The effects of using a wiki on student engagement and learning of report writing skills in a university statistics course. Australasian Journal of Educational Technology, 25(3), 382-398.
  21. Pavlovic, D. (2008). Network as a computer: ranking paths to find flows. Proc. Third International Computer Science Symposium in Russia, LNCS 5010, 384-397.
  22. Peregrin, J. (in press). The myth of semantic structure. In Stalmaszczyk, P. (ed.), Philosophy of Language and Linguistics. Volume I: The Formal Turn. Ontos, Frankfurt. Germany. URL:
  23. Pirrone, R., Pilato, G., Rizzo, R., & Russo, G. (2005). Learning path generation by domain ontology transformation. Proc. 9th Congress of the Italian Association for Artificial Intelligence, LNAI 3673, 359-369.
  24. Reinoso, A., Ortega, F., Gonzalez-Barahona, J., & Robles, G. (2009). A quantitative approach to the use of the Wikipedia. Proc. IEEE Symposium on Computers and Communications (ISCC 2009), Sousse, Tunisia, 56- 61.
  25. Schmidt, R.A., & Bjork, R.A. (1992). New conceptualizations of practice: common principles in three paradigms suggest new concepts for training. Psychological Science, 3(4), 207-217.
  26. Serrano, M., Flammini, A., & Menczer, F. (2009). Modeling statistical properties of written text. Public Library of Science ONE (PLoS ONE), 4(4): e5372.
  27. Simperl, E., & Tempich, C. (2006). Ontology engineering: a reality check. Proc. 5th International Conference on Ontologies, Databases, and Applications of Semantics (ODBASE2006), LNCS 4275, 836-854.
  28. Thomas, C., & Sheth, A. (2007). Semantic convergence of Wikipedia articles. Proc. IEEE/WIC/ACM International Conference on Web Intelligence, Silicon Valley, California, USA, 600-606.
  29. Van Berkum, J., Brown, C., Zwitserlood, P., Kooijman, V., & Hagoort, P. (2005). Anticipating upcoming words in discourse: Evidence from ERPs and reading times. Journal of Experimental Psychology: Learning, Memory and Cognition, 31, 443-467.
  30. Wikipedia article traffic statistics (2009). URL:
  31. Wikipedia page history statistics (2009). URL:
  32. Yang, J., Jangwhan, H., Oh, I., & Kwak, M. (2007). Using Wikipedia technology for topic maps design. Proc. 45th ACM Southeast Regional Conference (ACM-SE 45). Winston-Salem, North Carolina, USA, ACM Press, 106-110.
  33. Zouaq, A. & Nkambou, R.,(2009). Evaluating the generation of domain ontologies in the Knowledge Puzzle Project. IEEE Transactions on Knowledge and Data Engineering, 21(11), 1559-1572.

Paper Citation

in Harvard Style

Lahti L. (2010). PERSONALIZED LEARNING PATHS BASED ON WIKIPEDIA ARTICLE STATISTICS . In Proceedings of the 2nd International Conference on Computer Supported Education - Volume 1: CSEDU, ISBN 978-989-674-023-8, pages 110-120. DOI: 10.5220/0002800901100120

in Bibtex Style

author={Lauri Lahti},
booktitle={Proceedings of the 2nd International Conference on Computer Supported Education - Volume 1: CSEDU,},

in EndNote Style

JO - Proceedings of the 2nd International Conference on Computer Supported Education - Volume 1: CSEDU,
SN - 978-989-674-023-8
AU - Lahti L.
PY - 2010
SP - 110
EP - 120
DO - 10.5220/0002800901100120