A Graph-based Disambiguation Approach for Construction of an Expert Repository from Public Online Sources

Anna Hristoskova, Elena Tsiporkova, Tom Tourwé, Simon Buelens, Mattias Putman, Filip De Turck

2013

Abstract

The paper describes a dynamic framework for the construction and maintenance of an expert-finding repository through the continuous gathering and processing of online information. An initial set of online sources, relevant to the topic of interest, is identified to perform an initial collection of author profiles and publications. The extracted information is used as a seed to further enrich the expert profiles by considering other, potentially complementary, online data sources. The resulting expert repository is represented as a graph, where related author profiles are dynamically clustered together via a complex author disambiguation process leading to continuous merging and splitting of author nodes. Several rules are developed that assign weights to the links in the graph based on author similarities such as name, affiliation, e-mail, co-authors, and interests. Dynamic clustering of the authors depending on these weights results in the identification of unique experts for a specific domain. The developed disambiguation and author clustering algorithms are validated on several authors with varying name notations showing an improvement on the identification of unique profiles of 28% compared to the results from DBLP.

References

  1. Balog, K., Azzopardi, L., and de Rijke, M. (2009). A language modeling framework for expert finding. Information Processing & Management, 45(1):1-19.
  2. Balog, K. and de Rijke, M. (2007). Finding similar experts. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 821-822. ACM.
  3. Böhm, C., Naumann, F., et al. (2010). Profiling linked open data with ProLOD. In Proceedings of the 26th IEEE International Conference on Data Engineering ICDE 2011, Workshops, pages 175-178.
  4. Boley, H. and Paschke, A. (2007). Expert querying and redirection with rule responder. In 2nd International ExpertFinder Workshop at the 6th International Semantic Web Conference ISWC 2007.
  5. Fang, H. and Zhai, C. (2007). Probabilistic models for expert finding. Advances in Information Retrieval, pages 418-430.
  6. Flake, G., Tarjan, R., and Tsioutsiouliklis, K. (2004). Graph clustering and minimum cut trees. Internet Mathematics, 1(4):385-408.
  7. Hofmann, K., Balog, K., Bogers, T., and de Rijke, M. (2010). Contextual factors for finding similar experts. Journal of the American Society for Information Science and Technology, 61(5):994-1014.
  8. Jung, H., Lee, M., Kang, I., Lee, S., and Sung, W. (2007a). Finding topic-centric identified experts based on full text analysis. In 2nd International ExpertFinder Workshop at the 6th International Semantic Web Conference ISWC 2007.
  9. Jung, H., Lee, M., Sung, W., and Park, D. (2007b). Semantic Web-Based Services for Supporting Voluntary Collaboration among Researchers Using an Information Dissemination Platform. Data Science Journal, 6(0):241-249.
  10. Pavlov, M. and Ichise, R. (2007). Finding experts by link prediction in co-authorship networks. In 2nd International ExpertFinder Workshop at the 6th International Semantic Web Conference ISWC 2007, pages 42-55.
  11. Pu, K., Hassanzadeh, O., Drake, R., and Miller, R. (2010). Online annotation of text streams with structured entities. In Proceedings of the 19th ACM international conference on Information and knowledge management CIKM 2010, pages 29-38.
  12. Saha, B. and Mitra, P. (2006). Dynamic algorithm for graph clustering using minimum cut tree. In Proceedings of the 6th IEEE International Conference on Data Mining ICDMW 7806, pages 667-671. IEEE.
  13. Sriharee, N. and Punnarut, R. (2007). Constructing Semantic Campus for Academic Collaboration. In 2nd International ExpertFinder Workshop at the 6th International Semantic Web Conference ISWC 2007, pages 23-32.
  14. Stankovic, M., Jovanovic, J., and Laublet, P. (2011). Linked Data Metrics for Flexible Expert Search on the Open Web. In Proceedings of the 8th Extended Semantic Web Conference ESWC 2011, pages 108-123.
  15. Tung, Y., Tseng, S., Weng, J., Lee, T., Liao, A., and Tsai, W. (2010). A rule-based CBR approach for expert finding and problem diagnosis. Expert Systems with Applications, 37(3):2427-2438.
  16. Whitelaw, C., Kehlenbeck, A., Petrovic, N., and Ungar, L. (2008). Web-scale named entity recognition. In Proceeding of the 17th ACM Conference on information and Knowledge Management, pages 123-132. ACM.
  17. Zhang, J., Tang, J., and Li, J. (2010). Expert finding in a social network. Advances in Databases: Concepts, Systems and Applications, pages 1066-1069.
  18. Zhang, J., Tang, J., Liu, L., and Li, J. (2008). A mixture model for expert finding. In Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining, pages 466-478. SpringerVerlag.
Download


Paper Citation


in Harvard Style

Hristoskova A., Tsiporkova E., Tourwé T., Buelens S., Putman M. and De Turck F. (2013). A Graph-based Disambiguation Approach for Construction of an Expert Repository from Public Online Sources . In Proceedings of the 5th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-8565-39-6, pages 24-33. DOI: 10.5220/0004192300240033


in Bibtex Style

@conference{icaart13,
author={Anna Hristoskova and Elena Tsiporkova and Tom Tourwé and Simon Buelens and Mattias Putman and Filip De Turck},
title={A Graph-based Disambiguation Approach for Construction of an Expert Repository from Public Online Sources},
booktitle={Proceedings of the 5th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2013},
pages={24-33},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004192300240033},
isbn={978-989-8565-39-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - A Graph-based Disambiguation Approach for Construction of an Expert Repository from Public Online Sources
SN - 978-989-8565-39-6
AU - Hristoskova A.
AU - Tsiporkova E.
AU - Tourwé T.
AU - Buelens S.
AU - Putman M.
AU - De Turck F.
PY - 2013
SP - 24
EP - 33
DO - 10.5220/0004192300240033