Social Network Analysis for Predicting Emerging Researchers

Syed Masum Billah, Susan Gauch


Finding rising stars in academia early in their careers has many implications when hiring new faculty, applying for promotion, and/or requesting grants. Typically, the impact and productivity of a researcher are assessed by a popular measurement called the h-index that grows linearly with the academic age of a researcher. Therefore, h-indices of researchers in the early stages of their careers are almost uniformly low, making it difficult to identify those who will, in future, emerge as influential leaders in their field. To overcome this problem, we make use of social network analysis to identify young researchers most likely to become successful as measured by their h-index. We assume that the co-authorship graph reveals a great deal of information about the potential of young researchers. We built a social network of 62,886 researchers using the data available in CiteSeerx. We then designed and trained a linear SVM classifier to identify emerging authors based on their personal attributes and/or their networks of co-authors. We evaluated our classifier’s ability to predict the future research impact of a set of 26,170 young researchers, those with an h-index of less than or equal to two in 2005. By examining their actual impact six years later, we demonstrate that the success of young researchers can be predicted more accurately based on their professional network than their established track records.


  1. Adali, S., Lu, X., Ismail, M., and Purnell, J., 2011. Prominence Ranking in Graphs with Community Structure, ICWSM.
  2. Acuna, D., Allesina, S., Kording, K., 2012. Future impact: Predicting scientific success. Nature. 489, 201-202.
  3. Ball, P., 2005. Index Aims for Fair Ranking of Scientists. Nature, 436 (7053), pp. 900.
  4. Batista, P., Campiteli, M., Kinouchi, O., and Martinez, A., 2005. Is it Possible to Compare Researchers with Different Scientific Interests? ArXiv:physics/0509048, accessible via
  5. Billah, S., M., 2013. Identifying Emerging Researchers using Social Network Analysis. University of Arkansas, 1549393.
  6. Bornmann, L., Daniel, H. D., 2005. Does the H-index for Ranking of Scientists Really Work? Scientometrics, 65 (3), pp. 391-392.
  7. Brin, S., Page, L., 1998. The Anatomy of a Large-scale Hypertextual Web Search Engine. Proceedings of the 7th International World Wide web Conference.
  8. Castro, R., and Grossman, J., 1999. Famous trails to Paul Erdös. MATHINT: The Mathematical Intelligencer. 21, pp. 51-63.
  9. Christakis, N., Fowler, J., 2007. The Spread of Obesity in a Large Social Network Over 32 Years. N. Engl. J. Med., 357, pp. 370-379.
  10. Fan, R., Chang, K., Hsieh, C., Wang, X., and Lin, C., 2008. LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research 9, 1871-1874.
  11. Farkas, I., Derenyi, I., Jeong, H., Neda, Z., Oltvai, Ravasz, Z., Schubert, E., Barabasi, A., and Vicsek, T., 2002. Networks in life: Scaling Properties and Eigenvalue Spectra. Physica A, 314 (1-4), pp. 25-34.
  12. Garfield, E., 1979. Citation Indexing-Its Theory and Application in Science, Technology, and Humanities, John Wiley and Sons, New York, NY.
  13. Giles, C., Bollacker, K., and Lawrence, S., 1998. CiteSeerx: An Automatic Citation Indexing System. Proceedings of the 3rd ACM conference on Digital libraries, New York, NY, pp. 89-98.
  14. Google Scholar, 2015. [Online] Available from: [Accessed: 20 June 2015].
  15. Guns, R., Rousseau, R., 2009. Simulating Growth of the H-index. JASIST. 60 (2), pp. 410-417.
  16. He, S., and Spink, A., 2002. A comparison of Foreign Authorship Distribution in JASIST and the Journal of Documentation, Journal of the American Society for Information Science and Technology. 53 (11), pp. 953-959.
  17. Hirsch, J., 2005. An Index to Quantify an Individual's Scientific Research Output. PNAS. 102 (46), 16569- 16572.
  18. Irfan, M., Ortiz, L., 2013. On Influence, Stable Behavior, and the Most Influential Individuals in Networks: A Game-Theoretic Approach. CoRR, accessible via
  19. Kleinberg, J., 1999. Authoritative Sources in a Hyperlinked Environment. Journal of ACM (JASM), 46 (5), 1999, pp. 604-632.
  20. Lawrence, P., 2007. The Mismeasurement of Science. Current Biology. 17 (15), R583.
  21. Lindsey, D., 1989. Using Citation Counts as a Measure of Quality in Science Measuring What's Measurable Rather Than What's Valid. Scientometrics. 15(3), pp. 189-203.
  22. Liu, X., Bollen, J., Nelson, M., and Sompel, H., 2005. CoAuthorship Networks in the Digital Library Research Community, Information Processing and Management, 41 (6), pp. 1462-1480.
  23. Microsoft Academic Search, 2013. [Online] Available from: [Accessed: 21 July, 2013].
  24. Moravcsik M., and Murugesan, P., 1975. Some Results on the Function and Quality of Citations, Social Studies of Science, 5 (1), pp. 86.
  25. Nascimento, M., Sander, J., and Pound, J., 2003. Analysis of SIGMOD's Co-authorship Graph. SIGMOD, 32 (3).
  26. Neo4j, 2013. The World's Leading Graph Database. [Online] Available from: [Accessed: 21 July, 2013].
  27. Newman, M., 2001, Scientific Collaboration Networks: I. Network Construction and Fundamental Results, Physical Review E. 64:016131.
  28. Newman, M., 2001. Scientific Collaboration Networks: II. Shortest Paths, Weighted Networks, and Centrality. Physical Review E., 64:016132.
  29. Otte, E., and Rousseau, R., 2002. Social Network Analysis: a Powerful Strategy, also for the Information Sciences. Journal of Information Science. 28 (6), pp. 441-453.
  30. Popov, S., 2005. A Parameter to Quantify Dynamics of a Researcher's Scientific Activity, ArXiv:physics/ 0508113, accessible via 0508113.
  31. Scott, J., 2000. Social Network Analysis: A Handbook, 2nd ed., Sage Publications, London.
  32. Smeaton, A., Keogh, G., Gurrin, C., McDonald, K., and Sodring, T., 2002. Analysis of Papers from TwentyFive Years of SIGIR conferences: What have we been Doing for the Last Quarter of a Century. SIGIR,36 (2).
  33. U.S. News and World Report, 2015. Best Computer Science Program. [Online] Available from: [Accessed: 19th May 2015].
  34. Wasserman, S., and Faust, K., 1994, Social Network Analysis: Methods and Applications, Cambridge University Press.
  35. Watts, D., 2001. Small Worlds: The Dynamics of Networks between Order and Randomness, Princeton University Press.

Paper Citation

in Harvard Style

Masum Billah S. and Gauch S. (2015). Social Network Analysis for Predicting Emerging Researchers . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 27-35. DOI: 10.5220/0005593500270035

in Bibtex Style

author={Syed Masum Billah and Susan Gauch},
title={Social Network Analysis for Predicting Emerging Researchers},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},

in EndNote Style

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Social Network Analysis for Predicting Emerging Researchers
SN - 978-989-758-158-8
AU - Masum Billah S.
AU - Gauch S.
PY - 2015
SP - 27
EP - 35
DO - 10.5220/0005593500270035