After we completed our experiments with our
test and training data set, our classifier was used to
make the prediction of producing research impacts
in the coming years of a set of 8,849 researchers
who had an h-index of less than or equal to two in
2011. Finally, when we examined the results, we
found that after just four years (in 2015), the
predicted emerging researchers became mature in
present time.
While this work provides the basic framework
for finding emerging authors, there is still plenty of
room for improvement. For example, we extract
social features of a node from its immediate
neighbors (1-level deep) only. It would be an
interesting study to see the effect of extracting
features from nodes at distance two or more, making
more use of an author’s academic social network.
Moreover, other than the degree centrality, we do
not use any centrality measurement of a node (such
as betweenness, eigenvalue centrality, etc.) in the
coauthorship graph. Finally, we would like to see the
results of our algorithm on a more recent data-set.
ACKNOWLEDGEMENTS
This research is partially supported by the NSF grant
number 0958123 - Collaborative Research:
CIADDO-EN: Semantic CiteSeerX.
REFERENCES
Adali, S., Lu, X., Ismail, M., and Purnell, J., 2011.
Prominence Ranking in Graphs with Community
Structure, ICWSM.
Acuna, D., Allesina, S., Kording, K., 2012. Future impact:
Predicting scientific success. Nature. 489, 201–202.
Ball, P., 2005. Index Aims for Fair Ranking of Scientists.
Nature, 436 (7053), pp. 900.
Batista, P., Campiteli, M., Kinouchi, O., and Martinez, A.,
2005. Is it Possible to Compare Researchers with
Different Scientific Interests? ArXiv:physics/0509048,
accessible via http://arxiv.org/abs/physics/0509048.
Billah, S., M., 2013. Identifying Emerging Researchers
using Social Network Analysis. University of
Arkansas, 1549393.
Bornmann, L., Daniel, H. D., 2005. Does the H-index for
Ranking of Scientists Really Work? Scientometrics, 65
(3), pp. 391-392.
Brin, S., Page, L., 1998. The Anatomy of a Large-scale
Hypertextual Web Search Engine. Proceedings of the
7th International World Wide web Conference.
Castro, R., and Grossman, J., 1999. Famous trails to Paul
Erdös. MATHINT: The Mathematical Intelligencer.
21, pp. 51–63.
Christakis, N., Fowler, J., 2007. The Spread of Obesity in
a Large Social Network Over 32 Years. N. Engl. J.
Med., 357, pp. 370–379.
Fan, R., Chang, K., Hsieh, C., Wang, X., and Lin, C.,
2008. LIBLINEAR: A Library for Large Linear
Classification. Journal of Machine Learning Research
9, 1871-1874.
Farkas, I., Derenyi, I., Jeong, H., Neda, Z., Oltvai, Ravasz,
Z., Schubert, E., Barabasi, A., and Vicsek, T., 2002.
Networks in life: Scaling Properties and Eigenvalue
Spectra. Physica A, 314 (1-4), pp. 25-34.
Garfield, E., 1979. Citation Indexing-Its Theory and
Application in Science, Technology, and Humanities,
John Wiley and Sons, New York, NY.
Giles, C., Bollacker, K., and Lawrence, S., 1998.
CiteSeer
x
: An Automatic Citation Indexing System.
Proceedings of the 3
rd
ACM conference on Digital
libraries, New York, NY, pp. 89–98.
Google Scholar, 2015. [Online] Available from:
https://scholar.google.com/. [Accessed: 20 June 2015].
Guns, R., Rousseau, R., 2009. Simulating Growth of the
H-index. JASIST. 60 (2), pp. 410-417.
He, S., and Spink, A., 2002. A comparison of Foreign
Authorship Distribution in JASIST and the Journal of
Documentation, Journal of the American Society for
Information Science and Technology. 53 (11), pp.
953–959.
Hirsch, J., 2005. An Index to Quantify an Individual’s
Scientific Research Output. PNAS. 102 (46), 16569–
16572.
Irfan, M., Ortiz, L., 2013. On Influence, Stable Behavior,
and the Most Influential Individuals in Networks: A
Game-Theoretic Approach. CoRR, accessible via
http://arxiv.org/abs/1303.2147.
Kleinberg, J., 1999. Authoritative Sources in a Hyper-
linked Environment. Journal of ACM (JASM), 46 (5),
1999, pp. 604-632.
Lawrence, P., 2007. The Mismeasurement of Science.
Current Biology. 17 (15), R583.
Lindsey, D., 1989. Using Citation Counts as a Measure of
Quality in Science Measuring What’s Measurable
Rather Than What’s Valid. Scientometrics. 15(3), pp.
189–203.
Liu, X., Bollen, J., Nelson, M., and Sompel, H., 2005. Co-
Authorship Networks in the Digital Library Research
Community, Information Processing and
Management, 41 (6), pp. 1462-1480.
Microsoft Academic Search, 2013. [Online] Available
from: http://academic.research.microsoft.com/.
[Accessed: 21 July, 2013].
Moravcsik M., and Murugesan, P., 1975. Some Results on
the Function and Quality of Citations, Social Studies
of Science, 5 (1), pp. 86.
Nascimento, M., Sander, J., and Pound, J., 2003. Analysis
of SIGMOD’s Co-authorship Graph. SIGMOD, 32 (3).
Neo4j, 2013. The World's Leading Graph Database.
[Online] Available from: http://neo4j.com. [Accessed:
21 July, 2013].
Newman, M., 2001, Scientific Collaboration Networks: I.
Network Construction and Fundamental Results,
Physical Review E. 64:016131.