MEASURING TWITTER USER SIMILARITY AS A FUNCTION OF STRENGTH OF TIES
John Conroy, Josephine Griffith, Colm O’Riordan
2011
Abstract
Users of online social networks reside in social graphs, where any given user-pair may be connected or unconnected. These connections may be formal or inferred social links; and may be binary or weighted. We might expect that users who are connected by a social tie are more similar in what they write than unconnected users, and that more strongly connected pairs of users are more similar again than less-strongly connected users, but this has never been formally tested. This work describes a method for calculating the similarity between twitter social entities based on what they have written, before examining the similarity between twitter user-pairs as a function of how tightly connected they are. We show that the similarity between pairs of twitter users is indeed positively correlated with the strength of the tie between them.
References
- Asur S., Huberman B. A., 2010, Predicting the future with social media, IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
- Backstrom, L., Kumar, R., Marlow, C., Novak, J., 2008. Preferential behaviour in online groups. In Proceedings of the international conference on Web search and web data mining (WSDM) 2008
- Barabasi, A. L., Albert, R., 1999, Emergence of scaling in random networks, Science 286, pp 509-512
- Bush, V., 1939. Mechanization and the record, Vannevar Bush Papers, Library of Congress [U.S.A.] Box 138, speech article book file
- Bush, V., Wang, J., 1945, Atlantic Monthly 176 pp101- 108
- Conroy, J., Griffith, J. 2010 Machine learning techniques for sentiment analysis of Super Bowl commercials, The 21st National Conference on Artificial Intelligence and Cognitive Science (AICS), NUI Galway, Ireland
- Cummins, R., O'Riordan, C., 2007 An axiomatic comparison of learned term-weighting schemes in information retrieval: clarifications and extensions, Artificial Intelligence Review 28
- de Chowdury, M., Lin, Y. R., Sundaram, H., Candan, K. S., Lexing, X., Kelliher, A.. 2010. How does the data sampling strategy impact the discovery of information diffusion in social media. Fourth International AAAI Conference on Weblogs and Social Media.
- Dong, A., Zhang, R., Kolari, P., Bai, J., Diaz, F., Chang, Y., Time is of the essence: improving recency ranking using Twitter data. In Proceedings of WWW 7810 Proceedings of the 19th international conference on World wide web ACM New York
- Granovetter, J. M. 1973. The strength of weak ties. American Journal of Sociology 78(6)
- Huberman B. A., Romero, D. M. Wu, F., 2009 Social networks that matter: twitter under the microscope, First Monday 14
- Kumar, R., Mahdin, M., McGlohan, 2011, Dynamics of Conversations, ACM Special Interest Group on Knowledge Discovery and Data Mining (KDD10). Washington DC
- Liu, Y-Y., Slotine, J-J., Barabasi, A. L. 2011, Controllability of complex networks, Nature, Volume 473 Number 7346
- Livnel, A., Simmons, M. P., Adarl, E., Adamic, L.A., 2011, The Party is Over Here: Structure and Content in the 2010 Election, ICWSM 2011
- Luhn, H. P. 1957. A statistical approach to the mechanized encoding and searching of literary information, IBM Journal of Research and Development 1:4, 309-317
- Magnani, M., Montesi, D., Nunziante, G., Rossi, L., 2011, Conversation retrieval from Twitter, Lecture Notes in Computer Science Volume 6611/2011, 780- 783
- Milgram, S., 1967. The small world problem. Psychology Today 2:60-67
- Newman M. E. J, 2003, The structure and function of complex networks. SIAM Review 45, pp 167-256
- Raghavan, P., Schütze, H. 2008. Introduction to Information Retrieval, Cambridge University Press pp 117-120, 121-124
- Ritter, A. Cherry, C., Dolan, B., 2010, Unsupervised modeling of twitter conversations, HLT 7810: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
- Romero, D. M., Meeder, B., Kleinberg, J., 2011. Differences in the mechanics of information diffusion across topics: idioms, political hashtags and complex contagion on twitter. In Proceedings of the 20th Intl. Conference on World Wide Web WWW 2011
- Salton, G., Wong, A., Yang, C. S. 1997 A vector space model for automatic indexing. Readings in information retrieval. Morgan Kaufman publishers.
- Salton, G., 1991. Developments in automatic text retrieval. Science 253 pp 974-980
- Singhal, A., 2001, Modern information retrieval: a brief overview, Bulletin of the IEEE computer society technical committee on data engineering
- Soucy, P. 2005. Beyond TFIDF weighting for text categorization in the vector space model. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJACAI 2005)
- Watts, D. J., Strogatz, S.H., 1998, Collective dynamics of'small-world' networks. Nature Volume 393, pp 330-442
- Wilson, C., Boe, B., Sala, A., Puttaswamy, P. N., Zhao, B., 2009, User interactions in social networks and their implications, ACM EuroSys
- Zheng, Z. 2010. Time is of the essence: improving recency ranking using Twitter data. In Proceedings of WWW 7810 Proceedings of the 19th international conference on World wide web ACM New York
- Zipf, G. K. 1932. Selected studies of the principle of relative frequency in language. Harvard University Press.
Paper Citation
in Harvard Style
Conroy J., Griffith J. and O’Riordan C. (2011). MEASURING TWITTER USER SIMILARITY AS A FUNCTION OF STRENGTH OF TIES . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 254-262. DOI: 10.5220/0003661902620270
in Bibtex Style
@conference{kdir11,
author={John Conroy and Josephine Griffith and Colm O’Riordan},
title={MEASURING TWITTER USER SIMILARITY AS A FUNCTION OF STRENGTH OF TIES},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={254-262},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003661902620270},
isbn={978-989-8425-79-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - MEASURING TWITTER USER SIMILARITY AS A FUNCTION OF STRENGTH OF TIES
SN - 978-989-8425-79-9
AU - Conroy J.
AU - Griffith J.
AU - O’Riordan C.
PY - 2011
SP - 254
EP - 262
DO - 10.5220/0003661902620270