STREAM VOLUME PREDICTION IN TWITTER WITH ARTIFICIAL NEURAL NETWORKS

Gabriela Dominguez, Juan Zamora, Miguel Guevara, Héctor Allende, Rodrigo Salas

Abstract

Twitter is one of the most important social network, where extracting useful information is of paramount importance to many application areas. Many works to date have tried to mine this information by taking the network structure, language itself or even by searching for a pattern in the words employed by the users. Anyway, a simple idea that might be useful for every challenging mining task - and that at out knowledge has not been tackled yet - consists of predicting the amount of messages (stream volume) that will be emitted in some specific time span. In this work, by using almost 180k messages collected in a period of one week, a preliminary analysis of the temporal structure of the stream volume in Twitter is made. The expected contribution consists of a model based on artificial neural networks to predict the amount of posts in a specific time window, which regards the past history and the daily behavior of the network in terms of the emission rate of the message stream.

References

  1. Balestrassi, P., Popova, E., Paiva, A., and Marangon-Lima, J. (2009). Design of experiments on neural network's training for nonlinear time series forecasting. Neurocomputing, 72:1160-1178.
  2. Banerjee, S., Al-Qaheri, H., and Hassanien, A. E. (2010). Mining social networks for viral marketing using fuzzy logic. In Mathematical/Analytical Modelling and Computer Simulation (AMS), 2010 Fourth Asia International Conference on, pages 24 -28.
  3. Castillo, C., Mendoza, M., and Poblete, B. (2011). Information credibility on twitter. In Proceedings of the 20th international conference on World wide web, WWW 7811, pages 675-684, New York, NY, USA. ACM.
  4. Datar, M., Gionis, A., Indyk, P., and Motwani, R. (2002). Maintaining stream statistics over sliding windows: (extended abstract). In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, SODA 7802, pages 635-644, Philadelphia, PA, USA. Society for Industrial and Applied Mathematics.
  5. Domingos, P. (2005). Mining social networks for viral marketing. IEEE Intelligent Systems, 20(1):80-82.
  6. Guha, S., Koudas, N., and Shim, K. (2006). Approximation and streaming algorithms for histogram construction problems. ACM Trans. Database Syst., 31:396-438.
  7. Hornik, K., Stinchcombe, M., and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2:359-366.
  8. Lee, C.-H., Wu, C.-H., and Chien, T.-F. (2011). BursT: A dynamic term weighting scheme for mining microblogging messages. In Liu, D., Zhang, H., Polycarpou, M., Alippi, C., and He, H., editors, Advances in Neural Networks ISNN 2011, volume 6677 of Lecture Notes in Computer Science, pages 548-557. Springer Berlin / Heidelberg.
  9. Lee, L. K. and Ting, H. F. (2006). Maintaining significant stream statistics over sliding windows. In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, SODA 7806, pages 724-732, New York, NY, USA. ACM.
  10. Mathioudakis, M. and Koudas, N. (2010). Twittermonitor: trend detection over the twitter stream. In Proceedings of the 2010 international conference on Management of data, SIGMOD 7810, pages 1155-1158, New York, NY, USA. ACM.
  11. Mendoza, M., Poblete, B., and Castillo, C. (2010). Twitter under crisis: can we trust what we rt? In Proceedings of the First Workshop on Social Media Analytics, SOMA 7810, pages 71-79, New York, NY, USA. ACM.
  12. Pan, B., Demiryurek, U., Banaei-Kashani, F., and Shahabi, C. (2010). Spatiotemporal summarization of traffic data streams. In Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming, IWGS 7810, pages 4-10, New York, NY, USA. ACM.
  13. Petrovic, S., Osborne, M., and Lavrenko, V. (2010). Streaming first story detection with application to twitter. In HLT-NAACL, pages 181-189. The Association for Computational Linguistics.
  14. Rumelhart, D., Hinton, G., and William, R. (1986). Learning internal representation by back-propagation errors. Nature, 323:533-536.
  15. Zhu, Y. and Shasha, D. (2002). Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB, pages 358-369. Morgan Kaufmann.
Download


Paper Citation


in Harvard Style

Dominguez G., Zamora J., Guevara M., Allende H. and Salas R. (2012). STREAM VOLUME PREDICTION IN TWITTER WITH ARTIFICIAL NEURAL NETWORKS . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-8425-99-7, pages 488-493. DOI: 10.5220/0003837004880493


in Bibtex Style

@conference{icpram12,
author={Gabriela Dominguez and Juan Zamora and Miguel Guevara and Héctor Allende and Rodrigo Salas},
title={STREAM VOLUME PREDICTION IN TWITTER WITH ARTIFICIAL NEURAL NETWORKS},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2012},
pages={488-493},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003837004880493},
isbn={978-989-8425-99-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - STREAM VOLUME PREDICTION IN TWITTER WITH ARTIFICIAL NEURAL NETWORKS
SN - 978-989-8425-99-7
AU - Dominguez G.
AU - Zamora J.
AU - Guevara M.
AU - Allende H.
AU - Salas R.
PY - 2012
SP - 488
EP - 493
DO - 10.5220/0003837004880493