Analyzing Social Media Discourse - An Approach using Semi-supervised Learning

Álvaro Figueira, Luciana Oliveira

2016

Abstract

The ability to handle large amounts of unstructured information, to optimize strategic business opportunities, and to identify fundamental lessons among competitors through benchmarking, are essential skills of every business sector. Currently, there are dozens of social media analytics’ applications aiming at providing organizations with informed decision making tools. However, these applications rely on providing quantitative information, rather than qualitative information that is relevant and intelligible for managers. In order to address these aspects, we propose a semi-supervised learning procedure that discovers and compiles information taken from online social media, organizing it in a scheme that can be strategically relevant. We illustrate our procedure using a case study where we collected and analysed the social media discourse of 43 organizations operating on the Higher Public Polytechnic Education Sector. During the analysis we created an “editorial model” that characterizes the posts in the area. We describe in detail the training and the execution of an ensemble of classifying algorithms. In this study we focus on the techniques used to increase the accuracy and stability of the classifiers.

References

  1. Altman, N. S. (1992). "An introduction to kernel and nearest-neighbor nonparametric regression". The American Statistician 46 (3): 175-185.
  2. Bengio, Y.; Courville, A.; Vincent, P. (2013). "Representation Learning: A Review and New Perspectives". IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (8): 1798-1828.
  3. Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5-32.
  4. Cortes, C.; Vapnik, V. (1995). "Support-vector networks". Machine Learning 20 (3): 273.
  5. Freund, Y.; Schapire, R. E. (1999). "Large margin classification using the perceptron algorithm" (PDF). Machine Learning 37 (3): 277-296.
  6. Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2): 337-407.
  7. Koby Crammer and Yoram Singer. 2002. On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2 (March 2002), 265-292.
  8. Llew Mason, Jonathan Baxter, Peter Bartlett, and Marcus Frean (2000); Boosting Algorithms as Gradient Descent, in S. A. Solla, T. K. Leen, and K.-R. Muller, editors, Advances in Neural Information Processing Systems 12, pp. 512-518, MIT Press.
  9. Oliveira, L., Figueira, A. (2015). Benchmarking analysis of social media strategies in the Higher Education Sector. In Proceedings of Conference on ENTERprise Information Systems (CENTERIS'16). Vol64:779- 786.
  10. Ronan Collobert and Jason Weston. (2008). A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning (ICML 7808). ACM, New York, NY, USA, 160-167.
  11. Rud, Olivia (2009). Business Intelligence Success Factors: Tools for Aligning Your Business in the Global Economy. Hoboken, N.J: Wiley & Sons. ISBN 978-0- 470-39240-9.
Download


Paper Citation


in Harvard Style

Figueira Á. and Oliveira L. (2016). Analyzing Social Media Discourse - An Approach using Semi-supervised Learning . In Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-989-758-186-1, pages 188-195. DOI: 10.5220/0005786601880195


in Bibtex Style

@conference{webist16,
author={Álvaro Figueira and Luciana Oliveira},
title={Analyzing Social Media Discourse - An Approach using Semi-supervised Learning},
booktitle={Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},
year={2016},
pages={188-195},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005786601880195},
isbn={978-989-758-186-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - Analyzing Social Media Discourse - An Approach using Semi-supervised Learning
SN - 978-989-758-186-1
AU - Figueira Á.
AU - Oliveira L.
PY - 2016
SP - 188
EP - 195
DO - 10.5220/0005786601880195