A CASE STUDY - Classification of Stock Exchange News by Support Vector Machines

P. Kroha, K. Kröber, R. Janetzko

Abstract

In this paper, we present a case study concerning the classification of text messages with the use of Support Vector Machines. We collected about 700.000 news and stated the hypothesis saying that when markets are going down then negative messages have a majority and when markets are going up then positive messages have a majority. This hypothesis is based on the assumption of news-driven behavior of investors. To check the hypothesis given above we needed to classify the market news. We describe the application of Support Vector Machines for this purpose including our experiments that showed interesting results. We found that the news classification has some interesting correlation with long-term market trends.

References

  1. Boser, B., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In In: Haussler, D. (Ed.): Proceedings of 5th Annual ACM Workshop on COLT, pages 144-152. Pittsburgh, PA, ACM Press.
  2. Braverman, E., Aizerman, M., and Rozonoer, L. (1964). Theoretical foundations of the potential function methodin pattern recognition learning. In Automation and Remote Control 25, pp. 821 - 837.
  3. Janetzko, R. (2008). Using Support Vector Machines for Classification of News. TU Chemnitz, (In German).
  4. Joachims, T. (1998a). Making large-scale svm learning practical. In LS8-Report, 24. Universitaet Dortmund.
  5. Joachims, T. (1998b). Text categorization with support vector machines: Learning with many relevant features. In http://www.cs.cornell.edu/ People/tj/publications/joachims98a.
  6. Joachims, T. (2001). Learning to classify text using Support Vector Machines. Kluwer Academic Publishers.
  7. Kroha, P., Baeza-Yates, R., and Krellner, B. (2006). Text mining of business news for forecasting. In In: Proceedings of 17th International Conference DEXA'2006, Workshop on Theory and Applications of Knowledge Management TAKMA'2006, pp. 171-175. IEEE Computer Society.
  8. Kroha, P. and Reichel, T. (2007). Using grammars for text classification. In In: Cardoso, J., Cordeiro, J., Filipe, J.(Eds.): Proceedings of the 9th International Conference on Enterprise Information Systems ICEIS'2007, Volume Artificial Intelligence and Decision Support Systems, pp. 259-264. INSTICC with ACM SIGMIS and AAAI.
  9. Kroha, P., Reichel, T., and Krellner, B. (2007). Text mining for indication of changes in long-term market trends. In In: Tochtermann, K., Maurer, H. (Eds.): Proceedings of I-KNOW'07 7th International Conference on Knowledge Management as part of TRIPLE-I 2007, Journal of Universal Computer Science, pp. 424-431.
  10. Peters, E. (1996). Chaos and Order in the Capital Markets. John Wiley.
  11. Serafini, T., Zanghirati, G., and Zanni, L. (2005). Gradient projection methods for large quadratic programs and applications in training support vector machines. In Optim. Meth. Soft., 20, pp. 353-378.
  12. Vapnik, V. (1995). The Nature of Statistical Learnig Theory. Springer.
  13. Zanni, L., Serafini, T., and Zanghirati, G. (2006). Parallel software for training large scale support vector machines on multiprocessor systems. In Journal of Machine Learning Research, Volume 7.
  14. Zanni, L. and Zanghirati, G. (2003). A parallel solver for large quadratic programs in training support vector machines. In Parallel Computing, 29, pp. 535 - 551.
Download


Paper Citation


in Harvard Style

Kroha P., Kröber K. and Janetzko R. (2010). A CASE STUDY - Classification of Stock Exchange News by Support Vector Machines . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: DMIA, (ICSOFT 2010) ISBN 978-989-8425-22-5, pages 331-336. DOI: 10.5220/0003043403310336


in Bibtex Style

@conference{dmia10,
author={P. Kroha and K. Kröber and R. Janetzko},
title={A CASE STUDY - Classification of Stock Exchange News by Support Vector Machines},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: DMIA, (ICSOFT 2010)},
year={2010},
pages={331-336},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003043403310336},
isbn={978-989-8425-22-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: DMIA, (ICSOFT 2010)
TI - A CASE STUDY - Classification of Stock Exchange News by Support Vector Machines
SN - 978-989-8425-22-5
AU - Kroha P.
AU - Kröber K.
AU - Janetzko R.
PY - 2010
SP - 331
EP - 336
DO - 10.5220/0003043403310336