IMPACT OF FEATURE SELECTION AND FEATURE TYPES ON FINANCIAL STOCK PRICE PREDICTION

Michael Hagenau, Michael Liebmann, Dirk Neumann

Abstract

In this paper, we examine whether stock price effects can be automatically predicted analyzing unstructured textual information in financial news. Accordingly, we enhance existing text mining methods to evaluate the information content of financial news as an instrument for investment decisions. The main contribution of this paper is the usage of more expressive features to represent text through the employment of market feedback as part of our word selection process. In a comprehensive benchmarking, we show that a robust Feature Selection allows lifting classification accuracies significantly above previous approaches when combined with complex feature types. That is because our approach allows selecting only semantically relevant features and thus, reduces the problem of over-fitting when applying a machine learning approach. The methodology can be transferred to any other application area providing textual information and corresponding effect data.

References

  1. Burges, C. 1998. “A Tutorial on Support Vector Machines for Pattern Recognition”, Data Mining and Knowledge Discovery 2, pp. 121-167
  2. Butler, M., Keselj, V. 2009. “Financial Forecasting using Character N-Gram Analysis and Readability Scores of Annual Reports”, Advances in AI
  3. Cawley, G., Talbot, N. 2007. “Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters”, Journal of Machine Learning Research 8, pp.841-861
  4. Forman, G, 2003. “An extensive empirical study of feature selection metrics for text classification”, Journal of Machine Learning Research 3, pp. 1289-1305
  5. Groth, S., Muntermann, J. 2011. “An Intraday Risk Management Approach Based on Textual Analysis”, Decision Support Systems 50, p. 680
  6. Joachims, T., 1998. “Text categorization with support vector machines: Learning with many relevant features”, Proceedings of the European Conference on Machine Learning
  7. Klein, D. & Manning, C. D. 2003. “Accurate Unlexicalized Parsing”, Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423-430.
  8. MacKinlay, C. A. 1997. “Event Studies in Economics and Finance”, Journal of Economic Literature, S. 13-39.
  9. Mittermayr, M.-A. 2004. “Forecasting Intraday Stock Price trends with Text Mining techniques”, Proceedings of the 37th Annual Hawaii International Conference on System Sciences
  10. Muntermann, J., Guettler, A., 2009. “Supporting Investment Management Processes with Machine Learning Techniques”, 9. Internationale Tagung Wirtschaftsinformatik
  11. Porter, M. F. 1980. “An Algorithm for Suffix Stripping”, Program, 14(3): 130-137
  12. Schumaker, R. P., Chen, H. 2009. “Textual analysis of stock market prediction using breaking financial news: the AZFin Text System”, ACM Transactions on Information Systems 27
  13. Tetlock, P. C., Saar-Tsechansky, M. & Macskassy, S, 2008. “More than words: Quantifying Language to Measure Firms' Fundamentals”, The Journal of Finance, Volume 63, Number 3, June 2008 , pp. 1437- 1467
Download


Paper Citation


in Harvard Style

Hagenau M., Liebmann M. and Neumann D. (2011). IMPACT OF FEATURE SELECTION AND FEATURE TYPES ON FINANCIAL STOCK PRICE PREDICTION . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 295-300. DOI: 10.5220/0003665603030308


in Bibtex Style

@conference{kdir11,
author={Michael Hagenau and Michael Liebmann and Dirk Neumann},
title={IMPACT OF FEATURE SELECTION AND FEATURE TYPES ON FINANCIAL STOCK PRICE PREDICTION},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={295-300},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003665603030308},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - IMPACT OF FEATURE SELECTION AND FEATURE TYPES ON FINANCIAL STOCK PRICE PREDICTION
SN - 978-989-8425-79-9
AU - Hagenau M.
AU - Liebmann M.
AU - Neumann D.
PY - 2011
SP - 295
EP - 300
DO - 10.5220/0003665603030308