Towards Online Data Mining System for Enterprises

Jan Kupčík, Tomáš Hruška

Abstract

As the amount of generated and stored data in enterprises increases, the significance of fast analyzing of this data rises. This paper introduces data mining system designed for high performance analyses of very large data sets, and presents its principles. The system supports processing of data stored in relational databases and data warehouses as well as processing of data streams, and discovering knowledge from these sources with data mining algorithms. To update the set of installed algorithms the system does not need a restart, so high availability can be achieved. Data analytic tasks are defined in a programming language of the Microsoft .NET platform with libraries provided by the system. Thus, experienced users are not limited by graphical designers and their features and are able to create complex intelligent analytic tasks. For storing and querying results a special storage system is outlined.

References

  1. Abadi, D. J., Ahmad, Y., Balazinska, M., Cherniack, M., Hwang, J.-h., Lindner, W., Maskey, A. S., Rasin, E., Ryvkina, E., Tatbul, N., Xing, Y., and Zdonik, S. (2005). The design of the borealis stream processing engine. In In CIDR, pages 277-289.
  2. Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., Motwani, R., Srivastava, U., and Widom, J. (2004). STREAM: The stanford data stream management system. Technical Report 2004-20, Stanford InfoLab.
  3. Chandramouli, B., Ali, M., Goldstein, J., Sezgin, B., and Raman, B. S. (2010). Data stream management systems for computational finance. Computer, 43:45-52.
  4. Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1-27:27.
  5. F ülöp, L. J., T óth, G., Rácz, R., Pánczl, J., Gergely, T., Beszédes, A., and Farkas, L. (2010). Survey on complex event processing and predictive analytics. Technical report, University of Szeged, Department of Software Engineering.
  6. Gaber, M. M., Zaslavsky, A., and Krishnaswamy, S. (2005).
  7. Golab, L. and O zsu, M. T. (2003). Issues in data stream management. SIGMOD Rec., 32:5-14.
  8. Han, J., Kamber, M., and Pei, J. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., Waltham, MA, USA, third edition.
  9. Hébrail, G. (2008). In Fogelman-Souli, F., Perrotta, D., Piskorski, J., and Steinberger, R., editors, Mining Massive Data Sets for Security, Advances in Data Mining, Search, Social Networks and Text Mining, and their Applications to Security, volume 19, chapter Data stream management and mining, pages 89-102. IOS Press.
  10. Mikut, R. and Reischl, M. (2011). Data mining tools. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(5):431-443.
  11. Shi, Z., Huang, Y., He, Q., Xu, L., Liu, S., Qin, L., Jia, Z., Li, J., Huang, H., and Zhao, L. (2007). MSMiner-a developing platform for OLAP. Decis. Support Syst., 42:2016-2028.
  12. Thakkar, H., Laptev, N., Mousavi, H., Mozafari, B., Russo, V., and Zaniolo, C. (2011). SMM: A data stream management system for knowledge discovery. In Abiteboul, S., Böhm, K., Koch, C., and Tan, K.-L., editors, Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11-16, 2011, Hannover, Germany, pages 757-768. IEEE Computer Society.
  13. Wojnarski, M. (2008). Transactions on rough sets IX. chapter Debellor: A Data Mining Platform with Stream Architecture, pages 405-427. Springer-Verlag, Berlin, Heidelberg.
Download


Paper Citation


in Harvard Style

Kupčík J. and Hruška T. (2012). Towards Online Data Mining System for Enterprises . In Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-8565-13-6, pages 187-192. DOI: 10.5220/0004098101870192


in Bibtex Style

@conference{enase12,
author={Jan Kupčík and Tomáš Hruška},
title={Towards Online Data Mining System for Enterprises},
booktitle={Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},
year={2012},
pages={187-192},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004098101870192},
isbn={978-989-8565-13-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
TI - Towards Online Data Mining System for Enterprises
SN - 978-989-8565-13-6
AU - Kupčík J.
AU - Hruška T.
PY - 2012
SP - 187
EP - 192
DO - 10.5220/0004098101870192