A DATA MINING APPROACH TO LEARNING PROBABILISTIC USER BEHAVIOR MODELS FROM DATABASE ACCESS LOG
Mikhail Petrovskiy
2006
Abstract
The problem of user behavior modeling arises in many fields of computer science and software engineering. In this paper we investigate a data mining approach for learning probabilistic user behavior models from the database usage logs. We propose a procedure for translating database traces into representation suitable for applying data mining methods. However, most existing data mining methods rely on the order of actions and ignore time intervals between actions. To avoid this problem we propose novel method based on combination of decision tree classification algorithm and empirical time-dependent feature map, motivated by potential functions theory. The performance of the proposed method was experimentally evaluated on real-world data. The comparison with existing state-of-the-art data mining methods has confirmed outstanding performance of our method in predictive user behavior modeling and has demonstrated competitive results in anomaly detection.
References
- Aizerman, M.A., Braverman, E.M., & Rozonoer, L.I., (1970). Method of Potential Functions in the Theory of Learning Machines. Nauka, Moscow (in Russian).
- Dan, P., Yu, S. & Chung, J.-Y. (1995). Characterization of database access pattern for analytic prediction of buffer hit probability. VLDB J., 4(1):127--154.
- Debar, H., Becke, M. & Siboni, D. (1992). A neural network component for an intrusion detection system. In IEEE Symp. on Security and Privacy, pp. 240--250.
- Ghosh, A., Schwartzbard, A. & Schatz, M. (1999). Learning Program Behavior for Intrusion Detection. In 1th USENIX Workshop on Intrusion Detection and Network Monitoring. Florida, CA.
- Hastie, T. (2001). The Elements of Statistical Learning, Springer, New York.
- Lee, W. & Stolfo, S. (1998). Data mining approaches for intrusion detection. In 7th USENIX Security Symposium (SECURITY'98).
- Liu, B., Hsu, W. & Ma, Y. (1998). Integrating classification and association rule mining. In 4th Int. Conf. on KDD and Data Mining, pages 80-96.
- Manavoglu, E., Pavlov, D. & Giles, C. (2003). Probabilistic User Behavior Models. In IEEE Int. Conf. on Data Mining (ICDM-03). Melbourne, FL.
- Maxion, R. & Roberts, R. (2004). Proper Use of ROC Curves in Intrusion/Anomaly Detection, Tech. report CS-TR-871, University of Newcastle upon Tyne.
- Piatetsky-Shapiro, G., Fayyad, U., Smyth, P. & Uthurusamy, R. (1996). Advances in Knowledge Discovery and Data Mining, AAAI Press/MIT Press.
- Quinlan, J. (1987). Generating production rules from decision trees. In 10th International Joint Conference on Artificial Intelligence, pp. 304--307.
- Sarwar, B., Karypis, G., Konstan, J. & Riedl, J. (2001). Item-based Collaborative Filtering Recommendation Algorithms. In 10th International World Wide Web Conference, pp. 285-295
- Tang, Z.-H. & MacLennan, J. (2005). Data Mining with SQL Server 2005, Wiley Publishing.
- Valeur, F., Mutz, D. & Vigna, G. (2005). A LearningBased Approach to the Detection of SQL Attacks. In IEEE Conf. on Detection of Intrusions and Malware & Vulnerability Assessment, pp. 123-140.
Paper Citation
in Harvard Style
Petrovskiy M. (2006). A DATA MINING APPROACH TO LEARNING PROBABILISTIC USER BEHAVIOR MODELS FROM DATABASE ACCESS LOG . In Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-972-8865-69-6, pages 73-78. DOI: 10.5220/0001321200730078
in Bibtex Style
@conference{icsoft06,
author={Mikhail Petrovskiy},
title={A DATA MINING APPROACH TO LEARNING PROBABILISTIC USER BEHAVIOR MODELS FROM DATABASE ACCESS LOG},
booktitle={Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2006},
pages={73-78},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001321200730078},
isbn={978-972-8865-69-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the First International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - A DATA MINING APPROACH TO LEARNING PROBABILISTIC USER BEHAVIOR MODELS FROM DATABASE ACCESS LOG
SN - 978-972-8865-69-6
AU - Petrovskiy M.
PY - 2006
SP - 73
EP - 78
DO - 10.5220/0001321200730078