# Fast Regularized Least Squares and k-means Clustering Method for Intrusion Detection Systems

### Parisa Movahedi, Paavo Nevalainen, Markus Viljanen, Tapio Pahikkala

#### Abstract

Intrusion detection systems are intended for reliable, accurate and efficient detection of attacks in a large networked system. Machine learning methods have shown promising results in terms of accuracy but one disadvantage they share is the high computational cost of training and prediction phase when applied to intrusion detection. Recently some methods have been introduced to increase this efficiency. Kernel based methods are one of the most popular methods in the literature, and extending them with approximation techniques we describe in this paper has a huge impact on minimizing the computational time of the Intrusion Detection System (IDS). This paper proposes using optimized Regularized Least Square (RLS) classification combined with k-means clustering. Standard techniques are used in choosing the optimal RLS predictor parameters. The optimization leads to fewer basis vectors which improves the prediction speed of the IDS. Our algorithm evaluated on the KDD99 benchmark IDS dataset demonstrates considerable improvements in the training and prediction times of the intrusion detection while maintaining the accuracy.

#### References

- Airola, A., Pahikkala, T., and Salakoski, T. (2011). On learning and cross-validation with decomposed nyström approximation of kernel matrix. Neural Processing Letters, 33(1):17-30.
- Gao, H. and Wang, X. (2006). Ls-svm based intrusion detection using kernel space approximation and kerneltarget alignment. In The Sixth World Congress on Intelligent Control and Automation, 2006. WCICA 2006., volume 1, pages 4214-4218.
- Gupta, K. K., Nath, B., and Kotagiri, R. (2010). Layered approach using conditional random fields for intrusion detection. IEEE Transactions on Dependable and Secure Computing, 7(1):35-49.
- Hartigan, J. (1975). Clustering algorithms. New York: John Wiley & Sons.
- Kabir, M. (2014). A statistical framework for intrusion detection system. In Proceedings of the 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2014).
- Kaur, H. and Gill, N. (2013). Host based anomaly detection using fuzzy genetic approach (fga). International Journal of Computer Applications, 74(20):5-9.
- Kim, B.-J. and Kim, I. (2005a). Machine learning approach to realtime intrusion detection system. In Zhang, S. and Jarvis, R., editors, AI 2005: Advances in Artificial Intelligence, volume 3809 of Lecture Notes in Computer Science, pages 153-163. Springer Berlin Heidelberg.
- Kim, B.-J. and Kim, I.-K. (2005b). Kernel based intrusion detection system. In Fourth Annual ACIS International Conference on Computer and Information Science, 2005., pages 13-18.
- Li, H., Guan, X.-H., Zan, X., and Han, C.-Z. (2003). Network intrusion detection based on support vector machine. Journal of Computer Research and Development, 6:799-807.
- Li, W. (2004). Using genetic algorithm for network intrusion detection. Proceedings of the United States Department of Energy Cyber Security Group, pages 1-8.
- Pahikkala, T., Suominen, H., and Boberg, J. (2012). Efficient cross-validation for kernelized least-squares regression with sparse basis expansions. Machine Learning, 87(3):381-407.
- Poggio, T. and Girosi, F. (1990). Networks for approximation and learning. Proceedings of the IEEE, 78(9):1481-1497.
- Portnoy, L., Eskin, E., and Stolfo, S. (2001). Intrusion detection with unlabeled data using clustering. In Proceedings of ACM CSS Workshop on Data Mining Applied to Security. Citeseer.
- Rifkin, R., Yeo, G., and Poggio, T. (2003). Regularized least-squares classification. Nato Science Series Sub Series III Computer and Systems Sciences, 190:131- 154.
- Schölkopf, B. and Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, Cambridge, MA.
- Sotiris, V. A., Tse, P. W., and Pecht, M. G. (2010). Anomaly detection through a bayesian support vector machine. IEEE Transactions on Reliability, 59(2):277-286.
- Sung, A. and Mukkamala, S. (2003). Identifying important features for intrusion detection using support vector machines and neural networks. In Symposium on Applications and the Internet, 2003., pages 209-216.
- Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc., New York, NY, USA.

#### Paper Citation

#### in Harvard Style

Movahedi P., Nevalainen P., Viljanen M. and Pahikkala T. (2015). **Fast Regularized Least Squares and k-means Clustering Method for Intrusion Detection Systems** . In *Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,* ISBN 978-989-758-077-2, pages 264-269. DOI: 10.5220/0005246802640269

#### in Bibtex Style

@conference{icpram15,

author={Parisa Movahedi and Paavo Nevalainen and Markus Viljanen and Tapio Pahikkala},

title={Fast Regularized Least Squares and k-means Clustering Method for Intrusion Detection Systems},

booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},

year={2015},

pages={264-269},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0005246802640269},

isbn={978-989-758-077-2},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,

TI - Fast Regularized Least Squares and k-means Clustering Method for Intrusion Detection Systems

SN - 978-989-758-077-2

AU - Movahedi P.

AU - Nevalainen P.

AU - Viljanen M.

AU - Pahikkala T.

PY - 2015

SP - 264

EP - 269

DO - 10.5220/0005246802640269