Authors:
Suravi Akhter
1
;
Afia Sajeeda
2
and
Ahmedul Kabir
2
Affiliations:
1
Department of Computer Science and Engineering, University of Liberal Arts Bangladesh, Dhaka, Bangladesh
;
2
Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh
Keyword(s):
Software Defect Prediction, Bug Severity Classification, Feature Selection.
Abstract:
An anomaly of software refers to a bug or defect or anything that causes the software to deviate from its normal behavior. Anomalies should be identified properly to make more stable and error-free software systems. There are various machine learning-based approaches for anomaly detection. For proper anomaly detection, feature selection is a necessary step that helps to remove noisy and irrelevant features and thus reduces the dimensionality of the given feature vector. Most of the existing feature selection methods rank the given features using different selection criteria, such as mutual information (MI) and distance. Furthermore, these, especially MI-based methods fail to capture feature interaction during the ranking/selection process in case of larger feature dimensions which degrades the discrimination ability of the selected feature set. Moreover, it becomes problematic to make a decision about the appropriate number of features from the ranked feature set to get acceptable pe
rformance. To solve these problems, in this paper we propose anomaly detection for software data (ADSD), which is a feature subset selection method and is able to capture interactive and relevant feature subsets. Experimental results on 15 benchmark software defect datasets and two bug severity classification datasets demonstrate the performance of ADSD in comparison to four state-of-the-art methods.
(More)