Outlier Detection Through Connectivity-Based Outlier Factor for Software Defect Prediction
Andrada-Mihaela-Nicoleta Moldovan, Andreea Vescan
Regression testing becomes expensive in terms of time when changes are often made. In order to simplify testing, supervised/unsupervised binary classification Software Defect Prediction (SDP) techniques may rule out non-defective components or highlight those components that are most prone to defects. In this paper, outlier detection methods for SDP are investigated. The novelty of this approach is that it was not previously used for this particular task. Two approaches are implemented, namely, simple use of the local outlier factor based on connectivity (Connectivity-based Outlier Factor, COF), respectively, improving it by the Pareto rule (which means that we consider samples with the 20% highest outlier score resulting from the algorithm as outliers), COF + Pareto. The solutions were evaluated in 12 projects from NASA and PROMISE datasets. The results obtained are comparable to state-of-the-art solutions, for some projects, the results range from acceptable to good, compared to the results of other studies.
in Harvard Style
Moldovan A. and Vescan A. (2024). Outlier Detection Through Connectivity-Based Outlier Factor for Software Defect Prediction. In Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE; ISBN 978-989-758-696-5, SciTePress, pages 474-483. DOI: 10.5220/0012683400003687
