Cross-Version Defect Prediction: Does Excessive Train-Test Similarity Affect the Reliability of Evaluation?

Zsuzsanna Oneț-Marian; Diana-Lucia Hotea

doi:10.5220/0013481700003928

Cross-Version Defect Prediction: Does Excessive Train-Test Similarity Affect the Reliability of Evaluation?

Zsuzsanna Oneț-Marian, Diana-Lucia Hotea

2025

Abstract

Software Defect Prediction is defined as the automated identification of defective components within a software system. Its significance and applicability are extensive. The most realistic way of performing defect prediction is in the cross-version scenario. However, although emerging, this scenario is still relatively understudied. The prevalent approach in the cross-version defect prediction literature is to consider two successive software versions as the train-test pair, expecting them to be similar to each other. Some approaches even propose to increase this similarity by augmenting or, on the contrary, filtering the training set derived from historical data. In this paper, we analyze in detail the similarity between the instances in 28 pairs of successive software versions and perform a comparative supervised machine learning study to assess its impact on the reliability of cross-version defect prediction evaluation. We employ three ensemble learning models, Random Forest, AdaBoost and XGBoost, and evaluate them in different scenarios. The experimental results indicate that the soundness of the evaluation is questionable, since excessive train-test similarity, in terms of identical or highly similar instances, inflates the measured performance.

Download

Paper Citation

in Harvard Style

Oneț-Marian Z. and Hotea D. (2025). Cross-Version Defect Prediction: Does Excessive Train-Test Similarity Affect the Reliability of Evaluation?. In Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE; ISBN 978-989-758-742-9, SciTePress, pages 304-315. DOI: 10.5220/0013481700003928

in Bibtex Style

@conference{enase25,
author={Zsuzsanna Oneț-Marian and Diana-Lucia Hotea},
title={Cross-Version Defect Prediction: Does Excessive Train-Test Similarity Affect the Reliability of Evaluation?},
booktitle={Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE},
year={2025},
pages={304-315},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013481700003928},
isbn={978-989-758-742-9},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE
TI - Cross-Version Defect Prediction: Does Excessive Train-Test Similarity Affect the Reliability of Evaluation?
SN - 978-989-758-742-9
AU - Oneț-Marian Z.
AU - Hotea D.
PY - 2025
SP - 304
EP - 315
DO - 10.5220/0013481700003928
PB - SciTePress