Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment

Viktor Csuvik; Dániel Horváth; László Vidács

doi:10.5220/0012746900003753

Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment

Viktor Csuvik, Dániel Horváth, László Vidács

2024

Abstract

Automated Program Repair (APR) strives to minimize the expense associated with manual bug fixing by developing methods where patches are generated automatically and then validated against an oracle, such as a test suite. However, due to the potential imperfections in the oracle, patches validated by it may still be incorrect. A significant portion of the literature on APR focuses on this issue, usually referred to as Patch Correctness Check (PCC). Several approaches have been proposed that use a variety of information from the project under repair, such as diverse manually designed heuristics or learned embedding vectors. In this study, we explore various features obtained from previous studies and assess their effectiveness in identifying incorrect patches. We also evaluate the potential for accurately classifying correct patches by combining and selecting learned embeddings with engineered features, using various Machine Learning (ML) models. Our experiments demonstrate that not all features are equally important, and selecting the right ML model also has a huge impact on the overall performance. For instance, using all 490 features with a decision tree classifier achieves a mean F1 value of 64% in 10 independent trainings, while after an in-depth feature- and model selection with the selected 43 features, the MLP classifier produces a better performance of 81% F1. The empirical evaluation shows that this model is able to correctly classify samples on a dataset containing 903 labeled patches with 100% precision and 97% recall on it's peak, which is complementary performance compared to state-of-the-art methods. We also show that independent trainings can exhibit varying outcome, and propose how to improve the stability of model trainings.

Download

Paper Citation

in Harvard Style

Csuvik V., Horváth D. and Vidács L. (2024). Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment. In Proceedings of the 19th International Conference on Software Technologies - Volume 1: ICSOFT; ISBN 978-989-758-706-1, SciTePress, pages 23-34. DOI: 10.5220/0012746900003753

in Bibtex Style

@conference{icsoft24,
author={Viktor Csuvik and Dániel Horváth and László Vidács},
title={Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment},
booktitle={Proceedings of the 19th International Conference on Software Technologies - Volume 1: ICSOFT},
year={2024},
pages={23-34},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012746900003753},
isbn={978-989-758-706-1},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Conference on Software Technologies - Volume 1: ICSOFT
TI - Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment
SN - 978-989-758-706-1
AU - Csuvik V.
AU - Horváth D.
AU - Vidács L.
PY - 2024
SP - 23
EP - 34
DO - 10.5220/0012746900003753
PB - SciTePress