Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment
Viktor Csuvik, Dániel Horváth, László Vidács
2024
Abstract
Automated Program Repair (APR) strives to minimize the expense associated with manual bug fixing by developing methods where patches are generated automatically and then validated against an oracle, such as a test suite. However, due to the potential imperfections in the oracle, patches validated by it may still be incorrect. A significant portion of the literature on APR focuses on this issue, usually referred to as Patch Correctness Check (PCC). Several approaches have been proposed that use a variety of information from the project under repair, such as diverse manually designed heuristics or learned embedding vectors. In this study, we explore various features obtained from previous studies and assess their effectiveness in identifying incorrect patches. We also evaluate the potential for accurately classifying correct patches by combining and selecting learned embeddings with engineered features, using various Machine Learning (ML) models. Our experiments demonstrate that not all features are equally important, and selecting the right ML model also has a huge impact on the overall performance. For instance, using all 490 features with a decision tree classifier achieves a mean F1 value of 64% in 10 independent trainings, while after an in-depth feature- and model selection with the selected 43 features, the MLP classifier produces a better performance of 81% F1. The empirical evaluation shows that this model is able to correctly classify samples on a dataset containing 903 labeled patches with 100% precision and 97% recall on it's peak, which is complementary performance compared to state-of-the-art methods. We also show that independent trainings can exhibit varying outcome, and propose how to improve the stability of model trainings.
DownloadPaper Citation
in Harvard Style
Csuvik V., Horváth D. and Vidács L. (2024). Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment. In Proceedings of the 19th International Conference on Software Technologies - Volume 1: ICSOFT; ISBN 978-989-758-706-1, SciTePress, pages 23-34. DOI: 10.5220/0012746900003753
in Bibtex Style
@conference{icsoft24,
author={Viktor Csuvik and Dániel Horváth and László Vidács},
title={Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment},
booktitle={Proceedings of the 19th International Conference on Software Technologies - Volume 1: ICSOFT},
year={2024},
pages={23-34},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012746900003753},
isbn={978-989-758-706-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 19th International Conference on Software Technologies - Volume 1: ICSOFT
TI - Feature Extraction, Learning and Selection in Support of Patch Correctness Assessment
SN - 978-989-758-706-1
AU - Csuvik V.
AU - Horváth D.
AU - Vidács L.
PY - 2024
SP - 23
EP - 34
DO - 10.5220/0012746900003753
PB - SciTePress