Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects

Peerachai Banyongrakkul; Suronapee Phoomvuthisarn

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects

Topics: Data Mining and Data Analysis; Data-driven Software Engineering; Empirical Software Engineering; Open Source Development; Software and Systems Modeling

In Proceedings of the 18th International Conference on Software Technologies ICSOFT - Volume 1, 163-174, 2023 , Rome, Italy

Authors: Peerachai Banyongrakkul and Suronapee Phoomvuthisarn

Affiliation: Department of Statistics, Chulalongkorn University, Bangkok, Thailand

Keyword(s): Pull-Based Development, Pull Request, GitHub, Deep Learning, Multi-Output Learning, Classification.

Abstract: GitHub’s pull-based development model is widely used by software development teams to manage software complexity. Contributors create pull requests for merging changes into the main codebase, and integrators review these requests to maintain quality and stability. However, a high volume of pull requests can overwhelm integrators, causing feedback delays. Previous studies have built predictive models using traditional machine learning techniques with tabular data, but these may lose meaningful information. Additionally, relying solely on acceptance and latency predictions may not be sufficient for integrators. Reopened pull requests can add maintenance costs and burden already-busy developers. This paper proposes a novel multi-output deep learning-based approach that early predicts acceptance, latency, and reopening of pull requests, effectively handling various data sources, including tabular and textual data. Our approach also applies SMOTE and VAE techniques to address the highly i mbalanced nature of the pull request reopening. We evaluate our approach on 143,886 pull requests from 54 open-source projects across four well-known programming languages. The experimental results show that our approach significantly outperforms the randomized baseline. Moreover, our approach improves accuracy by 8.68%, precision by 1.01%, recall by 11.49%, and F1-score by 6.77% in acceptance prediction, and MMAE by 6.07% in latency prediction, while improving balanced accuracy by 9.43%, AUC by 9.37%, and TPR by 30.07% in reopening prediction over the existing approach. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.181

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Banyongrakkul, P., Phoomvuthisarn and S. (2023). Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects. In Proceedings of the 18th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-665-1; ISSN 2184-2833, SciTePress, pages 163-174. DOI: 10.5220/0012125200003538

@conference{icsoft23,
author={Peerachai Banyongrakkul and Suronapee Phoomvuthisarn},
title={Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects},
booktitle={Proceedings of the 18th International Conference on Software Technologies - ICSOFT},
year={2023},
pages={163-174},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012125200003538},
isbn={978-989-758-665-1},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 18th International Conference on Software Technologies - ICSOFT
TI - Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects
SN - 978-989-758-665-1
IS - 2184-2833
AU - Banyongrakkul, P.
AU - Phoomvuthisarn, S.
PY - 2023
SP - 163
EP - 174
DO - 10.5220/0012125200003538
PB - SciTePress