Subject Classification of Software Repository
Abdelhalim Dahou, Brigitte Mathiak
2023
Abstract
Software categorization involves organizing software into groups based on their behavior or domain. Traditionally, categorization has been crucial for software maintenance, aiding programmers in locating programs, identifying features, and finding similar ones within extensive code repositories. Manual categorization is expensive, tedious, and labor-intensive, leading to the growing importance of automatic categorization approaches. However, existing datasets primarily focus on technical categorization for the most common programming language, leaving a gap in other areas. This paper addresses the research problem of classifying software repositories that contain R code. The objective is to develop a classification model capable of accurately and efficiently categorizing these repositories into predefined classes with less data. The contribution of this research is twofold. Firstly, we propose a model that enables the categorization of software repositories focusing on R programming, even with a limited amount of training data. Secondly, we conduct a comprehensive empirical evaluation to assess the impact of repository features and data augmentation on automatic repository categorization. This research endeavors to advance the field of software categorization and facilitate better utilization of software repositories in the context of diverse domains research.
DownloadPaper Citation
in Harvard Style
Dahou A. and Mathiak B. (2023). Subject Classification of Software Repository. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN 978-989-758-671-2, SciTePress, pages 30-38. DOI: 10.5220/0012159600003598
in Bibtex Style
@conference{kdir23,
author={Abdelhalim Dahou and Brigitte Mathiak},
title={Subject Classification of Software Repository},
booktitle={Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2023},
pages={30-38},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012159600003598},
isbn={978-989-758-671-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Subject Classification of Software Repository
SN - 978-989-758-671-2
AU - Dahou A.
AU - Mathiak B.
PY - 2023
SP - 30
EP - 38
DO - 10.5220/0012159600003598
PB - SciTePress