Machine Learning-Enhanced Requirements Engineering: A Systematic

Literature Review

Ana-Gabriela N

nez

, Maria Fernanda Granda

, Victor Saquicela

and Otto Parra

Department of Computer Science, Universidad de Cuenca, Cuenca, Ecuador

Keywords:

Requirements Engineering, Machine Learning, Artiﬁcial Intelligence, Natural Language Processing.

Abstract:

In the software lifecycle, requirements are often subjective and ambiguous, challenging developers to com-

prehend and implement them accurately and thoroughly. Nevertheless, using techniques and knowledge can

help analysts simplify and improve requirements comprehensibility, ensuring that the ﬁnal product meets

the client’s expectations and needs. The Requirements Engineering domain and its relationship to Machine

Learning have gained momentum recently. Machine Learning algorithms have shown signiﬁcant progress and

superior performance when dealing with functional and non-functional requirements, natural language pro-

cessing, text-mining, data-mining, and requirements extraction, validation, prioritisation, and classiﬁcation.

This paper presents a Systematic Literature Review identifying novel contributions and advancements from

January 2012 to June 2023 related to strategies, technology and tools that use Machine Learning techniques in

Requirements Engineering. This process included selecting studies from ﬁve databases (Scopus, WoS, IEEE,

ACM, and Proquest), from which 74 out of 1219 were selected. Although some successful applications were

found, there are still topics to explore, such as analysing requirements using different techniques, combining

algorithms to improve strategies, considering other requirements speciﬁcation formats, extending techniques

to larger datasets and other application domains and paying attention to the efﬁciency of the approaches.

1 INTRODUCTION

Given the subjective and ambiguous nature of require-

ments, developers encounter difﬁculties in compre-

hending and executing them with accuracy and thor-

oughness. As such, Requirements Engineering (RE)

is deemed as the most pivotal phase in the Software

Development Life Cycle (SDLC), given that impre-

cise and incomplete requirements pose challenges for

developers to interpret and implement effectively. RE

encompasses some tasks associated with extracting,

analysing, specifying, validating, and managing re-

quirements, including needs, goals, functionalities,

constraints, qualities, behaviours, conditions, capabil-

ities, and more. Two types of requirements are tradi-

tionally considered when producing the Software Re-

quirement Speciﬁcation (SRS) document: functional

requirements (FR) and non-functional requirements

(NFR). FRs describe how the software interacts with

https://orcid.org/0000-0002-4996-0390

https://orcid.org/0000-0002-5125-8234

https://orcid.org/0000-0002-2438-9220

https://orcid.org/0000-0003-3004-1025

speciﬁc inputs and the functionalities provided by

such software. NFR represents any requirement for

the software product, including how it will be devel-

oped, maintained and put under operation (Alashqar,

2022). Machine Learning (ML) is used to improve the

efﬁciency and effectiveness of tasks such as identify-

ing, extracting, and classifying requirements, which

are often written in natural language. Integrating RE

with ML has the potential to enhance the efﬁcacy of

the requirements elicitation process, thereby improv-

ing software development quality.

This paper presents a systematic literature review

(SLR) that assesses and synthesizes the state-of-the-

art concerning ML techniques within the RE domain.

The focus is on existing literature showcasing suc-

cessful ML applications, the characteristic features

within RE that utilize ML, and the prevalent tools

in RE that incorporate ML techniques. The review

also examines quality criteria, including a minimum

number of pages, knowledge area or study domain,

scope, methods of extraction and classiﬁcation, and

text processing. Key ﬁndings of each paper, including

datasets, tools, and technology currently being used,

are presented.

Núñez, A., Granda, M., Saquicela, V. and Parra, O.

Machine Learning-Enhanced Requirements Engineering: A Systematic Literature Review.

DOI: 10.5220/0012688100003687

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2024), pages 521-528

ISBN: 978-989-758-696-5; ISSN: 2184-4895

521

To understand the current state of ML and RE do-

mains, we applied a search in the SCOPUS Digital

Library using the keyword “machine learning” in “re-

quirements engineering”. This search returned 4868

papers out of a total of 54231. The integration of ML

in RE can enhance process efﬁciency, reduce errors,

improve data quality, and increase stakeholder satis-

faction and collaboration among different roles (Pei

et al., 2022).

We conducted a systematic literature review in ac-

cordance with the guidelines proposed by Kitchen-

ham and Charters (Kitchenham and Charters, 2007)

to address the following research questions (RQs):

• RQ1. What existing works demonstrate the suc-

cessful application of ML techniques in Require-

ments Engineering?

• RQ2. What are the characteristic features em-

ployed in Requirements Engineering that leverage

Machine Learning?

• RQ3. What are the tools used in the ﬁeld of

Requirements Engineering that apply Machine

Learning?

The remainder of this paper is structured as follows:

Section II with related work; Section III on method-

ology; Section IV presenting SLR results; Section V

discussing validity threats; and Section VI concluding

and outlining future work.

2 RELATED WORK

RE’s study domain, especially its evolving relation-

ship with ML, has been widely covered. Signiﬁcant

works include Iqbal et al. (Iqbal et al., 2018), who

provided comprehensive insights on ML in RE, cov-

ering themes like FR, NFR, prioritisation, and more.

They detailed ML model types and datasets used,

highlighting ML’s emerging role in RE.

Sonbol et al. (Sonbol et al., 2022) conducted an

in-depth study on NLP in RE, distinguishing between

ML and rule-based methods, covering techniques like

tokenisation and POS tagging. Similarly, Zamani et

al. (Zamani et al., 2021) focused on challenges in

applying ML to RE, discussing datasets, document

types, and evaluation metrics for ML approaches in

RE, such as recall and precision. Jindal et al. (Jindal

et al., 2021) explored NFR classiﬁcation, including

types of NFRs, ML and NLP techniques, and datasets.

Their research adds to understanding ML’s role in pre-

dicting FR and NFR.

While some studies have explored speciﬁc ML

techniques for predicting FR and NFR, there is a

shortage of research that systematically reviews em-

pirical studies on this topic. More work is needed

to identify the most effective ML practices and chal-

lenges in RE.

3 METHODOLOGY

The utilized approach follows the model outlined in

the literature review by Kitchenham (2007) (Kitchen-

ham and Charters, 2007) and has been customized

to outline a search protocol consisting of three key

phases: planning, execution, and results.

3.1 Planning Stage

This stage is essential to deﬁning the basic review pro-

cedures and producing a search protocol to support

our vertebrate research method. The activities iden-

tiﬁed in this stage include establishing the research

questions, creating the search string, and selecting the

information sources.

3.1.1 Establishment of the Research Questions

We have started from these contributions as the pri-

mary consultation elements to comprehend the on-

going studies related to empirical studies and formu-

lated our research questions. For this, we consider

ﬁrst-hand that when implementing Machine Learn-

ing techniques, it’s essential to conduct an analysis

of user requirements. This preliminary assessment

provides a foundation for clearly deﬁning the scope

of ML within the target domain. In light of this,

it’s crucial to determine the speciﬁc Artiﬁcial Intel-

ligence (AI) strategies and algorithms that are most

aligned with the given context. Once these founda-

tional aspects are solidiﬁed, the emphasis should shift

to understanding the nuanced elements of Require-

ments Engineering ML, which might enhance that.

This involves identifying pertinent data sources, de-

tailing data pre-processing methods, and selecting the

most appropriate evaluation metrics. As our explo-

ration draws to a close, we aim to highlight the tools

that have proven most effective in bridging ML in RE.

3.1.2 Creation of the Research String

For this study, the research string has been composed

of the following terms (”requirement* engineering”

OR ”non-functional requirement*” OR nfr OR ”func-

tional requirement*” ) AND ( ”Machine Learning”

OR ml OR ”artiﬁcial Intelligence” OR ai OR ”data

mining” OR ”text mining” OR NLP OR ”Natural lan-

guage processing”) AND Year > 2011. This study

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

522

aims to analyze different contributions during the past

ten years in the domains of RE and ML. Our search

primarily relies on information from ﬁve pertinent

databases: Scopus, WoS, IEEE, ACM, and ProQuest.

Until June 2023, 1194 papers were retrieved using the

research string.

3.2 Execution Stage

In this stage, we have continued with the review pro-

tocol and speciﬁed some steps necessary in the SLR,

such as the primary studies selection process and the

study quality assessment process. For this, we ﬁrst

considered the automatic search and, secondly, the

manual search.

3.2.1 Selection of Primary Studies

First, a review was conducted to delete duplicate pa-

pers with the support of a script in R Studio

, obtain-

ing 1049 papers for the next analysis. Then, we hand-

picked and examined by reviewing the title and the

abstract to preserve only those results relevant to our

study goal that correspond to the research questions,

getting 198 papers. Then, the papers were selected

according to the inclusion and exclusion criteria.

3.2.2 Inclusion Criteria

These criteria comprised studies that presented ex-

amples or any empirical studies (e.g., study cases,

experiments, and developed tools, among oth-

ers), which included terms such as: ”Requirement

Engineering”, ”Functional Requirement*”, ”Non-

functional Requirements”, ”Requirement* Elicita-

tion”, ”Requirement* Validation”, ”Data-driven Re-

quirement*”, ”Extraction”, Classiﬁcation”, ”Pri-

oritisation”, ”ML”, ”NLP”, ”AI”, ”text-mining”,

”data-mining”

3.2.3 Exclusion Criteria

We eliminated papers that were not written in English,

were not published in journals or conferences, and

had a page count of fewer than six pages (for example,

workshop papers and posters).

3.2.4 Quality Assessment

Besides inclusion or exclusion criteria, we considered

certain aspects to evaluate the quality of our selected

studies. These aspects include the number of pages in

the paper (more than ﬁve pages), all RQs have been

R Studio: https://www.r-studio.com/data-recovery-

software/

answered, other authors have cited the study, and the

study has been published in relevant journals or con-

ferences.

In our work, we have included studies manually.

Then, we used the snowballing technique to observe if

relevant papers existed that we needed to include. For

this, criteria and quality assessment (QA) have been

considered, as well as some papers from notiﬁcations

we received in emails (new articles published in re-

lation to the study domain) and others from Google

Scholar. Initially, we did not select this database be-

cause we considered twofold issues. On one side, the

limited precision of search results, and on the other

side, the outcomes of many irrelevant results.

Therefore, ﬁrst, we collected information for each

study, such as year of publication, publication type

(journal or conference), publication source (journal

or conference name), publication title and abstract,

number of citations and authors’ names. Second, we

compiled the necessary study data to address our re-

search questions by deﬁning a data extraction form.

This strategy aimed to ensure a consistent classiﬁca-

tion of all the primary studies and understand the cur-

rent state-of-the-art. Finally, we did not consider pa-

pers that did not report works on the RE domain and

its relationship with ML or provide examples of em-

pirical studies.

3.3 Reporting Stage

This section presents the preliminary results obtained

from the SLR process, such as the data extraction and

data synthesis processes.

3.3.1 Data Extraction Strategy

Our analysis has identiﬁed a crucial set of topics (T)

for each pre-deﬁned research question (RQs). These

sixteen topics, derived from a comprehensive review

of prior works, hold signiﬁcant implications for RE

and ML.

Concerning RQ1, the results are classiﬁed by top-

ics that include strategies, learning algorithms, study

domain, and scope. For this review, we analyzed three

strategies to support RE: (1) classiﬁcation, (2) clus-

tering, (3) association.

We found supervised learning algorithms, unsu-

pervised learning algorithms, requirements obtained

from some methods, type of requirement, preprocess-

ing and metrics. We have seen the need to know the

works currently exist in which ML is helping in au-

tomating different tasks related to requirements, the

study domain and their scope.

For RQ2, we are interested in knowing the types of

dataset repositories used (e.g. open-source), types of

Machine Learning-Enhanced Requirements Engineering: A Systematic Literature Review

523

dataset documents with requirements to be processed

with ML (e.g. SRS, textual, review data, book),

the dataset gathered from (Alashqar, 2022), variables

treated in the datasets (e.g., continuous and categori-

cal) and the dataset size. Regarding RQ3, we recog-

nised topics regarding tools and technology and de-

veloped them.

3.3.2 Syntesis Method

We utilized a quantitative analysis that categorised the

primary studies according to the research questions.

In this regard, we considered the number of studies

and the percentages by different categories identiﬁed

for each topic.

3.3.3 Conducting Stage

The application of the review protocol yielded the fol-

lowing preliminary ﬁndings, as outlined in Table 1.

Throughout this process, four participants, who are

the article’s authors, contributed extensively to the en-

tire review. After that, we complemented papers by

manually searching 25 papers not considered in the

automatic search. The methodology was employed

to apply the same criteria as established before, and

17 papers were obtained. Then, 74 selected studies

were obtained for analysis. These studies are shown

in Appendix A, and the selected studies detailed by

inclusion criteria are displayed in Appendix B.

Table 1: Results of conducting stage.

Source Potential

Studies

Removing

dupli-

cates

Scanning

title and

abstract

Selected

Studies

Scopus 1036 969 160 46

WoS 48 17 9 2

IEEE 57 35 21 8

ACM 38 21 3 1

ProQuest 15 7 7 0

Manual

25 25 25 17

Total 1219 1074 223 74

4 RESULTS AND FINDINGS

This section presents the results and the ﬁndings ob-

tained through the SLR process regarding the three re-

search questions. The results were classiﬁed by year

to determine the frequency of selected studies and

showed that 2018 and 2021 were the years of great-

est scientiﬁc production regarding empirical studies

regarding works about ML techniques in RE (see Fig-

ure 1). Figure 2 summarise the extracted information

quantitatively (for a detailed synthesis of the selected

studies, please refer to Appendix C).

Figure 1: Selected studies per year.

4.1 RQ1. What Existing Works

Demonstrate the Successful

Application of ML Techniques in

RE?

4.1.1 RQ1 Results

This review analyzed three strategies (classiﬁcation,

clustering and association) to support RE. These

strategies were found in 66 of the reviewed papers

as follows. Most papers (53 of the 74 papers, that

is, 71.6%) report using classiﬁcation strategy to cate-

gorize requirements using different ML classiﬁcation

algorithms.

Mostly, the works focused on the supervised clas-

siﬁcation strategy, indicating an interest in analysing

and classifying requirements according to deﬁned cri-

teria (with the existence of a class). Eight relevant

works corresponding to 11% were found regarding

the application of clustering strategies, which seek to

ﬁnd similar requirements through similarity metrics,

as for association rules, which aim to ﬁnd rules of the

”if ... then” type regarding requirements, four relevant

works were found. Finally, eight works were classi-

ﬁed as “other” because they did not meet the classi-

ﬁcation deﬁned in RQ1. In this group some works

proposing an ontology (S37, S56, S63), and oth-

ers are based on a graph theory (S71) and question-

answering (chatbot) (S03).

Then, this review examined different ML tech-

niques and considered how these contribute to RE.

Regarding the supervised learning we have obtained

49 papers, where one was identiﬁed as Linear Regres-

sion, and seven as logistic Regression. 27% of the

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

524

Figure 2: RQs, Topics, and Summary of the Result corresponding (N=74).

papers were reported using decision trees (DT) as a

method, of which four papers related to the J48 deci-

sion tree method (S24, S30, S39, S58), two to C4.5

(S64, S68), and ten to Random Forest (RF). In most

of the studies related to SLR, 31% corresponding to

the SVM method and 31% to Naive Bayes (NB).

We also found that several works have used K-

means (K-NN) method (15%), LSTM (6.8%), CNN

(5%), Bagging (4%), Deep Learning (7%), and Re-

current Neural Network (RNN) (4%), and Support

Vector Classiﬁcation (SVC) (1%). Several works

have combined different techniques in their work. For

instance, Jindal et al. (S30) used 8 ML techniques

developed for the classiﬁcation of NFR descriptions,

Machine Learning-Enhanced Requirements Engineering: A Systematic Literature Review

525

including Logitboost (LB), Adaboost (AB), Multi-

Layer Perceptron (MLP), Radial Basis Function net-

work (RBF), and four more that are classiﬁed pre-

vious such as Bagging, NB, DT and RF. Yaseen et

al. (S71) present the Analytical Hierarchical Pro-

cess (AHP) and spanning tree combination to priori-

tise and implement FRs. In addition, Younas et al.

(S73) mention employing supervised learning but do

not specify which algorithm was used in their work.

In connection with unsupervised learning algo-

rithms, we identiﬁed nineteen studies where four

works corresponding to K-NN, four works Hierar-

chical agglomerative clustering (HAC), three works

LDA, one work to DL, four works to Word2vec,

one work to cluster-boosted regression (CBR) (S59).

Muhairat et al. (S45) presented in their work Apriori

and FP-Growth algorithms based on association rule

analysis to perform experiments to improve the ac-

curacy and the completeness of the gathered require-

ments. Elhassan et al. (S18) described the require-

ments conﬂict detection automation model based on

the unsupervised ML model.

Different study domains were presented, with

health (11%), industry (16%), education (1%) and

among other areas (26%)

Regarding the scope of the study, most of the pa-

pers reported that its research assumes requirements

classiﬁcation, (S06, S10, S11, S18, S20, S25, S30,

S31, S33, S35, S36, S40, S46, S50, S53, S54, S55,

S56, S70, S73) (28%), requirements identiﬁcation

(S12, S50, S52, S58, S70, S72) (9%), requirements

extraction (S02, S03, S12, S26, S55, S59) (8%), re-

quirements clustering, (S19, S50, S59) (4%), require-

ments ﬂaws (S34, S49) (3%), ontology (S34, S37,

S56, S61, S63) (7%), and among others.

4.1.2 RQ1 Findings

These results suggest that identifying and classifying

requirements is an important step towards automating

the analysis of requirements written in natural lan-

guage. However, applying ML techniques to classify

requirements revealed that a supervised classiﬁcation

strategy is mostly used.

Additionally, these studies show the tendency to

use ML mainly to classify NFR. This trend suggests

that this activity is critical and necessary since inap-

propriate NFR management could increase software

development and maintenance costs and impact its

quality.

The studies highlight the potential of ML tech-

niques to improve RE activities, including user vali-

dation, elicitation and prioritization of requirements.

Results suggest that no works related to regression

techniques were found since we focused explicitly on

papers that analyze requirements expressed in nat-

ural language. Therefore, it is necessary to per-

form preprocessing in the requirements using natu-

ral language processing or text mining techniques be-

fore applying ML techniques. Also, classiﬁcation and

extraction algorithms can be improved or combined

with other state-of-the-art algorithms to improve the

strategies (i.e. classiﬁcation, clustering and associa-

tion rules). The analysis of the papers revealed that

they apply their approaches to different domains, such

as industry, health, education, and others. These do-

mains generate sets of free and private requirements

data, which are used to apply ML algorithms.

4.2 RQ2. What Are the Characteristic

Features Employed in RE that

Leverage ML?

4.2.1 RQ2 Results

In this SLR, 54 papers covered ML techniques for

extracting or classifying requirements, as deﬁned in

RQ1. These techniques were applied to both FR and

NFR. Additionally, it was identiﬁed that requirements

were obtained through different methods, such as in-

terviews, experimentation, and experts. It is impor-

tant to note that some works do not explicitly specify

the type of requirement analyzed or its source. Con-

versely, the application of ML techniques in RE is di-

vided into two main groups: supervised learning and

unsupervised learning.

Concerning requirements obtained from different

ways, we recognized some got the interview (4%), an-

other through experts (7%), and the experiment (3%).

The rest of the papers regard other speciﬁcations from

where was obtained the requirements (22%), and in

some papers, the information was not available (S03,

S05, S32, S37, S38, S43, S45, S47) (11%).

Many works focused on NFR (41%), and others in

FR (31%), where features or capabilities are the input

to applied ML techniques. Most of the studies indi-

cated that applied preprocessing such as NLP (54%),

text-ming (16%), and one paper in which the authors

did not provide this information (S42). We recognised

some metrics among which we have most papers ap-

plied recall (54%), followed by precision (47%), F-

measure (36%), and accuracy (32%). Also, we iden-

tiﬁed k-fold cross (S21), validity degree (S15), sil-

houette score (S53), the curve of Receiver Operating

Characteristic (ROC) (S26), Area Under ROC Curve

(AUC) (S30), Flesch Kincaid Reading Ease (fkre),

Flesch Kincaid Grade Level (fkgl), Coleman Liau In-

dex (cli) (S24), and others (S09, S18, S60, S61).

The dataset is relevant to train and evaluate ML

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

526

techniques. We identiﬁed fourteen studies that used

datasets based on open-source repositories (S03, S07,

S24, S32, S43, S44, S45, S46, S47, S61, S62, S68,

S70). EzzatiKarami and Madhavji (S20) use a pub-

lic requirements document (PURE dataset). Reahimi

et al. (S54) specify that the dataset was obtained in

a previous real-world project. Gu et al. (S22) se-

lected a dataset of an experiment concerning 91 ef-

fective cases of KLK2 elevator, among others.

Considering the type of document of the dataset

applied for research purposes, we found some pa-

pers based on SRS (S09, S10, S20, S31) (5%), tex-

tual (S44, S61, S62) (4%), mobile apps (S04, S12,

S46) (4%), review data (S25, S70) (3%), and book

(S60) (1%). Once the dataset is gathered, it has to

be processed to prepare data and use ML algorithms.

Most studies use the PROMISE dataset (26%), among

others include iTrust (S24, S65), PURE (S02, S20),

SQuAD (S03), Aurora 2 (S47).

Regarding the dataset size, several works to group

sizes between 14 to 10.000 (S04, S06, S07, S11, S12,

S18, S19, S20, S21, S22, S24, S25, S28, S30, S31,

S34, S35, S36, S43, S44, S50, S52, S53, S54, S55,

S61, S62, S63, S64, S66, S68, S70, S73) (45%), and

other works varied between 10.000 to 7 million and

more (S02, S03, S11, S39, S45, S57) (8%). Some

methods consider continuous and categorical vari-

ables, ﬁve papers corresponding to categorical (S22,

S24, S36, S44, S45), and one to continuous (S22).

4.2.2 RQ2 Findings

In research on the classiﬁcation of requirements using

ML techniques, evaluating the results is an essential

part of the process. In this regard, various metrics

widely used in the analyzed works have been iden-

tiﬁed: Accuracy, Precision, Recall, and F-measure.

This conﬁrms that these four metrics are the most

commonly used as they are considered the most rep-

resentative for evaluating an ML model.

According to the results table, most papers use re-

quirement repositories that are freely accessible. Re-

garding the size of the datasets, most papers com-

prise less than 10.000 requirements. Although, some

papers with a minimal amount of requirements are

mainly used to perform a proof of concept of the pro-

posed approach. Only a small number of papers have

datasets with more than 10.000 requirements, which

is why deep learning algorithms are used to a lesser

extent. This situation may be due to the lack of free

access datasets with large amounts of requirements.

These results suggest that the investigation of the

application of ML techniques to automate the activi-

ties of RE has several open avenues, such as (1) con-

sidering other types of requirements speciﬁcation for-

mats, (2) replicating these works to conﬁrm their re-

sults and provide benchmarks of different approaches,

(3) extend the existing techniques through the use of

large data sets and other application domains; and

(4) pay attention to the efﬁciency of the approaches.

4.3 RQ3. What Are the Tools Used in

the Field of RE that Apply ML?

4.3.1 RQ3 Results

Some important topics in using ML algorithms ap-

plied to software requirements are related to the tech-

nology and tools used. Several tools and technology

have been applied in different studies involving ML

and RE, being that 20% of the works have used open-

source (S12, S13, S24, S43, S44, S45, S46, S47, S61,

S62, S64, S68), whereas 5% indicated visualization

tools for presenting the results of ML analyses in a

user-friendly (e.g., dashboards) (S13, S12, S61, S62),

and 3% speciﬁed chatbot (S03, S12). Dabrowski

et al. (S12) do not mention using visualization or

chatbot technologies; rather, they focus on evaluat-

ing techniques for opinion mining and searching for

feature-related reviews in app reviews. Some au-

thors indicated that they gave a name to their propos-

als, for example, Review with Categorized Require-

ments (ReCaRe) (S49), Heuristic Requirements Assis-

tant (HeRA) (S62), retraining Bidirectional Encoder

Representations from Transformers (REBERT4RE)

(S02), ReqVec (S66), NOMEN (S26), among others.

Concerning how the proposals were developed,

the results indicate that ten works use Python (S04,

S08, S18, S20, S28, S46, S54, S72), four use Java

(S22, S24, S34, S63) two use Weka (S04, S56). The

rest of the papers, corresponding to MySql (S22),

R (S52), Google Colab (S06), wiki (S61), Semi-

supervised approach for Feature Extraction (SAFE),

Group MAsking (GuMa), Relation-based Unsuper-

vised Summarization (ReUS) and Mining App Review

using Association Rule Mining (MARAM) (S12), and

APIs of Apple App Store (S70).

4.3.2 RQ3 Findings

After the analysis, it was identiﬁed that most ap-

proaches use open-source technology. Several of them

assign names to the generated tools, varying from vi-

sual tools to chatbots. Regarding programming lan-

guages, the ones used in data science in general, such

as Python, R, and Java, among others, are mostly

used. Visual frameworks, such as Weka, are also used.

It is important to note that several papers use cloud

software for experimentation execution.

Machine Learning-Enhanced Requirements Engineering: A Systematic Literature Review

527

These ﬁndings indicate that there is still the oppor-

tunity to develop tools that help automate ML algo-

rithms and that can integrate other AI solutions with

this purpose (e.g. ChatGPT).

5 THREATS TO VALIDITY

The completeness of the primary studies enlisted

in this SLR hinges heavily on the choice of key-

words and the limitations of the digital libraries and

search engines employed. Objective search terms

were utilised to mitigate risks associated with subjec-

tive search terms. However, the syntax and standards

across different engines and libraries may have inad-

vertently excluded relevant studies. A broad search

string encompassing various synonyms for each term

was crafted to address this concern. Five databases

(Scopus, WoS, IEEE, ACM, and ProQuest) were se-

lected to maximize the potential pool of papers. In-

clusion and exclusion criteria assessed the relevance

and quality of titles and abstracts. The researcher also

checked the references of selected papers to identify

any additional pertinent works. However, the deci-

sion to exclude papers in languages other than English

might have overlooked relevant contributions.

Furthermore, the data selection process was rigor-

ously aligned with the research questions to prevent

biases or inaccuracies in data extraction based on the

researcher’s interests. Instances of uncertainty or con-

ﬂicts regarding the inclusion of a paper were resolved

through discussions among the authors.

6 CONCLUSIONS AND FUTURE

WORK

The SLR, spanning from January 2012 to June 2023,

evaluates the application of ML techniques in soft-

ware requirements engineering, identifying 74 signif-

icant studies. This review highlights ML’s potential in

automating software RE processes, though complete

automation remains a future goal. Key improvements

through ML include user validation, requirement elic-

itation, and prioritization. Emerging technologies

like ChatGPT and Google Bard show promise in re-

quirement analysis and representation. Preprocess-

ing in ML for natural language requirements is essen-

tial, and enhancing classiﬁcation and extraction meth-

ods by integrating advanced techniques like clustering

and association rules is suggested.

ML’s application extends across diverse domains,

including industry, health, and education, emphasiz-

ing the need for accurate requirement classiﬁcation

metrics. Most studies use open-source tools, pro-

gramming languages like Python, R, and Java, and

visual frameworks like Weka. Future research op-

portunities include varying requirement speciﬁcation

formats, applying ML to larger datasets and new do-

mains, and enhancing efﬁciency.

ACKNOWLEDGEMENTS

This work was supported by ”Vicerrectorado de In-

vestigaci

on e Innovaci

on (VIUC)” at the University

of Cuenca.

REFERENCES

Alashqar, A. M. (2022). Studying the commonalities, map-

pings and relationships between non-functional re-

quirements using machine learning. Science of Com-

puter Programming, 218:102806.

Iqbal, T., Elahidoost, P., and Lucio, L. (2018). A bird’s eye

view on requirements engineering and machine learn-

ing. In 2018 25th Asia-Paciﬁc Software Engineering

Conference (APSEC), pages 11–20. IEEE.

Jindal, R., Malhotra, R., Jain, A., and Bansal, A. (2021).

Mining non-functional requirements using machine

learning techniques. e-Informatica Software Engi-

neering Journal, 15(1).

Kitchenham, B. and Charters, S. (2007). Guidelines for per-

forming systematic literature reviews in software en-

gineering. Keele University, 2.3.

Pei, Z., Liu, L., Wang, C., and Wang, J. (2022). Re-

quirements engineering for machine learning: A re-

view and reﬂection. In 2022 IEEE 30th Interna-

tional Requirements Engineering Conference Work-

shops (REW), pages 166–175. IEEE.

Sonbol, R., Rebdawi, G., and Ghneim, N. (2022). The use

of nlp-based text representation techniques to support

requirement engineering tasks: A systematic mapping

review. IEEE Access.

Zamani, K., Zowghi, D., and Arora, C. (2021). Ma-

chine learning in requirements engineering: A map-

ping study. In 2021 IEEE 29th International Require-

ments Engineering Conference Workshops (REW),

pages 116–125. IEEE.

APPENDIX

Appendices A and B list selected papers and inclusion

criteria for the SLR, while Appendix C offers a de-

tailed summary. The link to access these appendices

is available: SLR-Appendices-Resource.

ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering

528