so we decide to compare our proposed method with
these two studies. We use the original results reported
by authors (Zanoni et al., 2015) and (Barbudo et al.,
2021) to compare it with the SPD results. Given that
the results in their works are obtained based on the
total DPB repository size, we also use all samples in
this corpus (we label all existing patterns other than
SP as none). As the general purpose configuration for
MARPLE corresponds to Random Forest, we use the
same algorithm for the test. Comparative results can
be found in Table 9.
Table 9: Comparing SPD with MARPLE and GEML re-
sults.
Classifiers
SP Corpus: Labelled DPB
Accuracy(%) F1-score (%)
MARPLE 93 90
GEML 95.61 94.11
SPD 99.86 99.63
As can be observed, and even if it is just a test
for recovering only ST implementation, the SPD out-
performs GEML with a percentage of improvements
equal to 4.35%, 5.52% and MARPLE with 6.86%,
9.63 in terms of accuracy and F1 Score. The use
of specific and more complete data for training the
model makes the SPD better performed in recovering
any instance. MARPLE and GEML use limited data
to train their classifiers, and their ability to recover
SP instances depends heavily on those present in the
training dataset. Furthermore, the detailed analysis of
the SP and the use of relevant features make the classi-
fier more able to identify any implementation variants
even if it is in a combination form.
6 CONCLUSION
In this work, we propose a novel approach to SP de-
tection based on features and ML techniques. This
work is the first to recover non solely the typical im-
plementation but also the incorrect structure that in-
hibits the SP intent, and the implicit structure of this
pattern. The goal of this detection is to improve the
quality of source code, correct incoherent structure,
and give the possibility to automatically inject the SP
by discovering the corresponding context.
Based on a detailed analysis of the SP, we iden-
tify implementation variants of SI. Thereafter, we pro-
pose 9 features for their definition, and then we cre-
ate specific data for each one, containing snippets
of code. We added the newly proposed features to
those proposed by (Nacef et al., 2022), and we use
the same method to extract their values from the Java
program. In the next step, and based on a differ-
ent combination of feature values, we try to create
data that contains the greatest number of implemen-
tations. The data is named SDD and used to train the
SPD. The SPD is built based on different ML tech-
niques. Evaluating the SPD on collected and labeled
data from GitHub Java corpus named SED and DPB
corpus, prove the performance of all used techniques
and achieved +99% in terms of precision, recall, and
F1 Score with SVM. The proposed approach outper-
forms MARPLE and GEML by + 4% in terms of ac-
curacy and F1 Score..
In this work, we have focused on the SP with a de-
tailed analysis, and we have taken the first step toward
refactoring. In future work, we try to apply the same
method to recover other DPs, and as the next step, we
are going to attempt the injection of them.
REFERENCES
Barbudo, R., Ram
´
ırez, A., Servant, F., and Romero, J. R.
(2021). GEML: A grammar-based evolutionary ma-
chine learning approach for design-pattern detection.
J. Syst. Softw., 175:110919.
Chihada, A., Jalili, S., Hasheminejad, S. M. H., and Zan-
gooei, M. H. (2015). Source code and design confor-
mance, design pattern detection from source code by
classification approach. Appl. Soft Comput., 26:357–
367.
Fontana, F. A., Caracciolo, A., and Zanoni, M. (2012).
DPB: A benchmark for design pattern detection tools.
In 16th European Conference on Software Mainte-
nance and Reengineering, CSMR 2012, Szeged, Hun-
gary, March 27-30, 2012, pages 235–244. IEEE Com-
puter Society.
Gamma, E., Helm, R., Johnson, R., and Vlissides, J. M.
(1994). Design Patterns: Elements of Reusable
Object-Oriented Software. Addison-Wesley Profes-
sional, 1 edition.
Lucia, A. D., Deufemia, V., Gravino, C., and Risi, M.
(2009). Design pattern recovery through visual lan-
guage parsing and source code analysis. J. Syst. Softw.,
82(7):1177–1193.
Mayvan, B. B., Rasoolzadegan, A., and Yazdi, Z. G. (2017).
The state of the art on design patterns: A systematic
mapping of the literature. J. Syst. Softw., 125:93–118.
Nacef, A., Khalfallah, A., Bahroun, S., and Ben Ahmed, S.
(2022). Defining and extracting singleton design pat-
tern information from object-oriented software pro-
gram. In Advances in Computational Collective Intel-
ligence, pages 713–726, Cham. Springer International
Publishing.
Nazar, N., Aleti, A., and Zheng, Y. (2022). Feature-based
software design pattern detection. J. Syst. Softw.,
185:111179.
Rasool, G. and M
¨
ader, P. (2011). Flexible design pattern
detection based on feature types. In 26th IEEE/ACM
Automatic Detection of Implicit and Typical Implementation of Singleton Pattern Based on Supervised Machine Learning
209