class level. The results may be different for projects of
different sizes, maturity, or written in other languages.
They may also differ when performed at the method
or file level.
5 CONCLUSIONS AND FUTURE
WORK
In this paper we investigated the predictive power of
two data flow metrics: dep-degree and dep-degree
density. DDD measures different aspects of the code
than other metrics considered, since it is weakly
correlated with other metrics. DD shows signifi-
cantly greater correlations. However, using DDD in
SDP models only slightly increases the model perfor-
mance. On the other hand, DD achieves much better
results, but the best results were achieved for models
that used both DD and DDD as independent variables.
To conclude, DD and DDD seem to be the inter-
esting choice as defect predictors in the SDP mod-
els, as well as the objects of future research regard-
ing SDP. They may also be used as useful code com-
plexity metrics, indicating how difficult the code is to
understand by the developer. It would also be inter-
esting to investigate their predictive power in just-in-
time defect prediction models, which recently gained
a lot of attention from researchers.
Data Availability. The source files (bug data, R
script, and Open Static Analyzer metrics defini-
tions) can be found at https://github.com/Software-
Engineering-Jagiellonian/DepDegree-ENASE2023.
REFERENCES
Akimova, E., Bersenev, A., Deikov, A., Kobylkin, K.,
Konygin, A., Mezentsev, I., and Misilov, V. (2021). A
survey on software defect prediction using deep learn-
ing. Mathematics, 9:1180.
Alon, U., Zilberstein, M., Levy, O., and Yahav, E. (2019).
Code2vec: Learning distributed representations of
code. Proc. ACM Program. Lang., 3(POPL).
Ammann, P. and Offutt, J. (2016). Introduction to Software
Testing. Cambridge University Press, Cambridge.
Beyer, D. and Fararooy, A. (2010). A simple and effec-
tive measure for complex low-level dependencies. In
2010 IEEE 18th International Conference on Program
Comprehension, pages 80–83.
Beyer, D. and H
¨
aring, P. (2014). A formal evaluation of
depdegree based on weyuker’s properties. In Pro-
ceedings of the 22nd International Conference on
Program Comprehension, ICPC 2014, page 258–261,
New York, NY, USA. Association for Computing Ma-
chinery.
Bowes, D., Hall, T., and Petri
´
c, J. (2018). Software defect
prediction: do different classifiers find the same de-
fects? Software Quality Journal, 26:525–552.
Efron, B. (1983). Estimating the error rate of a prediction
rule: Improvement on cross-validation. Journal of the
American Statistical Association, 78:316–331.
Ferenc, R., T
´
oth, Z., and Lad
´
anyi, G. e. a. (2020). A pub-
lic unified bug dataset for java and its assessment re-
garding metrics and bug prediction. Software Quality
Journal, 28:1447–1506.
Hall, T., Beecham, S., Bowes, D., Gray, D., and Counsell,
S. (2012). A systematic literature review on fault pre-
diction performance in software engineering. IEEE
Transactions on Software Engineering, 38(6):1276–
1304.
Hellhake, D., Schmid, T., and Wagner, S. (2019). Using
data flow-based coverage criteria for black-box inte-
gration testing of distributed software systems. In
2019 12th IEEE Conference on Software Testing, Val-
idation and Verification (ICST), pages 420–429.
Henry, S. and Kafura, D. (1981). Software structure met-
rics based on information flow. IEEE Transactions on
Software Engineering, SE-7(5):510–518.
Jiang, Y., Cukic, B., and Menzies, T. (2008). Can data trans-
formation help in the detection of fault-prone mod-
ules? In Proceedings of the 2008 Workshop on De-
fects in Large Software Systems, DEFECTS ’08, page
16–20, New York, NY, USA. Association for Comput-
ing Machinery.
Jiarpakdee, J., Tantithamthavorn, C., and Hassan, A. E.
(2021). The impact of correlated metrics on the in-
terpretation of defect models. IEEE Transactions on
Software Engineering, 47(02):320–331.
Jiarpakdee, J., Tantithamthavorn, C., and Treude, C. (2018).
Autospearman: Automatically mitigating correlated
software metrics for interpreting defect models. In
2018 IEEE International Conference on Software
Maintenance and Evolution (ICSME), pages 92–103.
Kamei, Y., Shihab, E., Adams, B., Hassan, A. E., Mockus,
A., Sinha, A., and Ubayashi, N. (2013). A large-
scale empirical study of just-in-time quality assur-
ance. IEEE Transactions on Software Engineering,
39(6):757–773.
Katzmarski, B. and Koschke, R. (2012). Program complex-
ity metrics and programmer opinions. In 2012 20th
IEEE International Conference on Program Compre-
hension (ICPC), pages 17–26.
Kennedy, K. (1979). A survey of data flow analysis tech-
niques. Technical report, IBM Thomas J. Watson Re-
search Division.
Kolchin, A., Potiyenko, S., and Weigert, T. (2021). Ex-
tending data flow coverage with redefinition analysis.
In 2021 International Conference on Information and
Digital Technologies (IDT), pages 293–296.
Kumar, C. and Yadav, D. (2017). Software defects esti-
mation using metrics of early phases of software de-
velopment life cycle. International Journal of System
Assurance Engineering and Management, 8:2109–
–2117.
ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering
124