An Extensive Analysis of Data Clumps in UML Class Diagrams
Nils Baumgartner, Elke Pulvermüller
2024
Abstract
This study investigated the characteristics of data clumps in UML class diagrams. Data clumps are group of variables which appear together in multiple locations. In this study we compared the data clumps characteristics in UML class diagrams with them of source code projects. By analyzing the extensive Lindholmen and GenMyModel datasets, known for their real–world applicability, diversity, and containing more than 100,000 class diagrams in total, significant differences in the distribution and nature of data clumps were revealed. Approximately 19 % of the analyzed class diagrams contained data clumps. It was observed that field–field data clumps predominated in UML class diagrams, particularly in the GenMyModel dataset, while parame-ter–parameter data clumps were less frequent. Moreover, in contrast to the distribution in source code projects, data clumps in UML class diagrams were typically distributed across multiple classes or interfaces, forming larger chains. parameter–parameter data clumps were predominant in source code projects, indicating more detailed implementation of methods in these projects. These findings reflect different modeling approaches and paradigms among the respective user groups. This study has provided important insights regarding the development of UML modeling tools, teaching methods, and design practices in software development.
DownloadPaper Citation
in Harvard Style
Baumgartner N. and Pulvermüller E. (2024). An Extensive Analysis of Data Clumps in UML Class Diagrams. In Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE; ISBN 978-989-758-696-5, SciTePress, pages 15-26. DOI: 10.5220/0012550500003687
in Bibtex Style
@conference{enase24,
author={Nils Baumgartner and Elke Pulvermüller},
title={An Extensive Analysis of Data Clumps in UML Class Diagrams},
booktitle={Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE},
year={2024},
pages={15-26},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012550500003687},
isbn={978-989-758-696-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE
TI - An Extensive Analysis of Data Clumps in UML Class Diagrams
SN - 978-989-758-696-5
AU - Baumgartner N.
AU - Pulvermüller E.
PY - 2024
SP - 15
EP - 26
DO - 10.5220/0012550500003687
PB - SciTePress