Towards Statistical Comparison and Analysis of Models

Önder Babur, Loek Cleophas, Tom Verhoeff, Mark van den Brand


Model comparison is an important challenge in model-driven engineering, with many application areas such as model versioning and domain model recovery. There are numerous techniques that address this challenge in the literature, ranging from graph-based to linguistic ones. Most of these involve pairwise comparison, which might work, e.g. for model versioning with a small number of models to consider. However, they mostly ignore the case where there is a large number of models to compare, such as in common domain model/metamodel recovery from multiple models. In this paper we present a generic approach for model comparison and analysis as an exploratory first step for model recovery. We propose representing models in vector space model, and applying clustering techniques to compare and analyse a large set of models. We demonstrate our approach on a synthetic dataset of models generated via genetic algorithms.


  1. Abebe, S. L. and Tonella, P. (2010). Natural language parsing of program element names for concept extraction. In Program Comprehension (ICPC), 2010 IEEE 18th International Conference on, pages 156-159. IEEE.
  2. Altmanninger, K., Seidl, M., and Wimmer, M. (2009). A survey on model versioning approaches. International Journal of Web Information Systems, 5(3):271-304.
  3. Babur, O., Smilauer, V., Verhoeff, T., and van den Brand, M. (2015a). Multiphysics and multiscale software frameworks: An annotated bibliography. Technical Report 15-01, Dept. of Mathematics and Computer Science, Technische Universiteit Eindhoven, Eindhoven.
  4. Babur, O., Smilauer, V., Verhoeff, T., and van den Brand, M. (2015b). A survey of open source multiphysics frameworks in engineering. Procedia Computer Science, 51:1088-1097.
  5. Brunet, G., Chechik, M., Easterbrook, S., Nejati, S., Niu, N., and Sabetzadeh, M. (2006). A manifesto for model merging. In Proc. of the 2006 Int. Workshop on Global Integrated Model Management, pages 5-12. ACM.
  6. Budinsky, F. (2004). Eclipse modeling framework: a developer's guide. Addison-Wesley Professional.
  7. Deissenboeck, F., Hummel, B., Juergens, E., Pfaehler, M., and Schaetz, B. (2010). Model clone detection in practice. In Proc. of the 4th Int. Workshop on Software Clones, pages 57-64. ACM.
  8. Dijkman, R., Dumas, M., Van Dongen, B., Käärik, R., and Mendling, J. (2011). Similarity of business process models: Metrics and evaluation. Inf. Systems, 36(2):498-516.
  9. Jain, A. K. and Dubes, R. C. (1988). Algorithms for clustering data. Prentice-Hall, Inc.
  10. Javed, F., Mernik, M., Gray, J., and Bryant, B. R. (2008). Mars: A metamodel recovery system using grammar inference. Inf. and Software Tech., 50(9):948-968.
  11. Klint, P., Landman, D., and Vinju, J. (2013). Exploring the limits of domain model recovery. In Software Maintenance (ICSM), 2013 29th IEEE International Conference on, pages 120-129. IEEE.
  12. Kolovos, D. S., Ruscio, D. D., Pierantonio, A., and Paige, R. F. (2009). Different models for model matching: An analysis of approaches to support model differencing. In Comparison and Versioning of Software Models, 2009. ICSE Workshop on, pages 1-6. IEEE.
  13. Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., and Hornik, K. (2013). cluster: Cluster Analysis Basics and Extensions. R package version 1.14.4.
  14. Manning, C. D., Raghavan, P., Schütze, H., et al. (2008). Introduction to information retrieval, volume 1. Cambridge university press Cambridge.
  15. Ratiu, D., Feilkas, M., and Jürjens, J. (2008). Extracting domain ontologies from domain specific apis. InSoftware Maintenance and Reengineering, 2008. CSMR 2008. 12th European Conf. on, pages 203-212. IEEE.
  16. Reinhartz-Berger, I. (2010). Towards automatization of domain modeling. Data & Knowledge Engineering, 69(5):491-515.
  17. Rubin, J. and Chechik, M. (2013). N-way model merging. In Proc. of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 301-311. ACM.
  18. She, S., Lotufo, R., Berger, T., Wøsowski, A., and Czarnecki, K. (2011). Reverse engineering feature models. In Software Engineering (ICSE), 2011 33rd International Conference on, pages 461-470. IEEE.
  19. Stephan, M. and Cordy, J. R. (2013). A survey of model comparison approaches and applications. In Modelsward, pages 265-277.
  20. van den Brand, M., Hofkamp, A., Verhoeff, T., and Protic, Z. (2011). Assessing the quality of model-comparison tools: a method and a benchmark data set. In Proc. of the 2nd Int. Workshop on Model Comparison in Practice, pages 2-11. ACM.

Paper Citation

in Harvard Style

Babur Ö., Cleophas L., Verhoeff T. and van den Brand M. (2016). Towards Statistical Comparison and Analysis of Models . In Proceedings of the 4th International Conference on Model-Driven Engineering and Software Development - Volume 1: MODELSWARD, ISBN 978-989-758-168-7, pages 361-367. DOI: 10.5220/0005799103610367

in Bibtex Style

author={Önder Babur and Loek Cleophas and Tom Verhoeff and Mark van den Brand},
title={Towards Statistical Comparison and Analysis of Models},
booktitle={Proceedings of the 4th International Conference on Model-Driven Engineering and Software Development - Volume 1: MODELSWARD,},

in EndNote Style

JO - Proceedings of the 4th International Conference on Model-Driven Engineering and Software Development - Volume 1: MODELSWARD,
TI - Towards Statistical Comparison and Analysis of Models
SN - 978-989-758-168-7
AU - Babur Ö.
AU - Cleophas L.
AU - Verhoeff T.
AU - van den Brand M.
PY - 2016
SP - 361
EP - 367
DO - 10.5220/0005799103610367