Software Architecture Reconstruction through

Clustering: Finding the Right Similarity Factors

Ioana S¸ora

Department of Computer and Software Engineering, Politehnica University of Timis¸oara,

Timis¸oara, Romania

Abstract. Clustering is very often used for the purpose of automatic software

architecture reconstruction. This work investigates the importance of taking into

account different factors for the similarity metric, besides the traditional factor

based on direct coupling/cohesion: indirect coupling as computed from the topol-

ogy of the dependency graph, and global architectural layering resulting from the

orientation of dependencies. We experiment with using these factors, individu-

ally or combined, for deﬁning the similarity metrics within a set of clustering

algorithms.

1 Introduction

Software architecture is a model of the software system expressed at a high level of

abstraction, concentrating on the interaction of “black box” elements. Knowing and

having an explicit representation of the system architecture is essential for understand-

ing, evaluating and maintaining a large software application. Often, the documentation

is incomplete, outdated or is completely missing, only the code being available. Recon-

structing the architectural model from the available code remains the saving alternative

in these cases.

The reverse engineering community developed many techniques to help reconstruct

the architecture of software systems, as they are surveyed and classiﬁed in [8]. Auto-

matic reconstruction techniques aim at ﬁnding the logical cluster structure of software

systems, with as few user intervention as possible and with minimal prior knowledge.

Software clustering refers to the decomposition of a software system into meaningful

subsystems. To be meaningful, the automatic approach must produce clusterings that

can help developers to understand the system, grouping together parts that relate to

each other from a logical design point of view.

Our goal is to improve automatic reconstruction techniques, in order to obtain a

reconstructed architectural model of a better quality - one which is evaluated to be

better by a human expert.

This article is organized as follows: Section 2 resumes the state of the art and builds

the motivation for our approach. Section 3 states the goals of this work and introduces

our reconstruction approach. Experimental results regarding the inﬂuence of different

similarity factors on the quality of the reconstructed model are described in Section 4.

Sora I..

Software Architecture Reconstruction through Clustering: Finding the Right Similarity Factors.

DOI: 10.5220/0004599600450054

In Proceedings of the 1st International Workshop in Software Evolution and Modernization (SEM-2013), pages 45-54

ISBN: 978-989-8565-66-2

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

2 Background

Reconstructing the architecture of a software system can take one of the following ap-

proaches:

– The top-down approach, when certain assumptions of the overall system organiza-

tion are known and they are validated by examining the existing artifacts with help

of interactive tools in a human-controlled reconstruction process.

– The bottom-upapproach,when (quasi)-automaticunsupervised tools build hypothe-

ses starting from the examination of the existing artifacts.

In the category of top-down, human-controlled or interactive approaches there are

notable tools such as: Rigi by H. Muller et al [14]; the Reﬂexion Model technique of

Murphy, Notkin, and Sullivan [15]; the Reﬂexion model combined with clustering [7];

ACDC - the pattern driven approach of Tzerpos and Holt [20].

In the category of bottom-up, automatic or quasi-automatic approaches for architec-

tural reconstruction, techniques have been imported from the domain of data mining.

Clustering algorithms have been largely used in data mining to identify groups of ob-

jects whose members are similar in some way. Clustering algorithms group together

entities into groups, by maximizing the sum of relationships between entities grouped

together and minimizing the sum of relationships between different groups. In reverse

software engineering, clustering is used for architectural reconstruction, by grouping

together in subsystems modules (classes, functions, etc) that relate to each other.

There are several research approaches in this domain, which differ by:

– the graph clustering algorithm which is used [22]

– the software-engineering deﬁned criteria used for grouping modules together (the

similarity metric)

The basic assumption driving this software clustering approach is that software sys-

tems are organized into subsystems characterized by internal cohesion and loosely cou-

pling with each other. A reference tool of this category is Bunch, developed by Mitchell

and Mancoridis [12], using a search based algorithm (hill climbing) and a modulariza-

tion quality metric MQ deﬁned as a formula on coupling and cohesion.

As observed by many researchers, clustering software based on a metric for simi-

larity and dissimilarity derived only from coupling and cohesion does not provide sat-

isfactory results [9]. Various researches have tried to do software clustering by taking

into account other categories of informations as similarity metrics: A form of indirect

coupling is taken into account by Chiricota [6]. The LIMBO approach of Andritsos and

Tzerpos [1] considers even non-software informations, such as historical data (time of

last modiﬁcation, author) held by version control system repositories, the physical or-

ganization of applications in terms of ﬁles and folders. Anquetil and Letherbridge [3]

use for clustering the symbolic textual information available in the identiﬁer used as

names. Recent researches ([2], [5], [13], [10]) agree that unsupervised clustering ap-

proaches based only on a coupling/cohesion criteria tend to produce results that are not

acceptable for the domain experts and propose different measures for improvements.

3 Our Work

3.1 Goal and Approach

Our goal is to investigate ways of improving the quality of bottom-up, automatic, un-

supervised reconstruction. We have built the Architecture Reconstruction Tool Suite

(ARTs) as an extensible tool chain for experimenting with different methods for clus-

tering. In the architectural reconstruction community there have been developed a lot

of different approaches and methods but they are used and studied in isolation. Our

goal is to integrate into an architectural reconstruction toolsuite the different partial so-

lutions, in order to compare their relative efﬁciency and also study the ways how they

can be combined. Also, we propose a new approach of including extracted architectural

information in the grouping criteria.

The architecture of ARTs is depicted in Figure 1.

Reference

Architectural

Decomposition

Structural

Model

(UNIQ-ART)

DSM

Builder

General

Preprocess

Cluster

Finder

General

Postprocess

weighted

graph

weighted

graph

clusters

Compare

The Architecture Reconstruction Toolsuite

Factors Algorithm Methods Methods Algorithms

Clustering

Method

Asessment

Fig.1. The Architectural Reconstruction Toolsuite (ARTs).

The input for ARTs is a primary structural and dependency model extracted from

code by static analysis and represented according to the UNIQ-ART meta-model [17].

The primary models represent relationships between the Units(classes or modules) and

their parts, and may describe systems implemented in object oriented (Java, C#) or

procedural (C) languages.

Each tool of this chain accepts different plug-ins in order to customize it.

The central tool is the ClusterFinder, which operates on a weighted graph repre-

senting an abstraction of the system in order to produce its decomposition into clusters.

It may be implemented by different clustering algorithms. Currently we have imple-

mented both ﬂat decomposition algorithms as well as hierarchical decomposition algo-

rithms. The algorithms are:

– Minimum Spanning Tree based algorithms (MST [23] and MMST [4])

– Metric Based ([6]);

– Search based (hill-climbing)

– Hierarchical clustering: Single Linkage, Complete Linkage, Weighted Average,

Unweighted Average

The DSMBuilder selects information from the primary model and creates the ab-

stract weighted graph. Different grouping criteria can be used (in isolation or composed)

as factors leading to the weights values, as detailed in subsection 3.2.

In order to improve the clustering process, some preprocessings and postprocess-

ings can be performed, optionally, independent of the chosen clustering algorithm. We

implemented elimination of omnipresent modules [14] and orphan adoption [19].

A clustering method, deﬁned by the combination of grouping factors, clustering

algorithm, pre- and postprocessings, is evaluated by comparing its result with a given

authoritative decomposition. The evaluation methods (the Comparer) are detailed in

subsection 3.3.

3.2 Grouping Criteria

Following three criteria can be used (in isolation or composed) to build the similarity

metrics:

– the strength of the direct dependencies coupling (DC)

– indirect coupling (IC)

– global architectural information regarding the architectural layer (LA)

The direct coupling factor, which is the baseline grouping criteria, can be adjusted

by applying factors derived from the indirect coupling or global architectural informa-

tion. The similarity metric value between two units A and B is given by aggregating the

individual factors:

Similarity(A, B) = DC(A, B) · IC(A, B) · LA(A, B)

Also, in future work new grouping criteria could be added, for example introduc-

ing another factor derived from the symbolic textual information extracted from the

identiﬁer names.

The Direct Coupling Factor (DC). The main factor is the direct coupling factor which

quantiﬁes the statical dependencies between units. An unit A depends on a unit B if

there are explicit references in A to elements of B.

In previous work [18] we empirically deﬁned 11 different dependency types, char-

acterized by a dependency type weight w

DepT ype

. The values of the weights have been

empirically ﬁnetuned in order to reﬂect the relative importance of different dependen-

cies types for the strength of the coupling.

The value of the direct coupling factor between A and B is given by the sum of all

dependency types that exist between them:

DC(A, B) =

DepT ype

· count

DepT ype

(A, B)

For certain dependency types such as function calls or variable accesses, the spe-

ciﬁc weight w

DepT ype

is adjusted with a counter count

DepT ype

(A, B) representing the

relative number of the accesses from A to B, reported to the total number of possible

accesses.

The Indirect Coupling Factor (IC). We start from the observation that if two units A

and B have neighbors (units they interact with) which also interact with each other, this

corresponds to a form of indirect coupling. In this case, the two units A and B have a

higher probability to be part of the same subsystem (cluster).

First we calculate the ESM (Edge Strength Metric, deﬁned in [6]) value for each

edge of the given dependency graph.

To determine the importance of ESM value, a conﬁdence level cl ∈ [0; 1] is intro-

duced when computing the indirect coupling factor IC:

IC(A, B) = ESM (A, B) · (1 − cl)

Thus the higher the pre-givenconﬁdence level,the higher the impact of the IC factor

and the higher the importance given to cycles, with 0 meaning it will have no impact

in the algorithm used and 1 meaning it will have maximum impact (and as some of the

edges will have an ESM value of 0, it will practically cut some of the edges before the

algorithm).

The Architectural Layer Distance Factor (LA). One of the advantages of top-down

reconstruction approaches is that they start with some general assumptions about the

global architecture. In a bottom-up unsupervised approach we may not have such a-

priori global architectural knowledge. We propose a new approach of including ex-

tracted architectural information in the grouping criteria.

One kind of architectural information which may be extracted in a bottom-up ap-

proach is layering information. Units belonging to a layer may depend only on units

belonging to lower layers. Layers are determined by applying a partitioning algorithm

like [16] on the directed graph of dependencies. In future implementations, an algo-

rithm such as [11] may improve the determination of layers also in the presence of

cyclic dependencies.

We make the observation that two units which are situated in layers of very different

levels are highly unlikely to be part of the same architectural subsystem, even if there

is a strong dependency between them. On the other hand, two units that are situated on

the same or on close layers have a higher chance to be part of the same architectural

subsystem. This observation is reﬂected in the architectural layer distance factor.

We deﬁne δ as the absolute value of the difference between the layers of A and B,

divided to the total number of layers in order to normalize the value:

δ(A, B) =

|Layer(A) − La yer(B)|

T otalLayers

The similarity metric is proportional with the architectural layer distance factor LA,

deﬁned as:

LA(A, B) = Ladjustement(δ(A, B))

The layer distance adjustement is a decreasing function

Ladjustement : [0, 1] → [0, 1]

We experimented with layer distance adjustement functions decreasing at different

rates, such as linear or exponential.

When applying any of the adjustement functions, units that are mutually dependent

and are situated on the same layer have δ = 0, and the value of the linear or expo-

nential adjustement function is 1, thus the similarity is given only by the dependency

strength. For any other case, the bigger the distance is, the smaller will be the value of

the adjustement function, reducing accordingly the dependency strength.

3.3 Evaluation Approach

In our case, a clustering method is deﬁned by the combination of: grouping factors,

clustering algorithm, pre- and postprocessings. The existing approaches of evaluating

clustering methods can be divided into two categories: approaches which rely on a

authoritative decomposition and approaches which do not rely on such. Evaluation cri-

teria which do not rely on reference decompositions, such as the MQ metric [12], are

not suitable for our purpose because they already quantify coupling and cohesion as

main grouping criteria. Since our work investigates the importance of different group-

ing criteria, the only way to evaluate the results of a clustering method is to measure

how close they are to the decomposition indicated by a human expert.

A clustering method is evaluated by comparing the results it producesfor a set of test

systems with the corresponding authoritative decompositions of these systems. It may

be arguable that different experts may indicate different decompositions, at different

granularity levels, but this can be handled if the reference decompositions are speciﬁed

hierarchical.

Different strategies for comparing the similarity degree of two decompositions of

the same system have been proposed [21]. In this work we have so far used the MoJo

metric, but other metrics(such as Precision/Recall, EdgeSim, etc) could be also used in

the Comparator. The MoJo metrics counts the minimum number of operations (moves

and joins) one needs to perform in order to transform one decompositionC

into another

decomposition C

. The direct MoJo metric is actually a dissimilarity measure, since a

big value of the metric indicates that the decompositions are not similar. In order to

have a similarity measure, we use another quality measurement based on MoJo, the

MoJo similarity measurement which is deﬁned as:

similarityMoJo(C

, C

) = [1 −

MoJo(C

, C

)

] × 100%

This metric describes the normalized similarity degree of two clusterings, C

and

, of a system with N units. Since the MoJo metric is not symmetric, for a pair C

the metric is applied in both directions and the maximum value is taken.

4 Results

4.1 Tuning of Algorithms

First, all implemented algorithms required a tuning process in order to establish the

ranges of optimal values for their speciﬁc parameters.

Each algorithm has its very own set of speciﬁc parameters: The MST algorithm

has as parameter a Threshold value that is used by the algorithm as a decision factor

when edge removal is considered; The MMST algorithm has as parameter a Closeness

factor value that represents the threshold used by the algorithm as a decision factor

when uniting two clusters is considered. The Metric Based algorithm has as parameter

a Threshold value that is used as a decision factor when considering removing an edge

together with the ESM metric value. The Hill Climbing algorithm has as parameters the

climbDegree which speciﬁes how many of the possible variations should be considered

at each step and the generationMethod. The Hierarchical algorithms have as parameters

a granularity factor which determines the point of cutting off the ﬁnal clusters.

In order to determine the optimal parameter values, we proceeded as follows: We

choose a set of test systems to be clustered and we determined their reference decom-

positions, either by detailed code inspection or by requesting the opinion of their de-

velopers. For each algorithm, several runs have been made with different values for the

speciﬁc parameters, for all test system. We noticed that the parameter values for which

the obtained decomposition is closest to the reference (the maximum of the MoJo sim-

ilarity) may vary from one system to another, thus some average values have been de-

termined as the recommended values for the parameters of each algorithm. Discussing

the exact parameter values obtained by tuning for each algorithm is not relevant for the

main goal this paper; for example, an analysis of parameter values for the MST and

MMST algorithms has been included in our previous work [18].

Also, tuning has shown that general pre- and postprocessings such as elimination of

omnipresent modules (library classes) and orphan adoption have a clear positive impact

and have been included by default in all further experiments.

4.2 Evaluation of the Impact of different Grouping Criteria

After the step of tuning each algorithm, we carried out experiments in order to compare

the results when composingthe grouping criteria from differentfactors : Direct coupling

only (DC) which represents the baseline of other comparisons, Direct coupling and

Layer architecture (DC + LA), Direct coupling and Indirect coupling (DC + IC), Direct

coupling, Indirect coupling and Layer architecture (DC + IC + LA).

We carried out these experiments looking for the impact of using different grouping

criteria on the quality of the automatic decomposition, measured by its closeness to the

authoritative decomposition.

Table 1 contains the results obtained when applying the different clustering algo-

rithms, with different grouping criteria, for the clustering of a test system. The test

system analyzed in Table 1 is the ARTs toolsuite implementation, a medium-sized sys-

tem of 360 classes, and its architecture is well known to the experimenters. The table

presents the maximum values of the MoJo similarity metric, obtained for any speciﬁc

parameter settings for each algorithm. Columns ∆1, ∆2 and ∆3 compute the differ-

ences in MoJo similarity, obtained when using different additional factors vs. the base-

line factor.

Table 1. Experimental results - inﬂuence of different grouping factors on the clustering results.

Factors DC DC + LA ∆1 DC + IC ∆2 DC+IC+LA ∆3

[0] [1] [1]-[0] [2] [2]-[0] [3] [3]-[0]

Algorithms

MST 64.2 75.8 11.6 55.6 -8.6 71.1 6.9

MMST 57.5 65.6 8.1 50.7 -6.8 60.3 2.8

Metric 70.8 74.6 3.8 76.2 5.4 72.2 1.4

HillClimb 47.8 61.2 13.4 49.1 1.3 59.1 11.3

SL 71.3 82.7 11.4 71.5 0.2 81.2 9.9

WA 66.9 76.5 9.6 64.3 -2.6 73.8 6.9

average

improvements 9.65 -1.85 6.53

As the table shows, including an Architectural Layer factor in all clustering algo-

rithms always produces decompositions that are closer to the reference solution. Includ-

ing an Indirect Coupling factor, however, does not have a clear positive impact on the

quality of the resulting decomposition. Including both Architectural Layer and Indirect

Coupling factors is not better than using only the Architectural Layer factor.

We have used several other test systems, some open source software such as junit,

xercesImpl, jEdit, Ant and some developed as our university projects. We determined

their reference decompositions either by performing detailed analysis of their code or

by asking their developers. The sizes of the test systems go from 110 classes up to 1400

classes. By experimenting also with these systems, we obtained average improvement

values for ∆1, ∆2, ∆3 in ranges similar to these presented in Table 1.

We conclude that the architectural layer factor always improves the quality of the

clustering result, and the exponential adjustement function works better than the linear

one. From a quantitative point of view, the improvements are biggest for systems with

many classes that that have many dependencies spanning big layer distances.

From our experiments we concluded that the Indirect Coupling factor does not bring

real improvements. It also has a negative effect on many cases. Although it may seem

surprising, we can explain this ﬁnding by the following facts: the Indirect Coupling as

deﬁned by the Edge Strength Metric hampers the grouping of inheritance hierarchies;

also, in the case of smaller systems, the Indirect Coupling metric tends to agglomerate

everything in a few very big clusters. The granularity of the selected reference model

also affects the results, positive results were obtained on large and/or complex systems

or when using a more coarse grained reference model.

Also, the experiments pointed out another aspect which is worth to be investigated

in future work - how the different factors of the similarity metric may have an inﬂuence

on the stability of the clustering algorithms, by increasing the range of parameter values

that lead to optimal results and thus simplifying the tuning of the algorithms.

5 Conclusions

Taking into account global architectural information is essential for improving the re-

sults of coupling/cohesion guided software architecture reconstruction. In the case of

unsupervised automatic software clustering, we propose to make such global architec-

tural information available in form of the Architectural Layer distance factor, which can

be computed at the reconstruction time in a bottom-up manner and used as part of the

grouping criteria. Our experiments show that this way of taking into account the global

topology of the whole dependency graph in form of the Architectural Layer distance

factor is more effective than taking into account only local topologies of the depen-

dency graph in form of the Indirect Coupling factor. This conclusion applies to all the

investigated clustering algorithms, thus it demonstrates that the improvement is due to

the grouping criteria.

Acknowledgements

The author thanks all the students who, in recent years, have participated in the im-

plementation of parts of ARTs: Gabriel Glodean, Mihai Gligor, Adrian Oros, Bogdan

Zavada, Diana Brata.

References

1. Periklis Andritsos and Vassilios Tzerpos. Information-theoretic software clustering. IEEE

Trans. Software Eng., 31(2):150–165, 2005.

2. N. Anquetil and J. Laval. Legacy software restructuring: Analyzing a concrete case. In

Software Maintenance and Reengineering (CSMR), 2011 15th European Conference on,

pages 279–286, 2011.

3. Nicolas Anquetil and Timothy C. Lethbridge. Recovering software architecture from the

names of source ﬁles. Journal of Software Maintenance, 11(3):201–221, May 1999.

4. Markus Bauer and Mircea Trifu. Architecture-aware adaptive clustering of OO systems.

Software Maintenance and Reengineering, European Conference on, 0:3, 2004.

5. Fabian Beck and Stephan Diehl. On the congruence of modularity and code coupling. In

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on

Foundations of software engineering, ESEC/FSE ’11, pages 354–364, New York, NY, USA,

2011. ACM.

6. Yves Chiricota, Fabien Jourdan, and Guy Melancon. Software components capture using

graph clustering. In Proceedings IWPC, pages 217–226, 2003.

7. Andreas Christl, Rainer Koschke, and Margaret-Anne D. Storey. Automated clustering to

support the reﬂexion method. Information & Software Technology, 49(3):255–274, 2007.

8. S. Ducasse and D. Pollet. Software architecture reconstruction: A process-oriented taxon-

omy. Software Engineering, IEEE Transactions on, 35(4):573–591, 2009.

9. Fernando Brito e Abreu and Miguel Goul˜ao. Coupling and cohesion as modularization

drivers: Are we being over-persuaded? In Proceedings of the Fifth Conference on Software

Maintenance and Reengineering, CSMR, pages 47–57, 2001.

10. M. Hall, N. Walkinshaw, and P. McMinn. Supervised software modularisation. In Software

Maintenance (ICSM), 2012 28th IEEE International Conference on, pages 472–481, 2012.

11. Jannik Laval, Nicolas Anquetil, Usman Bhatti, and Stphane Ducasse. oZone: Layer identi-

ﬁcation in the presence of cyclic dependencies. Science of Computer Programming, (0):–,

2012.

12. Brian S. Mitchell and Spiros Mancoridis. On the automatic modularization of software

systems using the bunch tool. IEEE Trans. Software Eng., 32(3):193–208, 2006.

13. S. Muhammad. Evaluating relationship categories for clustering object-oriented software

systems. IET Software, 6:260–274(14), June 2012.

14. Hausi A. M¨uller, Scott R. Tilley, and Kenny Wong. Understanding software systems using

reverse engineering technology perspectives from the rigi project. In Proceedings of the

1993 conference of the Centre for Advanced Studies on Collaborative research: software

engineering - Volume 1, CASCON ’93, pages 217–226. IBM Press, 1993.

15. G.C. Murphy, D. Notkin, and K.J. Sullivan. Software reﬂexion models: bridging the gap be-

tween design and implementation. Software Engineering, IEEE Transactions on, 27(4):364–

380, 2001.

16. Neeraj Sangal, Ev Jordan, Vineet Sinha, and Daniel Jackson. Using dependency models

to manage complex software architecture. In OOPSLA ’05: Proceedings of the 20th an-

nual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and

applications, pages 167–176, New York, NY, USA, 2005. ACM.

17. Ioana Sora. A meta-model for representing language-independent primary dependency struc-

tures. In Joaquim Filipe and Leszek A. Maciaszek, editors, ENASE 2012 - Proceedings of the

7th International Conference on Evaluation of Novel Approaches to Software Engineering,

pages 65–74. SciTePress, 2012.

18. Ioana Sora, Gabriel Glodean, and Mihai Gligor. Software architecture reconstruction: An

approach based on combining graph clustering and partitioning. In Computational Cyber-

netics and Technical Informatics (ICCC-CONTI), 2010 International Joint Conference on,

pages 259–264, 2010.

19. V. Tzerpos and R.C. Holt. The orphan adoption problem in architecture maintenance. In

Reverse Engineering, 1997. Proceedings of the Fourth Working Conference on, pages 76–

82, 1997.

20. V. Tzerpos and R.C. Holt. Accd: an algorithm for comprehension-driven clustering. In

Reverse Engineering, 2000. Proceedings. Seventh Working Conference on, pages 258–267,

2000.

21. Zhihua Wen and V. Tzerpos. Evaluating similarity measures for software decompositions.

In Software Maintenance, 2004. Proceedings. 20th IEEE International Conference on, pages

368–377, 2004.

22. T.A. Wiggerts. Using clustering algorithms in legacy systems remodularization. In Proceed-

ings of the Fourth Working Conference on Reverse Eengineering, pages 33–43, 1997.

23. C.T. Zahn. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE

Transactions on Computers, C(20):68–86, 1971.