3 DECOMPOSITION
TECHNIQUES
3.1 Generalities on Decomposition
Techniques
The objective of a decomposition method is to split
a large problem into a collection of interconnected
but easier sub-problems. The decomposition tech-
niques can generally be applied to various problems.
Therefore, a huge strand of research is dedicated to
decomposition techniques. The decomposition pro-
cess depends on the nature of the problem and how
it is modelled (Schaeffer, 2007). In this study, the
focus is on decomposition techniques which include
graph decompositions such as graph partitioning or
graph clustering particularly adapted to optimization
problems which are modelled by graphs.
This section uses interchangeably the terms cluster-
ing and partitioning and proposes methods to built a
k-partition {C
1
,C
2
,... ,C
k
} of a given weighted graph
G =< V,E >. The clusters of the partition have no
shared variable and they are connected by a set of
edges. The end of such edges constitute the cut of
the decomposition. Building such a k-partition can
be done in many ways. Each method depends on
the expected structure for the clusters, the expected
properties for the cut and on the main goal of the
resulting partition. Moreover decomposition tech-
niques can be global or local (Schaeffer, 2007). Local
decompositions have been discarded in this study
because they assign a cluster for only some variables
of the problem, while in global decomposition
methods, each variable is assigned to one cluster of
the resulting partition.
The approach proposed in this paper is completely
generic. It is not conditioned by any particular de-
composition method. Therefore the performance of
the approach has to be assessed by considering sev-
eral decomposition methods with different properties.
However as the aim of this first work is rather to vali-
date the new AGAGD
x y algorithm, the well known
powerful clustering algorithm due to Newman (New-
man, 2004) is considered as target decomposition
method.
3.2 Newman Algorithm
In recent years, with the development of the web
research, many clustering algorithms for data min-
ing, information retrieval or knowledge mining have
been proposed. A common property that summa-
rizes all these algorithms is the community structure:
the nodes of the networks are grouped into clusters
with a high internal density and clusters are sparsely
connected. To detect structure communities in net-
works, an algorithm based on an iterative removal of
edges is proposed in (Girvan and Newman, 2002).
The main drawback of this algorithm is its compu-
tational time. Indeed, its worst case time complex-
ity is in O(m × n
2
) on a network with m edges and
n nodes or O(n
3
) on a sparse graph. This limits the
use of this algorithm to problems with a few thousand
nodes at most. A more efficient algorithm for detect-
ing community structure is presented in (Newman,
2004), with a worst time complexity in O((m+n)×n)
or O(n
2
) on a sparse graph. In practice, this algorithm
runs on current computers in a reasonable time for
networks of up to a million vertices, so the instances
considered previously are intractable. The principle
of this new algorithm (denoted Newman algorithm) is
based on the idea of modularity. The first algorithm
presented in (Girvan and Newman, 2002), (Newman,
2004) splits the network into communities, regardless
of whether the network has naturally such a division.
To define the meaningfulness of a decomposition, a
quality function denoted Q or modularity is associ-
ated. Given a network G =< V,E >, let e
ij
be the
fraction of edges in G that connects the nodes in clus-
ter i to those in cluster j and let a
i
=
∑
j
e
ij
, then
Q =
∑
i
(e
ii
− a
2
i
)
In practice, values of Q greater than about 0,3 give a
significant community structure. In (Newman, 2004),
an alternative approach is suggested to find commu-
nity structures: Q is simply optimized instead of con-
sidering different iterative removals of edges. How-
ever the optimization of Q is very expensive. In prac-
tice, looking for all possible divisions for optimizing
Q takes at least an exponential amount of time and it
is infeasible for networks larger than 20 or 30 nodes.
Different heuristic or metaheuristic algorithms can be
used to approximate this problem.
Newman uses an iterative agglomerative algorithm
that is a bottom-up hierarchical one. This algorithm
starts by considering n clusters or n communities, for
which each community contains only one node. The
communities are then repeatedly joined in pairs. The
algorithm chooses at each step the join that results
in the smallest decrease of Q. The algorithm pro-
gresses like a dendrogram at different nodes. The cuts
through this dendrogram at different levels give the
divisions of the graph into a certain number of com-
munities of different sizes. The best cut is chosen by
looking for the maximal value for Q. This new ver-
sion of the algorithm is in O(n
2
) on sparse graphs.
ICAART2015-InternationalConferenceonAgentsandArtificialIntelligence
80