will formulate the problem using 2-clubs whose short-
est path connecting RD patients with specialist cen-
ters/staff (e.g., care center, hospital, or clinical staff)
has to “transit” through, at least one EP patient who
has already been followed up by the considered spe-
cialist “point”.
Formally, for any pair, (d,h), composed by the re-
cently diagnosed patient, d, and, e.g., the health care
center, h, the social platform should suggest for the
patient d, to compare (even to meet) with an identi-
fied (available) patient x’s experience. More specifi-
cally, we are currently seeking (within the input “so-
cial graph”) a 2-Club, G[D ∪ X ∪ H], where D, X and
H represent the sets of recently diagnosed patients,
experienced patients, and care point centers. respec-
tively. Please note that, when such a structure (i.e.,
a maximum size 2-clubs) exists, within the identi-
fied starting social space, then for any pair of ver-
tices, it must exist at least one simple path of length
2, i.e., a path composed by a triple of vertices. This,
in turn, will be also true for any pair, (d, h) where
d ∈ D,h ∈ H. Indeed, our goal will be to find the
largest-size 2-clubs which has the further property of
providing, for any pair (d,h), a shortest path charac-
terized by the triple of vertices (d,x,h) ∈ (D×X ×H).
In this case, the set of edges, modeling the starting so-
cial network, will be defined as follow.
• Edges between similar profiled patients.
• Edges expressing the fact that an experienced pa-
tient x, has already been properly followed up
from specialists or care centers, h. In this case
the edges in X × H will be constructed by know-
ing both the clinical history of each (experienced)
patient, x, and the clinical staff or hospital h which
has already properly followed up the patients, x.
• Edges between two vertices h
1
,h
2
∈ H, for
example because two care centers are similar
(have similar services or are part of the same
institution).
In this situation, the simple path given by the
triple of vertices (d,x,h) ∈ (D × X × H) in the
2-club G would suggest for patient d ∈ D to contact
the patient, x ∈ X , about the health care center (or
specialist staff), h ∈ H. For sake of clarity, before
defining computationally the problem, we refer
to any pair of vertices, (d,h) ∈ D × H (such that
the minimum path connecting d to h is given by
the vertex sequence (d, x,h), for any x ∈ X), as a
”feasible pair”. Considering the above discussion,
we can define the following variant of the 2-clubs
maximization problem
Problem 1. Maximum 2-Club (Max-2-club)
Input: a graph G = (D ∪ X ∪ H,R ∪ F).
Output: a set V
0
⊆ D ∪ X ∪ H such that G[V
0
] is a 2-
club having maximum size, and for each pair of ver-
tices (d,h) ∈ D × H in G[V
0
], a minimum path con-
necting d to h is given by the vertex sequence (d,x, h)
for some x ∈ X (i.e., (d,h) is feasible).
2.2 Computational Hardness
The complexity of the problem of Maximum s-club
has been extensively studied in literature, and unfortu-
nately it turns to be NP-hard for each s ≥ 1 (Bourjolly
et al., 2002); Maximum s-Club is NP-hard even if the
input graph has diameter s + 1, for each s ≥ 1 (Bala-
sundaram et al., 2005). The same property holds for
our variant of Maximum 2-club. Indeed, the compu-
tation of a 2-club of maximum size containing a spe-
cific vertex v is also NP-hard. By defining D = {v},
X = N(v) and H the remaining set of vertices, it fol-
lows that the ”feasibility” property holds.
Given an input graph G = (V,E), Maximum s-
club is not approximable within factor |V |
1/2−ε
, for
any ε > 0 and s ≥ 2 (Asahiro et al., 2010). On the
positive side, polynomial-time approximation algo-
rithms (Asahiro et al., 2010) have been given, with
factor |V |
1/2
for every even s ≥ 2, and factor |V |
2/3
for every odd s ≥ 3. The parameterized complexity
of Maximum s-Club has also been studied, leading
to fixed-parameter algorithms (Sch
¨
afer et al., 2012;
Komusiewicz and Sorge, 2015; Chang et al., 2013).
Maximum 2-Club has been considered also for spe-
cific graph classes (Hartung et al., 2015; Golovach
et al., 2014).
3 A GENETIC ALGORITHM
The complexity of the problems introduced so far
make optimization potentially impracticable. For
this reason, we designed a Genetic Algorithm
(GA) to seek faster approximation solutions see,
e.g., (Mitchell, 1996) for details.
In particular, given an input graph G = (V,E), the
proposed GA represents a solution (a subset V
0
⊆ V
such that G[V
0
] is a 2-club of G, with the property
discussed above) as a binary chromosome c of size
n = |V |, whose ith component is defined as follows:
c[i] = 1, for all v
i
∈ V
0
, else c[i] = 0. During the
offspring generation, chromosomes are interpreted as
hypotheses of feasible solutions, or they can even rep-
resent unfeasible solutions (e.g., s-club with s > 2,
disconnected graphs, or ”unfeasible pairs”, as defined