algorithm proposed in (Liu, 2008) for a comparison
with our proposed method, which is originally
designed for finding maximal quasi-cliques.
Moreover, we also propose an algorithm to
efficiently solve the special case on γ = 1. Two
synthetic datasets and one real dataset are used to
test the proposed methods, and the experiment
results demonstrate that our methods are better than
the modified Quick algorithm in most cases.
The remainder of this paper is organized as
follows. The related works are reviewed in Section
2. Then, the preliminaries are given in Section 3.
The modified Quick algorithm and our methods are
detailed in Section 4. Thereafter, the performance
evaluation on the proposed methods is presented in
Section 5. Finally, Section 6 concludes this work.
2 RELATED WORKS
The dense graph problems have been adopted on a
variety of applications, such as finding thematic
groups, organizing social events, and tag suggestion
(
Sozio, 2010), (Tsourakakis, 2013). A Clique, also
known as complete graph, is a typical dense graph,
in which vertices are all connected to each other.
The problem of finding a clique with a given size k
in a graph is NP-complete. In addition, to find all of
the maximal cliques is more difficult. (
Du, 2006)
have studied the techniques to enumerate all
maximal cliques in a complex network. For general
undirected graphs, Xiang et al. propose a color-
based technique to compute an upper bound of the
size of cliques in (
Xiang, 2013). If two vertices have
different colors, it means that no edge exists between
those two vertices. Since cliques are complete
graphs, the number of colors in the graph represents
the possible size of maximal clique able to be found.
A partitioning algorithm is designed in (
Xiang,
2013), which computes the maximum clique on
MapReduce using a branch and bound search. (
Zou,
2010) combine the maximal clique problem and the
top-k query. They assume that the graph data is
generally interfered in reality. This kind of graphs is
called uncertain graphs. In an uncertain graph,
vertices and edges have their own weights for
representing the probabilities of existence. When
they confirm that a sub-graph is a clique, its
corresponding score is calculated from the weights
of vertices and edges. Then, we can use the score to
prune some other vertices, which cannot form other
cliques with larger scores.
On the other hand, researchers consider quasi-
cliques, another type of dense graphs, which have
different definitions for different studies.
(
Tsourakakis, 2013) define the threshold for the
number of edges in a quasi-clique, and mention that
each vertex need connect to most other vertices in a
quasi-clique. (
Brunato, 2007) formulate two
parameters to define the quasi-clique. The first one
determines the number of neighbors of each vertex
in a quasi-clique, and the second one determines the
number of edges in the quasi-clique. (
Abello, 2002)
and (
Liu, 2008) have the similar definition for quasi-
cliques, which is based on the degree of each vertex
in the same sub-graph. (
Abello, 2002) propose an
algorithm for finding a single maximal quasi-clique.
(
Liu, 2008) propose the Quick algorithm for finding
maximal quasi-cliques in a graph. The basic idea of
this Quick algorithm is to use the depth-first order to
explore the search space. Then, they use several
pruning techniques to reduce the execution time. We
illustrate the detailed steps of the Quick algorithm in
Section 4 as a comparison of our method.
3 PRELIMINARIES
In this section, we describe the notations and terms
to be used in this paper, and formally define the
problem on finding maximal quasi-cliques for a
target vertex in a graph. Given a simple graph G =
(V, E), where V denotes a set of vertices and E
denotes a set of edges to represent objects and the
relationships among objects, respectively. That is, if
any two objects have a relationship, an edge between
the two corresponding vertices exists. An edge is
denoted using a form of (u, v) where u, v ∈ V. |V|
and |E| denote the number of vertices and the
number of edges in a graph, respectively. N
G
(v) = {u|
∀(u, v) ∈ E} denotes the neighbors of a vertex v in
G. |N
G
(v)| therefore denotes the degree of v in G.
dist
G
(u, v) denotes the distance between the vertex u
and the vertex v, which equals the minimum number
of edges to traverse from u to v in G. G' = (V', E') is
a sub-graph of G = (V, E) when V' ⊂ V, E' ⊂ E, and
for any u and v in V', if (u, v) ∈ E, then (u, v) ∈ E'.
In the following discussion, we also use a set of
vertices to represent the corresponding sub-graph.
Definition 1 (Quasi-Clique): Given a sub-graph G' =
(V', E') of G, where V' ⊂ V and E' ⊂ E, G' is defined
as a quasi-clique of v with respect to a parameter γ,
denoted QC(γ, v), where v ∈ V and 0 < γ ≤ 1, if G'
satisfies the following three conditions. 1) v ∈ V'. 2)
G' is connected, which means at least a path exists
between any two vertices in V'. 3) |N
G'
(v)| needs to
equal or exceed ⌈(|V'| − 1) × γ⌉, ∀v ∈ V', where ⌈(|V'|
DATA2015-4thInternationalConferenceonDataManagementTechnologiesandApplications
6