Authors:
Sousuke Takami
and
Akihiro Inokuchi
Affiliation:
Kwansei Gakuin University, Japan
Keyword(s):
Graph Edit Distance, Graph Relabeling, Graph Classification.
Related
Ontology
Subjects/Areas/Topics:
Classification
;
Graphical and Graph-Based Models
;
Pattern Recognition
;
Similarity and Distance Learning
;
Theory and Methods
Abstract:
The graph edit distance, a well-known metric for determining the similarity between two graphs, is commonly
used for analyzing large sets of structured data, such as those used in chemoinformatics, document analysis,
and malware detection. As computing the exact graph edit distance is computationally expensive, and may
be intractable for large-scale datasets, various approximation techniques have been developed. In this paper,
we present a method based on graph relabeling that is both faster and more accurate than the conventional
approach. We use unfolded subtrees to denote the potential relabeling of local structures around a given vertex.
These subtree representations are concatenated as a vector, and the distance between different vectors is used
to characterize the distance between the corresponding graphs. This avoids the need for multiple calculations
of the exact graph edit distance between local structures. Simulation experiments on two real-world chemical
datasets
are reported. Compared with the conventional technique, the proposed method gives a more accurate
approximation of the graph edit distance and is significantly faster on both datasets. This suggests the proposed
method could be applicable in the analysis of larger and more complex graph-like datasets.
(More)