nodes, while our approach is more scalable since the
workers always see only a small portion of the com-
plete graph.
The pattern matching approach presented in (Dörr,
1995) can be used for matching multiple patterns hav-
ing isomorphic sub-patterns at the same time. The al-
gorithm runs on single-core environments, but by dis-
covering the identical parts of the different patterns
and performing their matching at once, it can achieve
noticeable performance gain, since only the different
parts of the patterns must be matched separately af-
terwards.
6 CONCLUSIONS AND FUTURE
WORK
In this paper, we have introduced a new pattern spec-
ification language. The presented concept makes it
possible to define inexact patterns in a concise way.
We have also presented the algorithm MRMM, a
MapReduce-based method for detecting inexact pat-
terns in large graphs. The algorithm finds all sub-
graphs corresponding to the defined pattern in the host
graph.
The MapReduce framework is designed to sup-
port processing large data sets. Therefore, it can be
suitable for graph related algorithms if the graphs are
represented as textual files. In this paper, we have also
described the applied data structure.
Because of the lack of space, in this paper, we fo-
cused on the description of the new language concept
and the new matcher algorithm. We plan to present
the detailed evaluation of the performance and the ex-
periences collected during the application of the ap-
proach in a separate publication.
Other future work contains the analysis of the pre-
sented algorithms. Since the MapReduce framework
is not optimized for I/O operations, the sizes of the
produced outputs are critical. In order to evaluate the
efficiency of the algorithms, we also intend to perform
different measurements.
REFERENCES
Amazon Web Services (2013). http://aws.amazon.com.
Apache Hadoop (2011). Apache Hadoop Project. http://
hadoop.apache.org/.
Berry, J. W. (2010). Practical heuristics for inexact sub-
graph isomorphism.
Berry, J. W., Hendrickson, B., Kahan, S., and Konecny, P.
(2007). Software and algorithms for graph queries on
multithreaded architectures. In International Parallel
and Distributed Processing Symposium, pages 1–14.
IEEE.
Coffman, T., Greenblatt, S., and Marcus, S. (2004). Graph-
based technologies for intelligence analysis. Commu-
nications ACM, 47(3):45–47.
Dean, J. and Ghemawat, S. (2008). Mapreduce: simplified
data processing on large clusters. Communications of
the ACM, 51(1):107–113.
Dörr, H. (1995). Efficient Graph Rewriting and Its Imple-
mentation. Springer-Verlag New York, Inc., Secaucus,
NJ, USA.
Ehrig, H., Ehrig, K., Prange, U., and Taentzer, G. (2006).
Fundamentals of Algebraic Graph Transformation
(Monographs in Theoretical Computer Science. An
EATCS Series). Springer-Verlag New York, Inc., Se-
caucus, NJ, USA.
Fehér, P., Vajk, T., Charaf, H., and Lengyel, L. (2013).
Mapreduce algorithm for finding st-connectivity. In
4th IEEE International Conference on Cognitive Info-
cocommunications - CogInfoCom 2013.
Karloff, H., Suri, S., and Vassilvitskii, S. (2010). A model
of computation for mapreduce. In Proceedings of the
Twenty-First Annual ACM-SIAM Symposium on Dis-
crete Algorithms, pages 938–948. Society for Indus-
trial and Applied Mathematics.
Kim, S.-H., Lee, K.-H., Choi, H., and Lee, Y.-J. (2013).
Parallel processing of multiple graph queries using
mapreduce. In DBKDA 2013, The Fifth International
Conference on Advances in Databases, Knowledge,
and Data Applications, pages 33–38.
Liu, Y., Jiang, X., Chen, H., Ma, J., and Zhang, X. (2009).
Mapreduce-based pattern finding algorithm applied
in motif detection for prescription compatibility net-
work. In Advanced Parallel Processing Technologies,
pages 341–355. Springer.
Mezei, G., Levendovszky, T., Meszaros, T., and Madari, I.
(2009). Towards truly parallel model transformations:
A distributed pattern matching approach. In EURO-
CON 2009, EUROCON ’09. IEEE, pages 403–410.
Plantenga, T. (2012). Inexact subgraph isomorphism in
mapreduce. Journal of Parallel and Distributed Com-
puting.
Plump, D. (1998). Termination of graph rewriting is unde-
cidable. Fundamenta Informaticae, 33(2):201–209.
Tong, H., Faloutsos, C., Gallagher, B., and Eliassi-Rad,
T. (2007). Fast best-effort pattern matching in large
attributed graphs. In Proceedings of the 13th ACM
SIGKDD international conference on Knowledge dis-
covery and data mining, pages 737–746. ACM.
Windows Azure (2013). http://www.windowsazure.com/
en-us/.
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
212