# ADAPTATION AND ENHANCEMENT OF EVALUATION MEASURES TO OVERLAPPING GRAPH CLUSTERINGS

### Tatiana Gossen, Michael Kotzyba, Andreas Nürnberger

#### Abstract

Quality measures are important to evaluate graph clustering algorithms by providing a means to assess the quality of a derived cluster structure. In this paper, we focus on overlapping graph structures, as many realworld networks have a structure of highly overlapping cohesive groups. We propose three methods to adapt existing crisp quality measures such that they can handle graph overlaps correctly, but also ensure that their properties for the evaluation of crisp graph clusterings are preserved when assessing a crisp cluster structure. We demonstrate our methods on such measures as Density, Newman’s modularity and Conductance. We also propose an enhancement of an existing modularity measure for networks with overlapping structure. The newly proposed measures are analysed using experiments on artificial graphs that possess overlapping structure. For this evaluation, we apply a graph generation model that creates clustered graphs with overlaps that are similar to real-world networks i.e. their node degree and cluster size distribution follow a power law.

#### References

- Adamcsek, B., Palla, G., Farkas, I., Derényi, I., and Vicsek, T. (2006). Cfinder: locating cliques and overlapping modules in biological networks. Bioinformatics, 22(8):1021.
- Aggarwal, C. and Wang, H. (2010). Managing and Mining Graph Data, volume 40. Springer-Verlag New York Inc.
- Ahn, Y., Bagrow, J., and Lehmann, S. (2010). Link communities reveal multiscale complexity in networks. Nature, 466(7307):761-764.
- Baumes, J., Goldberg, M., and Magdon-Ismail, M. (2005). Efficient identification of overlapping communities. Intelligence and Security Informatics, pages 27-36.
- Brandes, U., Delling, D., Gaertler, M., et al. (2007). On modularity clustering. IEEE Transactions on Knowledge and Data Engineering, pages 172-188.
- Brandes, U. and Erlebach, T. (2005). Network analysis: methodological foundations, volume 3418. Springer Verlag.
- Brandes, U., Gaertler, M., and Wagner, D. (2003). Experiments on graph clustering algorithms. AlgorithmsESA 2003, pages 568-579.
- Chakrabarti, D., Faloutsos, C., and McGlohon, M. (2010). Graph mining: Laws and generators. Managing and Mining Graph Data, pages 69-123.
- Delling, D., Gaertler, M., Görke, R., Nikoloski, Z., and Wagner, D. (2006). How to evaluate clustering techniques. Univ., Fak. für Informatik, Bibliothek.
- Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3-5):75-174.
- Gavin, A., Bösche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J., Michon, A., Cruciat, C., et al. (2002). Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature, 415(6868):141-147.
- Girvan, M. and Newman, M. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12):7821.
- Gregory, S. (2007). An algorithm to find overlapping community structure in networks. Knowledge Discovery in Databases: PKDD 2007, pages 91-102.
- Lancichinetti, A., Kivelä, M., and Saramäki, J. (2010). Characterizing the community structure of complex networks. PloS one, 5(8):e11976.
- Lancichinetti, A. and Radicchi, F. (2008). Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4):046110.
- Lázár, A., Í bel, D., and Vicsek, T. (2010). Modularity measure of networks with overlapping communities. EPL (Europhysics Letters), 90:18001.
- Nepusz, T., Petróczi, A., Négyessy, L., and Bazsó, F. (2008). Fuzzy communities and the concept of bridgeness in complex networks. Physical Review E, 77(1):016107.
- Newman, M. and Girvan, M. (2004). Finding and evaluating community structure in networks. Physical review E, 69(2):026113.
- Nicosia, V., Mangioni, G., Carchiolo, V., and Malgeri, M. (2009). Extending the definition of modularity to directed graphs with overlapping communities. Journal of Statistical Mechanics: Theory and Experiment, 2009:P03024.
- Palla, G., Derényi, I., Farkas, I., and Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814-818.
- Schaeffer, S. (2007). Graph clustering. Computer Science Review, 1(1):27-64.
- Tan, P., Steinbach, M., Kumar, V., et al. (2006). Introduction to data mining. Pearson Addison Wesley Boston.

#### Paper Citation

#### in Harvard Style

Gossen T., Kotzyba M. and Nürnberger A. (2012). **ADAPTATION AND ENHANCEMENT OF EVALUATION MEASURES TO OVERLAPPING GRAPH CLUSTERINGS** . In *Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,* ISBN 978-989-8425-98-0, pages 5-14. DOI: 10.5220/0003706400050014

#### in Bibtex Style

@conference{icpram12,

author={Tatiana Gossen and Michael Kotzyba and Andreas Nürnberger},

title={ADAPTATION AND ENHANCEMENT OF EVALUATION MEASURES TO OVERLAPPING GRAPH CLUSTERINGS},

booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

year={2012},

pages={5-14},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0003706400050014},

isbn={978-989-8425-98-0},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,

TI - ADAPTATION AND ENHANCEMENT OF EVALUATION MEASURES TO OVERLAPPING GRAPH CLUSTERINGS

SN - 978-989-8425-98-0

AU - Gossen T.

AU - Kotzyba M.

AU - Nürnberger A.

PY - 2012

SP - 5

EP - 14

DO - 10.5220/0003706400050014