5.3 Community Visualization
The visualization is a significant part of
KANGAROO, also a very difficult part. In this
section, we show the community visualization which
is most tough task for the presentation layer.
After community detection on Karate graph, the
visualization of the communities is represented in
Figure 9 which is clear and intelligible to distinguish
the different nodes among all communities. In Figure
10, we represent a more beautiful graph which
contains 12 communities in diverse colors.
6 CONCLUSIONS AND FUTURE
WORK
Motivated by recently increasing request for the large
scale social network analysis, in this paper, we
introduce our solution, KANGAROO based on
distributed system, and report how to construct it
using the open-source Hadoop project, also the
advantages of this system, including the ability to
analyze the huge scale social network and the high
performance with linear speedup ratio.
We employ a huge scale real-world network with
ten million nodes and edges to analyze with
KANGAROO, including basic statistics, community
detection and community visualization. In these cases,
we do not only show the result of our algorithms but
also the speedup ratio of our system.
At present, KANGAROO is an on-going system,
our future work will continue in the construction of
this system, especially the computing modes and
algorithms for large scale social network, scalability
and performance. Fundamentally, we hope
KANGAROO can serve as a practical social network
analysis framework for huge scale real-world
network.
ACKNOWLEDGEMENTS
This work is supported by the National Science
Foundation of China (Grant No.60905025, 90924029,
61074128), National High Technology Research and
Development Program of China
(No.2009AA04Z136), National Key Technology
R&D Program of China (2006BAH03B05) and the
Fundamental Research Funds for the Central
Universities.
REFERENCES
U. Kang. Charalampos E. Tsourakakis, Christos Faloutsos.
2009. PEGASUS: A Peta-Scale Graph Ming System –
Implementation and Observations. In ICDM2009,
Ninth IEEE International Conference on Data Mining.
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik,
James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz
Czajkowski. 2010. Pregel: A System for Large-Scale
Graph Processing. In SIGMOD2010, ACM SIGMOD
International Conference on Management of Data.
Shengqi Yang, Bai Wang, Haizhou Zhao and Bin Wu.
2009. Efficient Dense Structure Mining using
MapReduce. In ICDM2009, Ninth IEEE International
Conference on Data Mining workshop on Large-scale
Data Mining.
A. L. Barabasi and R. Albert. 1999. Emergence of scaling in
random networks. In Science, 286(5439):509-512.
D. J. Watts and S.H. Strogatz. 1998. Collective dynamics of
small-world networks. In Nature, 393(6684):440-442.
Linyuan Lv, Tao Zhou. 2010. Link Prediction in Complex
Networks: A Survey. In arXiv:1010.0725v1 [Physics
and Society (physics.soc-ph)] 4 Oct 2010.
Shengqi Yang, Bai Wang, Haizhou Zhao, Yuan Gao, Bin
Wu. 2009. DisTec: Towards a Distributed System for
Telecom computing. In International Conference on
Cloud Computing 2009.
Bin Wu, Shengqi Yang, Haizhou Zhao, Yuan Gao and
Lijun Suo. 2009. CosDic: towards a Comprehensive
System for Knowledge Discovery in Large-scale data.
In The 2009 IEEE/WIC/ACM International Conference
on Web Intelligence 2009.
J. Dean and S. Ghemawat. 2004. Mapreduce: Simplified
data processing on large clusters. In OSDI 2004
L. da F. Costa, F. A. Rodrigues, G. Travieso, P. R. Villas
Boas. 2005. Characterization of Complex Networks: A
Survey of measurements. In Condensed
Matter/0505185
P. J. Flory. 1941. Molecular size distribution in
three-dimensional polymers. i. gelation. In Journal of
the American Chemical Society, 63:3083-3090
A. Rapoport. 1953. Contribution to the theory of random
and biased nets. In Bulletin of Mathematical Biophysics,
19:257-277, 1957.
P. Erdos and A.Renyi. 1961. On the strength of
connectedness of a random graph. In Acta Mathematica
Scientia Hungary, 12:261-267, 1961.
Valdis Kredbs 2004. Valdis Krebs’ website for Inflow, a
software-based SNA tool. In
http://www.orgnet.com/sna.html
XiaoPing Liao, Wei Ren, Guiying Yan. 2009. A Linear
Projection Approach for Resolving Community
Structure. In The Third International Symposium on
Optimization and Systems Biology 2009.
KANGAROO: A DISTRIBUTED SYSTEM FOR SNA - Social Network Analysis in Huge-Scale Networks
409