Lastly, we conduct a similar comparison on a much
larger, more realistic data set data-UCI provided by
University of California, Irvine (Gjoka , Gjoka,
Kurant et al. 2011). The graph has 1,189,768 nodes,
29,760,300 edges, one component, with an average
degree of 50 and maximal degree of 4,411. The
results are shown in Figures 8 and 9. Similar
observations can be made. The advantages of ROBE
are even more evident
7 CONCLUSIONS
In this work we have proposed and investigated a
technique called ROBE, which is based a hub
containing representative nodes, bridge nodes, and
extension nodes. Graph compression techniques
including clique compression, chain collapsing, and
tentacle retracting are exploited in order to reduce
the size and overall computation for the hub.
If all eligible representative nodes are chosen,
our scheme has a zero miss rate. Otherwise, its miss
rate is still very low. It also enjoys a low error rate,
in addition to its short construction time and low
cost for shortest-path queries. We have detailed our
design and performed extensive evaluations of
ROBE with related schemes and experimented on
two real data sets. The results suggest that ROBE
can serve as a good candidate for shortest-path
computation in large social networks.
REFERENCES
Agrawal, R. and R. Srikant (1994). Fast Algorithms for
Mining Association Rules in Large Databases.
VLDB'94, Proceedings of 20th International
Conference on Very Large Data Bases, September 12-
15, 1994, Santiago de Chile. J. B. Bocca, M. Jarke and
C. Zaniolo, Morgan Kaufmann: 487-499.
Akiba, T., Y. Iwata and Y. Yoshida (2013). Fast Exact
Shortest-Path Distance Queries on Large Networks by
Pruned Landmark Labeling. SIGMOD. New York.
Apostolico, A. and G. Drovandi ( 2009). "Graph
Compression by BFS." Algorithms 2: 1031-1044.
Backstrom, L., P. Boldi, M. Rosa, J. Ugander and S.
Vigna (2012). Four degrees of separation. WebSci.
Evanston, IL, USA: 33-42.
Baswana, S. and S. Sen (2006). "Approximate Distance
Oracles for Unweighted Graphs in Expected O(n^2)
Time." ACM Transactions on Algorihms 2(4): 557-
577.
Bellman, R. (1958). "On a routing problem." Quarterly of
Applied Mathematics (16): 87–90.
Chen, Z., Y. Chen, C. Ding, B. Deng and X. Li (2011).
Pomelo: accurate and decentralized shortest-path
distance estimation in social graphs. ACM SIGCOMM
Conference. Toronto, ON, Canada: 406-407.
Cohen, E., E. Halperin, H. Kaplan and U. Zwick (2003).
"Reachability and distance queries via 2-hop labels."
SIAM J. Comput. 32(5): 1338–1355.
Dijkstra, E. W. (1959). "A note on two problems in
connexion with graphs." Numerische Mathematik 1(1):
269–271.
Fan, W., J. Li, X. Wang and Y. Wu (2012). Query
preserving graph compression. SIGMOD. Scottsdale,
Arizona, USA: 157-168.
Feder, T. and R. Motwani (1995). "Clique partitions,
graph compression and speeding-up algorithms."
Journal of Computer And System Sciences 51: 261-
272.
Gao, J., R. Jin, J. Zhou, J. X. Yu, X. Jiang and T. Wang
(2011). Relational approach for shortest path
discovery over large graphs. Proc. VLDB Endow.
Gjoka, M. from http://odysseas.calit2.uci.edu/doku.php/
public:online_social_networks.
Gjoka, M., M. Kurant, C. T. Butts and A. Markopoulou
(2011). "Practical Recommendations on Crawling
Online Social Networks." IEEE J. Sel. Areas Commun.
on Measurement of Internet Topologies 29(9): 1872-
1892.
Gubichev, A., S. Bedathur, S. Seufert and G. Weikum
(2010). Fast and Accurate Estimation of Shortest
Paths in Large Graphs. CIKM, Toronta, Ontario,
Canada.
Jin, R., N. Ruan, Y. Xiang and V. E. Lee (2012). A
highway-centric labeling approach for answering
distance queries on large sparse graphs. SIGMOD:
445-456.
Jin, R., Y. Xiang, N. Ruan and D. Fuhry. . In ’09 (2009).
3-hop: a high-compression indexing scheme for
reachability query. SIGMOD.
Karande, C., K. Chellapilla and R. Andersen (2009).
Speeding up algorithms on compressed web graphs.
Proceedings of the Second ACM International
Conference on Web Search and Data Mining,
Barcelona, Spain.
Potamias, M., F. Bonchi, C. Castillo and A. Gionis (2009).
Fast shortest path distance estimation in large
networks. Proceedings of the 18th ACM conference on
Information and knowledge management. Hong Kong,
China: 867-876.
Qiao, M., H. Cheng, L. Chang and J. X. Yu (2012).
Approximate Shortest Distance Computing: A Query-
Dependent Local Landmark Scheme. ICDE,
Washington, DC, USA (Arlington, Virginia), 1-5
April, 2012.
Ruan, N., R. Jin and Y. Huang (2011). Distance
Preserving Graph Simplification.. ICDM. Vancouver,
Canada 1200-1205.
Sarma, A. D., S. Gollapudi, M. Najork and R. Panigrahy
(2010). A Sketch-Based Distance Oracle for Web-
Scale Graphs. WSDM, New York, USA.
SNAP. (2009). from http://snap.stanford.edu/data/.
Thorup, M. and U. Zwick (2005). "Approximate Distance
Oracles." Journal of the ACM 52(1): 1-24.
ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems
106