# ROBE - Knitting a Tight Hub for Shortest Path Discovery in Large Social Graphs

### Lixin Fu, Jing Deng

#### Abstract

Scalable and efficient algorithms are needed to compute shortest paths between any pair of vertices in large social graphs. In this work, we propose a novel ROBE scheme to estimate the shortest distances. ROBE is based on a hub serving as the skeleton of the large graph. In order to stretch the hub into every corner in the network, we first choose representative nodes with highest degrees that are at least two hops away from each other. Then bridge nodes are selected to connect the representative nodes. Extension nodes are also added to the hub to ensure that the originally connected parts in the large graph are not separated in the hub graph. To improve performance, we compress the hub through chain collapsing, tentacle retracting, and clique compression techniques. A query evaluation algorithm based on the compressed hub is given. We compare our approach with other state-of-the-art techniques and evaluate their performance with respect to miss rate, error rate, as well as construction time through extensive simulations. ROBE is demonstrated to be two orders faster and has more accurate estimations than two recent algorithms, allowing it to scale very well in large social graphs.

#### References

- Agrawal, R. and R. Srikant (1994). Fast Algorithms for Mining Association Rules in Large Databases. VLDB'94, Proceedings of 20th International Conference on Very Large Data Bases, September 12- 15, 1994, Santiago de Chile. J. B. Bocca, M. Jarke and C. Zaniolo, Morgan Kaufmann: 487-499.
- Akiba, T., Y. Iwata and Y. Yoshida (2013). Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling. SIGMOD. New York.
- Apostolico, A. and G. Drovandi ( 2009). "Graph Compression by BFS." Algorithms 2: 1031-1044.
- Backstrom, L., P. Boldi, M. Rosa, J. Ugander and S. Vigna (2012). Four degrees of separation. WebSci. Evanston, IL, USA: 33-42.
- Baswana, S. and S. Sen (2006). "Approximate Distance Oracles for Unweighted Graphs in Expected O(Â2) Time." ACM Transactions on Algorihms 2(4): 557- 577.
- Bellman, R. (1958). "On a routing problem." Quarterly of Applied Mathematics (16): 87-90.
- Chen, Z., Y. Chen, C. Ding, B. Deng and X. Li (2011). Pomelo: accurate and decentralized shortest-path distance estimation in social graphs. ACM SIGCOMM Conference. Toronto, ON, Canada: 406-407.
- Cohen, E., E. Halperin, H. Kaplan and U. Zwick (2003). "Reachability and distance queries via 2-hop labels." SIAM J. Comput. 32(5): 1338-1355.
- Dijkstra, E. W. (1959). "A note on two problems in connexion with graphs." Numerische Mathematik 1(1): 269-271.
- Fan, W., J. Li, X. Wang and Y. Wu (2012). Query preserving graph compression. SIGMOD. Scottsdale, Arizona, USA: 157-168.
- Feder, T. and R. Motwani (1995). "Clique partitions, graph compression and speeding-up algorithms." Journal of Computer And System Sciences 51: 261- 272.
- Gao, J., R. Jin, J. Zhou, J. X. Yu, X. Jiang and T. Wang (2011). Relational approach for shortest path discovery over large graphs. Proc. VLDB Endow.
- Gjoka, M. from http://odysseas.calit2.uci.edu/doku.php/ public:online_social_networks.
- Gjoka, M., M. Kurant, C. T. Butts and A. Markopoulou (2011). "Practical Recommendations on Crawling Online Social Networks." IEEE J. Sel. Areas Commun. on Measurement of Internet Topologies 29(9): 1872- 1892.
- Gubichev, A., S. Bedathur, S. Seufert and G. Weikum (2010). Fast and Accurate Estimation of Shortest Paths in Large Graphs. CIKM, Toronta, Ontario, Canada.
- Jin, R., N. Ruan, Y. Xiang and V. E. Lee (2012). A highway-centric labeling approach for answering distance queries on large sparse graphs. SIGMOD: 445-456.
- Jin, R., Y. Xiang, N. Ruan and D. Fuhry. . In 7809 (2009). 3-hop: a high-compression indexing scheme for reachability query. SIGMOD.
- Karande, C., K. Chellapilla and R. Andersen (2009). Speeding up algorithms on compressed web graphs. Proceedings of the Second ACM International Conference on Web Search and Data Mining, Barcelona, Spain.
- Potamias, M., F. Bonchi, C. Castillo and A. Gionis (2009). Fast shortest path distance estimation in large networks. Proceedings of the 18th ACM conference on Information and knowledge management. Hong Kong, China: 867-876.
- Qiao, M., H. Cheng, L. Chang and J. X. Yu (2012). Approximate Shortest Distance Computing: A QueryDependent Local Landmark Scheme. ICDE, Washington, DC, USA (Arlington, Virginia), 1-5 April, 2012.
- Ruan, N., R. Jin and Y. Huang (2011). Distance Preserving Graph Simplification.. ICDM. Vancouver, Canada 1200-1205.
- Sarma, A. D., S. Gollapudi, M. Najork and R. Panigrahy (2010). A Sketch-Based Distance Oracle for WebScale Graphs. WSDM, New York, USA.
- SNAP. (2009). from http://snap.stanford.edu/data/.
- Thorup, M. and U. Zwick (2005). "Approximate Distance Oracles." Journal of the ACM 52(1): 1-24.
- Wei, F. (2011). "TEDI: Efficient Shortest Path Query Answering on Graphs." Graph Data Management: Techniques and Applications: 214-238.
- Zhao, X., A. Sala, C. Wilson, H. Zheng and B. Zhao (2010). Orion: shortest path estimation for large social graphs. Proceedings of the 3rd conference on Online social networks. Boston, MA: 9-9.

#### Paper Citation

#### in Harvard Style

Fu L. and Deng J. (2015). **ROBE - Knitting a Tight Hub for Shortest Path Discovery in Large Social Graphs** . In *Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,* ISBN 978-989-758-096-3, pages 97-107. DOI: 10.5220/0005353500970107

#### in Bibtex Style

@conference{iceis15,

author={Lixin Fu and Jing Deng},

title={ROBE - Knitting a Tight Hub for Shortest Path Discovery in Large Social Graphs},

booktitle={Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},

year={2015},

pages={97-107},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0005353500970107},

isbn={978-989-758-096-3},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,

TI - ROBE - Knitting a Tight Hub for Shortest Path Discovery in Large Social Graphs

SN - 978-989-758-096-3

AU - Fu L.

AU - Deng J.

PY - 2015

SP - 97

EP - 107

DO - 10.5220/0005353500970107