loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Joo Yong Lee 1 ; Sang Ho Lee 1 and Yanggon Kim 2

Affiliations: 1 School of Computing, Soongsil University, Korea, Republic of ; 2 Computer and Information Sciences, Towson University, United States

Keyword(s): Web crawler, Parallel crawler, Scalability, Web database.

Related Ontology Subjects/Areas/Topics: Cloud Computing ; Collaboration and e-Services ; Data Engineering ; e-Business ; Enterprise Information Systems ; Mobile Software and Services ; Ontologies and the Semantic Web ; Services Science ; Software Agents and Internet Computing ; Software Engineering ; Software Engineering Methods and Techniques ; Telecommunications ; Web Services ; Wireless Information Networks and Systems

Abstract: As the size of the Web grows, it becomes increasingly important to parallelize a crawling process in order to complete downloading pages in a reasonable amount of time. This paper presents the design and implementation of an effective parallel web crawler. We first present various design choices and strategies for a parallel web crawler, and describe our crawler’s architecture and implementation techniques. In particular, we investigate the URL distributor for URL balancing and the scalability of our crawler.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.118.140.78

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Yong Lee, J.; Ho Lee, S. and Kim, Y. (2007). SCRAWLER: A SEED-BY-SEED PARALLEL WEB CRAWLER. In Proceedings of the Second International Conference on e-Business (ICETE 2007) - ICE-B; ISBN 978-989-8111-11-1, SciTePress, pages 151-156. DOI: 10.5220/0002108701510156

@conference{ice-b07,
author={Joo {Yong Lee}. and Sang {Ho Lee}. and Yanggon Kim.},
title={SCRAWLER: A SEED-BY-SEED PARALLEL WEB CRAWLER},
booktitle={Proceedings of the Second International Conference on e-Business (ICETE 2007) - ICE-B},
year={2007},
pages={151-156},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002108701510156},
isbn={978-989-8111-11-1},
}

TY - CONF

JO - Proceedings of the Second International Conference on e-Business (ICETE 2007) - ICE-B
TI - SCRAWLER: A SEED-BY-SEED PARALLEL WEB CRAWLER
SN - 978-989-8111-11-1
AU - Yong Lee, J.
AU - Ho Lee, S.
AU - Kim, Y.
PY - 2007
SP - 151
EP - 156
DO - 10.5220/0002108701510156
PB - SciTePress