loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Author: Khaled Nagi

Affiliation: Faculty of Engineering and Alexandria University, Egypt

Keyword(s): Search Engine, Scalability, Fault Tolerance, Open-Source, Lucene, Solr, NoSQL, Hadoop.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Business Analytics ; Business Intelligence Applications ; Data Analytics ; Data Engineering ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Symbolic Systems

Abstract: The usage of search engines is nowadays extended to do intelligent analytics of petabytes of data. With Lucene being at the heart of the vast majority of information retrieval systems, several attempts are made to bring it to the cloud in order to scale to big data. Efforts include implementing scalable distribution of the search indices over the file system, storing them in NoSQL databases, and porting them to inherently distributed ecosystems, such as Hadoop. We evaluate the existing efforts in terms of distribution, high availability, fault tolerance, manageability, and high performance. We believe that the key to supporting search indexing capabilities for big data can only be achieved through the use of common open-source technology to be deployed on standard cloud platforms such as Amazon EC2, Microsoft Azure, etc. For each approach, we build a benchmarking system by indexing the whole Wikipedia content and submitting hundreds of simultaneous search requests. We measur e the performance of both indexing and searching operations. We stimulate node failures and monitor the recoverability of the system. We show that a system built on top of Solr and Hadoop has the best stability and manageability; while systems based on NoSQL databases present an attractive alternative in terms of performance. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.80.164.96

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Nagi, K. (2015). Bringing Search Engines to the Cloud using Open Source Components. In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR; ISBN 978-989-758-158-8; ISSN 2184-3228, SciTePress, pages 116-126. DOI: 10.5220/0005632701160126

@conference{kdir15,
author={Khaled Nagi.},
title={Bringing Search Engines to the Cloud using Open Source Components},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR},
year={2015},
pages={116-126},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005632701160126},
isbn={978-989-758-158-8},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR
TI - Bringing Search Engines to the Cloud using Open Source Components
SN - 978-989-758-158-8
IS - 2184-3228
AU - Nagi, K.
PY - 2015
SP - 116
EP - 126
DO - 10.5220/0005632701160126
PB - SciTePress