loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Michele Ianni 1 ; Elio Masciari 2 ; Giuseppe M. Mazzeo 3 and Carlo Zaniolo 4

Affiliations: 1 DIMES, University of Calabria, Rende (CS) and Italy ; 2 ICAR-CNR, Rende (CS) and Italy ; 3 Facebook, Menlo Park and U.S.A. ; 4 UCLA, Los Angeles and U.S.A.

Keyword(s): Clustering, Big Data, Spark.

Abstract: The need to support advanced analytics on Big Data is driving data scientist’ interest toward massively parallel distributed systems and software platforms, such as Map-Reduce and Spark, that make possible their scalable utilization. However, when complex data mining algorithms are required, their fully scalable deployment on such platforms faces a number of technical challenges that grow with the complexity of the algorithms involved. Thus algorithms, that were originally designed for a sequential nature, must often be redesigned in order to effectively use the distributed computational resources. In this paper, we explore these problems, and then propose a solution which has proven to be very effective on the complex hierarchical clustering algorithm CLUBS+. By using four stages of successive refinements, CLUBS+ delivers high-quality clusters of data grouped around their centroids, working in a totally unsupervised fashion. Experimental results confirm the accuracy and scalability of CLUBS+ on Map-Reduce platforms. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.117.107.90

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Ianni, M.; Masciari, E.; M. Mazzeo, G. and Zaniolo, C. (2018). Clustering Big Data. In Proceedings of the 7th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-318-6; ISSN 2184-285X, SciTePress, pages 276-282. DOI: 10.5220/0006858702760282

@conference{data18,
author={Michele Ianni. and Elio Masciari. and Giuseppe {M. Mazzeo}. and Carlo Zaniolo.},
title={Clustering Big Data},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - DATA},
year={2018},
pages={276-282},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006858702760282},
isbn={978-989-758-318-6},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - DATA
TI - Clustering Big Data
SN - 978-989-758-318-6
IS - 2184-285X
AU - Ianni, M.
AU - Masciari, E.
AU - M. Mazzeo, G.
AU - Zaniolo, C.
PY - 2018
SP - 276
EP - 282
DO - 10.5220/0006858702760282
PB - SciTePress