Sparse-Reduced Computation - Enabling Mining of Massively-large Data Sets

Philipp Baumann; Dorit S. Hochbaum; Quico Spaen

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Sparse-Reduced Computation - Enabling Mining of Massively-large Data Sets

Topics: Classification; Embedding and Manifold Learning; ICA, PCA, CCA and other Linear Models; Sparsity

In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods ICPRAM - Volume 1, 224-231, 2016 , Rome, Italy

Authors: Philipp Baumann ; Dorit S. Hochbaum and Quico Spaen

Affiliation: University of California, United States

Keyword(s): Large-Scale Data Mining, Classification, Data Reduction, Supervised Normalized Cut.

Related Ontology Subjects/Areas/Topics: Classification ; Embedding and Manifold Learning ; ICA, PCA, CCA and other Linear Models ; Pattern Recognition ; Sparsity ; Theory and Methods

Abstract: Machine learning techniques that rely on pairwise similarities have proven to be leading algorithms for classification. Despite their good and robust performance, similarity-based techniques are rarely chosen for largescale data mining because the time required to compute all pairwise similarities grows quadratically with the size of the data set. To address this issue of scalability, we introduced a method called sparse computation, which efficiently generates a sparse similarity matrix that contains only significant similarities. Sparse computation achieves significant reductions in running time with minimal and often no loss in accuracy. However, for massively-large data sets even such a sparse similarity matrix may lead to considerable running times. In this paper, we propose an extension of sparse computation called sparse-reduced computation that not only avoids computing very low similarities but also avoids computing similarities between highly-similar or identical objects by compressing them to a single object. Our computational results show that sparse-reduced computation allows highly-accurate classification of data sets with millions of objects in seconds. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.133.157.133

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Baumann, P.; Hochbaum, D. and Spaen, Q. (2016). Sparse-Reduced Computation - Enabling Mining of Massively-large Data Sets. In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-173-1; ISSN 2184-4313, SciTePress, pages 224-231. DOI: 10.5220/0005690402240231

@conference{icpram16,
author={Philipp Baumann. and Dorit S. Hochbaum. and Quico Spaen.},
title={Sparse-Reduced Computation - Enabling Mining of Massively-large Data Sets},
booktitle={Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2016},
pages={224-231},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005690402240231},
isbn={978-989-758-173-1},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Sparse-Reduced Computation - Enabling Mining of Massively-large Data Sets
SN - 978-989-758-173-1
IS - 2184-4313
AU - Baumann, P.
AU - Hochbaum, D.
AU - Spaen, Q.
PY - 2016
SP - 224
EP - 231
DO - 10.5220/0005690402240231
PB - SciTePress