loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Cláudia M. V. Silvestre 1 ; Margarida M. G. Cardoso 2 and Mario A. T. Figueiredo 3

Affiliations: 1 Escola Superior de Comunicaçãao Social, Portugal ; 2 Lisbon University Institute, Portugal ; 3 Instituto Superior Técnico, Portugal

Keyword(s): Cluster analysis, Finite mixture models, Feature selection, EM algorithm, Categorical variables.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Business Analytics ; Clustering and Classification Methods ; Data Analytics ; Data Engineering ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Symbolic Systems

Abstract: There has been relatively little research on feature/variable selection in unsupervised clustering. In fact, feature selection for clustering is a challenging task due to the absence of class labels for guiding the search for relevant features. The methods proposed for addressing this problem are mostly focused on numerical data. In this work, we propose an approach to selecting categorical features in clustering. We assume that the data comes from a finite mixture of multinomial distributions and implement a new expectation-maximization (EM) algorithm that estimate the parameters of the model and selects the relevant variables. The results obtained on synthetic data clearly illustrate the capability of the proposed approach to select the relevant features.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.191.189.124

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
M. V. Silvestre, C.; M. G. Cardoso, M. and A. T. Figueiredo, M. (2009). SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2009) - KDIR; ISBN 978-989-674-011-5; ISSN 2184-3228, SciTePress, pages 303-306. DOI: 10.5220/0002303203030306

@conference{kdir09,
author={Cláudia {M. V. Silvestre}. and Margarida {M. G. Cardoso}. and Mario {A. T. Figueiredo}.},
title={SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2009) - KDIR},
year={2009},
pages={303-306},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002303203030306},
isbn={978-989-674-011-5},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2009) - KDIR
TI - SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING
SN - 978-989-674-011-5
IS - 2184-3228
AU - M. V. Silvestre, C.
AU - M. G. Cardoso, M.
AU - A. T. Figueiredo, M.
PY - 2009
SP - 303
EP - 306
DO - 10.5220/0002303203030306
PB - SciTePress