loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Wanthanee Prachuabsupakij and Nuanwan Soonthornphisaj

Affiliation: Faculty of Science and Kasetsart University, Thailand

ISBN: 978-989-8425-79-9

Keyword(s): Imbalanced dataset, Multi-class classification, Machine learning, Decision tree.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Soft Computing ; Symbolic Systems

Abstract: Two important challenges in machine learning are the imbalanced class problem and multi-class classification, because several real-world applications have imbalanced class distribution and involve the classification of data into classes. The primary problem of classification in imbalanced data sets concerns measure of performance. The performance of standard learning algorithm tends to be biased towards the majority class and ignore the minority class. This paper presents a new approach (KSAMPLING), which is a combination of k-means clustering and sampling methods. K-means algorithm is used for spitting the dataset into two clusters. After that, we combine two types of sampling technique, over-sampling and under-sampling, to re-balance the class distribution. We have conducted experiments on five highly imbalanced datasets from the UCI. Decision trees are used to classify the class of data. The experimental results showed that the prediction performance of KSAMPLING is better than the state-of-the-art methods in the AUC results and F-measure are also improved. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.204.173.45

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Prachuabsupakij, W. and Soonthornphisaj, N. (2011). MULTI-CLASS DATA CLASSIFICATION FOR IMBALANCED DATA SET USING COMBINED SAMPLING APPROACHES.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 158-163. DOI: 10.5220/0003635201660171

@conference{kdir11,
author={Wanthanee Prachuabsupakij. and Nuanwan Soonthornphisaj.},
title={MULTI-CLASS DATA CLASSIFICATION FOR IMBALANCED DATA SET USING COMBINED SAMPLING APPROACHES},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={158-163},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003635201660171},
isbn={978-989-8425-79-9},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - MULTI-CLASS DATA CLASSIFICATION FOR IMBALANCED DATA SET USING COMBINED SAMPLING APPROACHES
SN - 978-989-8425-79-9
AU - Prachuabsupakij, W.
AU - Soonthornphisaj, N.
PY - 2011
SP - 158
EP - 163
DO - 10.5220/0003635201660171

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.