KSAMPLING still has some drawbacks, the
accuracy rates can be dropped if the training set size
is small.
ACKNOWLEDGEMENTS
This research is supported by Faculty of Science,
Kasetsart University and National Science and
Technology Development Agency under Ministry of
Science and Technology of Thailand.
REFERENCES
Anand, R., Mehrotra, K., Mohan, C. K., & Ranka, S.
(1995). Efficient classification for multiclass problems
using modular neural networks. IEEE Transactions on
Neural Networks, 6(1), 117-124.
Arthur Asuncion , D. N. (2007). UCI machine learning
repository from http://archive.ics.uci.edu/ml/datasets.
html
Benjamin, W., & Nathalie, J. (2008). Boosting Support
Vector Machines for Imbalanced Data Sets. In A. An,
S. Matwin, Z. Ras & D. Slezak (Eds.), Foundations of
Intelligent Systems (Vol. 4994, pp. 38-47): Springer
Berlin / Heidelberg.
Chawla, N. V., Bowyer, K. W., Hall, L. O., &
Kegelmeyer, W. P. (2002). SMOTE: synthetic
minority over-sampling technique. J. Artif. Int. Res.,
16(1), 321-357.
Chen, S., He, H., & A., G. E. (2010). RAMOBoost:
Ranked Minority Oversampling in Boosting. IEEE
Transactions on Neural Networks, 21(10), 1624-1642.
Fernadez-Navarro, F., Hervas-Martinez, C., & Gutierrez,
P. A. (2011). A dynamic over-sampling procedure
based on sensitivity for multi-class problems. Pattern
Recogn., 44(8), 1821-1833.
Fernandez, A., Jesus, M. J. D., & Herrera, F. (2010).
Multi-class imbalanced data-sets with linguistic fuzzy
rule based classification systems based on pairwise
learning. Paper presented at the Proceedings of the
Computational intelligence for knowledge-based
systems design, and 13th International Conference on
Information Processing and Management of
Uncertainty.
Forgy, E. (1965). Cluster analysis of multivariate data:
efficiency versus interpretability of classifications.
Biometrics, 21, 768-780.
Hand, D. J., & Till, R. J. (2001). A Simple Generalisation
of the Area Under the ROC Curve for Multiple Class
Classification Problems. Mach. Learn., 45(2), 171-
186.
Hastie, T., & Tibshirani, R. (1998). Classification by
Pairwise Coupling. 26(2), 451-471.
Huang, J., & Ling, C. X. (2005). Using AUC and
Accuracy in Evaluating Learning Algorithms. IEEE
Trans. on Knowl. and Data Eng., 17(3), 299-310.
Liu, Y., Yu, X., Huang, J. X., & An, A. (2010).
Combining integrated sampling with SVM ensembles
for learning from imbalanced datasets. [doi: DOI:
10.1016/j.ipm.2010.11.007]. Information Processing
& Management, In Press, Corrected Proof.
Orriols-Puig, A., & Bernadó-Mansilla, E. (2009).
Evolutionary rule-based systems for imbalanced data
sets. Soft Computing - A Fusion of Foundations,
Methodologies and Applications, 13(3), 213-225.
Quinlan, J. R. (1986). Induction of Decision Trees.
Machine Learning, 1(1), 81-106.
Seiffert, C., Khoshgoftaar, T. M., Van Hulse, J., &
Napolitano, A. (2010). RUSBoost: A Hybrid
Approach to Alleviating Class Imbalance. Systems,
Man and Cybernetics, Part A: Systems and Humans,
IEEE Transactions on, 40(1), 185-197.
Witten, I. H., Frank, E., & Hall, M. A. (2005). Data
Mining: Practical Machine Learning Tools and
Techniques (Third Edition ed.). San Francisco:
Morgan Kaufmann.
Yen, S.-J., & Lee, Y.-S. (2009). Cluster-based under-
sampling approaches for imbalanced data
distributions. Expert Syst. Appl., 36(3), 5718-5727.
MULTI-CLASS DATA CLASSIFICATION FOR IMBALANCED DATA SET USING COMBINED SAMPLING
APPROACHES
171