Distributed Optimization of Classifier Committee Hyperparameters
Sanzhar Aubakirov, Paulo Trigo, Darhan Ahmed-Zaki
2018
Abstract
In this paper, we propose an optimization workflow to predict classifiers accuracy based on the exploration of the space composed of different data features and the configurations of the classification algorithms. The overall process is described considering the text classification problem. We take three main features that affect text classification and therefore the accuracy of classifiers. The first feature considers the words that comprise the inputtext; here we use the N-gram concept with different N values. The second feature considers the adoption of textual pre-processing steps such as the stop-word filtering and stemming techniques. The third feature considers the classification algorithms hyperparameters. In this paper, we take the well-known classifiers K-Nearest Neighbors (KNN) and Naive Bayes (NB) where K (from KNN) and a-priori probabilities (from NB) are hyperparameters that influence accuracy. As a result, we explore the feature space (correlation among textual and classifier aspects) and we present an approximation model that is able to predict classifiers accuracy.
DownloadPaper Citation
in Harvard Style
Aubakirov S., Trigo P. and Ahmed-Zaki D. (2018). Distributed Optimization of Classifier Committee Hyperparameters.In Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-318-6, pages 171-179. DOI: 10.5220/0006884101710179
in Bibtex Style
@conference{data18,
author={Sanzhar Aubakirov and Paulo Trigo and Darhan Ahmed-Zaki},
title={Distributed Optimization of Classifier Committee Hyperparameters},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2018},
pages={171-179},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006884101710179},
isbn={978-989-758-318-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Distributed Optimization of Classifier Committee Hyperparameters
SN - 978-989-758-318-6
AU - Aubakirov S.
AU - Trigo P.
AU - Ahmed-Zaki D.
PY - 2018
SP - 171
EP - 179
DO - 10.5220/0006884101710179