Impact of Thresholds of Univariate Filters for Predicting Species Distribution

Yousra Cherif, Ali Idri, Ali Idri, Omar El Alaoui

2023

Abstract

Researchers rely on species distribution models (SDMs) to establish a correlation between species occurrence records and environmental data. These models offer insights into the ecological and evolutionary aspects of the subject. Feature selection (FS) aims to choose useful interlinked features or remove those that are unnecessary and redundant, reduce model costs, storage needs, and make the induced model easier to understand. Therefore, to predict the distribution of three bird species, this study compares five filter-based univariate feature selection methods to select relevant features for classification tasks using five thresholds, as well as four classifiers; Support Vector Machine (SVM), Light gradient-boosting machine (LGBM), Decision Tree (DT), and Random Forest (RF). The empirical evaluations involve several techniques, such as the 5-fold cross-validation method, the Scott Knott (SK) test, and Borda Count. In addition, we used three performance criteria (accuracy, kappa and F1-score). Experiments showed that 40% and 50% thresholds were the best choice for classifiers, with RF outperforming LGBM, DT and SVM. Finally, the best combination for each classifier is as follows: RF and LGBM classifiers using Mutual information with 40% threshold, DT using ReliefF with 50% thresholds, and SVM using Anova F-value with 40% thresholds.

Download


Paper Citation


in Harvard Style

Cherif Y., Idri A. and El Alaoui O. (2023). Impact of Thresholds of Univariate Filters for Predicting Species Distribution. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN 978-989-758-671-2, SciTePress, pages 86-97. DOI: 10.5220/0012203000003598


in Bibtex Style

@conference{kdir23,
author={Yousra Cherif and Ali Idri and Omar El Alaoui},
title={Impact of Thresholds of Univariate Filters for Predicting Species Distribution},
booktitle={Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2023},
pages={86-97},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012203000003598},
isbn={978-989-758-671-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Impact of Thresholds of Univariate Filters for Predicting Species Distribution
SN - 978-989-758-671-2
AU - Cherif Y.
AU - Idri A.
AU - El Alaoui O.
PY - 2023
SP - 86
EP - 97
DO - 10.5220/0012203000003598
PB - SciTePress