higher variance but can reduce the bias of the model
(Zhou 2012).
In the context of species distribution modelling
several studies investigated the use of heterogenous
ensembles to enhance models performance. The study
(Kaky et al. 2020) proposed a heterogenous ensemble
using eight algorithms as base learners combined
using weighted voting to predict the distribution of
some medicinal plants located in Egypt. In (Früh et
al. 2018), authors firstly trained 4 ML models (RF,
SVC, DT, LR), and then constructed 11
heterogeneous ensembles combined using soft voting
to predict the potential distribution of mosquito
species in Germany. Studies (Kaky et al. 2020)(Früh
et al. 2018) and (Grenouillet et al. 2011)(Samal et al.
2022)(Dong et al. 2020) showed that ensembles
generally outperform single models in terms of
performance. However, these studies have revealed
certain limitations: (1) the studies have not covered
all the necessary pre-processing steps. Inadequate
pre-processing of data can lead to biased results due
to overfitting. (2) The evaluation process used in
comparing ensembles and single models was
insufficient due to the lack of appropriate statistical
tests. (3) the experimental design of these studies was
unclear and did not provide a comprehensive
modelling framework for using heterogenous
ensembles. Therefore, this study aims to address these
limitations by presenting a comprehensive modelling
framework for using heterogeneous ensembles and
offers better insight into their performance compared
to single models.
This paper aims to model the distribution of the
three redstarts species (P. Moussieri, P. Ochruros,
and P. Phoenicurus) located in Morocco using single
machine learning algorithms and heterogeneous
ensembles. Initially, eight ML algorithms (KNN,
SVM, MLP, GB, DT, RF, AB, and QDA) were
trained as base-learners. Then, based on the
performance of these base-learners, seven
heterogeneous ensembles of two up to eight models,
were constructed for each species dataset. The aim of
this study is to assess the effect of the used selection
strategy on the performance of ensembles. The
performance of the proposed approach was evaluated
using five classification metrics (accuracy,
sensitivity, specificity, AUC, and Kappa), SK test to
compare the performance of the presented models,
and the Borda Count voting method to rank the best
performing models based on multiple performance
criteria. To this end, the present study presents and
discusses the following research questions:
(RQ1): How effective are the eight machine
learning techniques in modeling the
distribution of the three redstarts species?
(RQ2): Do the heterogenous ensembles
constructed using the three selection strategies
perform significantly better than their singles?
The main contributions of this research are:
1. Assessing the performance of eight ML techniques
(KNN, SVM, MLP, GB, DT, RF, AB, and QDA)
in modeling the distribution of the three redstarts
species.
2. Constructing 7 heterogenous ensembles based on
the performance of base-learners.
3. Evaluating whether the heterogenous ensembles
outperformed their singles.
The rest of this paper is divided into different
sections. Section 2 covers the literature review related
to the proposal. Section 3 presents the material and
methods used in this study. Section 4 presents and
discusses the results obtained. Section 5 covers the
threats to validity of this research design. Lastly,
section 6 outlines the conclusion and future works.
2 RELATED WORKS
This section presents the main findings of studies that
have investigated the use of ensemble learning for
species distribution modelling. A structured literature
review (Hao et al. 2019) was conducted to examine
the performance and application of species
distribution modelling ensembles using the BIOMOD
platform. The review found that: (1) on average, six
individual models were employed in ensembles, with
GLMs, BRTs, RFs, and GAMs being the most
frequently used. BIOCLIM was the least frequently
used. MaxEnt, a widely used algorithm in SDM was
not integrated into BIOMOD until 2012, (2)
regarding combination methods, the most frequently
used method was Weighted Mean, with 113 (50.4%)
studies employing it, followed by unweighted Mean
with 58 studies (25.8%), Committee Averaging with
20 studies (9%), and other methods such as PCA,
Median, Mode, and others accounting for 12.9%. For
more specific papers, The study (Hosni et al. 2019)
aim to analyze the effects of geographical and
environmental ranges on the performances of SDMs
models, they trained three statistical models (GLM,
GAM, MARS) and four machine learning algorithms
(ANN, RF, ABT, FDA, CTA) to model the
distribution of 35 fish species at 1110 stream sections
in France. Thereafter, they built an heterogenous
ensemble by averaging the predictions of these 8