Sentinel 2 High-Resolution Land Cover Mapping in Sub-Saharan
Africa with Google Earth Engine
Elena Belcore
a
and Marco Piras
b
DIATI, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy
Keywords: Land Cover, Machine Learning, Sentinel-2, Google Earth Engine, Sub-Saharan, Natural Hazard, Climate
Change.
Abstract: This work aims to develop an efficient methodology for high-resolution spatial and thematic land cover maps
of sub-Saharan areas based on Sentinel-2 data. LC mapping in these areas is complicated due to their land
morphology, climatic conditions and homogeneity of surface spectral responses. Two pixel-based supervised
classification approaches are compared in Google Earth Engine. The aggregated method classifies each image
and then aggregates the results on frequency bases at pixel level. The stacked method classifies all the images
together in a single stacked database. Additionally, the influence of linear atmospheric correction models on
the overall accuracy (OA) is assessed, and the best-performing approach is compared to existing Land Cover
(LC) maps of the area. 16 Sentinel-2 images (level 1C) from 2017 and 2019 were atmospheric and
topographically corrected and classified into nine classes. The results show similar performances for the
analysed approaches, with a slightly high OA for the aggregated classification (0.97). The atmospheric
correction has little impact on the results.
1 INTRODUCTION
Sub-Saharan Africa is exceptionally vulnerable to
Climate Change-induced phenomena, such as floods,
erosion, and droughts, which have dramatically
increased in the past years. The Dosso region in
southwest Niger is no exception (Bigi et al., 2018;
Oguntunde et al., 2018; Teodoro and Duarte, 2022).
In 2021, 200,000 people were affected by floods in
Niger. One of the most recent disastrous events
occurred in October 2022, when flooding caused by
heavy rains claimed nearly 200 lives and affected
more than a quarter of a million people.
In such scenarios, to plan against natural disasters,
continuous and detailed monitoring of the area is not
negligible (Tiepolo et al., 2018), and updated
information regarding the Land Cover (LC) is crucial
to land management, as these maps provide users with
information related to terrestrial ecosystems and
livelihoods (Li et al., 2020). Classification of Sub-
Saharan areas is one of the most challenging due to
the landscape complexity and the low spectral
variability within the covers. Moreover, the sand dust
a
https://orcid.org/0000-0002-3592-9384
b
https://orcid.org/0000-0001-8000-2388
particulates in the atmosphere may alter the spectral
response of the Earth surface and further exacerbate
the difficulty of the classification. One of the most
significant problems in the optical remote sensing of
Sub-Saharan regions is that reflectance from soil and
rock during the dry season is often much greater than
that of the sparse vegetation making it difficult to
separate the vegetation. Some specific problems
involved with remote sensing of arid vegetation
include multiple scattering of light (nonlinear mixing)
between vegetation and soil (Huete, 1988). Moreover,
the local architecture, such as Nigerienne
architecture, consists of tiny houses with flat roofs
built using local clays and not plastered, which results
in hardly spectral separable build-up areas and bare
soils, even from very high resolution (VHR) imagery
(Belcore et al., 2022), Figure 1.
Similarly, in some villages and suburbs, the roads
are unpaved. As a consequence, the spectral response
of buildings is the same as roads and bare soil. The
strong seasonality adds further complexity to the
classification as the frequent cloud cover during the
rainy season and sand presence in the air may alter the
spectral values of sensed data.
Belcore, E. and Piras, M.
Sentinel 2 High-Resolution Land Cover Mapping in Sub-Saharan Africa with Google Earth Engine.
DOI: 10.5220/0011746500003473
In Proceedings of the 9th International Conference on Geographical Information Systems Theory, Applications and Management (GISTAM 2023), pages 27-36
ISBN: 978-989-758-649-1; ISSN: 2184-500X
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
27
These reasons contribute to the LC data scarcity
in Sub-Saharan Africa. Indeed, to date, few very high-
resolution LC maps of sub-Saharan areas exist.
Examples are the Africa LC by ESA and FROM-
GLC10 (Li et al., 2020). Although these LC maps are
incredibly complex to realise, most do not provide
good thematic detail. Because of their nature, the
training datasets are hardly updatable due to the large
amount of time and manual work this task requires.
Figure 1: Aerial view of local architecture acquired by a
drone system in 2021.
Niger's southern territories are the study area of
the ANADIA 2.0 project (Anadia 2.0, 2023). It aims
to create an Early Warning System to face climate
change effects in Sirba River Basin in the Tillabery
region, enhance local technicians' knowledge
regarding floods forecasting, and create an adaptation
strategy planform two villages along the Sirba River.
In this area, the need for climatic planning and the
development of adaptation strategies to climate
change at the local level is not negligible (Tiepolo et
al., 2018). Despite this undeniable need for climate
planning, there is no appropriate risk mapping of the
area; indeed, subnational risk mapping lacks detail
(Tiepolo et al., 2018). The data gathered and the
information provided by this work are directly
involved in ANADIA 2.0 by feeding the adaptation
strategy plan and investigating the cause-effect
relation of floods.
Aiming to map and monitor the areas potentially
flooded by the Sirba river, a workflow to produce an
updated, high-detailed LC map of the site is proposed
and validated. In this application, the LC map of
southwest Niger has been realised using nine classes
over 16 features. The entire process was completed in
the Google Earth Engine (GEE) platform, and two
multi-temporal approaches for classification were
tested and compared.
2 MATERIALS AND METHODS
2.1 Study Area
Sirba River is a tributary of the Niger River and
crosses Burkina Faso and Niger countries (Figure 2).
Its basin is prone to floods, and villages along the
river are vulnerable to life and economic losses
(Massazza et al., 2019, 2018; Tamagnone et al.,
2019).
Figure 2: Footprint on the classification (red) and detail of
the Nigerienne branch of the Sirba River.
2.2 Identification of Classes
Nine classes describe the classification system, as
table 1 illustrates.
2.3 Satellite Imagery Filtering and
Pre-Processing
The images acquired by Sentinel-2 (both Sentinel-2A
and Sentinel-2B satellites) were filtered by location
and date. The study area includes the segment of
Sirba River that lies in Niger country for about 100
km. The period covers all the acquisitions between
2017 and 2019. An additional filtering parameter
regards the cloud cover percentage, which must be
less than 10% over a single scene. Only images
sensed during the rainy season (from August to
October) were selected to maximise the classes'
spectral variability, especially to better distinguish
between the vegetation classes and the bare soils and
to identify water. Sentinel-2 level 1C dataset was used
for this classification since only one image from the
corrected dataset of Sentinel-2 level 2A satisfied the
filter mentioned above parameters. The selected
images of Sentinel 2A-1C level are 16 (Table 2). The
GISTAM 2023 - 9th International Conference on Geographical Information Systems Theory, Applications and Management
28
dataset was atmospherically corrected by applying
Dark Object Subtraction (DOS) (Chavez, 1988),
which is a linear atmospheric correction model that
performs similarly to radiative transfer models on
homogeneous surfaces (Lantzanakis et al., 2017).
Table 1: Classes of the classification in South Niger.
N Class Description Picture
1 Water Internal waters
2 Plateaux
Elevated areas over the
dry savannah. They
influence the water
catchment, the erosion
process and present
peculiar plant species.
3
Riparian
vegetation
The thickly vegetated
area along the rivers. It
is usually composed of
trees and bushes.
4 Urban areas
Villages and main
roads.
5
Red bare
soils
Red soils rich in ferric
oxides that
characterised the
savannah soil
landscape.
6
Sandy bare
soils
Sand natural deposits.
7
Vegetation
of the
plateaux
Vegetation on the
plateaux. It grows
along the drainage
canals. It is mainly
composed of
herbaceous species.
8
Irrigated
agricultural
lands
Areas interested by
intense agricultural
activity that require
tillage, and irrigated
generally through
channel systems.
9
Non-
irrigated
agricultural
lands and
pastures
Areas interested by
moderate agricultural
activities that require
tillage or pastures.
Table 2: List of Sentinel-2 images used in the classification.
Yea
r
No. Sentinel Ima
g
e Identification Code
2017
0 20170815T102021
_
20170815T102513
_
T31PCR
1 20170924T102021_20170924T102649_T31PCR
2 20170926T101009
_
20170926T102049
_
T31PCR
2018
3 20180815T102019
_
20180815T102918
_
T31PCR
4 20180820T102021_20180820T103538_T31PCR
5 20180911T101019
_
20180911T101438
_
T31PCR
6 20180911T101019
_
20180911T102702
_
T31PCR
7 20180916T101021_20180916T101512_T31PCR
8 20180921T101019
_
20180921T101647
_
T31PCR
9 20180924T102019
_
20180924T102602
_
T31PCR
10 20180929T102021_20180929T103112_T31PCR
2019
11 20190812T101031
_
20190812T102016
_
T31PCR
12 20190911T101021
_
20190911T102116
_
T31PCR
13 20190921T101031
_
20190921T102426
_
T31PCR
14 20190926T101029
_
20190926T102551
_
T31PCR
15 20190929T102029
_
20190929T102700
_
T31PCR
Knowing that DOS can affect the classification's
results differently depending on the geographical area
(and land cover), the classification model was trained
and applied over the DOS-corrected and non-
corrected datasets. Then the Overall Accuracy (OA)
of the classifications were compared to check the
influence of DOS on the final result.
The topographical correction of the images was
initially applied to reduce the effects of elevation over
the plateaux areas (Dorren et al., 2003). The
topographic correction allows the variation in the
reflectance derived from the terrain's inclination and
the sun elevation (Poortinga et al., 2019; Shepherd
and Dymond, 2010). Nevertheless, the correction
introduced noise in the dataset because the plateaux
slopes do not interfere with the soil's spectral
response. The Digital Terrain Model (DTM) applied,
USGD 30m DTM, is not resolute and precise enough.
Thus, the dataset was not topographically corrected.
(a)
(b)
Figure 3: (a) RGB mosaic on sentinel-2, level 1C data, DOS
applied; (b) RGB mosaic on sentinel 1C data, DOS applied,
and topographically corrected. The correction excessively
alters the data over the plateaux and the plane areas.
Sentinel 2 High-Resolution Land Cover Mapping in Sub-Saharan Africa with Google Earth Engine
29
2.4 Features Extraction and Selection
The feature extraction consisted of the computation
of 6 spectral features, 4 histogram-based features, 18
textural (computed on Sentinel band 8A), 2 elevation-
derived features, and 1 edge-detector feature added to
the 12 spectral bands. Specifically, the Gray Level
Co-occurrence Matrix (GLCM) texture metrics were
calculated over a 9x9 neighbourhood (Haralick et al.,
1973), while the histogram-based features were on a
3x3 filter (Conners et al., 1984; GEE,2022) to help in
classes discrimination (Drzewiecki et al., 2013;
Kukawska et al., 2017). Table 3 lists the extracted
features.
Table 3: Derivative features calculated for Sentinel-2.
Feature Formula/note
Chlorophyll
IndexRedEdge, CRE
(B9/B5)1
Enhanced Vegetation
Index, EVI
2.5*((B9B5)/((B9+6*B57.5*B1)+1))
HUE
Arctan((2*V5B3B1)/30.5)*(B3B1))
Soil Composition Index,
SCI
(B11B8)/(B11+B8)
Wetness Index, WET
(0.1509*B2)+(0.1973*B3)+(0.3279*B4)+
(0.03406*B8)-(0.7112*B11)-(0.4572*B12)
Triangular Vegetation
Index, TVI
0.5*(120*(B8-B3))-(200*(B4-B3))
Sob Sobel edge extractor
Var Variance
Mean Mean
Skew Skewness
Kurt Kurtosis
Entr Entropy
Asm
Angular Second Moment; measures
the number of repeated pairs
Corr
Correlation; measures the correlation
b
etween pairs of pixels
Var
Variance; measures how spread out
the distribution of gray-levels is
Idm
Inverse Difference Moment;
measures the homogeneity
Savg Sum Average
Svar Sum Variance
Sent Sum Entropy
Ent
Entropy. Measures the randomness of
a gray-level distribution
Dvar Difference variance
Dent Difference entropy
Imcorr1 Information Measure of Corr. 1
Imcorr2 Information Measure of Corr. 2
Maxcorr Max Corr. Coefficient.
Diss Dissimilarity
Inertia Inertia
Shade Cluster Shade
Prom Cluster prominence
DSM
Digital Surface Model
(NASA SRTM 30m)
Height model, HM
DSM-DTM (Global Multi-resolution
Terrain Elevation 2010)
The feature selection phase is fundamental to
reducing the computational time of the classification
without losing accuracy (Belcore et al., 2020).
In mid-2020, the function SmileRandomForest
was introduced in the GEE coding platform. Unlike
its predecessor RandomForest function, it allows the
computation of the layer importance, which is based
on the GINI impurity system. Random Forest
algorithm creates multiple decision trees (i.e. forest)
using bootstrapped data samples and selects the final
output based on the majority vote of the individual
trees (Breiman, 2001). The Gini impurity criterion
measures the purity or randomness of a set of items
(Breiman, 2001). It determines the quality of a split
between classes in a decision tree. The goal is to split
the data in a way that results in the purest possible
subset of the classes or values. The Gini impurity of
a split is calculated as the sum of the probability of
each class or value being incorrectly classified, given
a random observation from the set. A split with a
lower Gini impurity is considered a better split, as it
results in a more pure subset of the classes or values.
The feature that results in the lowest Gini impurity is
selected as the splitting feature, and the process is
repeated until a stopping criterion is reached. The
GINI is calculated for each variable of the classifier.
The variables with high GINI gain (so they have less
impurity) are more important.
The features with less than 50 GINI gain were
removed from the input dataset as resulting of an
iterative comparison of 5 scenarios (Table 5). The
parameters considered for the best scenario
evaluation are the out-of-bag error (oob) and the
overall accuracy (OA). The threshold value was
selected according to the maximum accuracy
achievable. Finally, the features were scaled
according to their minimum-maximum values.
2.5 Classification and Multi-Temporal
Strategies Comparison
The training dataset comprises 2500 points, 300
points for each class except for the urban areas class,
which is constituted of 100 points. The choice of
unbalancing the training is due to the low percentage
of urban areas cover. Since the urban areas class
covers the smallest portion of the study area and the
Random Forest (RF) classifier tends to promote the
more represented classes in training, few samples of
the Urban areas were used to train the classifier. The
validation dataset is composed of 1800 points, 200 for
each class. The training and validation dataset were
manually created using a 2017 map as a ground
reference.
GISTAM 2023 - 9th International Conference on Geographical Information Systems Theory, Applications and Management
30
Two different multi-temporal approaches were
compared: aggregated multi-temporal and stacked
multi-temporal methods. In the aggregated multi-
temporal, each image was separately classified using
the machine learning algorithm Random Forest (RF)
with 100 rifle decision trees per class and 2 as the
minimum size for terminal nodes.
The same training dataset was used for each
image. The results are 16 classifications that were
aggregated according to the modal value. Only the
more accurate classifications (equal and greater than
the OA modal value) were used in the final
aggregation.
Differently, the stack multi-temporal approach
consisted of one classification over a dataset
composed of all the features from different epochs
ensembled. In this case, the images were stacked
together and classified with RF algorithm (200 rifle
decision trees per class and 4 as the minimum size for
terminal nodes). Due to GEE's limited available
memory, only 2018 and 2019 data were considered.
Figure 4 shows the classification workflow.
Figure 4: The workflow of the classification. Red arrows
indicate the DOS processing, while the blue ones indicate
the processing without DOS correction.
2.6 Accuracy Assessment
The classifications' accuracy was asses based on the
error matrix-derived measures: the overall accuracy,
the producer's accuracy, the user's accuracy, and the
F1 score (Congalton, 1991).
2.7 Comparison to Existing LC
Classifications
Today there is no official product of LC and use that
can be considered a shared and trusted reference.
Despite the large availability of satellite source data,
no high-resolution harmonised LC product exists. In
2017 ESA created a land cover classification map of
Africa at 20m resolution using 180000 Copernicus
Sentinel-2 images captured between December 2015
and December 2016 (ESA, 2016). The map is still a
prototype, and only eleven classes are described (i.e.
Trees, Shrubs, Grasslands, Croplands, Aquatic
Vegetation, Sparse Vegetation, Bare Areas, Built-Up
Snow, and Open Waters). The lack of thematic detail
is compensated by the 20m spatial resolution, which
makes it unique in the LC data of Niger. Although it
is still a partially validated prototype, it was used as
reference for the validation of the LC map generated
in this work with the stacked approach. The classes of
the two LC systems are hardly harmonised, thus, the
translation required the creation of a target shared
classification for LC Africa ESA and LC of Sirba
area, as Table 4 shows.
Table 4: Conversion classes between ESA Africa LC and
here developed LC map.
Common classes ESA Africa LC Present LC
1- Vegetation
1- Trees, 2 - Shrubs,
6 - Sparse vegetation
3 - Forest and
bushes, 8- Plateaux
vegetation
2- Grassland 3 - Grasslands
10 – non-irrigated
agricultural lands
and pastures
3- Cropland 4 - Cropland
9 – Irrigated
agricultural lands
4 - Bare areas 7- Bare areas
5 – Red bare soils,
6 - Sandy soils,
2 - Plateaux
5 - Built-up 8- Built-up 4 – Urban areas
6 - Waters
5 Aquatic vegetation,
10 - Open waters
1 – Water bodies
3 RESULTS
3.1 Feature Extraction and Selection
Figure 5 shows the results of the GINI importance
analysis. The bands per image (initially 44) were
Sentinel 2 High-Resolution Land Cover Mapping in Sub-Saharan Africa with Google Earth Engine
31
reduced to 16 according to the maximum achievable
accuracy (Table 5) computed by considering five
scenarios with reduced input features. Scenario 3
revealed better OA (0.85) and a little out-of-bag error
(0.070).
Table 5: Tests run over five scenarios that differ in the
number of input features in the classification selected
according to their importance value (see Figure 5).
GINI threshold oob OA
Scenario 1 none 0.081 0.846
Scenario 2 >40 0.069 0.846
Scenario 3 >50 0.070 0.854
Scenario 4 >60 0.084 0.849
Scenario 5 >80 0.081 0.842
3.2 Classification and Multi-Temporal
Strategies Comparison
The aggregated multi-temporal classification was
performed separately in 16 images with 16 features
for each. Table 6 provides the OA values calculated
for each classification. Classifications with OA less
than 0.94 were not used for the modal aggregation.
The stacked multi-temporal classification was
realised using as the input dataset the features from
different epochs together.
The period considered was 2018 and 2019. The
input features of the stack multi-temporal
classification were 76.
Figure 5: GINI importance (y) of the extracted features (x).
The GINI importance was computed for the 76
bands to optimise the process further. Four scenarios
for the slimming out were considered (Table 7). Still,
the best results in terms of OA are provided by
scenario number 1, which does not remove any
feature from the classification.
Table 6: OA achieved on each classification. The underlined
classifications were excluded from the aggregation.
Classification no. OA
1 0.97
2 0.95
3 0.96
4 0.85
5 0.94
6 0.83
7 0.85
8 0.92
9 0.95
10 0.95
11 0.92
12 0.95
13 0.95
14 0.94
15 0.95
16 0.96
Table 7: Tests run over 5 scenarios that differ in the number
of input features in the classification selected according to
their importance value. The parameters considered for the
best scenario evaluation are the out-of-bag error (oob) and
the overall accuracy (OA).
GINI threshold oob OA
Scenario 1 none 0.019 0.960
Scenario 2 >9 0.020 0.958
Scenario 3 >10 0.020 0.955
Scenario 4 >20 0.023 0.951
The atmospheric correction's influence on the
classifications was checked by comparing the
accuracy of the same classification model applied to
corrected and non-corrected input datasets. The DOS
has little impact on the aggregated multi-temporal
classification's goodness: it shifts the OA from 0.971
(non-corrected dataset) to 0.975 (corrected dataset).
Similarly, the DOS showed little influence on the
stacked method too. Indeed it shifts the OA from
0.955 (non-corrected) to 0.960 (corrected), Table 8.
Table 8: DOS influences over the classifications.
No DOS DOS
OA of Aggregated multi-temporal 0.971 0.975
OA of Stack multi-temporal 0.955 0.960
3.3 Accuracy Assessment
The error matrices give the performance of the
classification. Both multi-temporal approaches
resulted in high accuracy values. For what concerns
the aggregated approach, Table 9 shows that the
User's and Producer's accuracies are always above
GISTAM 2023 - 9th International Conference on Geographical Information Systems Theory, Applications and Management
32
0.95. The Plateaux class is less accurate, although its
F1 score reaches 0.95. The model correctly identifies
sandy bare soils and irrigated land classes.
Table 9: Error matrix of the aggregated multi-temporal
classification.
Water
Plateaux
Forest bushes
Urban areas
Red bare soils
Sandy bare soil
Vegetation
(p
lateaux
)
Irrigated
a
g
ricultural lands
Non-irrigated
lands and
p
astures
Water 200 0 0 0 0 0 0 0 0
Plateaux 0 189 0 0 0 0 11 0 0
Forest bushes 0 0 200 0 0 0 0 0 0
Urban areas 0 0 7 193 0 0 0 0 0
Red bare soils 0 8 0 0 188 0 0 0 4
Sandy bare soil 0 0 0 0 1 197 0 0 2
Vegetation
(plateaux)
0 0 0 0 0 0
200 0 0
Irrigated
agricultural
lands
0 0 0 0 0 0 0
200 0
Non-irrigated
lands and
astures
0 0 6 5 1 0 0 0
188
TOT 200 197 213 198 190 197 211 200 194
PA 1.00 0.96 0.94 0.98 0.99 1.00 0.94 1.00 0.97
UA 1.00 0.94 1.00 0.96 0.94 0.98 1.00 1.00 0.94
F1 1.00 0.95 0.97 0.97 0.96 0.99 0.97 1.00 0.95
OA= 0.98
Table 10: Error matrix of the stacked multi-temporal
classification.
Water
Plateaux
Forest bushes
Urban areas
Red bare soils
Sandy bare soil
Vegetation
(p
lateaux
)
Irrigated
a
g
ricultural
Non-irrigated
lands and
Water 183 0 0 0 0 0 0 0 0
Plateaux 0 180 0 0 0 0 10 0 0
Forest bushes 1 0 191 0 0 0 1 3 0
Urban areas 0 0 6 161 0 0 0 0 0
Red bare soils 0 9 0 2 174 0 0 0 8
Sandy bare soil 4 0 0 0 0 178 0 0 1
Vegetation
(p
lateaux
)
0 1 0 0 0 0
195 0 0
Irrigated
agricultural
lands
3 0 6 0 0 0 0
171 0
Non-irrigated
lands and
astures
0 0 8 4 0 0 0 0
177
TOT 191 190 211 167 174 178 206 174 186
PA 0.96 0.95 0.91 0.96 1.00 1.00 0.95 0.98 0.95
UA 1.00 0.95 0.97 0.96 0.90 0.97 0.99 0.95 0.94
F1 0.98 0.95 0.94 0.96 0.95 0.99 0.97 0.97 0.94
OA= 0.96
Regarding the stacked classification, the accuracy
values are slightly lower than the ones on the
aggregate multi-temporal classification. Table 10
shows that the plateaux class reaches 0.947 of the F1-
score, which is the less accurate class along the non-
irrigated lands and pastures. The overall accuracy is
0.96, with only 0.05 points of difference from the
aggregated methods.
Although the high accuracy value, some salt-and-
pepper effect is present all over the scene; thus, some
post-processing operations were carried out for
aesthetic reasons. Specifically, erosion (size 4) and
dilation (size 3) were realised in class Urban areas
(Figure 6).
Figure 6: Example of the aggregated multi-temporal
classification (left) and the stacked multi-temporal
classification (right).
3.4 Comparison to Existing LC
Classifications
The pixel by pixel comparison reveals an overall
accuracy of 0.203 and the pixel-based confusion
matrix is described in Table 11. From a visual
Table 11: Error matrix of the ESA LC Africa (reference)
and Sirba Classification. To facilitate the reading, the
values are reported in square kilometres.
Vegetation
Grassland
Cropland
Bare areas
Built-up
Waters
Total
Vegetation 1068 1497 101 1568 56 97 4387
Grassland 2451 8908 105 34992 1103 787 48348
Cropland 7819 29140 1503 10711 1792 722 51686
Bare areas 36 1407 2 11037 19 113 12614
Built-up 58 4 66 59 584 12 783
Waters 10 0 101 4 0 1109 1224
Total 11442 40956 1879 58369 3554 2841
OA
0.203
PA 0.093 0.218 0.800 0.189 0.164 0.390
UA 0.244 0.184 0.029 0.875 0.745 0.906
F1 0.135 0.200 0.056 0.311 0.269 0.546
Sentinel 2 High-Resolution Land Cover Mapping in Sub-Saharan Africa with Google Earth Engine
33
interpretation of the results, the water class
(specifically the Sirba river) is better identified by the
aggregated LC than the ESA LC (Figure 7).
Figure 7: Sirba River area classified according Sirba LC
(top) and ESA LC (bottom).
4 DISCUSSION
Both classification methods show very positive
results. The little influence of DOS on the results
might be caused by the short span and the very similar
meteorological condition of the analysed dataset.
Also, the RF, which is slightly sensitive to non-
normalised datasets, might contribute to such results.
Although DOS has little influence on the results, it
was maintained in the classification workflow
because of its lightweight processing time. More
complex atmospheric correction models can require
more computational power and processing time.
Thus, further and more detailed analysis needs to be
realised in this direction.
The GINI importance analysis allows the
lightening of the classification process and improves
the classification's performance. Although this was
not true for the stacked classification, reducing the
dataset reduces the OA. It is worth underlining that
the importance analysis was applied twice in this
case. The DOS correction has little influence on the
final accuracy results for both multi-temporal
approaches. This is an unexpected result since most
relevant literature underlines the importance of
atmospheric correction in multi-temporal approaches,
especially in stacked ones.
Little distance also emerges from the comparison
of the two multi-temporal approaches. The
aggregated multi-temporal classification overcomes
the stacked one for only 0.015 of OA (regardless of
the atmospheric correction). The F1 score of some
classes of the aggregated multi-temporal approach is
1 (water and irrigated lands). In the stacked multi-
temporal approaches, the F1 score shows some
differences: irrigated land class is not one of the most
accurate classes, but bare soil is. This method seems
to misclassify the irrigated class, which is often
confused with forest or water. Despite some little
differences, in this case, the two approaches are
perfectly exchangeable for this specific application.
Again, there is a little difference in terms of time to
apply one or the other.
Nevertheless, there is a strong possibility of
running out of memory in additional features or a vast
area. Indeed data from 2017 were taken out. In this
specific application, the aggregated multi-temporal
classification method was applied because of the
slightly higher accuracy and the less scarcity of salt-
and-pepper effect all over the scene.
The pixel-by-pixel comparison between the LC
ESA map and the aggregated LC reveals low overall
accuracy (0.203) despite the two classifications
having similar spatial resolutions (10m and 20m).
The primary issue regards the confusion between
Grassland, Cropland and Bare areas. Most pixels
classified as Grassland in Sirba LC are considered
Bare areas in the ESA Africa LC (Table 11).
Similarly, most of the croplands of the Sirba LC are
classified as Grassland in the ESA LC. Indeed the F1
score of Cropland is only 0.056. Such results are
ascribable to the nature of the definitions of pastures,
grasslands and bare soils. In fact, pastures are
considered Agricultural land in ESA LC and
Grassland in Sirba LC. This is clearly detectable from
the visual comparison between the classifications
(Figure 7). Sirba River and most seasonal ponds and
lakes are detected in Sirba LC and not in ESA LC
because of the dataset of the classifications. Sirba LC
is a rainy season LC (only summer months in 2017-
2019), while ESA LC is based on one-year
observations. This also influences the vegetation
class, which is captured at its maximum during the
rainy season. A good overlap is present between the
other classes. It is worth underling that the ESA
Africa LC is a prototype, and it was validated using
Crowdsourcing only for Kenya, Gabon, Ivory Coast
and South Africa. The analysis and the methodology
applied for the classification demonstrate that the
textural information facilitates segmentation and
classification.
GISTAM 2023 - 9th International Conference on Geographical Information Systems Theory, Applications and Management
34
Similarly, the aggregated multi-temporal approach
proposed to reduce the variability of the images led to
high-accuracy classification. Selecting a limited
period for the satellite classification allowed the
maximisation of the seasonal characterisation. It
increased the separability of some hard-to-map
classes (e.g. Nigerienne urban areas from bare soils
and pastures and water).
5 CONCLUSIONS
Regardless of the application of atmospheric
correction, the classification provides a suitable LC
map for flood planning. It follows that, with some
specific actions, it is possible to overcome the main
mapping difficulties and obtain LC maps with high
thematic detail in sub-Saharan areas. The model
proposed in this paper can be applied to classify other
sub-Saharan river areas semi-automatically since it is
developed in GEE.
REFERENCES
Anadia 2.0. climateservices.it CNR-IBE. URL https://clima
teservices.it/progetto/anadia/ (accessed 2.8.23).
Belcore, E., Piras, M., Pezzoli, A., 2022. Land Cover
Classification from Very High-Resolution UAS Data
for Flood Risk Mapping. Sensors 22, 5622. https://
doi.org/10.3390/s22155622
Belcore, E., Piras, M., Wozniak, E., 2020. Specific alpine
environment land cover classification methodology:
Google Earth Engine processing for Sentinel-2 data, in:
Volume XLIII-B3-2020. Copernicus Publications, pp.
663–670. https://doi.org/10.5194/isprs-archives-XLIII-
B3-2020-663-2020
Bigi, V., Pezzoli, A., Rosso, M., 2018. Past and Future
Precipitation Trend Analysis for the City of Niamey
(Niger): An Overview. Climate 6, 73. https://doi.org/
10.3390/cli6030073
Breiman, L., 2001. Random Forests. Machine Learning 45,
5–32. https://doi.org/10.1023/A:1010933404324
Chavez, P.S., 1988. An improved dark-object subtraction
technique for atmospheric scattering correction of
multispectral data. Remote Sensing of Environment 24,
459–479.https://doi.org/10.1016/0034-4257(88)90019-3
Congalton, R.G., 1991. A review of assessing the accuracy
of classifications of remotely sensed data. Remote
Sensing of Environment 37, 35–46. https://doi.org/
10.1016/0034-4257(91)90048-B
Conners, R. W., Trivedi, M.M., Harlow, C.A., 1984.
Segmentation of a high-resolution urban scene using
texture operators. Computer Vision, Graphics, and
Image Processing 25, 273–310. https://doi.org/
10.1016/0734-189X(84)90197-X
Dorren, L.K.A., Maier, B., Seijmonsbergen, A.C., 2003.
Improved Landsat-based forest mapping in steep
mountainous terrain using object-based classification.
Forest Ecology and Management 183, 31–46.
https://doi.org/10.1016/S0378-1127(03)00113-0
Drzewiecki, W., Wawrzaszek, A., Aleksandrowicz, S.,
Krupiński, M., Bernat, K., 2013. Comparison of
selected textural features as global content-based
descriptors of VHR satellite image, in: 2013 IEEE
International Geoscience and Remote Sensing
Symposium - IGARSS. Presented at the 2013 IEEE
International Geoscience and Remote Sensing
Symposium - IGARSS, pp. 4364–4366. https://doi.org/
10.1109/IGARSS.2013.6723801
ESA, 2016. ESA CCI LAND COVER – S2 prototype Land
Cover 20m map of Africa 2016 [WWW Document].
URL http://2016africalandcover20m.esrin.esa.int/
(accessed 4.12.21).
GEE, n.d. Google Earth Engine [WWW Document]. URL
https://earthengine.google.com/platform/ (accessed
3.9.20).
Haralick, R.M., Shanmugam, K., Dinstein, I., 1973.
Textural Features for Image Classification. IEEE
Transactions on Systems, Man, and Cybernetics SMC-
3, 610–621. https://doi.org/10.1109/TSMC.1973.
4309314
Huete, A. R., 1988. A soil-adjusted vegetation index
(SAVI). Remote Sensing of Environment 25, 295–309.
https://doi.org/10.1016/0034-4257(88)90106-X
Kukawska, E., Lewiński, S., Krupiński, M., Malinowski,
R., Nowakowski, A., Rybicki, M., Kotarba, A., 2017.
Multitemporal Sentinel-2 data - remarks and
observations, in: 2017 9th International Workshop on
the Analysis of Multitemporal Remote Sensing Images
(MultiTemp). Presented at the 2017 9th International
Workshop on the Analysis of Multitemporal Remote
Sensing Images (MultiTemp), pp. 1–4. https://doi.org/
10.1109/Multi-Temp.2017.8035212
Lantzanakis, G., Mitraka, Z., Chrysoulakis, N., 2017.
Comparison of Physically and Image Based
Atmospheric Correction Methods for Sentinel-2
Satellite Imagery, in: Karacostas, T., Bais, A., Nastos,
P.T. (Eds.), Perspectives on Atmospheric Sciences,
Springer Atmospheric Sciences. Springer International
Publishing, pp. 255–261.
Li, Q., Qiu, C., Ma, L., Schmitt, M., Zhu, X.X., 2020.
Mapping the Land Cover of Africa at 10 m Resolution
from Multi-Source Remote Sensing Data with Google
Earth Engine. Remote Sensing 12, 602. https://doi.
org/10.3390/rs12040602
Massazza, G., Tamagnone, P., Pezzoli, A., Housseini, M.,
Belcore, E., Tiepolo, M., Rosso, M., 2018. Amélio-
rations sur le système d’observation du bassin de la
Rivière Sirba pour la gestion des risques naturels, in:
Colloque International AMMA-CATCH. AMMA
CATCH.
Massazza, G., Tamagnone, P., Wilcox, C., Belcore, E.,
Pezzoli, A., Vischel, T., Panthou, G., Housseini
Ibrahim, M., Tiepolo, M., Tarchiani, V., Rosso, M.,
2019. Flood Hazard Scenarios of the Sirba River
Sentinel 2 High-Resolution Land Cover Mapping in Sub-Saharan Africa with Google Earth Engine
35
(Niger): Evaluation of the Hazard Thresholds and
Flooding Areas. Water 11, 1018. https://doi.org/10.
3390/w11051018
Oguntunde, P.G., Lischeid, G., Abiodun, B.J., 2018.
Impacts of climate variability and change on drought
characteristics in the Niger River Basin, West Africa.
Stoch Environ Res Risk Assess 32, 1017–1034.
https://doi.org/10.1007/s00477-017-1484-y
Poortinga, A., Tenneson, K., Shapiro, A., Nquyen, Q., San
Aung, K., Chishtie, F., Saah, D., 2019. Mapping
Plantations in Myanmar by Fusing Landsat-8, Sentinel-
2 and Sentinel-1 Data along with Systematic Error
Quantification. Remote Sensing 11, 831. https://doi
.org/10.3390/rs11070831
Shepherd, J. D., Dymond, J.R., 2010. Correcting satellite
imagery for the variance of reflectance and illumination
with topography. International Journal of Remote
Sensing. https://doi.org/10.1080/01431160210154029
Tamagnone, P., Massazza, G., Pezzoli, A., Rosso, M.,
2019. Hydrology of the Sirba River: Updating and
Analysis of Discharge Time Series. Water 11, 156.
https://doi.org/10.3390/w11010156
Teodoro, A.C., Duarte, L., 2022. Chapter 10 - The role of
satellite remote sensing in natural disaster management,
in: Denizli, A., Alencar, M.S., Nguyen, T.A., Motaung,
D.E. (Eds.), Nanotechnology-Based Smart Remote
Sensing Networks for Disaster Prevention, Micro and
Nano Technologies. Elsevier, pp. 189–216.
https://doi.org/10.1016/B978-0-323-91166-5.00015-X
Tiepolo, M., Bacci, M., Braccio, S., 2018. Multihazard Risk
Assessment for Planning with Climate in the Dosso
Region, Niger. Climate 6, 67. https://doi.org/10.3
390/cli6030067
GISTAM 2023 - 9th International Conference on Geographical Information Systems Theory, Applications and Management
36