The spatially interpolated results using Inverse
Distance Weighting (IDW) for the observed
Turbidity and TDS are respectively presented in
Appendix (a) and Appendix (d). The predicted water
quality parameters from drone data using XGBoost as
the best regression model is visually presented in
Appendix (b) and Appendix (e).
From the visualization of the IDW interpolation
results in the Appendix, it is inferred that the use of a
single ML model may not always give accurate
prediction results. This, is attributed in part to the
complexity of bio-optical responses of the water
quality parameters and to the few number sampling
stations, requires the development of ensemble ML
approaches that combines the advantages of the of the
optimal machine learning algorithms for a given
WQP (Satish et al., 2024).
For the minimization of overfitting, not only
should the sampling data be increased, but ensemble
ML can be modelled such that the inputs of the second
stage contain both the spectral indices and the
prediction results from the first-stage ML method.
The results in Appendix (c) and Appendix (f) shows
the improvements in the prediction of Turbidity and
TDS with the ensemble ELR-XGBoost in using the
DJI-PH4 drone data.
4 CONCLUSIONS
In this study, spectral indices with different band
combinations were constructed from the spectral
reflectances of DJI-PH4 Multispectral UAV-Drone
and Sentinel-2 satellite data for the retrieval of
concentrations of Turbidity and TDS water
parameters in a dam reservoir. For the case study of
Gaborone dam (Botswana), the sensor spectral
reflectance and the in-situ measured WQPs were
modelled using univariate Empirical Linear
Regression (ELR), XGBoost and RFR machine
learning models. For both WQPs, XGBoost
performed better in the model training phase,
however third-order polynomial ELR gave good
results for training and testing of the drone and
satellite reflectance data. Turbidity prediction results
from the drone and satellite data showed that the ELR
multivariate regression model outperformed the
XGBoost in data testing and was also better than RF
in both training and testing phases. For the prediction
of TDS, XGBoost gave the best results for both the
drone and satellite data. The XGBoost and ELR
ensemble algorithm demonstrated the ability to
improve water quality parameter inversion as the
ensemble WQP prediction results were higher than
from single ML models. While the absolute accuracy
for the retrieval of WQPs still requires improvements
such as the inclusion of seasonal variability
measurements and increasing the number of sampling
stations, the current results on the WQPs prediction
using machine learning algorithms demonstrates the
potential of using the drone and satellite sensors for
spatial retrieval of Turbidity and TDS in dam
reservoirs. The proposed histogram equalization and
linear bias adjustment of the drone spectral
reflectances based respectively on Sentinel-2 MSI
and Landsat-9 OLI2 satellite data is found to provide
suitable results. Based on the comparatively similar
WQPs estimation results from the drone and satellite
sensors, the sensors can be integrated to exploit the
high temporal resolution of drone sensors, and the
dynamic spectral band wavelengths in the Sentinel-
MSI for improved water quality monitoring in dam
reservoirs.
ACKNOWLEDGEMENTS
The authors acknowledge the Water Utilities
Corporation (WUC) of Botswana for providing the
in-situ measured water quality data used in this study.
This research project was funded by both the USAID
Partnerships for Enhanced Engagement in Research
(PEER) under the PEER program cooperative
agreement number: AID-OAA-A-11-00012.
REFERENCES
Asadollahfardi, G., Taklify, A., Ghanbari, A. (2012).
Application of artificial neural network to predict TDS
in Talkheh Rud River. Journal of Irrigation and
Drainage Engineering, 138(4), 363-370.
Breiman, L (2001) Random forests. Machine Learning, 45,
5–32.
Chen, B., Mu, X., Chen, P., Wang, B., Choi, J., Park, H.,
Xu, S., Wu, Y., Yang, H. (2021). Machine learning-
based inversion of water quality parameters in typical
reach of the urban river by UAV multispectral data.
Ecological Indicators, 133, 108434.
Le, N.Q.K., Do, D.T., Le, Q.A. (2021). A sequence-based
prediction of Kruppel-like factors proteins using
XGBoost and optimized features. Gene, 787, 145643.
Lotfi, G., Ahmadi, N.M., Abolhasani, M. (2019). The
feasibility of using Landsat OLI images for water
turbidity estimation in Gandoman wetland. Iran.
Journal of Radar and Optical Remote Sensing, 2(2),
49–62.
Ouma, Y.O., Keitsile, A., Lottering, L., Nkwae, B. and
Odirile, P., 2024. Spatiotemporal empirical analysis of
particulate matter PM2. 5 pollution and air quality