A Proactive Approach for the Sustainable Management of Water
Distribution Systems
Sarah Di Grande
1a
, Mariaelena Berlotti
1b
, Salvatore Cavalieri
1c
and Roberto Gueli
2d
1
Department of Electrical Electronic and Computer Engineering, University of Catania, Viale A. Doria n.6, Catania, Italy
2
EHT, Viale Africa n.31, Catania, Italy
Keywords: Water Distribution System, Water 4.0, Water Demand Forecasting, Energy Consumption Forecasting,
Machine Learning.
Abstract: Today, water distribution systems need to supply water to consumers in a sustainable way. This is connected
to the concept of Watergy, which means the satisfaction of user demand with the least possible use of water
and energy resources. Thanks to modern technologies, the forecasting of water and energy demand can help
achieve this goal. In particular, water demand forecasting allows water distribution companies to know in
advance how water resources will be allocated, it can help identify any anomalies in water consumption, and
it is essential for pumps scheduling. On the other hand, energy consumption forecasting has other important
roles, such as energy optimization, identification of anomalous consumption, and planning of energy load.
The present paper aims to develop short-term water demand and energy forecasting models through
innovative machine learning-based methodologies for the water distribution sector: global forecasting models,
the N-Beats machine learning algorithm, and transfer learning approaches. These tools demonstrated very
good performances in the creation of the models previously mentioned.
1 INTRODUCTION
Today water distribution systems (WDSs) are
responsible for water delivery with the required
quality, pressure, and quantity, but with the lowest
possible water and energy waste (Adedeji et al.,
2022), (Mesalie et al., 2021). This goal is linked to the
concept of Watergy efficiency, which means the
satisfaction of user demand with the least possible use
of water and energy resources (Bolognesi et al.,
2014).
Water scarcity poses a great threat to humans. It is
predicted that by 2025, 1.8 billion people may face
severe water shortages, and about two-thirds of the
world's population could be experiencing water stress
(Hans et al., 2014). This scenario of decreasing water
availability is the result of the amplification of
various factors, such as climate change, population
growth, increased urbanization rates, and industrial
development (Patil et al., 2022), (Leitão et al., 2019),
a
https://orcid.org/0009-0008-8895-2175
b
https://orcid.org/0009-0007-6564-704X
c
https://orcid.org/0000-0001-9077-3688
d
https://orcid.org/0000-0002-8014-0243
(de Souza Groppo et al., 2019), (Esen et al., 2020).
These phenomena have caused a significant increase
in water consumption reducing the available water
resources (Hussain et al., 2022), (Stańczyk et al.,
2022). Indeed, the increase in water consumption is
not accompanied by an increase in water resources.
Water demand forecasting can help in identifying
wasteful behavior or leakages in the system, which
lead not only to higher water consumptions but also
to higher energy consumption (Kofinas et al., 2016).
Furthermore, water demand forecasting prevents
energy waste through the possibility of pumps
scheduling, as it will be pointed out in the next
section.
Indeed, concerning energy consumption, a water
distribution system incurs high energy costs in all of
its operations (water extraction, treatment, and
distribution), but pumping systems are the biggest
cause of consumption (Sarmas et al., 2022). Luckily,
the optimal management of pumps’ operations, called
Di Grande, S., Berlotti, M., Cavalieri, S. and Gueli, R.
A Proactive Approach for the Sustainable Management of Water Distribution Systems.
DOI: 10.5220/0012121200003541
In Proceedings of the 12th International Conference on Data Science, Technology and Applications (DATA 2023), pages 115-125
ISBN: 978-989-758-664-4; ISSN: 2184-285X
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
115
pump scheduling can be a source of savings (Gan et
al., 2022). This concept would be better explained if
the Figure 1 has been considered.
Figure 1 is a simplified representation of how the
water is distributed to final users in a water
distribution network. In this network, each zone is
served by a pump system composed of 3
collaborating pumps (parallel pumps) that distribute
water collected in a reservoir. After each pump
system, there is a single pipe in which the water flows
to a network zone. Parallel pump systems give space
for energy savings opportunities because it can be
decided when to turn on/off pumps based on the water
demand and energy consumption (Gan et al., 2022).
In other words, not all pumps need to operate always
simultaneously, for example during the night when
the water demand is usually lower concerning
working hours.
Figure 1: A simple representation of a WDS.
In this context, forecasting the aggregated water
demand of each network zone plays a very important
role to schedule pumps’ operations. Accurate
forecasting helps in deciding how many and which
pumps to turn on at a given moment of the day based
on water demand. Furthermore, forecasting the
energy consumption of pump systems can also be
useful for different purposes: energy optimization,
identification of anomalous consumption, and
planning of energy load (Yi et al., 2022), (Alhendi et
al., 2022). In particular, the planning of the energy
load to be supported is fundamental to predicting the
necessary costs to be incurred and also if the system
will be able to cope with the required power.
In conclusion, today Watergy efficiency is the
main goal that water distribution companies want to
achieve and the digitalization that has pervaded the
management of water resources permits them to get
as close as possible to this goal. Indeed, the
sustainability of the water supply system would not
be possible without the industrial revolution of the
water sector, called Water 4.0 (Adedeji et al., 2022).
Automation, increased integration of sensors, Internet
of Things, big data analysis, and artificial intelligence
are some of the features of Water 4.0. In particular,
the most famous applications of artificial intelligence
in the management of water supply systems are
anomaly detection and water demand forecasting
(Adedeji et al., 2022). The present paper is focused
on the latter, together with the pumps’ energy
consumption forecasting. In particular, it is
implemented a 24-h horizon forecasting with a
timestep of 1 hour.
The proposal presented in the paper has been
developed inside a research project funded by “Italian
Ministry of Enterprises and Made in Italy”
(https://www.mimit.gov.it/en/); details about the
research project are given in the Acknowledgment
section. One of the partners of the project, EHT, is an
enterprise involved in the digital transformation of
water distribution systems. The goals of the research
here presented were specified by this enterprise
together with sponsors of the project, fully involved
in the same business area. Moreover, the enterprises
involved in the project evaluated the results achieved
in order to understand to what extent these results
could be valuable in practical terms; the evaluation of
these results was successful as the impact of the
proposal on the real management of water distribution
systems was considered important.
The paper is organized as it follows. In Section 2
the authors introduce related studies about machine
learning (ML) models used for water demand
forecasting and pumps energy forecasting. In Section
3 the authors describe the proposed approach to solve
the previously cited forecasting problems, explaining
how the dataset was simulated, the preprocessing
steps were done before the machine learning
algorithm, the proposed forecasting models, and
finally, the performance metrics used to evaluate
models. In Section 4 and Section 5 the obtained
results and conclusions with future works,
respectively, are reported.
2 RELATED WORKS
Water demand forecasting was addressed with
different machine learning forecasting models in the
literature. In (Niknam et al., 2022) and (de Souza
Groppo et al., 2019), a detailed review of the methods
employed, and important future challenges are given.
In particular, the three most used methods are in
order: traditional time series (e.g., autoregressive
integrated moving averages, exponential smoothing),
different types of artificial neural networks (e.g., long
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
116
short-term memory, radial basis function ANN, gated
recurrent units), support vector machines.
Time series models are not able to reach high
accuracy forecasts as machine learning models,
because they are not able to learn complex, non-linear
patterns in demand forecasting. Despite this, they are
among the most used models due to their ease of use
and interpretation (Niknam et al., 2022). Among the
most recent papers, AutoRegressive Integrated
Moving Averages (ARIMA) and Exponential
Smoothing (ES) are the most used time series models.
In (Ebrahim Banihabib et al., 2019), two forecasting
methods for daily urban water consumption
forecasting are used; one of them is ARIMA. In
(Karamaziotis et al., 2020) different methods, among
which ARIMA and ES, are used to realize a mid-term
forecast. In (Ristow et al., 2021) the ARIMA and ES
methods are used to forecast monthly urban water
demand.
Regarding ANN, in (Salloom et al., 2021) an
hourly water demand forecasting with the Gated
Recurrent Unit (GRU) method is presented. In (Hu et
al., 2021), the GRU method is also used, but with the
aim to make an hourly water demand forecasting,
demonstrating the superiority of this method
compared to Support Vector Machines (SVM).
About the SVM method, in (Shabani et al., 2017),
it is used with a polynomial kernel function to predict
monthly water demand. In the paper (Candelieri et al.,
2019), the authors used the SVR with a parallel global
optimization tuning of hyperparameters, which
allowed them to increase the accuracy of the short-
term forecast.
The pumps’ energy consumption forecasting of a
water distribution system is instead a less explored
area of study. In (Yi et al., 2022), the authors used
four algorithms (multiple linear regression, random
forest, deep neural network, and support vector
regression) to forecast the energy consumption of an
entire water system and the three subsystems of
conveyance, treatment, and distribution. They
pointed out the lack of similar papers.
Partly according to the research carried out, and
partly also according to what was affirmed by
(Niknam et al., 2022), there are some quite
unexplored topics in literature.
In water distribution systems usually, water
demand time series are forecasted individually,
meaning that one model for each time series is
developed. Instead, global models could allow the
creation of a single model for all the time series
(Montero-Manso et al., 2021). This is essential
considering that, over the years, thanks to the
technologies available, the number of available series
will always increase, and new tools are needed to
manage it.
Furthermore, global models, learning from
multiple time series simultaneously, allow the use of
transfer learning approaches (Bandara et al., 2021). If
the model is trained on fairly heterogeneous series,
transfer learning should allow forecasting on series
never seen before.
Finally, to the best of the authors' knowledge,
among all the algorithms used, there is one that has
never been used in water demand and pumps’ energy
consumption forecasting of an entire water
distribution system: N-Beats (Oreshkin et al., 2020).
This is an algorithm developed specifically for time
series.
In summary, this paper aims to fill the gaps in the
literature by developing short-term water demand and
pumps’ energy consumption forecasting models
through global models using the N-Beats algorithm
and transfer learning approach.
3 DESCRIPTION OF THE
APPROACH
In this section, the authors describe the tools used for
data simulation, the dataset preprocessing phases, the
proposed forecasting method, and the performance
metrics for the models' evaluation.
3.1 Data Simulation
Data plays a strategic role in a machine learning
approach, as known. Considering the problem
presented in this paper, information about the WDS
like pipes flowrate and pumps’ energy consumption,
is strongly required. Data for water distribution
systems are often not available or of poor quality
(Maira et al., 2014). As the main aim of the paper was
the feasibility study of the proposed approach, data
needed to run the machine learning-based solution
was synthetically generated.
Data were simulated through Water Network Tool
for Resilience (WNTR), a Python package based
upon EPANET software, designed for the simulation
of water distribution networks (WDNs), version 0.5.0
(Klise et al., 2020). The WNTR simulator takes as
input an .inp file containing the network
characteristics (e.g., pipes, pumps, valves, junctions,
tanks, reservoirs, water demand patterns, pumps
curves) and it returns different time series with the
simulation results (e.g., pipes flowrate, nodes
pressure, pumps’ energy consumption).
A Proactive Approach for the Sustainable Management of Water Distribution Systems
117
The network used for the simulation is a
simplified version of the WDN of the city of Milano
(Italy); it is made up of 12,354 nodes, 17,548 links,
128 patterns, and 95 pump curves (one for each
pump). At each node representing a user, a water
demand pattern has been assigned. The demand
pattern was extrapolated from water consumption
data collected every minute on particular dates. More
in-depth, these consumption data were aggregated in
a dataset, where each column represented a specific
zone, and each row was the water consumption of a
particular minute of the day. In order to extract
coefficients of the general demand pattern, a column
with the total consumption per minute was added.
Finally, the column with the coefficients per minute
was obtained by dividing each observation of the total
consumption column by the average value of this
column. Then, the pattern was aggregated with the
mean operator to obtain 1-hour interval observations,
useful for the subsequent step of the hourly
forecasting. True patterns of 24 hours are represented
by a black line in Figure 2.
From this real demand pattern, other 4 patterns
were obtained adding for each value of the true one a
random noise. In order to create a similar but different
pattern, at each value of the true pattern was added a
quantity that randomly increased or decreased the
original value. To calculate this quantity, first of all,
a random noise between -1.5 and 1.5 was generated
by multiplying a random value between 0 and 1 by 3
and subtracting 1.5. Then, this value was multiplied
by 0.05 to obtain a relative number between the 5%
of -1.5 and 1.5. For each value of the real pattern, this
process was repeated 4 times to obtain other 4 similar
patterns. Synthetic patterns of 24 hours are
represented in Figure 2 and compared with the true
values.
Figure 2: True water demand pattern and synthetic ones.
Even if this is the general water demand pattern,
in the network.inp file each node demanding water
has a specific base demand. This different base
demand for each node makes the demand pattern of
each node in the network unique.
Furthermore, the pump's speed was set constant
because WNTR at the moment doesn’t support
variable speed pumps.
The water distribution was simulated using this
network to obtain 5 days of hourly time series (120
observations). The simulation was a demand-driven
simulation, meaning that the pressure in the system
depends on the node demands, and that node demands
are always satisfied (Klise et al., 2020). The time
series collected to proceed with the analysis are:
flowrates series for each pipe (120 rows, 17548
columns), and energy consumption of each pump
(120 rows, 95 columns).
Table 1 and Table 2 provide an example of
flowrates and energy consumptions datasets,
respectively, where f indicates the flowrate and e is
the energy, while the index of the rows represents the
hours (120 of total hours because there are 5-day
hourly data).
Table 1: Simulated flowrates time series example.
Flow
p
ipe i
d
1
Flow
p
ipe i
d
2
Flow
p
ipe i
d
17548
1f
1,1
f
1,2
f
1,17548
2f
2,1
f
2,2
f
2,17548
……
120 f
120,1
f
120,2
f
120,17548
Table 2: Simulated energy consumptions time series
example.
Energy
p
um
p
id 1
Energy
p
um
p
id 2
Energy
p
um
p
i
d
95
1e
1,1
e
1,2
e
1,95
2e
2,1
e
2,2
e
2,95
……
120 e
120,1
e
120,2
e
120,95
3.2 Feature Selection
A feature selection procedure was done for both
flowrates dataset and pumps’ energy consumption
dataset to obtain the best input setting for the machine
learning algorithms.
As represented by Figure 1, after each group of
pumps there is a pipe from which water flows to a
zone based on the zone's aggregate water demand.
After this pipe, there would be many other pipes that
allow water to reach all the end users. Since the
authors were interested only in the aggregated
demand of a network area, for each served zone, the
pipe represented in Figure 1 was selected through its
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
118
id, and the flowrates dataset was reduced from 17,548
to 27 columns (one pipe for each network zone).
Among the 27 time series, 7 of them showed an
almost constant pattern, and consequently, they were
excluded from the analysis. Therefore, a further
reduction was implemented from 27 to 20 columns
because the pipes of areas having an almost constant
simulated daily pattern were excluded.
The names of the selected zones of the Milano
network are: Anfossi, Armi, Assiano, Baggio,
Cantore, Chiusabella, Cimabue, Comasina,
Crescenzago, Feltre, Gorla, Italia, Lambro, Novara,
Padova, S. Siro, Salemi, Suzzani nord, Suzzani sud,
Vialba.
The final dataset was composed of 20 time series,
each one representing the 5-day aggregated water
demand of a network zone with an hourly interval.
The pumps’ energy consumption dataset needed a
first reduction of features to contain only the pumps’
energy consumption of the previously selected
network’s zones (from 95 to 72 pumps). Thus, a file
associating each pump with its area was consulted to
make the above selection. In each area, the pumps
collaborate for the delivery of water to the specific
zone. The number of collaborating pumps for each
area ranges from a minimum of 2 to a maximum of 5.
Furthermore, the consumption of pumps in the same
zone was aggregated with the sum operator to prepare
the dataset for the pumps’ energy consumption
forecasting of a network zone.
The final dataset was composed of 20 time series,
each one representing the 5-day aggregated pumps’
energy consumption of a network zone with an hourly
interval.
3.3 Model Definition
The aim of the paper is to develop machine learning
(ML) models for the aggregated water demand and
pumps’ energy consumption forecasting of 20
different water distribution network zones.
Each time series was normalized with the Min-
Max scaling so that the range of each variable
becomes 0-1. More specifically, given max the
maximum value of a variable, and min its minimum
value, each observation x is transformed according to
this formula:
𝑥𝑚𝑖𝑛
𝑚𝑎𝑥
𝑚𝑖𝑛
(1)
Then, to prepare the datasets for the machine
learning algorithms, each time series was divided into
a training set (first 4 days, 96 hours of observations)
and a test set (last day, 24 hours of observations).
The proposed forecasting approach is shown
below in Figure 3 and Figure 4.
All machine learning models were performed
through Darts (version 0.23.1), a Python machine
learning library specific for time series analysis, in
particular for time series forecasting (Herzen et al.,
2022). The powerful feature of Darts is to provide
modern machine learning functionalities with a user-
friendly and easy-to-use API design (Herzen et al.,
2022). Furthermore, all deep learning forecasting
models implemented in Darts are global forecasting
models. A global forecasting model has great
potential because it can be trained with multiple time
series and it can make forecasting not only for these
time series but also for unseen series (transfer
learning approach).
Other important information about the training of
global forecasting models will be provided in the next
paragraph, together with detailed information on the
architecture of the used model.
Figure 3: Water demand and pumps’ energy consumption
forecasting through a unique ML model trained with
flowrates series.
Figure 4: Water demand forecasting for one zone through a
ML model trained with all the flowrates series excluding
the forecasted one.
As depicted by Figure 3 and Figure 4, the water
demand forecasting of each flowrate time series was
addressed with two different methodologies.
The first consists in the creation of a global
forecasting model taking as input all the flowrates
series training sets, and giving 24 hours of forecast
A Proactive Approach for the Sustainable Management of Water Distribution Systems
119
for all of these series (Figure 3). The Darts model
used for forecasting is the N-Beats model. The default
hyperparameters were maintained except for two of
them: the input_chunk_length and the
output_chunk_length. The former specifies the length
of the time series portion taken in input by the internal
neural network and was set to 72, meaning that the
neural network looks 72 hours in the past. The latter
represents the length of the forecast of the model and
was set to 24, meaning that the neural network
produces 24 points of forecast.
The second method consists in the creation of a
global forecasting model for each flowrate series
taking as input all the flowrates series training sets
except for this one, and giving 24 hours of forecast
for this series. This approach exploits the knowledge
gained with the training of some series to forecast
unseen series (transfer learning). Figure 4 represents
an example of this method applied for the forecasting
of a single flowrate series.
The latter method was applied also for the
construction of the pumps’ energy consumption
forecasting model. The energy consumption of pumps
strictly depends on the flowrate pattern. The WNTR
simulator calculates it considering the pump flowrate,
the node head, and the global efficiency of the pump
(set to 75%). Therefore, each pump’s energy
consumption time series has almost the same pattern
as the flowrate time series in the same network zone.
Instead of creating an additional model, the first
global model trained with all the flowrates time series
was tested for the energy consumption forecasting of
each zone (see Figure 3).
3.4 N-Beats Global Forecasting Model
This section provides insights into the used model (N-
Beats) and how the training of global models works
on the Darts library.
Recently, the authors of (Oreshkin et al., 2020)
proposed a neural network architecture designed for
time series forecasting called N-Beats (Neural basis
expansion analysis for interpretable time series
forecasting). In the following, the architecture of the
model will be described, as shown in Figure 5; more
details may be achieved by (Oreshkin et al., 2020).
Given a forecast horizon (or forecast period) of
length H and an observed series history (or lookback
period) of length T (where T = n × H), the model takes
as input the lookback period to learn the behavior of
the time series, and it predicts the behavior of the
same time series in the forecast period (upper part of
Figure 5). There are different stacks (right part of
Figure 5), and at the end, the output of each stack is
combined to obtain a global forecasting output. In
each stack, there are multiple blocks (middle part of
Figure 5), and each block has a fully connected stack
with 4 layers that do both forecasting and backcasting
(left part of Figure 5). The difference between
forecasting and backcasting is the direction of
predictions: the former predicts future values by
looking back at historical data, and the latter
extrapolate past values from future data (forecasting
backward in time). Furthermore, nonlinearities are
provided by the ReLU activation function. Activation
functions have an important task because they
introduce non-linearities in the network. In other
words, learning complex pattern in the data, help in
the resolution of real-world problems. There are
different activation functions that can be used in the
network, but the ReLU (rectified linear unit) is the
most popular because it is simple and fast (Nair et al.,
2010).
Figure 5: N-Beats architecture.
This model has been used through the Darts
Python library. Darts library has implemented many
forecasting models, but only a subset of them can be
trained with multiple time series (among which N-
Beats). These models are called global forecasting
models. Time series can be divided into two classes:
target time series (series to be forecasted) and
covariate time series (series not to be forecasted but
to be taken into consideration to help target time
series forecasting). The present paper does not
consider the presence of covariates, therefore models
take in input only target series. This choice lies in the
fact that water demand history is the main factor
influencing future demand, therefore it is sufficient
for developing accurate models (Hu et al., 2021),
(Bakker et al., 2013).
When a model with multiple target time series
needs to be trained (as in our case), Darts creates a
dataset aggregating multiple input/output pairs from
the provided time series. The length of the input is
equal to the input_chunk_length hyperparameter,
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
120
while the length of the output depends on the
output_chunk_length hyperparameter.
Figure 6 shows the training phase of a model with
two example series of different lengths and different
time stamps in input.
In this example, the input_chunk_length is equal
to 4, while the output_chunk_length is equal to 2. The
number of samples used for training is calculated by
subtracting from the time series length the sum of the
input_chunk_length and output_chunk_length, and
adding 1 to this result. Therefore, the first series has a
number of samples used for training equal to 9, while
the number of samples of the second series is 7. A
training epoch in multiple series models consists of
the complete pass over all the samples of all the
series. Finally, the most important things to point out
are that series do not need to have the same length,
the same time stamps, or the same frequency
(although this is not our case).
Figure 6: Training of global models in Darts library.
3.5 Performance Metrics
Different metrics were considered for the
performance evaluation of forecasting models: mean
absolute error (MAE), symmetric mean absolute
percentage error (SMAPE), mean squared error
(MSE), and root mean square error (RMSE).
MAE is a measure of error between predicted and
true values, and it is calculated as an arithmetic
average of the absolute errors (Hyndman et al., 2006).
SMAPE is a measure of accuracy based on
relative errors, therefore it is a percentage value
(Hyndman et al., 2006), (Bandara et al., 2021).
MSE is a measure indicating the average squared
difference between predicted values and actual values
(Hyndman et al., 2006).
RMSE is calculated as the square root of the mean
of the square of all the errors (Hyndman et al., 2006).
For all of these measures, the lower the values, the
better the model performance.
4 RESULTS
In this work, innovative machine-learning based
methodologies are proposed to develop short-term
water demand and energy forecasting models. This
important task was addressed with the use of global
forecasting models and transfer learning approaches.
First of all, four different global models were
tested to select the one that provided the most
accurate water demand forecasting. Among the
models available within the Darts python library, the
ones that were tested are: N-Beats, RNN, BlockRNN,
and Transformer.
Performance metrics are reported in Table 3.
Table 3: Performance metrics of N-Beats, RNN,
BlockRNN, and Transformer global models.
N-Beats RNN Block RNN Transf
MAE 0.031 0.188 0.23 0.16
SMAPE 21.322 52.15 56.174 45.762
MSE 0.003 0.06 0.075 0.043
RMSE 0.043 0.237 0.265 0.194
Among the models used, the N-Beats model
outperforms the others according to all metrics
considered, confirming its superiority in terms of
forecasting accuracy. The other algorithms obtained
very similar metrics’ results, and the following
ranking was obtained in decreasing order of
performance: N-Beats, Transformer, RNN,
BlockRNN. Taking as an example the SMAPE
metric, it can be seen that considering the N-Beats
model this metric is reduced by more than half
compared to all the other models. The
outperformance of this model may be attributed to its
ability to do both backcasting and forecasting, which
is a property that greatly differentiates it from the
other algorithms, as said before in Section 3.4 while
describing its architecture.
The performance metrics of the N-Beats global
forecasting model (Table 3, first column) have been
compared to the ones obtained from the creation of
single models trained with one series at a time; in this
case, the following values have been achieved:
MAE=0.030, SMAPE=21.839, MSE=0.003, RMSE=
0.043. As it can be easily pointed out, almost identical
results have been achieved. This situation may have
been arisen because a restricted amount of data was
used, but building a global forecasting model trained
with a lot of real data with a longer period of time may
benefit from learning from more patterns at the same
time. Indeed, related time series could improve the
overall predictions with respect to the result obtained
with a collection of local models (Hewamalage et al.,
2022). However, the time spent for training one
model with multiple time series was lower than the
total time needed to train a model for each series (36
seconds and 476 seconds respectively). This could be
justified considering that the model complexity of
A Proactive Approach for the Sustainable Management of Water Distribution Systems
121
local models grows proportionally to the number of
time series in the dataset, and it can be higher than the
constant complexity of the global model
(Hewamalage et al., 2022). As said in Section 1, in
the Water 4.0 era, water distribution systems are
characterized by increased integration of sensors,
Internet of Things, and big data. As a result, so much
more data can be collected, analyzed, and exploited
in the decision-making and planning phases.
However, having access to all this information also
means learning how to manage it properly. The
results obtained demonstrate that the use of global
models meets these needs, as forecasting can be
performed by training a single model on multiple
series with less time spent than creating multiple local
models.
Pumps energy consumption patterns strictly
depends on the water demand of the respective zone.
For this reason, the previously created model trained
with water demand time series has been used to reach
the goal of pumps’ energy consumption forecasting.
Performance metrics results demonstrate the
effectiveness of the approach (MAE: 0.032; SMAPE:
21.305; MSE: 0.003; RMSE: 0.043).
Finally, it has been explored the capacity of global
forecasting models to forecast previously unseen
series. For each flowrate time series, it has been
created a model trained with all the other series and
tested with this excluded one. Out of 20 forecasts, 18
of them produce very good results, demonstrating the
ability of the models to generalize well (MAE: 0.079;
SMAPE: 25.141; MSE: 0.012; RMSE: 0.1), while the
other two have been considered unacceptable
forecasting (MAE: 0.203; SMAPE: 97.746; MSE:
0.057; RMSE: 0.238). The worst performance of
transfer learning in this small group of time series
may be attributed to the fact that the patterns of these
time series are too different from the group of time
series on which the model is trained. Consequently,
the fact that these two series have a totally different
pattern from the others suggests that a global model
made up of as heterogeneous series as possible can
obtain better performances in the case of transfer
learning.
A few examples of comparisons between the
forecasting of global forecasting models and transfer
learning models are reported in Figure 7, Figure 8,
and Figure 9.
In conclusion, an essential step in the water and
energy forecasting approach is to compare the results
of this study with similar research in the past
literature. However, no comparative research was
found as the use of global models, the N-Beats
algorithm, and transfer learning techniques is a field
being explored for the first time in the water
distribution sector.
Figure 7: Water demand forecasting for the zone named
Cimabue with the global forecasting model and the pre-
trained model.
Figure 8: Water demand forecasting for the zone named
Comasina with the global forecasting model and the pre-
trained model.
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
122
Figure 9: Water demand forecasting for the zone named
Feltre with the global forecasting model and the pre-trained
model.
5 CONCLUSIONS
In this study, the authors developed a short-term
water demand and pumps’ energy consumption
forecasting with simulated data from the Milano
water distribution network. In particular, hourly data
were used to make 24-h horizon forecasts.
To the best of the authors' knowledge, the
approaches proposed for forecasting differ from
previously published studies in different points.
First of all, both water and energy forecasts are
investigated together for the same water distribution
network.
Then, it is the first time that global models are
used in the water sector, and this has made it possible
to create fast a single and general model able to
generalize on unseen time series (transfer learning).
Finally, although N-Beats was never used before
in the water demand and pumps’ energy consumption
forecasting of an entire water distribution system, the
results achieved by the authors pointed out that it
offered the best performance; on account of these
results, it seems very suitable to be used in this field.
Future studies plan to test this methodology with
real data covering a longer period, to create more
complex models able to detect weekdays, weekends,
and yearly patterns, or trends if present. Furthermore,
also mid-term and long-term forecasts could be
developed.
ACKNOWLEDGEMENTS
The research results presented in this paper have been
achieved inside the Water 4.0 project, named
“Technologies for the convergence between industry
4.0 and the integrated water cycle”. This research
project is currently running and is funded by the
Ministry of Enterprises and Made in Italy
(https://www.mimit.gov.it/en/).
REFERENCES
Adedeji, K. B., Ponnle, A. A., Abu-Mahfouz, A. M.,
Kurien, A. M. (2022). Towards Digitalization of Water
Supply Systems for Sustainable Smart City
Development—Water 4.0. Applied Sciences, 12, 9174.
MDPI. https://doi.org/10.3390/app12189174
Alhendi, A. A., Al-Sumaiti, A. S., Elmay, F. K., Wescaot,
J., Kavousi-Fard, A., Heydarian-Forushani, E.,
Alhelou, H. H. (2022). Artificial intelligence for water–
energy nexus demand forecasting: a review.
International Journal of Low-Carbon Technologies, 17,
730–744. Oxford University Press. https://doi.org/10.
1093/ijlct/ctac043
Bakker, M., Vreeburg, J. H. G., van Schagen, K. M.,
Rietveld, L. C. (2013). A fully adaptive forecasting
model for short-term drinking water demand.
Environmental Modelling & Software, 48, 141-151.
ELSEVIER. https://doi.org/10.1016/j.envsoft.2013.06.
012
Bandara, K., Hewamalage, H., Liu, Y.-H., Kang, Y.,
Bergmeir, C. (2021). Improving the accuracy of global
forecasting models using time series data augmentation.
Pattern Recognition, 120, 108148. ELSEVIER.
https://doi.org/10.1016/j.patcog.2021.108148
Bolognesi, A., Bragalli, C., Lenzi, C., Artina, S. (2014).
Energy efficiency optimization in water distribution
systems. Procedia Engineering, 70, 181-190.
ELSEVIER. https://doi.org/10.1016/j.proeng.2014.02.
021
Candelieri, A., Giordani, I., Archetti, F., Barkalov, K.,
Meyerov, I., Polovinkin, A., Sysoyev, A., & Zolotykh,
N. (2019). Tuning hyperparameters of a SVM-based
water demand forecasting system through parallel
global optimization. Computers & Operations
Research, 106, 202-209. ELSEVIER. https://doi.
org/10.1016/j.cor.2018.01.013
de Souza Groppo, G., Costa, M. A., Libânio, M. (2019).
Predicting water demand: a review of the methods
employed and future possibilities. Water Supply, 19(8),
A Proactive Approach for the Sustainable Management of Water Distribution Systems
123
2179-2198. IWA PUBLISHING. https://doi.org/10.
2166/ws.2019.122
Ebrahim Banihabib, M., Mousavi-Mirkalaei, P. (2019).
Extended linear and non-linear auto-regressive models
for forecasting the urban water consumption of a fast-
growing city in an arid region. Sustainable Cities and
Society, 48, 101585. ELSEVIER. https://doi.org/10.
1016/j.scs.2019.101585
Esen, Ö., Yıldırım, D. Ç., Yıldırım, S. (2020). Threshold
effects of economic growth on water stress in the
Eurozone. Environmental Science and Pollution
Research, 27, 31427–31438. Springer. https://doi.org
/10.1007/s11356-020-09383-y
Gan, X., Pei, J., Pavesi, G., Yuan, S., Wang, W. (2022).
Application of intelligent methods in energy efficiency
enhancement of pump system: A review. Energy
Reports, 8, 11592–11606. ELSEVIER. https://doi.org/
10.1016/j.egyr.2022.09.016
Hans, A., Bharat, D. A. (2014). Water as a Resource:
Different Perspectives in Literature. International
Journal of Engineering Research, 3(10). IJERT. DOI:
10.17577/IJERTV3IS100054.
Herzen, J., Lässig, F., Piazzetta, S. G., Neuer, T., Tafti, L.,
Raille, G., Van Pottelbergh, T., Pasieka, M., Skrodzki,
A., Huguenin, N., Dumonal, M., Kościsz, J., Bader, D.,
Gusset, F., Benheddi, M., Williamson, C., Kosinski,
M., Petrik, M., Grosch, G. (2023). Darts: User-friendly
modern machine learning for time series. The Journal
of Machine Learning Research, 23, 1-6. arXiv.
https://doi.org/10.48550/arXiv.2110.03224
Hewamalage, H., Bergmeir, C., Bandara, K. (2022). Global
models for time series forecasting: A Simulation study.
Pattern Recognition, 124, 108441. ELSEVIER.
https://doi.org/10.1016/j.patcog.2021.108441
Hu, S., Gao, J., Zhong, D., Deng, L., Ou, C., Xin, P. (2021).
An Innovative Hourly Water Demand Forecasting
Preprocessing Framework with Local Outlier
Correction and Adaptive Decomposition Techniques.
Water, 13(5), 582. MDPI. https://doi.org/
10.3390/w13050582
Hussain, Z., Wang, Z., Wang, J., Yang, H., Arfan, M.,
Hassan, D., Wang, W., Azam, M. I., Faisal, M. (2022).
A comparative Appraisal of Classical and Holistic
Water Scarcity Indicators. Water Resources
Management, 36, 931-950. Springer. https://doi.org/
10.1007/s11269-022-03061-z
Hyndman, R. J., Koehler, A. B. (2006). Another look at
measures of forecast accuracy. International Journal of
Forecasting, 22(4), 679-688. ELSEVIER. https://doi.
org/10.1016/j.ijforecast.2006.03.001
Karamaziotis, P. I., Raptis, A., Nikolopoulos, K., Litsiou,
K., Assimakopoulos, V. (2020). An empirical
investigation of water consumption forecasting
methods. International Journal of Forecasting, 36(2),
588-606. ELSEVIER. https://doi.org/10.1016/j.
ijforecast.2019.07.009
Klise, K., Hart, D., Bynum, M., Hogge, J., Haxton, T.,
Murray, R., Burkhardt, J. (2020). Water Network Tool
for Resilience (WNTR). User Manual, Version 0.2.3.
EPA. https://doi.org/10.2172/1660790
Kofinas, D., Papageorgiou, E., Laspidou, C., Mellios, N.,
Kokkinos, K. (2016). Daily Multivariate Forecasting of
Water Demand in a Touristic Island with the Use of
Artificial Neural Network and Adaptive Neuro-Fuzzy
Inference System. 2016 International Workshop on
Cyber-Physical Systems for Smart Water Networks
(CySWater), Vienna, Austria. IEEE. https://doi.org/
10.1109/CySWater.2016.7469061
Leitão, J., Simões, N., Marques, J. A., Gil, P., Ribeiro,
B., Cardoso, A. (2019). Detecting urban water
consumption patterns: a time-series clustering
approach. Water Supply, 19(8), 2323-2329. IWA
PUBLISHING. https://doi.org/10.2166/ws.2019.113
Maira, M., Raucha, W., Sitzenfreia, R. (2014). Improving
incomplete water distribution system data. 12th
International Conference on Computing and Control
for the Water Industry (CCWI2013), Procedia
Engineering 70 (2014) 1055 – 1062.
Mesalie, R. A., Aklog, D., Kifelew, M. S. (2021). Failure
assessment for drinking water distribution system in the
case of Bahir Dar institute of technology, Ethiopia.
Applied Water Science, 11, 138. Springer.
https://doi.org/10.1007/s13201-021-01465-7
Montero-Manso, P., Hyndman, R. J. (2021). Principles and
Algorithms for Forecasting Groups of Time Series:
Locality and Globality. International Journal of
Forecasting, 37, 1632-1653. arXiv. https://doi.org/10.
1016/j.ijforecast.2021.03.004
Nair, V., Hinton, G. E. (2010). Rectified Linear Units
Improve Restricted Boltzmann Machines. Proceedings
of the 27 th International Conference on Machine
Learning, Haifa, Israel, 2010. ACM.
Niknam, A., Zare, H. K., Hosseininasab, H., Mostafaeipour,
A., Herrera, M. (2022). A Critical Review of Short-
Term Water Demand Forecasting Tools—What
Method Should I Use? Sustainability, 14(9), 5412.
MDPI. https://doi.org/10.3390/su14095412
Oreshkin, B. N., Carpov, D., Chapados, N., Bengio, Y.
(2020). N-BEATS: Neural basis expansion analysis
for interpretable time series forecasting.
(arXiv:1905.10437). arXiv. https://doi.org/10.48550/
arXiv.1905.10437
Ristow, D. C. M., Henning, E., Kalbusch, A., Petersen, C.
E. (2021). Models for forecasting water demand using
time series analysis: A case study in Southern Brazil.
Journal of Water, Sanitation and Hygiene for
Development, 11(2), 231-240. IWA PUBLISHING.
https://doi.org/10.2166/washdev.2021.208
Salloom, T., Kaynak, O., He, W. (2021). A novel deep
neural network architecture for real-time water demand
forecasting. Journal of Hydrology, 599, 126353.
ELSEVIER. https://doi.org/10.1016/j.jhydrol.2021.
126353
Sarmas, E., Spiliotis, E., Marinakis, V., Tzanes, G.,
Kaldellis, J. K., Doukas, H. (2022). ML-based energy
management of water pumping systems for the
application of peak shaving in small-scale islands.
Sustainable Cities and Society, 82, 103873.
ELSEVIER. https://doi.org/10.1016/j.scs.2022.103873
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
124
Shabani, S., Yousefi, P., Naser, G. (2017). Support Vector
Machines in Urban Water Demand Forecasting Using
Phase Space Reconstruction. Procedia Engineering,
186, 537–543. ELSEVIER.
https://doi.org/10.1016/j.proeng.2017.03.267
Stańczyk, J., Kajewska-Szkudlarek, J., Lipiński, P.,
Rychlikowski, P. (2022). Improving short-term water
demand forecasting using evolutionary algorithms.
Scientific Reports, 12, 13522. Nature publishing group.
https://doi.org/10.1038/s41598-022-17177-0
Yi, S., Kondolf, G. M., Sandoval-Solis, S., Dale, L. (2022).
Application of Machine Learning-based Energy Use
Forecasting for Inter-basin Water Transfer Project.
Water Resources Management, 36, 5675–5694.
Springer. https://doi.org/10.1007/s11269-022-03326-7.
A Proactive Approach for the Sustainable Management of Water Distribution Systems
125