Multi-Step Simulation Improvement for Time Series Using Exogenous
State Variables
Esmaeel Mohammadi
1,2 a
, Daniel Ortiz-Arroyo
3 b
, Mikkel Stokholm-Bjerregaard
1 c
and Petar Durdevic
3 d
1
Kr
¨
uger A/S, Indkildevej 6C, Aalborg, 9210, North Jutland, Denmark
2
Department of Chemistry and Bioscience, Aalborg University,
Fredrik Bajers Vej 7H, Aalborg, 9220, North Jutland, Denmark
3
AAU Energy, Aalborg University, Niels Bohrs vej 8, Esbjerg, 6700, South Jutland, Denmark
Keywords:
Deep Reinforcement Learning, Dynamic Model, Simulator, LSTM, Exogenous, Phosphorus.
Abstract:
Accurate simulation of wastewater treatment systems is essential for optimizing control strategies and ensuring
efficient operation. This study focuses on enhancing the predictive accuracy of a Long Short-Term Memory
(LSTM)-based simulator by incorporating exogenous state variables, such as temperature, flow, and process
phases, that are independent of output and control variables. The experimental results demonstrate that in-
cluding these variables significantly reduces prediction errors, measured by Mean Squared Errors (MSE) and
Dynamic Time Warping (DTW) metrics. The improved model, particularly the version that uses actual val-
ues of exogenous state variables at each simulation step, showed robust performance across different seasons,
reducing MSE by 55% and DTW by 34% compared to the model which didn’t include exogenous state vari-
ables. This approach addresses the compounding error issue in multi-step simulations, leading to more reliable
predictions and enhanced operational efficiency in wastewater treatment.
Glossary of Terms and Acronyms
LSTM: Long Short-Term Memory
MSE: Mean Squared Error
DTW: Dynamic Time Warping
WWTP: Wastewater Treatment Plant
DAD: DATA AS DEMONSTRATOR
1 INTRODUCTION
Accurate simulation of wastewater treatment sys-
tems’ behavior is essential for optimizing control
strategies and ensuring efficient operation. Based on
mathematical and statistical approaches, traditional
simulation models often face challenges in prediction
accuracy due to the non-linear, stochastic, and non-
stationary nature of system dynamics (Gujer et al.,
a
https://orcid.org/0000-0001-7109-7944
b
https://orcid.org/0000-0002-1297-3702
c
https://orcid.org/0000-0002-3714-5137
d
https://orcid.org/0000-0003-2701-9257
1995; Hansen et al., 2024; Hansen et al., 2022). Fac-
tors such as varying influent characteristics and op-
erational conditions add complexity, making accurate
predictions challenging (Gujer et al., 1995; Moham-
madi et al., 2024c; Hansen et al., 2022).
Recent advancements in deep learning, particu-
larly Long Short-Term Memory (LSTM) networks
(Hochreiter and Schmidhuber, 1997a), have shown
promise in modeling complex time series data (Mo-
hammadi et al., 2024c; Hansen et al., 2022). LSTMs
can learn long-term dependencies, making them suit-
able for predicting the behavior of complex systems
like wastewater treatment plants. However, one sig-
nificant challenge in developing a multi-step simula-
tion environment using deep learning models is the
accumulation of errors, known as compounding er-
rors (Mohammadi et al., 2024c; Mohammadi et al.,
2024a).
The previous work by (Gao et al., 2023) incorpo-
rated exogenous variables within LSTM models for
multivariate time series prediction, focusing on using
Neural ODEs to enhance prediction smoothness and
interpretability. However, their approach did not di-
rectly address the critical issue of compounding errors
Mohammadi, E., Ortiz-Arroyo, D., Stokholm-Bjerregaard, M. and Durdevic, P.
Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables.
DOI: 10.5220/0012927300003822
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024) - Volume 1, pages 651-659
ISBN: 978-989-758-717-7; ISSN: 2184-2809
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
651
in multi-step forecasting, which is particularly rele-
vant in dynamic systems like wastewater treatment.
This study differentiates itself by specifically tar-
geting the reduction of compounding errors in multi-
step simulations by strategically including exogenous
variables. Building on the iterative improvement
method proposed in (Mohammadi et al., 2024a), our
approach integrates exogenous variables directly into
the model’s predictive framework. Unlike previous
studies, we explicitly evaluate and compare scenar-
ios with and without exogenous variables, quantifying
their impact on prediction accuracy and error propa-
gation over time.
An improved LSTM-based simulator was devel-
oped to train deep reinforcement learning algorithms
for wastewater treatment, focusing on the phospho-
rus removal process. The LSTM model was used as a
benchmark to investigate the improvement of the sim-
ulation using the proposed method. The core contri-
bution is incorporating the exogenous state variables
into the LSTM model and using their actual values at
each step of the multi-step simulation. This method
aims to enhance prediction accuracy by preventing
the accumulation of prediction errors over multiple
steps. The experimental results demonstrated that the
incorporation of the exogenous variables improves
the simulation accuracy by 55% and 34% in terms of
mean squared errors (MSE) and dynamic time warp-
ing (DTW), respectively.
1.1 Contributions
This paper introduces several key contributions to the
field of time series prediction and multi-step simula-
tion:
An Improved Multi-step Forecasting Model:
We incorporate exogenous variables like temper-
ature, flow, and process phases, enhancing the ac-
curacy of long-term predictions by reducing com-
pounding errors in multi-step simulations.
Extensive Experimental Validation: Our
method, validated with real-world wastewater
treatment data, shows significant improvements
over state-of-the-art models in MSE and DTW
metrics.
Improved Simulation for Reinforcement
Learning: The enhanced LSTM-based simulator
improves the training of deep reinforcement
learning algorithms, enabling more effective con-
trol strategies in wastewater treatment processes.
2 METHODS
This section provides an overview of our research
data and methods. It details the wastewater treat-
ment plant data, including exogenous state variables,
and presents a mathematical proof of reduced mul-
tistep error with these variables. Finally, it outlines
the LSTM model structure, training procedure, exper-
imental design, and the hardware and software used.
2.1 The Plant and Dataset
This study focuses on the data from Kolding Cen-
tral WWTP in Agtrup, Denmark. The time-series
dataset for two years was collected through the
Hubgrade
TM
Performance Plant system, designed by
Kr
¨
uger/Veolia (Kr
¨
uger A/S, 2023). Data preprocess-
ing played a crucial role in enhancing model perfor-
mance. The raw wastewater treatment data was ini-
tially normalized using the Min-Max technique, scal-
ing the features from 0 to 1. Feature selection was
guided by principal component and correlation anal-
ysis. Variables of the system that demonstrated the
highest correlation with the target variable, Phosphate
concentration, were selected as inputs for the model,
in conjunction with the target variable itself and the
action variable, Metal dosage. Details about the plant,
dataset, and preprocessing are provided in (Moham-
madi et al., 2024b; Mohammadi et al., 2024c).
2.2 Incorporation of Exogenous State
Variables
Different variables in the plant and their impact on the
accuracy of the phosphorus removal process simula-
tion were investigated. To do this, two dataset varia-
tions, named IOPPo and IOPTQCfFiFoPo, were cre-
ated, each incorporating a unique combination of ex-
ogenous and state variables (detailed in Table 1). The
primary distinction between these datasets is the in-
clusion or exclusion of the exogenous state variables.
This approach analyzed how including the exogenous
state variables affects the simulation’s accuracy.
In control theory, the state-space equations consist
of two main components: the state and output. The
state equation is given by (Hespanha, 2018; Goodwin
et al., 2001):
x(t + 1) = Ax(t) + Bu(t) + Ew(t) (1)
In this equation, x(t) represents the state vector,
which encapsulates the internal condition of the sys-
tem at time t. The matrix A is known as the state ma-
trix and defines the dynamics of the state vector x(t)
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
652
Table 1: Notations for the variables used in datasets. The
type describes a control variable (C), exogenous variable
(E), and objective variable (O) (Mohammadi et al., 2024b).
Symbol Type Description Unit
I C Flow of the Iron to the biology tanks L/h
O C Dissolved oxygen mg/L
P C Flow of the PAX to the settler L/h
T E Temperature of the biology tank °C
Q E Flow of the wastewater to the biology tank m
3
/h
Cf E Maximum critical function percentage %
Fi E Process phase at the inlet (tank 1 or 2). -
Fo E Process phase at the outlet (tank 1 or 2). -
Po O Phosphate concentration in the biology tank mg/L
without any external input. The vector u(t) is the con-
trol input, representing the actions taken to control the
system, and B is the input matrix describing how the
control inputs affect the state vector. The term Ew(t)
includes the exogenous state variables w(t), which are
external disturbances or inputs that affect the system,
with E being the matrix that shows how these distur-
bances impact the state vector. The output equation is
given by:
y(t) = Cx(t) + Dν(t) (2)
Here, y(t) is the output vector, representing the
measurable outputs of the system. The matrix C is
the output matrix, which maps the state vector x(t) to
the output y(t). The vector ν(t), which is the output
noise, also influences the output through the feedfor-
ward matrix D.
Assume a system where the state, control, and ex-
ogenous state variables at each time step t are repre-
sented by x(t) R
n
x
, w(t) R
n
w
and u(t) R
n
u
, The
state vector x(t) can be partitioned into two compo-
nents:
x(t) =
x
c
(t)
x
e
(t)
(3)
where x
c
(t) R
n
c
are objective, or the control-
lable state variables affected by the control inputs
u(t). Moreover, x
e
(t) R
n
e
are the exogenous state
variables not affected by u(t) but affecting the objec-
tive variables. The state-space equations considering
exogenous state variables can be written as:
(
x
c
(t + 1) = A
c
x
c
(t) + B
c
u(t) + E
c
x
e
(t)
x
e
(t + 1) = A
e
x
e
(t) + w(t)
(4)
In these equations, A
c
R
n
c
×n
c
is the state matrix
for the objective variables, defining their dynamics.
The matrix B
c
R
n
c
×n
u
is the input matrix describ-
ing how the control inputs affect the objective vari-
ables. The term E
c
R
n
c
×n
e
represents the influence
of the exogenous state variables on the objective vari-
ables. For the exogenous state variables, A
e
R
n
e
×n
e
is the state matrix that defines their dynamics. Lastly,
w(t) R
n
e
represents the external disturbances that
impact the exogenous state variables.
The output equation considering the effect of ex-
ogenous state variables is:
y(t) = C
c
x
c
(t) +C
e
x
e
(t) + Du(t) (5)
Where, y(t) R
n
y
represents the output vector,
which contains the objective variables. The matrix
C
c
R
n
y
×n
c
maps the objective variables to the out-
put, while C
e
R
n
y
×n
e
maps the exogenous state vari-
ables to the output. Additionally, D R
n
y
×n
u
is the
feedforward matrix that describes the direct influence
of the control inputs on the output. Finally, the exoge-
nous state variables are those components of the state
vector that influence the objective variables but are
not influenced by the control inputs. They capture the
effects of external disturbances and are crucial for ac-
curately modeling and controlling complex systems.
Two learned non-linear and sequence-to-sequence
models ( f ) for the prediction of time series data D
which output the state of the system at time t + 1 are
defined as follows:
f
A
(Exogenous State Variables in the Model’s
Prediction):
ˆ
s(t +1) =
ˆ
x
c
(t +1) +
ˆ
x
e
(t +1) = f
A
(
ˆ
X
c
,
ˆ
X
e
,U(t))
(6)
f
B
(No Exogenous State Variables in the
Model’s Prediction):
ˆ
s(t + 1) =
ˆ
x
c
(t + 1) = f
B
(
ˆ
X
c
,X
e
(t),U(t)) (7)
Where
ˆ
s(t + 1) R
n
out
is the prediction of the
model with n
out
being the number of the variables
in the prediction which is n
c
+ n
e
for the model f
A
and n
c
for the model f
B
. Moreover,
ˆ
x
c
(t + 1) R
n
c
and
ˆ
x
e
(t + 1) R
n
e
are the prediction of the model
for state and exogenous state variables. The terms
X
c
(t) R
l×n
c
, X
e
(t) R
l×n
e
, and U(t) R
l×n
u
de-
fine the historical record of the system’s state, exoge-
nous, and control variables at time t. When using
these models as simulators over h Z
+
horizon, if
the prediction error at each time step is defined using
the Euclidean distance, expressed as e
t
=
ˆ
s
t
s
t
,
then the system’s state returned by the models at the
time step t +h can be described as (Mohammadi et al.,
2024c):
ˆ
s
t+h
= f (... f ( f (s
t
,a
t
)+e
t
,a
t+1
)+e
t+1
...,a
t+h
)+e
t+h
(8)
The accuracy of each model as a simulator de-
pends on the prediction errors from the beginning of
the simulation until the end because the error at each
step is accumulated and results in lower accuracy as
the simulation continues. This compounding error is-
sue is addressed in (Mohammadi et al., 2024a), where
Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables
653
an improvement method was implemented to reduce
the error at each simulation step, minimizing the ef-
fect of compounding error. The prediction error at the
time step e
t+k
with k Z
+
and 0 < k h, for
ˆ
f
A
is:
e
ˆ
f
A
t+k
= e
t+k,c
+ e
t+k,e
(9)
While the prediction error for
ˆ
f
B
is defined as:
e
ˆ
f
B
t+k
= e
t+k,c
(10)
The trained models can be improved to minimize
multi-step simulation errors by employing an itera-
tive method that incorporates the model’s predictions
back into the input at each training step (Mohammadi
et al., 2024a). This approach, inspired by the work
of (Venkatraman et al., 2015), is specifically adapted
for recursive multi-step forecasting. During training,
the model forecasts across the designated prediction
horizon for each input-output pair from the dataset.
At each step, the model’s output is fed back into the
input for the subsequent step’s prediction, continuing
this process until the horizon is reached. The loss
is calculated for each one-step prediction and used
for learning, enabling the model to correct errors at
each step before proceeding, thereby preventing er-
ror propagation throughout the forecasting or simula-
tion horizon. Assuming
ˆ
f
A
and
ˆ
f
B
to be the improved
versions of learned f
A
and f
B
based on (Mohammadi
et al., 2024a), it can be concluded that:
Theorem 1. If e
ˆ
f
A
t+k
and e
ˆ
f
B
t+k
are the prediction
errors of the simulators based on f
A
and f
B
at time
step t +k, then f
A
{ f
A
,
ˆ
f
A
}, f
B
{ f
B
,
ˆ
f
B
} we have
e
ˆ
f
B
t+k
e
ˆ
f
A
t+k
.
Proof. exogenous state variables are unaffected by
the system’s state and control variables, so their val-
ues as a part of the model’s input at each simulation
step can be sampled from the offline data D. With the
same state and control variables at each step, the input
at time step t + k for model
ˆ
f
A
is:
S(t) =
ˆ
X
c
+
ˆ
X
e
(t) + U(t) (11)
While the input to the model
ˆ
f
B
is:
S(t) =
ˆ
X
c
+ X
e
(t) + U(t) (12)
The error term e
t+k,e
for the exogenous state variables
in the model
ˆ
f
a
is calculated as:
e
ˆ
f
A
t+k,e
=
n
e
m=1
ˆ
x
e{t+1,m}
x
e{t+1,m}
0 (13)
While for
ˆ
f
B
, as the exogenous state variables are
sampled from the real values in D , the error term
e
t+k,e
for them in the model is equal to zero:
e
ˆ
f
B
t+k,e
=
n
e
m=1
x
e{t+1,m}
x
e{t+1,m}
= 0 (14)
It can be concluded from equations 13 and 14 that the
error term for exogenous state variables which will
affect the model’s prediction for the next step in the
model
ˆ
f
B
will always be lower or equal to the error
term in the model
ˆ
f
A
:
e
ˆ
f
B
t+k
e
ˆ
f
A
t+k
(15)
2.3 The LSTM Model
The LSTM architecture as shown in Figure 1 was de-
signed with multiple layers to capture complex pat-
terns in wastewater treatment data. Specifically, the
network comprises two LSTM layers, each consist-
ing of 256 units, as described in (Mohammadi et al.,
2024c). We employed the ’tanh’ activation function
in the LSTM layers to facilitate non-linear learning.
The model also included a dropout rate of 0.15 to pre-
vent overfitting. The input to the model at each step
consisted of a history of time steps, including all of
the system’s state variables, while the output was a
single-step prediction of the system’s state. The train-
ing and validation procedure of the base LSTM model
is explained in (Mohammadi et al., 2024c).
While the input provided to the LSTM model
consists of a historical record of the system’s state
X
c
(t) R
l×n
c
, exogenous state X
e
(t) R
l×n
e
and
control variables U(t) R
l×n
u
, detailed as follows:
S(t) = X
c
(t) + X
e
(t) + U(t) =
x
c
(t)
x
c
(t 1)
.
.
.
x
c
(t l)
+
x
e
(t)
x
e
(t 1)
.
.
.
x
e
(t l)
+
u(t)
u(t 1)
.
.
.
u(t l)
(16)
Where S(t) R
l×n
is the input to the LSTM model
at time t, with n = n
c
+ n
e
+ n
u
representing the num-
ber of the input variables. The output of the model at
each time step t will be as follows:
ˆ
S(t + 1) =
ˆ
x(t + 1)
ˆ
x(t + 2)
.
.
.
ˆ
x(t + p)
(17)
Where p Z
+
represents the model’s output se-
quence length, which in this study is set to 1, conse-
quently leading to
ˆ
S(t + 1) =
ˆ
x(t + 1). The model’s
output can be determined from equations 6 and 7,
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
654
Input
LSTM LSTM LSTM LSTM
=
LSTM
h
0
x
0
h
1
x
1
h
2
x
2
h
t
x
t
h
t
x
t
. . .
LSTM Layer
Fully Connected Layer
Output
Figure 1: The structure of the LSTM (Hochreiter and Schmidhuber, 1997b) model for time series forecasting tasks, where
(x
0
,..., x
t
), and (h
0
,..., h
t
) represent the input and the hidden state (output) of each LSTM cell (Mohammadi et al., 2024c).
depending on whether exogenous state variables are
included in the prediction. The prediction error of
LSTM model at time t + 1 can be calculated as:
L
t+1
=
1
n
x
n
x
d=1
ˆ
x
{t+1,d}
x
{t+1,d}
2
(18)
Common training methods, known as teacher-
forcing or supervised learning, utilize backpropaga-
tion and minimize the single-step loss function for
each training batch. Each training batch D contains
z Z
+
input-output pairs sampled from the dataset
D, where D
i
= (S(t)
i
,x(t + 1)
i
). The optimization
is performed over the model parameters θ as follows
(Mohammadi et al., 2024a):
θ
= argmin
θ
z
i=1
n
x
d=1
(
ˆ
x
t+1,d
)
i
(x
t+1,d
)
i
2
(19)
where θ represents the parameters of the model,
and (
ˆ
x
t+1,d
)
i
and (x
t+1,d
)
i
are the predicted and true
values of the d-th dimension of the system’s state at
time t + 1 for the i-th pair in the batch D.
2.4 Experiments Design
To investigate the experimental validation of Theo-
rem 1, three distinct LSTM models were trained to
evaluate the impact of incorporating additional vari-
ables. Table 2 details these models, highlighting their
differences. While the models share the same struc-
ture and hyperparameters, they vary in including ex-
ogenous state variables, such as temperature, flow,
maximum critical function value, and process phases,
in the model output. The base model, trained with
teacher-forcing, and improved versions with iterative
training (E1, E2, E3, and E4), as described in (Mo-
hammadi et al., 2024a), which employed a method
similar to the ”DATA AS DEMONSTRATOR (DAD)
approach. The DILATE loss function (Le Guen and
Thome, 2019) was also used in the improvements, and
the results were compared to those obtained using the
Mean Squared Errors (MSE) loss function.
This study compares scenarios where only control
and state variables are present in the model input ( f
0
),
and exogenous state variables are included ( f
A
and
f
B
). Furthermore, it compares two cases regarding
the model output: in one case, the predicted values
of the exogenous state variables are used at each sim-
ulation step ( f
A
), while in the other case, the actual
values of the exogenous state variables are utilized in
the simulation ( f
B
).
2.5 Software and Hardware
All of the tests for the simulation environment are im-
plemented in programming language Python by us-
ing the Gym (Brockman et al., 2016) and PyTorch
(Paszke et al., 2019) libraries. The AI Cloud service
from Aalborg University is used for GPU-based com-
putations. The used compute nodes are each equipped
with 2 × 24-core Intel Xeon CPUs, 1.5 TB of system
RAM, and one NVIDIA Tesla V100 GPU with 32 GB
of RAM, all connected via NVIDIA NVLink.
3 RESULTS
This section presents the experimental results of in-
corporating exogenous state variables into the multi-
step simulation of time series data for wastewater
treatment.
3.1 Experimental Results
The LSTM model, as described in Section 2.3, was
trained on various data combinations. To assess its
effectiveness, we compared its performance with sev-
eral state-of-the-art time series prediction models, as
reported in (Mohammadi et al., 2024c). The experi-
ments utilized three dataset variations ( f
0
, f
A
, and f
B
),
Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables
655
Table 2: The explanations of the trained models.
Models Name Explanation
IOPPo-10i1o f
0
Without the exogenous state variables
IOPTQCfFiFoPo-15i6o f
A
Exogenous State Variables in the Model’s Prediction
IOPTQCfFiFoPo-15i1o f
B
No Exogenous State Variables in the Model’s Prediction
each incorporating different combinations of exoge-
nous state variables as detailed in Table 2. Moreover,
Table 3 displays the average MSE and DTW metrics
for the base model and its improved versions (E1, E2,
E3, and E4) over one year. The best MSE and DTW
values for each model are highlighted in bold, while
the best values for each version and dataset variation
are underlined.
After analyzing the various model versions and
calculating the average MSE and DTW over a year-
long daily simulation, the versions with the lowest
DTW were selected for further comparison. The hy-
perparameters of the best-improved versions for each
model are listed in Table 4. Additionally, Figure 2 vi-
sually represents the daily DTW values over the year,
illustrating the error changes. Each point in Figure 2
represents the average loss of a simulation sequence
that begins with the data at the start of the day and
recursively predicts the output until the end of the
day. The initial input to the model for each simula-
tion point consists of a system history, incorporating
all input variables over a specific lookback window.
For this study, the lookback window was set to 240
steps for all models. The blue color in Figure 2 in-
dicates the number of time steps that include actual
values from the dataset used as model input at each
step. Once the simulation reaches the point where the
number of simulation steps equals the lookback win-
dow, it completely runs out of actual time steps. Be-
yond this point, all inputs to the model are predictions
generated by the model itself.
Finally, the best versions of the models were used
to simulate a period of 720 steps (24 hours) for differ-
ent points of the year. The results of these simulations
are shown in Figure 3, and the metrics for each point
can be found in Table 5.
3.1.1 Improvement Performance
The average MSE and DTW in Table 3 are presented
by columns to compare the performance of different
versions of improved models across the entire dataset.
The f
0
model, which does not include exogenous state
variables, showed significantly higher MSE and DTW
values than those that incorporated exogenous state
variables. The best version of the f
0
model achieves
an MSE of 0.4848 and a DTW of 1.7716. Using the
actual values of exogenous state variables at each step
( f
B
) results in lower error values, with the best version
achieving an MSE of 0.2165 and a DTW of 1.1646.
This improvement represents a 64.49% reduction in
MSE and a 32.37% reduction in DTW compared to
the best f
A
version, and a 55.35% reduction in MSE
and a 34.27% reduction in DTW compared to the
best version of f
0
model. These results highlight the
significant benefits of incorporating exogenous state
variables and using their actual values to enhance the
predictive accuracy of the LSTM-based simulator.
4 DISCUSSION
Incorporating exogenous variables like temperature,
flow, maximum critical function value, and process
phases into the LSTM model significantly enhanced
the predictive accuracy and robustness of the wastew-
ater treatment simulator, as evidenced by lower MSE
and DTW values. These exogenous variables pro-
vided a more comprehensive representation of the
system, capturing critical aspects that state variables
alone couldn’t, such as the impact of temperature
and flow on phosphorus removal. This allowed the
model to predict system behavior more accurately un-
der varying conditions.
Including exogenous state variables in the model
led to a significant reduction in MSE and DTW. The
enhanced model, especially when using actual exoge-
nous values at each simulation step, demonstrated ro-
bust performance, cutting MSE by 55% and DTW
by 34% compared to the model without these vari-
ables. This reduction indicates that the model’s abil-
ity to capture complex dynamics of the WWT process
improved, resulting in more precise simulations. The
f
A
model, which included exogenous state variables
in the output, consistently showed improved perfor-
mance across different months and seasons. This
highlights the model’s ability to predict these addi-
tional factors, enhancing its accuracy. Moreover, the
f
B
model, which incorporated exogenous state vari-
ables in the input but not in the output and used their
actual values at each simulation step, outperformed
the f
A
model in all versions. This underscores the im-
portance of exogenous state variables in representing
the system’s state and their independence from con-
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
656
Table 3: The average Mean Squared Error and Dynamic Time Warping data for the base model and improved versions during
different months of the year. The best MSE and DTW values for each model and experiment are highlighted in bold and
underlined, respectively.
Models
Base Model E1 E2 E3 E4
MSE DTW MSE DTW MSE DTW MSE DTW MSE DTW
f
0
449.4486 56.1453 0.5684 1.8893 0.4848 1.8369 0.5027 1.7716 0.5096 1.9028
f
A
28.6279 11.1544 0.6536 1.7978 0.6981 1.8361 0.6098 1.7219 0.8137 2.0372
f
B
13.1450 9.8183 0.2165 1.1646 0.2436 1.2566 0.3496 1.5075 0.2788 1.3412
Average 163.7405 25.7060 0.4795 1.6172 0.4755 1.6432 0.4874 1.6670 0.5340 1.7604
Figure 2: The loss for the best experiments of all dataset types with or without the exogenous state variables.
Table 4: The parameters of the best improved checkpoints
for each experiment. Ex.: Experiment number, Ep.: Im-
provement epochs, Min EL and Max EL: Minimum and
Maximum episode length during the improvement, Loss F.:
The improvement loss function, and Alpha: Alpha in DI-
LATE loss function.
Models
Parameters
Ex. Ep. Min EL Max EL Loss F. Alpha
f
0
E3 80 240 240 Dilate 0.6
f
A
E3 80 240 240 Dilate 0.6
f
B
E1 80 240 240 MSE -
trol and objective variable changes. Using indepen-
dent exogenous state variables allows the model to
leverage real-time actual values from the dataset, re-
sulting in better simulation and accurately capturing
the system’s dynamics.
Using actual values of exogenous state variables
at each simulation step (model f
B
) mitigated the ac-
Table 5: The average Mean Squared Error and Dynamic
Time Warping data for each model in the different points of
the year. The best MSE and DTW values for each point and
model are highlighted in bold and underlined, respectively.
Points
f
0
f
A
f
B
MSE DTW MSE DTW MSE DTW
September 0.9570 21.0640 1.3240 17.2590 0.8260 15.1950
December 0.6600 13.3860 0.7580 13.1930 0.2960 9.4360
March 0.5800 7.2210 0.5510 6.2090 0.1770 5.6840
June 0.7510 19.2260 1.0440 15.5430 0.4770 13.6430
Average 0.7370 14.8400 0.9190 13.0510 0.4440 11.3740
cumulation of prediction errors over multiple steps,
reducing MSE by 55% and DTW by 34%. This ap-
proach kept the model’s input closely aligned with re-
ality, enhancing overall simulation accuracy. By cor-
recting itself at each step, the model prevented errors
from propagating throughout the simulation.
Performance improvement was consistent
throughout one year in the collected dataset, with
Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables
657
Figure 3: The simulation for the best experiments of each model. The number of past actual values in the input shown by the
blue color decreases as the simulation reaches a point where all the input history is the predictions.
some variations observed. Notably, the model’s
performance improved in March, when the lowest
MSE and DTW values were recorded. This suggests
that the model can adapt to seasonal changes in the
wastewater treatment process. Seasonal variations
can affect inflow characteristics and process dynam-
ics. The model’s ability to maintain accuracy across
these variations is a testament to its effectiveness.
Finally, the iterative improvement method re-
ported in (Mohammadi et al., 2024a) enhanced the
accuracy of the models, bringing them closer to the
system’s dynamics. Including exogenous state vari-
ables at each step assists the model in reducing pre-
diction errors, which are often compounded from pre-
vious steps.
While including exogenous variables has clear
benefits, it also presents challenges. These variables
may not always be accurately measured or available in
real-time, potentially impacting model performance.
Additionally, the increased computational complex-
ity and need for large datasets could limit this ap-
proach’s applicability in smaller systems with limited
data. Despite these challenges, the method signifi-
cantly improves wastewater treatment simulations.
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
658
5 CONCLUSIONS
The study successfully demonstrated the advantages
of incorporating exogenous state variables into an
LSTM-based simulator for wastewater treatment. The
improved model, particularly the f
B
, significantly en-
hanced prediction accuracy and robustness. The key
conclusions are:
Enhanced Accuracy: Including exogenous state
variables markedly improved the model’s accu-
racy, as evidenced by lower MSE and DTW val-
ues across the year.
Error Mitigation: Using actual values of exoge-
nous state variables at each simulation step re-
duced MSE by 55% and DTW by 34%, effectively
mitigating the compounding of prediction errors
and leading to more reliable simulations.
Broad Applicability: The model demonstrated
robust performance across different seasonal con-
ditions, highlighting its potential applicability in
diverse operational settings.
Future Work: Future research could explore
integrating more external factors and applying
similar methods to other aspects of wastewater
treatment. Additionally, attention-based models
and GPU training, as suggested by (Mohammadi
et al., 2024c), could enhance efficiency and reduce
computational time.
The improved LSTM-based simulator represents
a significant advancement in wastewater treatment
modeling. It offers a powerful tool for optimiz-
ing control strategies and enhancing operational ef-
ficiency.
ACKNOWLEDGEMENTS
The RecaP project has received funding from the Eu-
ropean Union’s Horizon 2020 research and innovation
programme under the Marie Skłodowska-Curie grant
agreement No 956454. Disclaimer: This publication
reflects only the author’s view; the Research Execu-
tive Agency of the European Union is not responsible
for any use that may be made of this information.
REFERENCES
Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,
Schulman, J., Tang, J., and Zaremba, W. (2016). Ope-
nai gym. arXiv preprint arXiv:1606.01540.
Gao, P., Yang, X., Zhang, R., Guo, P., Goulermas, J. Y.,
and Huang, K. (2023). Egpde-net: Building contin-
uous neural networks for time series prediction with
exogenous variables.
Goodwin, G. C., Graebe, S. F., Salgado, M. E., et al. (2001).
Control system design, volume 240. Prentice Hall Up-
per Saddle River.
Gujer, W., Henze, M., Mino, T., Matsuo, T., Wentzel, M. C.,
and Marais, G. (1995). The activated sludge model no.
2: biological phosphorus removal. Water science and
technology, 31(2):1–11.
Hansen, L. D., Stentoft, P. A., Ortiz-Arroyo, D., and Dur-
devic, P. (2024). Exploring data quality and sea-
sonal variations of n2o in wastewater treatment: a
modeling perspective. Water Practice & Technology,
19(3):1016–1031.
Hansen, L. D., Stokholm-Bjerregaard, M., and Durdevic, P.
(2022). Modeling phosphorous dynamics in a wastew-
ater treatment process using bayesian optimized lstm.
Computers & Chemical Engineering, 160:107738.
Hespanha, J. P. (2018). Linear systems theory. Princeton
university press.
Hochreiter, S. and Schmidhuber, J. (1997a). Long Short-
Term Memory. Neural Computation, 9(8):1735–1780.
Hochreiter, S. and Schmidhuber, J. (1997b). Long short-
term memory. Neural Computation, 9:1735–1780.
Kr
¨
uger A/S (2023). Hubgrade performance plant. https:
//www.kruger.dk/english/hubgrade-advanced-onlin
e-control. Accessed: 2023-11-30.
Le Guen, V. and Thome, N. (2019). Shape and time dis-
tortion loss for training deep time series forecasting
models. Advances in neural information processing
systems, 32.
Mohammadi, E., Ortiz-Arroyo, D., Stokholm-Bjerregaard,
M., Hansen, A. A., and Durdevic, P. (2024a). Im-
proved long short-term memory-based wastewater
treatment simulators for deep reinforcement learning.
arXiv preprint arXiv:2403.15091.
Mohammadi, E., Rani, A., Stokholm-Bjerregaard, M.,
Ortiz-Arroyo, D., and Durdevic, P. (2024b). Wastew-
ater treatment plant data for nutrient removal system.
Mohammadi, E., Stokholm-Bjerregaard, M., Hansen, A. A.,
Nielsen, P. H., Ortiz-Arroyo, D., and Durdevic, P.
(2024c). Deep learning based simulators for the
phosphorus removal process control in wastewater
treatment via deep reinforcement learning algorithms.
Engineering Applications of Artificial Intelligence,
133:107992.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,
Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,
Antiga, L., et al. (2019). Pytorch: An imperative style,
high-performance deep learning library. Advances in
neural information processing systems, 32.
Venkatraman, A., Hebert, M., and Bagnell, J. (2015). Im-
proving multi-step prediction of learned time series
models. In Proceedings of the AAAI Conference on
Artificial Intelligence, volume 29.
Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables
659