Multi-Step Simulation Improvement for Time Series Using Exogenous

State Variables

Esmaeel Mohammadi

1,2 a

, Daniel Ortiz-Arroyo

3 b

, Mikkel Stokholm-Bjerregaard

1 c

and Petar Durdevic

3 d

uger A/S, Indkildevej 6C, Aalborg, 9210, North Jutland, Denmark

Department of Chemistry and Bioscience, Aalborg University,

Fredrik Bajers Vej 7H, Aalborg, 9220, North Jutland, Denmark

AAU Energy, Aalborg University, Niels Bohrs vej 8, Esbjerg, 6700, South Jutland, Denmark

Keywords:

Deep Reinforcement Learning, Dynamic Model, Simulator, LSTM, Exogenous, Phosphorus.

Abstract:

Accurate simulation of wastewater treatment systems is essential for optimizing control strategies and ensuring

efﬁcient operation. This study focuses on enhancing the predictive accuracy of a Long Short-Term Memory

(LSTM)-based simulator by incorporating exogenous state variables, such as temperature, ﬂow, and process

phases, that are independent of output and control variables. The experimental results demonstrate that in-

cluding these variables signiﬁcantly reduces prediction errors, measured by Mean Squared Errors (MSE) and

Dynamic Time Warping (DTW) metrics. The improved model, particularly the version that uses actual val-

ues of exogenous state variables at each simulation step, showed robust performance across different seasons,

reducing MSE by 55% and DTW by 34% compared to the model which didn’t include exogenous state vari-

ables. This approach addresses the compounding error issue in multi-step simulations, leading to more reliable

predictions and enhanced operational efﬁciency in wastewater treatment.

Glossary of Terms and Acronyms

• LSTM: Long Short-Term Memory

• MSE: Mean Squared Error

• DTW: Dynamic Time Warping

• WWTP: Wastewater Treatment Plant

• DAD: DATA AS DEMONSTRATOR

1 INTRODUCTION

Accurate simulation of wastewater treatment sys-

tems’ behavior is essential for optimizing control

strategies and ensuring efﬁcient operation. Based on

mathematical and statistical approaches, traditional

simulation models often face challenges in prediction

accuracy due to the non-linear, stochastic, and non-

stationary nature of system dynamics (Gujer et al.,

https://orcid.org/0000-0001-7109-7944

https://orcid.org/0000-0002-1297-3702

https://orcid.org/0000-0002-3714-5137

https://orcid.org/0000-0003-2701-9257

1995; Hansen et al., 2024; Hansen et al., 2022). Fac-

tors such as varying inﬂuent characteristics and op-

erational conditions add complexity, making accurate

predictions challenging (Gujer et al., 1995; Moham-

madi et al., 2024c; Hansen et al., 2022).

Recent advancements in deep learning, particu-

larly Long Short-Term Memory (LSTM) networks

(Hochreiter and Schmidhuber, 1997a), have shown

promise in modeling complex time series data (Mo-

hammadi et al., 2024c; Hansen et al., 2022). LSTMs

can learn long-term dependencies, making them suit-

able for predicting the behavior of complex systems

like wastewater treatment plants. However, one sig-

niﬁcant challenge in developing a multi-step simula-

tion environment using deep learning models is the

accumulation of errors, known as compounding er-

rors (Mohammadi et al., 2024c; Mohammadi et al.,

2024a).

The previous work by (Gao et al., 2023) incorpo-

rated exogenous variables within LSTM models for

multivariate time series prediction, focusing on using

Neural ODEs to enhance prediction smoothness and

interpretability. However, their approach did not di-

rectly address the critical issue of compounding errors

Mohammadi, E., Ortiz-Arroyo, D., Stokholm-Bjerregaard, M. and Durdevic, P.

Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables.

DOI: 10.5220/0012927300003822

In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024) - Volume 1, pages 651-659

ISBN: 978-989-758-717-7; ISSN: 2184-2809

651

in multi-step forecasting, which is particularly rele-

vant in dynamic systems like wastewater treatment.

This study differentiates itself by speciﬁcally tar-

geting the reduction of compounding errors in multi-

step simulations by strategically including exogenous

variables. Building on the iterative improvement

method proposed in (Mohammadi et al., 2024a), our

approach integrates exogenous variables directly into

the model’s predictive framework. Unlike previous

studies, we explicitly evaluate and compare scenar-

ios with and without exogenous variables, quantifying

their impact on prediction accuracy and error propa-

gation over time.

An improved LSTM-based simulator was devel-

oped to train deep reinforcement learning algorithms

for wastewater treatment, focusing on the phospho-

rus removal process. The LSTM model was used as a

benchmark to investigate the improvement of the sim-

ulation using the proposed method. The core contri-

bution is incorporating the exogenous state variables

into the LSTM model and using their actual values at

each step of the multi-step simulation. This method

aims to enhance prediction accuracy by preventing

the accumulation of prediction errors over multiple

steps. The experimental results demonstrated that the

incorporation of the exogenous variables improves

the simulation accuracy by 55% and 34% in terms of

mean squared errors (MSE) and dynamic time warp-

ing (DTW), respectively.

1.1 Contributions

This paper introduces several key contributions to the

ﬁeld of time series prediction and multi-step simula-

tion:

• An Improved Multi-step Forecasting Model:

We incorporate exogenous variables like temper-

ature, ﬂow, and process phases, enhancing the ac-

curacy of long-term predictions by reducing com-

pounding errors in multi-step simulations.

• Extensive Experimental Validation: Our

method, validated with real-world wastewater

treatment data, shows signiﬁcant improvements

over state-of-the-art models in MSE and DTW

metrics.

• Improved Simulation for Reinforcement

Learning: The enhanced LSTM-based simulator

improves the training of deep reinforcement

learning algorithms, enabling more effective con-

trol strategies in wastewater treatment processes.

2 METHODS

This section provides an overview of our research

data and methods. It details the wastewater treat-

ment plant data, including exogenous state variables,

and presents a mathematical proof of reduced mul-

tistep error with these variables. Finally, it outlines

the LSTM model structure, training procedure, exper-

imental design, and the hardware and software used.

2.1 The Plant and Dataset

This study focuses on the data from Kolding Cen-

tral WWTP in Agtrup, Denmark. The time-series

dataset for two years was collected through the

Hubgrade

Performance Plant system, designed by

uger/Veolia (Kr

uger A/S, 2023). Data preprocess-

ing played a crucial role in enhancing model perfor-

mance. The raw wastewater treatment data was ini-

tially normalized using the Min-Max technique, scal-

ing the features from 0 to 1. Feature selection was

guided by principal component and correlation anal-

ysis. Variables of the system that demonstrated the

highest correlation with the target variable, Phosphate

concentration, were selected as inputs for the model,

in conjunction with the target variable itself and the

action variable, Metal dosage. Details about the plant,

dataset, and preprocessing are provided in (Moham-

madi et al., 2024b; Mohammadi et al., 2024c).

2.2 Incorporation of Exogenous State

Variables

Different variables in the plant and their impact on the

accuracy of the phosphorus removal process simula-

tion were investigated. To do this, two dataset varia-

tions, named IOPPo and IOPTQCfFiFoPo, were cre-

ated, each incorporating a unique combination of ex-

ogenous and state variables (detailed in Table 1). The

primary distinction between these datasets is the in-

clusion or exclusion of the exogenous state variables.

This approach analyzed how including the exogenous

state variables affects the simulation’s accuracy.

In control theory, the state-space equations consist

of two main components: the state and output. The

state equation is given by (Hespanha, 2018; Goodwin

et al., 2001):

x(t + 1) = Ax(t) + Bu(t) + Ew(t) (1)

In this equation, x(t) represents the state vector,

which encapsulates the internal condition of the sys-

tem at time t. The matrix A is known as the state ma-

trix and deﬁnes the dynamics of the state vector x(t)

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

652

Table 1: Notations for the variables used in datasets. The

type describes a control variable (C), exogenous variable

(E), and objective variable (O) (Mohammadi et al., 2024b).

Symbol Type Description Unit

I C Flow of the Iron to the biology tanks L/h

O C Dissolved oxygen mg/L

P C Flow of the PAX to the settler L/h

T E Temperature of the biology tank °C

Q E Flow of the wastewater to the biology tank m

Cf E Maximum critical function percentage %

Fi E Process phase at the inlet (tank 1 or 2). -

Fo E Process phase at the outlet (tank 1 or 2). -

Po O Phosphate concentration in the biology tank mg/L

without any external input. The vector u(t) is the con-

trol input, representing the actions taken to control the

system, and B is the input matrix describing how the

control inputs affect the state vector. The term Ew(t)

includes the exogenous state variables w(t), which are

external disturbances or inputs that affect the system,

with E being the matrix that shows how these distur-

bances impact the state vector. The output equation is

given by:

y(t) = Cx(t) + Dν(t) (2)

Here, y(t) is the output vector, representing the

measurable outputs of the system. The matrix C is

the output matrix, which maps the state vector x(t) to

the output y(t). The vector ν(t), which is the output

noise, also inﬂuences the output through the feedfor-

ward matrix D.

Assume a system where the state, control, and ex-

ogenous state variables at each time step t are repre-

sented by x(t) ∈ R

, w(t) ∈ R

and u(t) ∈ R

, The

state vector x(t) can be partitioned into two compo-

nents:

x(t) =



(t)



(3)

where x

(t) ∈ R

are objective, or the control-

lable state variables affected by the control inputs

u(t). Moreover, x

(t) ∈ R

are the exogenous state

variables not affected by u(t) but affecting the objec-

tive variables. The state-space equations considering

exogenous state variables can be written as:

(

(t + 1) = A

(t) + B

u(t) + E

(t)

(t + 1) = A

(t) + w(t)

(4)

In these equations, A

∈ R

×n

is the state matrix

for the objective variables, deﬁning their dynamics.

The matrix B

∈ R

×n

is the input matrix describ-

ing how the control inputs affect the objective vari-

ables. The term E

∈ R

×n

represents the inﬂuence

of the exogenous state variables on the objective vari-

ables. For the exogenous state variables, A

∈ R

×n

is the state matrix that deﬁnes their dynamics. Lastly,

w(t) ∈ R

represents the external disturbances that

impact the exogenous state variables.

The output equation considering the effect of ex-

ogenous state variables is:

y(t) = C

(t) +C

(t) + Du(t) (5)

Where, y(t) ∈ R

represents the output vector,

which contains the objective variables. The matrix

∈ R

×n

maps the objective variables to the out-

put, while C

∈ R

×n

maps the exogenous state vari-

ables to the output. Additionally, D ∈ R

×n

is the

feedforward matrix that describes the direct inﬂuence

of the control inputs on the output. Finally, the exoge-

nous state variables are those components of the state

vector that inﬂuence the objective variables but are

not inﬂuenced by the control inputs. They capture the

effects of external disturbances and are crucial for ac-

curately modeling and controlling complex systems.

Two learned non-linear and sequence-to-sequence

models ( f ) for the prediction of time series data D

which output the state of the system at time t + 1 are

deﬁned as follows:

• f

(Exogenous State Variables in the Model’s

Prediction):

s(t +1) =

(t +1) +

(t +1) = f

(

,U(t))

(6)

• f

(No Exogenous State Variables in the

Model’s Prediction):

s(t + 1) =

(t + 1) = f

(

(t),U(t)) (7)

Where

s(t + 1) ∈ R

out

is the prediction of the

model with n

out

being the number of the variables

in the prediction which is n

+ n

for the model f

and n

for the model f

. Moreover,

(t + 1) ∈ R

and

(t + 1) ∈ R

are the prediction of the model

for state and exogenous state variables. The terms

(t) ∈ R

l×n

, X

(t) ∈ R

l×n

, and U(t) ∈ R

l×n

de-

ﬁne the historical record of the system’s state, exoge-

nous, and control variables at time t. When using

these models as simulators over h ∈ Z

horizon, if

the prediction error at each time step is deﬁned using

the Euclidean distance, expressed as e

= ∥

− s

∥,

then the system’s state returned by the models at the

time step t +h can be described as (Mohammadi et al.,

2024c):

t+h

= f (... f ( f (s

)+e

t+1

)+e

t+1

...,a

t+h

)+e

t+h

(8)

The accuracy of each model as a simulator de-

pends on the prediction errors from the beginning of

the simulation until the end because the error at each

step is accumulated and results in lower accuracy as

the simulation continues. This compounding error is-

sue is addressed in (Mohammadi et al., 2024a), where

Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables

653

an improvement method was implemented to reduce

the error at each simulation step, minimizing the ef-

fect of compounding error. The prediction error at the

time step e

t+k

with k ∈ Z

and 0 < k ≤ h, for

is:

t+k

= e

t+k,c

+ e

t+k,e

(9)

While the prediction error for

is deﬁned as:

t+k

= e

t+k,c

(10)

The trained models can be improved to minimize

multi-step simulation errors by employing an itera-

tive method that incorporates the model’s predictions

back into the input at each training step (Mohammadi

et al., 2024a). This approach, inspired by the work

of (Venkatraman et al., 2015), is speciﬁcally adapted

for recursive multi-step forecasting. During training,

the model forecasts across the designated prediction

horizon for each input-output pair from the dataset.

At each step, the model’s output is fed back into the

input for the subsequent step’s prediction, continuing

this process until the horizon is reached. The loss

is calculated for each one-step prediction and used

for learning, enabling the model to correct errors at

each step before proceeding, thereby preventing er-

ror propagation throughout the forecasting or simula-

tion horizon. Assuming

and

to be the improved

versions of learned f

and f

based on (Mohammadi

et al., 2024a), it can be concluded that:

Theorem 1. If e

′

t+k

and e

′

t+k

are the prediction

errors of the simulators based on f

′

and f

′

at time

step t +k, then ∀ f

′

∈ { f

},∀ f

′

∈ { f

} we have

′

t+k

≤ e

′

t+k

Proof. exogenous state variables are unaffected by

the system’s state and control variables, so their val-

ues as a part of the model’s input at each simulation

step can be sampled from the ofﬂine data D. With the

same state and control variables at each step, the input

at time step t + k for model

′

is:

S(t) =

(t) + U(t) (11)

While the input to the model

′

is:

S(t) =

+ X

(t) + U(t) (12)

The error term e

t+k,e

for the exogenous state variables

in the model

′

is calculated as:

′

t+k,e

∑

m=1

∥

e{t+1,m}

− x

e{t+1,m}

∥ ≥ 0 (13)

While for

′

, as the exogenous state variables are

sampled from the real values in D , the error term

t+k,e

for them in the model is equal to zero:

′

t+k,e

∑

m=1

∥x

e{t+1,m}

− x

e{t+1,m}

∥ = 0 (14)

It can be concluded from equations 13 and 14 that the

error term for exogenous state variables which will

affect the model’s prediction for the next step in the

model

′

will always be lower or equal to the error

term in the model

′

t+k

≤ e

′

t+k

(15)

2.3 The LSTM Model

The LSTM architecture as shown in Figure 1 was de-

signed with multiple layers to capture complex pat-

terns in wastewater treatment data. Speciﬁcally, the

network comprises two LSTM layers, each consist-

ing of 256 units, as described in (Mohammadi et al.,

2024c). We employed the ’tanh’ activation function

in the LSTM layers to facilitate non-linear learning.

The model also included a dropout rate of 0.15 to pre-

vent overﬁtting. The input to the model at each step

consisted of a history of time steps, including all of

the system’s state variables, while the output was a

single-step prediction of the system’s state. The train-

ing and validation procedure of the base LSTM model

is explained in (Mohammadi et al., 2024c).

While the input provided to the LSTM model

consists of a historical record of the system’s state

(t) ∈ R

l×n

, exogenous state X

(t) ∈ R

l×n

and

control variables U(t) ∈ R

l×n

, detailed as follows:

S(t) = X

(t) + X

(t) + U(t) =







(t)

(t − 1)

(t − l)













(t)

(t − 1)

(t − l)













u(t)

u(t − 1)

u(t − l)







(16)

Where S(t) ∈ R

l×n

is the input to the LSTM model

at time t, with n = n

+ n

representing the num-

ber of the input variables. The output of the model at

each time step t will be as follows:

S(t + 1) =







x(t + 1)

x(t + 2)

x(t + p)







(17)

Where p ∈ Z

represents the model’s output se-

quence length, which in this study is set to 1, conse-

quently leading to

S(t + 1) =

x(t + 1). The model’s

output can be determined from equations 6 and 7,

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

654

Input

LSTM LSTM LSTM LSTM

LSTM

. . .

LSTM Layer

Fully Connected Layer

Output

Figure 1: The structure of the LSTM (Hochreiter and Schmidhuber, 1997b) model for time series forecasting tasks, where

,..., x

), and (h

,..., h

) represent the input and the hidden state (output) of each LSTM cell (Mohammadi et al., 2024c).

depending on whether exogenous state variables are

included in the prediction. The prediction error of

LSTM model at time t + 1 can be calculated as:

t+1

∑

d=1

∥

{t+1,d}

− x

{t+1,d}

∥

(18)

Common training methods, known as teacher-

forcing or supervised learning, utilize backpropaga-

tion and minimize the single-step loss function for

each training batch. Each training batch D contains

z ∈ Z

input-output pairs sampled from the dataset

D, where D

= (S(t)

,x(t + 1)

). The optimization

is performed over the model parameters θ as follows

(Mohammadi et al., 2024a):

∗

= argmin

∑

i=1

∑

d=1

∥(

t+1,d

)

− (x

t+1,d

)

∥

(19)

where θ represents the parameters of the model,

and (

t+1,d

)

and (x

t+1,d

)

are the predicted and true

values of the d-th dimension of the system’s state at

time t + 1 for the i-th pair in the batch D.

2.4 Experiments Design

To investigate the experimental validation of Theo-

rem 1, three distinct LSTM models were trained to

evaluate the impact of incorporating additional vari-

ables. Table 2 details these models, highlighting their

differences. While the models share the same struc-

ture and hyperparameters, they vary in including ex-

ogenous state variables, such as temperature, ﬂow,

maximum critical function value, and process phases,

in the model output. The base model, trained with

teacher-forcing, and improved versions with iterative

training (E1, E2, E3, and E4), as described in (Mo-

hammadi et al., 2024a), which employed a method

similar to the ”DATA AS DEMONSTRATOR” (DAD)

approach. The DILATE loss function (Le Guen and

Thome, 2019) was also used in the improvements, and

the results were compared to those obtained using the

Mean Squared Errors (MSE) loss function.

This study compares scenarios where only control

and state variables are present in the model input ( f

and exogenous state variables are included ( f

and

). Furthermore, it compares two cases regarding

the model output: in one case, the predicted values

of the exogenous state variables are used at each sim-

ulation step ( f

), while in the other case, the actual

values of the exogenous state variables are utilized in

the simulation ( f

2.5 Software and Hardware

All of the tests for the simulation environment are im-

plemented in programming language Python by us-

ing the Gym (Brockman et al., 2016) and PyTorch

(Paszke et al., 2019) libraries. The AI Cloud service

from Aalborg University is used for GPU-based com-

putations. The used compute nodes are each equipped

with 2 × 24-core Intel Xeon CPUs, 1.5 TB of system

RAM, and one NVIDIA Tesla V100 GPU with 32 GB

of RAM, all connected via NVIDIA NVLink.

3 RESULTS

This section presents the experimental results of in-

corporating exogenous state variables into the multi-

step simulation of time series data for wastewater

treatment.

3.1 Experimental Results

The LSTM model, as described in Section 2.3, was

trained on various data combinations. To assess its

effectiveness, we compared its performance with sev-

eral state-of-the-art time series prediction models, as

reported in (Mohammadi et al., 2024c). The experi-

ments utilized three dataset variations ( f

, f

, and f

Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables

655

Table 2: The explanations of the trained models.

Models Name Explanation

IOPPo-10i1o f

Without the exogenous state variables

IOPTQCfFiFoPo-15i6o f

Exogenous State Variables in the Model’s Prediction

IOPTQCfFiFoPo-15i1o f

No Exogenous State Variables in the Model’s Prediction

each incorporating different combinations of exoge-

nous state variables as detailed in Table 2. Moreover,

Table 3 displays the average MSE and DTW metrics

for the base model and its improved versions (E1, E2,

E3, and E4) over one year. The best MSE and DTW

values for each model are highlighted in bold, while

the best values for each version and dataset variation

are underlined.

After analyzing the various model versions and

calculating the average MSE and DTW over a year-

long daily simulation, the versions with the lowest

DTW were selected for further comparison. The hy-

perparameters of the best-improved versions for each

model are listed in Table 4. Additionally, Figure 2 vi-

sually represents the daily DTW values over the year,

illustrating the error changes. Each point in Figure 2

represents the average loss of a simulation sequence

that begins with the data at the start of the day and

recursively predicts the output until the end of the

day. The initial input to the model for each simula-

tion point consists of a system history, incorporating

all input variables over a speciﬁc lookback window.

For this study, the lookback window was set to 240

steps for all models. The blue color in Figure 2 in-

dicates the number of time steps that include actual

values from the dataset used as model input at each

step. Once the simulation reaches the point where the

number of simulation steps equals the lookback win-

dow, it completely runs out of actual time steps. Be-

yond this point, all inputs to the model are predictions

generated by the model itself.

Finally, the best versions of the models were used

to simulate a period of 720 steps (24 hours) for differ-

ent points of the year. The results of these simulations

are shown in Figure 3, and the metrics for each point

can be found in Table 5.

3.1.1 Improvement Performance

The average MSE and DTW in Table 3 are presented

by columns to compare the performance of different

versions of improved models across the entire dataset.

The f

model, which does not include exogenous state

variables, showed signiﬁcantly higher MSE and DTW

values than those that incorporated exogenous state

variables. The best version of the f

model achieves

an MSE of 0.4848 and a DTW of 1.7716. Using the

actual values of exogenous state variables at each step

( f

) results in lower error values, with the best version

achieving an MSE of 0.2165 and a DTW of 1.1646.

This improvement represents a 64.49% reduction in

MSE and a 32.37% reduction in DTW compared to

the best f

version, and a 55.35% reduction in MSE

and a 34.27% reduction in DTW compared to the

best version of f

model. These results highlight the

signiﬁcant beneﬁts of incorporating exogenous state

variables and using their actual values to enhance the

predictive accuracy of the LSTM-based simulator.

4 DISCUSSION

Incorporating exogenous variables like temperature,

ﬂow, maximum critical function value, and process

phases into the LSTM model signiﬁcantly enhanced

the predictive accuracy and robustness of the wastew-

ater treatment simulator, as evidenced by lower MSE

and DTW values. These exogenous variables pro-

vided a more comprehensive representation of the

system, capturing critical aspects that state variables

alone couldn’t, such as the impact of temperature

and ﬂow on phosphorus removal. This allowed the

model to predict system behavior more accurately un-

der varying conditions.

Including exogenous state variables in the model

led to a signiﬁcant reduction in MSE and DTW. The

enhanced model, especially when using actual exoge-

nous values at each simulation step, demonstrated ro-

bust performance, cutting MSE by 55% and DTW

by 34% compared to the model without these vari-

ables. This reduction indicates that the model’s abil-

ity to capture complex dynamics of the WWT process

improved, resulting in more precise simulations. The

model, which included exogenous state variables

in the output, consistently showed improved perfor-

mance across different months and seasons. This

highlights the model’s ability to predict these addi-

tional factors, enhancing its accuracy. Moreover, the

model, which incorporated exogenous state vari-

ables in the input but not in the output and used their

actual values at each simulation step, outperformed

the f

model in all versions. This underscores the im-

portance of exogenous state variables in representing

the system’s state and their independence from con-

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

656

Table 3: The average Mean Squared Error and Dynamic Time Warping data for the base model and improved versions during

different months of the year. The best MSE and DTW values for each model and experiment are highlighted in bold and

underlined, respectively.

Models

Base Model E1 E2 E3 E4

MSE DTW MSE DTW MSE DTW MSE DTW MSE DTW

449.4486 56.1453 0.5684 1.8893 0.4848 1.8369 0.5027 1.7716 0.5096 1.9028

28.6279 11.1544 0.6536 1.7978 0.6981 1.8361 0.6098 1.7219 0.8137 2.0372

13.1450 9.8183 0.2165 1.1646 0.2436 1.2566 0.3496 1.5075 0.2788 1.3412

Average 163.7405 25.7060 0.4795 1.6172 0.4755 1.6432 0.4874 1.6670 0.5340 1.7604

Figure 2: The loss for the best experiments of all dataset types with or without the exogenous state variables.

Table 4: The parameters of the best improved checkpoints

for each experiment. Ex.: Experiment number, Ep.: Im-

provement epochs, Min EL and Max EL: Minimum and

Maximum episode length during the improvement, Loss F.:

The improvement loss function, and Alpha: Alpha in DI-

LATE loss function.

Models

Parameters

Ex. Ep. Min EL Max EL Loss F. Alpha

E3 80 240 240 Dilate 0.6

E1 80 240 240 MSE -

trol and objective variable changes. Using indepen-

dent exogenous state variables allows the model to

leverage real-time actual values from the dataset, re-

sulting in better simulation and accurately capturing

the system’s dynamics.

Using actual values of exogenous state variables

at each simulation step (model f

) mitigated the ac-

Table 5: The average Mean Squared Error and Dynamic

Time Warping data for each model in the different points of

the year. The best MSE and DTW values for each point and

model are highlighted in bold and underlined, respectively.

Points

MSE DTW MSE DTW MSE DTW

September 0.9570 21.0640 1.3240 17.2590 0.8260 15.1950

December 0.6600 13.3860 0.7580 13.1930 0.2960 9.4360

March 0.5800 7.2210 0.5510 6.2090 0.1770 5.6840

June 0.7510 19.2260 1.0440 15.5430 0.4770 13.6430

Average 0.7370 14.8400 0.9190 13.0510 0.4440 11.3740

cumulation of prediction errors over multiple steps,

reducing MSE by 55% and DTW by 34%. This ap-

proach kept the model’s input closely aligned with re-

ality, enhancing overall simulation accuracy. By cor-

recting itself at each step, the model prevented errors

from propagating throughout the simulation.

Performance improvement was consistent

throughout one year in the collected dataset, with

Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables

657

Figure 3: The simulation for the best experiments of each model. The number of past actual values in the input shown by the

blue color decreases as the simulation reaches a point where all the input history is the predictions.

some variations observed. Notably, the model’s

performance improved in March, when the lowest

MSE and DTW values were recorded. This suggests

that the model can adapt to seasonal changes in the

wastewater treatment process. Seasonal variations

can affect inﬂow characteristics and process dynam-

ics. The model’s ability to maintain accuracy across

these variations is a testament to its effectiveness.

Finally, the iterative improvement method re-

ported in (Mohammadi et al., 2024a) enhanced the

accuracy of the models, bringing them closer to the

system’s dynamics. Including exogenous state vari-

ables at each step assists the model in reducing pre-

diction errors, which are often compounded from pre-

vious steps.

While including exogenous variables has clear

beneﬁts, it also presents challenges. These variables

may not always be accurately measured or available in

real-time, potentially impacting model performance.

Additionally, the increased computational complex-

ity and need for large datasets could limit this ap-

proach’s applicability in smaller systems with limited

data. Despite these challenges, the method signiﬁ-

cantly improves wastewater treatment simulations.

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

658

5 CONCLUSIONS

The study successfully demonstrated the advantages

of incorporating exogenous state variables into an

LSTM-based simulator for wastewater treatment. The

improved model, particularly the f

, signiﬁcantly en-

hanced prediction accuracy and robustness. The key

conclusions are:

• Enhanced Accuracy: Including exogenous state

variables markedly improved the model’s accu-

racy, as evidenced by lower MSE and DTW val-

ues across the year.

• Error Mitigation: Using actual values of exoge-

nous state variables at each simulation step re-

duced MSE by 55% and DTW by 34%, effectively

mitigating the compounding of prediction errors

and leading to more reliable simulations.

• Broad Applicability: The model demonstrated

robust performance across different seasonal con-

ditions, highlighting its potential applicability in

diverse operational settings.

• Future Work: Future research could explore

integrating more external factors and applying

similar methods to other aspects of wastewater

treatment. Additionally, attention-based models

and GPU training, as suggested by (Mohammadi

et al., 2024c), could enhance efﬁciency and reduce

computational time.

The improved LSTM-based simulator represents

a signiﬁcant advancement in wastewater treatment

modeling. It offers a powerful tool for optimiz-

ing control strategies and enhancing operational ef-

ﬁciency.

ACKNOWLEDGEMENTS

The RecaP project has received funding from the Eu-

ropean Union’s Horizon 2020 research and innovation

programme under the Marie Skłodowska-Curie grant

agreement No 956454. Disclaimer: This publication

reﬂects only the author’s view; the Research Execu-

tive Agency of the European Union is not responsible

for any use that may be made of this information.

REFERENCES

Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,

Schulman, J., Tang, J., and Zaremba, W. (2016). Ope-

nai gym. arXiv preprint arXiv:1606.01540.

Gao, P., Yang, X., Zhang, R., Guo, P., Goulermas, J. Y.,

and Huang, K. (2023). Egpde-net: Building contin-

uous neural networks for time series prediction with

exogenous variables.

Goodwin, G. C., Graebe, S. F., Salgado, M. E., et al. (2001).

Control system design, volume 240. Prentice Hall Up-

per Saddle River.

Gujer, W., Henze, M., Mino, T., Matsuo, T., Wentzel, M. C.,

and Marais, G. (1995). The activated sludge model no.

2: biological phosphorus removal. Water science and

technology, 31(2):1–11.

Hansen, L. D., Stentoft, P. A., Ortiz-Arroyo, D., and Dur-

devic, P. (2024). Exploring data quality and sea-

sonal variations of n2o in wastewater treatment: a

modeling perspective. Water Practice & Technology,

19(3):1016–1031.

Hansen, L. D., Stokholm-Bjerregaard, M., and Durdevic, P.

(2022). Modeling phosphorous dynamics in a wastew-

ater treatment process using bayesian optimized lstm.

Computers & Chemical Engineering, 160:107738.

Hespanha, J. P. (2018). Linear systems theory. Princeton

university press.

Hochreiter, S. and Schmidhuber, J. (1997a). Long Short-

Term Memory. Neural Computation, 9(8):1735–1780.

Hochreiter, S. and Schmidhuber, J. (1997b). Long short-

term memory. Neural Computation, 9:1735–1780.

uger A/S (2023). Hubgrade performance plant. https:

//www.kruger.dk/english/hubgrade-advanced-onlin

e-control. Accessed: 2023-11-30.

Le Guen, V. and Thome, N. (2019). Shape and time dis-

tortion loss for training deep time series forecasting

models. Advances in neural information processing

systems, 32.

Mohammadi, E., Ortiz-Arroyo, D., Stokholm-Bjerregaard,

M., Hansen, A. A., and Durdevic, P. (2024a). Im-

proved long short-term memory-based wastewater

treatment simulators for deep reinforcement learning.

arXiv preprint arXiv:2403.15091.

Mohammadi, E., Rani, A., Stokholm-Bjerregaard, M.,

Ortiz-Arroyo, D., and Durdevic, P. (2024b). Wastew-

ater treatment plant data for nutrient removal system.

Mohammadi, E., Stokholm-Bjerregaard, M., Hansen, A. A.,

Nielsen, P. H., Ortiz-Arroyo, D., and Durdevic, P.

(2024c). Deep learning based simulators for the

phosphorus removal process control in wastewater

treatment via deep reinforcement learning algorithms.

Engineering Applications of Artiﬁcial Intelligence,

133:107992.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,

Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,

Antiga, L., et al. (2019). Pytorch: An imperative style,

high-performance deep learning library. Advances in

neural information processing systems, 32.

Venkatraman, A., Hebert, M., and Bagnell, J. (2015). Im-

proving multi-step prediction of learned time series

models. In Proceedings of the AAAI Conference on

Artiﬁcial Intelligence, volume 29.

Multi-Step Simulation Improvement for Time Series Using Exogenous State Variables

659