From ML2 to ML2+: Integrating Time Series Forecasting in
Model-Driven Engineering of Smart IoT Applications
Zahra Mardani Korani
1,4 a
, Moharram Challenger
2 b
, Armin Moin
3 c
, Jo
˜
ao Carlos Ferreira
4 d
,
Alberto Rodrigues da Silva
5 e
, Gonc¸alo Vitorino Jesus
1 f
, Elsa Lourenc¸o Alves
1 g
and Ricardo Correia
6
1
LNEC, Portugal
2
Department of Computer Science, University of Antwerp & Flanders Make, Belgium
3
Department of Computer Science, University of Colorado, Colorado Springs, CO, U.S.A.
4
ISCTE, Instituto Universit
´
ario de Lisboa (ISCTE-IUL), ISTAR, 1649-026 Lisbon, Portugal
5
INESC-ID, Instituto Superior T
´
ecnico, Universidade de Lisboa, Portugal
6
BioGHP, Portugal
{zmardani, gjesus, ealves}@lnec.pt, moharram.challenger@uantwerpen.be, amoin@uccs.edu,
Keywords:
Model-Driven Engineering, Machine Learning, IoT, Time Series Forecasting.
Abstract:
Time-series forecasting is essential for anomaly detection, predictive maintenance, and real-time optimiza-
tion in IoT environments, where sensor data is sequential. However, most model-driven engineering (MDE)
frameworks lack specialized mechanisms to capture temporal dependencies, restricting the creation of intelli-
gent and adaptive IoT systems. IoT inherently involves sequential data, yet most frameworks do not support
time-series forecasting, essential for real-world systems. This paper presents ML2+, an enhanced version of
the ML-Quadrat framework that integrates software engineering (SE) with machine learning (ML) in model-
driven engineering. ML2+ allows users to define models, things, and messages for time-series forecasting.
We evaluated ML2+ through two IoT use cases, focusing on development time, performance metrics, and
lines of code (LOC). Results show that ML2+ maintains prediction accuracy similar to manual coding while
significantly reducing development time by automating tedious tasks for developers. By automating feature
engineering, model training, and evaluation for time-series data, ML2+ streamlines forecasting, improving
scalability. ML2+ supports various forecasting models, including deep learning, statistical, and hybrid mod-
els. It offers preprocessing capabilities such as handling missing data, creating lagged features, and detecting
data seasonality. The tool automatically generates code for time-series forecasting, making it easier for devel-
opers to train and deploy ML models without coding.
1 INTRODUCTION
Model-driven engineering for the Internet of Things
(MDE4IoT) has gained significant attention in recent
years due to its ability to improve the efficiency, pre-
dictability, and maintainability of IoT system devel-
opment. Using models as the main artifacts for soft-
a
https://orcid.org/0000-0001-9144-7964
b
https://orcid.org/0000-0002-5436-6070
c
https://orcid.org/0000-0002-8484-7836
d
https://orcid.org/0000-0002-6662-0806
e
https://orcid.org/0000-0002-7900-9846
f
https://orcid.org/0000-0002-8431-3877
g
https://orcid.org/0000-0003-0937-7237
ware design and implementation, MDE4IoT offers a
high-level abstract representation of the system, out
of which the source code and other artifacts can au-
tomatically be generated (Da Silva, 2015; Alulema
et al., 2020). To further enhance its capabilities,
data analytics and machine learning (ML) techniques
have been incorporated into the development process
(Moin et al., 2022b; Moin et al., 2022a; Kirchhof
et al., 2022). This integration has enabled the cre-
ation of more intelligent and adaptive IoT systems.
However, despite these advances, existing MDE4IoT
frameworks often lack native, out-of-the-box support
for time series forecasting, an essential component for
applications that rely on sequential or temporal data.
458
Korani, Z. M., Challenger, M., Moin, A., Ferreira, J. C., Rodrigues da Silva, A., Jesus, G. V., Alves, E. L. and Correia, R.
From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications.
DOI: 10.5220/0013443200003896
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 13th International Conference on Model-Based Software and Systems Engineering (MODELSWARD 2025), pages 458-465
ISBN: 978-989-758-729-0; ISSN: 2184-4348
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
Time-series forecasting focuses on predicting future
values derived from historical observations indexed
in time. As most IoT applications produce time se-
ries data (for example, evolving sensor readings), the
ability to accurately model and forecast these trends
is critical for real-time monitoring, predictive main-
tenance, anomaly detection, and trend analysis (Adi
et al., 2020; Cruz-N
´
ajera et al., 2022; Mardani Korani
et al., 2023). Although previous research has explored
integrating ML into MDE4IoT workflows, a notable
gap remains in the specialized support for advanced
time-series forecasting models. Previous works, such
as GreyCat (Hartmann et al., 2019), have attempted to
integrate machine learning (ML) into domain models
for the Internet of Things (IoT). However, they lack
dedicated frameworks to address the complexities in-
herent in sequential data, such as capturing tempo-
ral dependencies and managing seasonality. Bridging
this gap presents significant challenges, necessitat-
ing a systematic approach to modeling temporal rela-
tionships, automating feature engineering for sequen-
tial data, and seamlessly integrating these processes
into Model-Driven Engineering (MDE) workflows.
This paper introduces ML2+, an enhanced extension
of the ML-Quadrat (ML2) framework (Moin et al.,
2022a) that directly tackles these challenges by em-
bedding comprehensive time-series forecasting capa-
bilities into MDE4IoT. Building upon ML2, an open-
source Model-Driven Software Engineering (MDSE)
tool that integrates ML with IoT development, ML2+
augments the framework with robust functionalities
for time-series forecasting. It leverages diverse mod-
els, including deep learning architectures like LSTM,
statistical models such as ARIMA and Prophet, and
hybrid approaches such as XGBoost, providing a
versatile toolkit for developers to effectively han-
dle temporal data within IoT applications. It auto-
matically generates Python code for data preprocess-
ing (e.g., handling missing values, creating lagged
features, detecting seasonality), model training, and
evaluation, thus eliminating the need for develop-
ers to have specialized expertise in time-series anal-
ysis. This automation optimizes development work-
flows, improves scalability, and enables developers
to seamlessly integrate advanced forecasting features
into their IoT solutions. By reducing manual coding,
ML2+ allows developers to focus on higher-level ana-
lytical tasks rather than low-level implementation de-
tails. The framework supports configurations for mul-
tivariate analysis, seasonality detection, and stationar-
ity checks, ensuring that generated models are well-
tuned to the data characteristics. As an open-source,
community-driven platform, ML2+ fosters collabora-
tion and knowledge exchange, ultimately bridging a
key research gap in MDE4IoT and supporting sophis-
ticated time-series analysis for a wide range of IoT
systems. This paper is structured as follows: Sec-
tion 2 examines the latest developments in the field,
while Section 3.1.4 presents our proposed solution.
Section 4 describes the validation and evaluation. Fi-
nally, Section 5 summarizes the study and outlines fu-
ture work.
2 LITERATURE REVIEW
We briefly reviewed several related works in the
area of MDE for the IoT (MDE4IoT). ML-Quadrat
(ML2) (Moin et al., 2018; Moin et al., 2020; Moin,
2021; Moin et al., 2022a) integrated supervised, un-
supervised, and semi-supervised learning, improving
the development of IoT applications. However, it did
not explicitly address time-series forecasting, requir-
ing developers to manually manage complex tempo-
ral dependencies. Hartmann et al. introduced Gr-
eyCat (Hartmann et al., 2019), which integrated ML
techniques into domain models for IoT. Although Gr-
eyCat enabled predictions within domain models, it
did not emphasize the specialized models and pre-
processing steps necessary for time-series forecast-
ing. Similarly, MoSIoT (Meli
´
a et al., 2021) in-
corporated learning features into the IoT scenario
models, but lacked dedicated support for advanced
time-series forecasting tasks. The Stratum plat-
form (Bhattacharjee et al., 2019) focused on an-
alytics management in IoT contexts, and Hartsell
et al. (Hartsell et al., 2019) proposed model-based
design techniques for cyber-physical systems. Al-
though these approaches showcased integrated ana-
lytics or active learning methods, they did not ad-
dress the unique requirements of time series fore-
casting, such as handling seasonality, performing sta-
tionarity checks, and performing lag feature engineer-
ing. Beyond MDE4IoT-specific tools, a range of ML
frameworks (e.g., TensorFlow (Abadi et al., 2015),
Keras (Chollet et al., 2015)) and workflow design-
ers (e.g., KNIME (Berthold et al., 2009), Rapid-
Miner (Mierswa et al., 2006)) offered high-level APIs
and flexible interfaces for ML. However, these did
not align fully with the model-driven engineering
paradigm and lacked direct mechanisms to integrate
time-series forecasting models into MDE4IoT arti-
facts. Interoperability standards such as PMML,
PFA, and ONNX (Bai et al., 2019) enhanced model
exchange but did not seamlessly incorporate temporal
modeling capabilities within domain-specific model-
ing languages (DSMLs) for IoT. In particular, while
the existing version of ML-Quadrat provided a strong
From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications
459
foundation for general ML tasks, it did not fully inte-
grate time-series forecasting. This gap remained a key
barrier for developers who needed to leverage fore-
casting within an MDE4IoT ecosystem. Our research
addressed this gap by enhancing ML-Quadrat with
ML2+. By incorporating advanced time-series fore-
casting models (e.g., LSTM, ARIMA, Prophet, XG-
Boost), automating data preprocessing, and stream-
lining model training, ML2+ enabled more effective
and efficient development of IoT applications. Fu-
ture work will validate ML2+ through extensive IoT
case studies, focusing on improvements in develop-
ment time, forecast accuracy, and overall scalability.
3 PROPOSED APPROACH
We enhanced the ML-Quadrat (ML2) (Moin et al.,
2022a) framework to ML2+ by integrating time-
series forecasting. This extension introduces time-
series modeling constructs, automated preprocessing
pipelines, and advanced forecasting methods within
the Model-Driven Engineering for the Internet of
Things (MDE4IoT) workflow, enabling effective han-
dling of sequential data essential for IoT applications
requiring temporal pattern analysis.
3.1 The Foundational Software Model
(SM)
The foundational Software Model (SM) (Moin et al.,
2022a) for IoT/CPS systems is formally defined as:
SM = (A, Ψ,B,C) (1)
Where:
A: Annotations for external libraries and proto-
cols.
Ψ: Structural Elements (e.g., Thing, Port,
Message).
B: Behavioral Elements (e.g., finite-state ma-
chines or statecharts).
C: Configurations that manage settings, compo-
nent instantiations, and connections.
Based on SM, we construct the Smart Software
Model (SSM) by incorporating a Domain Model (DM)
that specifies machine learning (ML) tasks in behav-
ioral elements:
SSM = (A,Ψ, f
B
(DM),C) (2)
Here, DM stands for Domain Model, which encom-
passes the relevant ML artifacts (such as algorithms,
hyperparameters, and data schemas). The function
f
B
(DM) indicates the behavior modifications to the
finite-state machines of SM that arise from the incor-
poration of ML. In practical terms, these modifica-
tions can include transitions or states that depend on
the predictions of the ML model (e.g., switching to
a state warning if a forecast sensor value exceeds a
threshold). This approach ensures that the system’s
run-time behavior is intelligently adapted based on
the results of data-driven models (Moin et al., 2022a).
To integrate time-series forecasting into the SSM, we
refine f
B
(DM) to:
f
B
(DM
2
,DM
T S
)
Where:
DM
2
: Standard ML tasks (e.g., classification or
regression).
DM
T S
: Time-series-specific constructs (e.g.,
ARIMA, LSTM, Prophet).
Thus, the SSM now becomes:
SSM = (A,Ψ, f
B
(DM
2
,DM
T S
),C) (3)
3.1.1 Time-Series Domain Model (DM
T S
)
The time-series domain model, DM
T S
, contains es-
sential artifacts for forecasting tasks, such as model
architectures, parameters, features, hyperparameters,
and metadata. Notably, the SSM is initially un-
trained, meaning DM
T S
includes constructs for train-
ing but not the trained model itself. Our implementa-
tion uses two separate datasets to train these ML mod-
els, ensuring robust and accurate forecasting.
DM
T S
= (υ
T S
,P
T S
,Φ
T S
,H
T S
,I
T S
) (2) (4)
Where:
υ
T S
: Model Architecture (e.g., ARIMA, LSTM)
defining the structure of the forecasting approach.
P
T S
: Model Parameters (e.g., ARIMA (p,d,q),
LSTM units) detailing specific configuration val-
ues per architecture.
Φ
T S
: Time-Series Features derived from histor-
ical data (e.g., lagged observations, rolling win-
dows).
H
T S
: Hyperparameters (e.g., learning rate,
epochs) guiding the training process.
I
T S
: Metadata (e.g., data frequency, forecast
horizon) providing contextual information for the
time-series data.
MBSE-AI Integration 2025 - 2nd Workshop on Model-based System Engineering and Artificial Intelligence
460
By incorporating DM
T S
into the SSM, we en-
able robust time-series forecasting within MDE4IoT
workflows. The following sections detail how ML2+
automates preprocessing, training, and deployment
for these models, minimizing manual effort while pre-
serving flexibility for developers and data scientists.
3.1.2 Model Validation and Code Generation
The validated and complete SSM models are con-
verted into executable source code:
V (SSM) C(SSM) = (SSM) = full source code
(5)
Where:
V (SSM): Validates the SSM.
C(SSM): Checks the completeness of the SSM.
: Automates the transformation, reducing man-
ual effort.
Supports statsmodels, xgboost, prophet, pytorch,
and integrates with scikit-learn, keras-tensorflow,
weka.
3.1.3 Configurations Management
Configurations manage component instantiations and
connections(Moin et al., 2022a):
C
i
= (A
C
i
,Θ, Ξ) (6)
where:
A
C
i
: Annotations for component requirements.
Θ: Instantiated components (e.g., data streams,
preprocessing units, models).
Ξ: Communication connectors for data flow.
By maintaining abstraction, ML2+ ensures that
complex IoT architectures with forecasting capabili-
ties are manageable and maintainable.
3.1.4 Proposed Architecture Design
This study improves the framework ML-Quadrat
(ML2) (Moin et al., 2022a) by integrating time-series
forecasting into Internet of Things (IoT) systems.
Figure 1 illustrates the metamodel of our architec-
ture, highlighting new time series forecasting compo-
nents in yellow, along with their preprocessing steps
in red in the Data Analytics component, assessment
methods, and model parameters. Green-marked com-
ponents indicate enhancements and expanded ma-
chine learning functionalities of ML-Quadrat (ML2).
Using domain-specific modeling languages (DSML)
and model-driven engineering (MDE), ML2 + man-
ages system complexity and automates development
tasks such as code generation and deployment (Mar-
dani Korani et al., 2023).
3.1.5 Abstract Syntax of the DSML
The DSML for ML2+ is built on an Ecore metamodel
within the MDE4IoT framework, tailored for time-
series forecasting. It defines datasets with features,
labels, and temporal attributes, and includes prepro-
cessing configurations such as imputation and out-
lier removal. The language supports models such
as ARIMA, SARIMA, LSTM, and hybrids with cus-
tomizable parameters, and incorporates evaluation
metrics that include RMSE, MAE, and MSE.
3.1.6 Concrete Syntax and Model Editors
The DSMLs concrete syntax is designed for user-
friendliness, facilitating interaction with time-series
forecasting models within the MDE4IoT framework.
Developed using Xtext grammar in Eclipse, it enables
structured model definitions. A cloud-based web ed-
itor has replaced the previous ML2 editor, offering
features such as syntax highlighting, autocompletion,
and customizable templates to enhance user experi-
ence.
3.1.7 Semantics and Model-to-Code
Transformations
The semantics of the DSML define the meaning and
behavior of models created using the language. It
includes model-to-code transformations that convert
high-level model descriptions into executable Python
and Java code. These transformations automatically
generate deployable scripts such as processing.py,
training.py, and predict.py, ensuring that the
generated code aligns with the model logic. Xtend-
based templates streamline this process, enabling
easy deployment on IoT devices and edge platforms.
3.1.8 Supported Time-Series Forecasting
Methods and Techniques
ML2+ provides a comprehensive framework for
time-series forecasting, preprocessing, and eval-
uation. It supports datasets with features, labels,
and temporal configurations, using parameters like
common period threshold to align multivariate
IoT datasets by specifying the maximum allowable
missing values. Preprocessing capabilities include
imputation, outlier removal, and transforma-
tions such as lag features, rolling windows,
and resampling. Temporal data handling is con-
figurable, with options for forecasting horizons,
lag settings, seasonality detection, and
stationarity checks. The framework supports
deep learning models, including MLP, GRU, CNN,
LSTM, RNN, TCN, and Transformers, as well as
From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications
461
statistical models like ARIMA, SARIMA, Holt-
Winters, and ETS. Machine learning models such
as SVR, RFR, GBM, and XGBoost, along with
hybrid approaches like ARIMA GARCH and Prophet,
are also included. Hyperparameter tuning tech-
niques, including grid search, random search,
and Bayesian optimization, ensure optimal model
performance. Model evaluation employs metrics
like RMSE, MAE, and MSE, with support for context-
specific performance analysis in domains such as
RiverFlow prediction and DataCenterManagement.
Domain-specific use cases, including RiverFlow,
ServerMonitoring, and IoTDataCenter, are
defined for specialized applications. The framework
integrates seamlessly with widely used libraries such
as statsmodels, xgboost, prophet, pytorch,
scikit-learn, keras-tensorflow, and weka,
enabling efficient data processing and accurate
predictions without extensive DSML or modeling
expertise.
4 VALIDATION AND
EVALUATION
4.1 River Flow Forecasting
A case study was implemented to validate the pro-
posed river flow prediction framework using ac-
tual multivariate datasets. The system leverages the
ML2+ framework and the Data Analytics and Ma-
chine Learning (DAML) components to predict river
discharge and support flood risk management in an
IoT environment. For example, to predict river
flow for the next three days at the target station
Ac¸ude Ponte de Coimbra (12G/01AE), daily river
flow data are collected from multiple monitoring sta-
tions, including Albufeira da Aguieira (11H/01A),
Albufeira da Raiva (12H/01A), Albufeira de Fronhas
(12I/01A) and Ac¸ude Ponte de Coimbra (12G/01AE).
The dataset spans from January 1, 1984, to Novem-
ber 18, 2024, consisting of daily average effluent
flow measurements (in m
3
/s), provided by the Por-
tuguese National Water Resources Information Sys-
tem (SNIRH)(System, ; Jesus et al., 2025). Each sta-
tion transmits its data daily to a central server, which
is subsequently accessed by the DAML server. The
workflow, orchestrated by the DAML components,
automates the critical stages of the pipeline. The data
preprocessing step, executed using DA Preprocess,
prepares the collected multivariate data by interpolat-
ing missing values, capped at 10 consecutive gaps,
to ensure data continuity. The dataset is resampled
to align with a common period threshold and con-
verted into a supervised learning format by generating
lagged features that capture temporal dependencies
across multiple stations, essential for accurate river
flow forecasting. These preprocessing steps are im-
plemented using the Scikit-Learn library as part of
the software model. The preprocessed data are split
into 80% for training and 20% for testing, ensuring
the sequential nature of the multivariate time-series
data is preserved. The training process is executed
using DA Train, which deploys a Multilayer Percep-
tron (MLP) model configured with 50 neurons, 50
epochs, a batch size of 16, L2 regularization with a
value of 0.01, a dropout rate of 0.2, and the relu op-
timizer with Mean Squared Error (MSE) as the loss
function. This configuration ensures efficient train-
ing while maintaining model accuracy and is imple-
mented using the Keras library within the ML2+
framework. Following the training phase, predictions
are generated for the next three days at the target sta-
tion using DA Predict, a component in the ML2+
framework that retrieves the latest input features, per-
forms predictions, and stores the forecasted values in
predefined properties. The predicted river flow val-
ues are compared against a predefined threshold of
150m
3
/s. If the predicted flow exceeds this threshold,
a flood alert is triggered, allowing proactive decision-
making to mitigate potential risks.
4.2 Energy Consumption Forecasting
A case study was conducted to evaluate the pro-
posed energy consumption forecasting system, lever-
aging real-world data from the UCI Machine Learn-
ing Repository (UCI Machine Learning Repository,
). Let us assume we want to predict the total en-
ergy consumption for the next three hours in a smart
home setting. The dataset spans from December
2006 to November 2010 and provides second-level
measurements of electric power consumption in kilo-
watts (kW). The data include total energy usage as
the target variable and time-indexed features, such
as appliance-specific consumption (e.g., dishwasher,
refrigerator), and additional contextual factors like
temperature when available. In this setup, data are
transmitted to a central server every second, support-
ing real-time monitoring and prediction. However,
to align with the forecasting goal of predicting to-
tal energy consumption over the next three hours,
the second-level data are aggregated into hourly in-
tervals during preprocessing. Hourly resampling re-
duces noise, highlights long-term trends, and ensures
computational efficiency. The aggregated data are
then accessed by the DAML server, which orches-
trates the workflow to automate the critical stages of
MBSE-AI Integration 2025 - 2nd Workshop on Model-based System Engineering and Artificial Intelligence
462
ThingMLModel
Import
importURI: EString
from: EString
PlatformAnnotation
name : EString
value : EString
Thing
fragment: EBoolean = false
Protocol
Property
readonly: EBoolean = false
Message
Parameter
Port
DataAnalytics
combine_threshold : ELong
preprocess_feature_scaler
preprocess_sample_normalizer
lling_missing_value
remove_outliers
advanced_imputation
lagged_features
rolling_window_features
resampling
transformations
multivariate
stationary
seasonality_detection
supervised_learning
create_lagged_features
sliding_window
hyperparameter_tuning
ensemble_methods
model_evaluation
outlier_detection
time_series_clustering
contextArea
input fetures
Plots
DataAnalyticsModelAlgorithm
name: EString
Time_series_ModelAlgorithm
Deep_learning_ModelAlgorithm
optimizer
metrics
statistical_ModelAlgorithm
machine_learning_ModelAlgorithm
hybrid_ModelAlgorithm
GRU
hidden_layer_sizes : EString
regularization
input_activation
hidden_activation
output_activation
dropout : EDouble = 0.0
rate : EDouble = 0.0
LSTM
regularization
input_activation
hidden_activation
output_activation
hidden_layer_sizes :
EString
rate : EDouble = 0.0
predctionplot
ARIMA
trend : EString
XGBoost
learning_rate : EDouble = 0.0
objective : EString
booster : EString
gamma : EDouble = 0.0
min_child_weight : EDouble = 0.0
subsample : EDouble = 0.0
colsample_bytree : EDouble = 0.0
ARIMA_GARCH
seasonal_order : EString
trend : EString
garch_order : EString
Prophet
growth : EString
seasonality_
mode :
EString
ML2_ModelAlgorithm
PMML_ModelAlgorith
m
pmml_path : EString
PFA_ModelAlgorithm
pfa_path: EString
NN_MultilayerPerceptron
hidden_layer_sizes: EString
activation
hidden_layers_activation_functions: EString
activations
optimizer
Action
DASaveAction
DAPreprocessActio
n
DATrainAction
DAPredictAction
DAPreTrainedPredictAction
DAForecastAction
Conguration
CongPropertyAssign
[0..*] imports
[0..*] protocols
[0..*] congs
[0..*] annotations
[0..*] includes
[0..*] messages
[0..*] ports
[0..*] properties
[0..*] dataAnalytics
[0..*] parameters
[0..*] sends
[0..*] receives
[0..*] features
[0..*] output_features
[0..1] modelAlgorithm
[0..*] predictionResults
[0..1] dataAnalytics
[0..1] dataAnalytics
[0..1] dataAnalytics
[0..1] dataAnalytics
[0..*] features
[0..1] dataAnalytics
[0..*] features
[0..1] dataAnalytics
[0..*] features
[0..*] propassigns
[0..1] property
[0..*] annotations
Figure 1: Meta-Model Diagram of the Enhanced ML2+ Framework, highlighting new components in yellow.
data preparation, model training, and forecasting. The
workflow begins with DA Preprocess, which auto-
mates preprocessing tasks, including resampling, in-
terpolation of missing data, generation of lagged fea-
tures to capture temporal dependencies, and scaling of
variables to ensure consistency. During resampling,
the dataset is transformed from second-level granu-
larity to hourly intervals by taking the mean energy
usage within each hour. Missing values were imputed
using interpolation to ensure data continuity. Lagged
versions of the use [kW] variable were generated for
the last three hours (t 1, t 2, t 3) to incorporate
temporal dependencies. Features were normalized us-
ing Min-Max Scaling to bring values into the range
[0, 1], and outliers were removed using the Z-score
method to improve model robustness. Stationarity
of the time series was verified using the Augmented
Dickey-Fuller (ADF) test, and non-stationary series
were differenced as needed to stabilize the variance.
The dataset was divided into 80% training data and
20% testing data, maintaining the temporal order of
the time series to preserve sequence relationships. For
model training, the DA Train component was used
to train two forecasting models: Holt-Winters (Triple
Exponential Smoothing) and SARIMA. The Holt-
Winters model was configured to capture the level,
trend, and seasonality of energy usage, with additive
components and a seasonal period of 24 (correspond-
ing to daily hourly seasonality). The SARIMA model,
which extends ARIMA by incorporating seasonal pat-
terns, was configured with the parameters (1,1,1)
for ARIMA and (1,1,1, 24) for seasonal components.
Seasonal decomposition was used to identify optimal
seasonal parameters for the SARIMA model. Hyper-
parameter tuning was conducted to optimize model
performance. A library such as Statsmodels was uti-
lized within the ML2+ framework for model imple-
mentation and evaluation. Once the models were
trained, the DA Predict component was used to gen-
erate predictions for the next three hours. The DAML
server retrieved the forecast input data, aggregated it
into hourly intervals, fed them into the trained models,
and stored the predicted energy usage in predefined
properties. For proactive energy management, an alert
threshold of 5kW was defined. If the predicted total
energy consumption exceeded this threshold, an alert
was triggered, allowing real-time adjustments or in-
terventions in the smart home environment.
4.3 Comparative Analysis and Metrics
The evaluation presented in this paper was conducted
by a user with medium-level experience in Python
coding and a good understanding of the writing of
model instances. Forecast performance was assessed
using Root Mean Squared Error (RMSE) and Mean
Absolute Error (MAE). RMSE penalizes larger er-
rors, making it suitable for identifying significant de-
viations, while MAE provides a straightforward mea-
sure of the magnitude of the average error. For the
From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications
463
River Flow Forecasting use case, ML2+ achieved
RMSE values of 50.75 for t + 0, 72.66 for t + 1, and
86.65 for t + 2, identical to those obtained through
manual coding. This shows that automation using
ML2+ does not compromise accuracy. Additionally,
ML2 + reduced the development time from 14 to 11
hours, although the lines of code (LOC) increased
from 80 to 210 due to the generation of a complete
software model instance. In the Energy Consump-
tion Prediction use case, RMSE values were 0.2301
for Holt-Winters and 0.1941 for SARIMA, with cor-
responding MAE values of 0.1882 and 0.1553. Us-
ing ML2+, the development time decreased from 11.5
hours to just 4 hours, while the LOC increased from
64 and 60 to 200 and 205 for the respective mod-
els. ML2+ automates the generation of a complete
software model instance, encompassing all system
elements such as Things, Ports, Messages, Proper-
ties, Statecharts, and Configurations. The user de-
fines the model instance through a web-based editor
and a template with predefined structures for Things,
Ports, Messages, and workflows. This template-based
approach simplifies the process by requiring the user
to modify specific parameters and attributes, such
as hyperparameters, model types, or message for-
mats, rather than constructing the entire model from
scratch. The built-in automation in ML2+ manages
repetitive tasks such as preprocessing steps, model
training, and evaluation workflows, allowing the user
to focus on adapting the provided templates to their
specific requirements. Consequently, ML2+ not only
reduces development time but also minimizes manual
effort, making it a valuable tool for users aiming to
streamline the model development process.
5 CONCLUSION AND FUTURE
WORK
This paper introduced ML2+, an enhanced frame-
work that integrates comprehensive time-series fore-
casting functionalities into the MDE4IoT paradigm.
By automating data preprocessing, feature engineer-
ing, model training, and evaluation, ML2+ addresses
a crucial gap in existing MDE4IoT frameworks, al-
lowing developers to handle sequential data without
specialized time-series expertise. Evaluating river
flow and energy consumption scenarios shows that
ML2+ maintains predictive accuracy comparable to
manual coding approaches while reducing develop-
ment time. Automation of the framework and ab-
stractions reduces the probability of human error, im-
proves reliability, and simplifies maintenance. ML2+
thus bridges the gap between MDE and ML for time
series forecasting, positioning itself as a valuable tool
for developing intelligent, scalable, and adaptable IoT
applications. As an open-source, community-driven
platform, ML2+ will evolve with user feedback and
emerging technologies, such as advanced ML mod-
els, ensuring it remains a key enabler of intelligent
IoT ecosystems. ML2+ has demonstrated its potential
to automate time-series forecasting in AIoT contexts.
Future improvements will focus on enhanced visual-
ization capabilities that allow users to inspect prepro-
cessing steps, monitor training progress, and interpret
forecast results interactively. We will gather feed-
back from IoT developers, data scientists, and MDE
practitioners to refine ML2+, ensuring that it aligns
closely with user needs and workflows. Broader eval-
uations are planned across various domains, includ-
ing smart cities, healthcare, and industrial IoT, vali-
dating ML2+ under various data conditions. Further-
more, we aim to integrate ML2+ with edge comput-
ing strategies, enabling efficient model updates on the
fly without extensive code rewrites. This adaptability
is vital for continuously evolving IoT environments,
ensuring that the models remain accurate and rele-
vant. Community involvement and open collabora-
tion are also priorities. By fostering a user commu-
nity that shares best practices, templates, and exten-
sions, ML2+ can continuously evolve and remain at
the forefront of MDE4IoT advancements.
ACKNOWLEDGMENTS
This work is supported by several projects, in-
cluding Blockchain PT (PRR–RE-C05-i01.02:
Agendas/Alianc¸as Verdes para a Inovac¸
˜
ao Empresar-
ial), CONNECT 22050 (Local Coastal Monitoring
Service for Portugal), and ATTRACT-DIH (Digital
Innovation Hub for Artificial Intelligence and High-
Performance Computing, funded under the Digital
European Programme, Grant 101083770).
REFERENCES
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean,
J., Devin, M., Ghemawat, S., Irving, G., Isard, M.,
et al. (2015). Tensorflow: Large-scale machine learn-
ing on heterogeneous systems. Software available
from tensorflow.org.
Adi, E., Anwar, A., Baig, Z., and Zeadally, S. (2020). Ma-
chine learning and data analytics for the iot. Neural
Computing and Applications, 32(20):16205–16233.
Alulema, D., Criado, J., Iribarne, L., Fern
´
andez-Garc
´
ıa,
A. J., and Ayala, R. (2020). A model-driven engineer-
MBSE-AI Integration 2025 - 2nd Workshop on Model-based System Engineering and Artificial Intelligence
464
ing approach for the service integration of iot systems.
Cluster Computing, 23(3):1937–1954.
Bai, H., Breuel, T. M., et al. (2019). Onnx: Open neural
network exchange. GitHub Repository. Available at:
https://onnx.ai/.
Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R.,
K
¨
otter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., and
Wiswedel, B. (2009). Knime: The konstanz informa-
tion miner. In Data Analysis, Machine Learning and
Applications, pages 319–326. Springer.
Bhattacharjee, A., Barve, Y., Khare, S., Bao, S., Kang, Z.,
Gokhale, A., and Damiano, T. (2019). Stratum: A
bigdata-as-a-service for lifecycle management of iot
analytics applications. In 2019 IEEE International
Conference on Big Data (Big Data), pages 1607–
1612. IEEE.
Bredereke, J., Morin, B., et al. (2013). Models@runtime
to support dynamic adaptation. Software and Systems
Modeling, 12:159–168.
Chollet, F. et al. (2015). Keras.
Cruz-N
´
ajera, M. A., Trevi
˜
no-Berrones, M. G., Ponce-
Flores, M. P., Ter
´
an-Villanueva, J. D., Cast
´
an-Rocha,
J. A., Ibarra-Mart
´
ınez, S., Santiago, A., and Laria-
Menchaca, J. (2022). Short time series forecasting:
Recommended methods and techniques. Symmetry,
14(6):1231.
Da Silva, A. R. (2015). Model-driven engineering: A sur-
vey supported by the unified conceptual model. Com-
puter Languages, Systems & Structures, 43:139–155.
Guazzelli, A., Zeller, M., Lin, W.-C., and Williams, G.
(2009). Pmml: An open standard for sharing models.
The R Journal, 1(1):60–65.
Harrand, N., Fleurey, F., Morin, B., and Husa, K. (2016).
Thingml: A language and code generation frame-
work for heterogeneous targets. Proceedings of the
ACM SIGPLAN International Conference on Model
Driven Engineering Languages and Systems (MOD-
ELS), pages 125–135.
Hartmann, T., Moawad, A., Fouquet, F., and Le Traon, Y.
(2019). The next evolution of mde: a seamless in-
tegration of machine learning into domain modeling.
Software & Systems Modeling, 18(2):1285–1304.
Hartsell, C., Mahadevan, N., Ramakrishna, S., Dubey, A.,
Bapty, T., Johnson, T., Koutsoukos, X., Sztipanovits,
J., and Karsai, G. (2019). Model-based design for cps
with learning-enabled components. In Proceedings of
the Workshop on Design Automation for CPS and IoT,
pages 1–9.
Jesus, G., Mardani, Z., Alves, E., and Oliveira, A. (2025).
Using deep learning for tejo river flow forecasting.
Submitted to Sensors. Under review.
Kirchhof, J. C., Kusmenko, E., Ritz, J., Rumpe, B., Moin,
A., Badii, A., G
¨
unnemann, S., and Challenger, M.
(2022). Mde for machine learning-enabled software
systems: a case study and comparison of montianna
& ml-quadrat. In Proceedings of the 25th Interna-
tional Conference on Model Driven Engineering Lan-
guages and Systems: Companion Proceedings, MOD-
ELS ’22, page 380–387, New York, NY, USA. Asso-
ciation for Computing Machinery.
Mardani Korani, Z., Moin, A., Rodrigues da Silva, A., and
Ferreira, J. C. (2023). Model-driven engineering tech-
niques and tools for machine learning-enabled iot ap-
plications: A scoping review. Sensors, 23(3).
Meli
´
a, S., Nasabeh, S., Luj
´
an-Mora, S., and Cachero,
C. (2021). Mosiot: Modeling and simulating iot
healthcare-monitoring systems for people with dis-
abilities. International Journal of Environmental Re-
search and Public Health, 18(12):6357.
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and
Euler, T. (2006). Rapidminer: An open source sys-
tem for knowledge discovery in large data sets. In
Proceedings of the NIPS ML Open Source Software
Workshop.
Minka, T. P., Winn, J., Guiver, J., and Knowles, D. (2018).
Infer.net: A framework for running bayesian inference
in graphical models. Journal of Machine Learning
Research, 18:1–5.
Moin, A. (2021). Data analytics and machine learning
methods, techniques and tool for model-driven engi-
neering of smart iot services. In 2021 IEEE/ACM 43rd
International Conference on Software Engineering:
Companion Proceedings (ICSE-Companion), pages
287–292.
Moin, A., Challenger, M., Badii, A., and G
¨
unnemann, S.
(2022a). A model-driven approach to machine learn-
ing and software modeling for the iot. Software and
Systems Modeling, 21(3):987–1014.
Moin, A., Mituca, A., Challenger, M., Badii, A., and
G
¨
unnemann, S. (2022b). Ml-quadrat & driotdata: a
model-driven engineering tool and a low-code plat-
form for smart iot services. In Proceedings of the
ACM/IEEE 44th International Conference on Soft-
ware Engineering: Companion Proceedings, ICSE
’22, page 144–148, New York, NY, USA. Association
for Computing Machinery.
Moin, A., R
¨
ossler, S., and G
¨
unnemann, S. (2018).
Thingml+: Augmenting model-driven software engi-
neering for the internet of things with machine learn-
ing. In Proceedings of MODELS 2018 Workshops,
Copenhagen, Denmark, October, 14, 2018, volume
2245 of CEUR Workshop Proceedings, pages 521–
523. CEUR-WS.org.
Moin, A., R
¨
ossler, S., Sayih, M., and G
¨
unnemann, S.
(2020). From things’ modeling language (thingml) to
things’ machine learning (thingml2). MODELS ’20,
New York, NY, USA. ACM.
Morin, B., Barais, O., Fleurey, F., et al. (2016). Heads: A
holistic approach for the development of distributed
heterogeneous and adaptive systems. In International
Conference on Model Driven Engineering Languages
and Systems (MODELS), pages 92–101. Springer.
Open Data Group (2016). Portable format for analytics
(pfa). Available at: http://dmg.org/pfa/.
System, P. N. W. R. I. Snirh portal. https://snirh.
apambiente.pt/snirh/. Accessed on [2024].
UCI Machine Learning Repository. Individual
household electric power consumption dataset.
https://archive.ics.uci.edu/ml/datasets/individual+
household+electric+power+consumption. Accessed:
[2024].
From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications
465