From ML2 to ML2+: Integrating Time Series Forecasting in

Model-Driven Engineering of Smart IoT Applications

Zahra Mardani Korani

1,4 a

, Moharram Challenger

2 b

, Armin Moin

3 c

, Jo

ao Carlos Ferreira

4 d

Alberto Rodrigues da Silva

5 e

, Gonc¸alo Vitorino Jesus

1 f

, Elsa Lourenc¸o Alves

1 g

and Ricardo Correia

LNEC, Portugal

Department of Computer Science, University of Antwerp & Flanders Make, Belgium

Department of Computer Science, University of Colorado, Colorado Springs, CO, U.S.A.

ISCTE, Instituto Universit

ario de Lisboa (ISCTE-IUL), ISTAR, 1649-026 Lisbon, Portugal

INESC-ID, Instituto Superior T

ecnico, Universidade de Lisboa, Portugal

BioGHP, Portugal

{zmardani, gjesus, ealves}@lnec.pt, moharram.challenger@uantwerpen.be, amoin@uccs.edu,

Keywords:

Model-Driven Engineering, Machine Learning, IoT, Time Series Forecasting.

Abstract:

Time-series forecasting is essential for anomaly detection, predictive maintenance, and real-time optimiza-

tion in IoT environments, where sensor data is sequential. However, most model-driven engineering (MDE)

frameworks lack specialized mechanisms to capture temporal dependencies, restricting the creation of intelli-

gent and adaptive IoT systems. IoT inherently involves sequential data, yet most frameworks do not support

time-series forecasting, essential for real-world systems. This paper presents ML2+, an enhanced version of

the ML-Quadrat framework that integrates software engineering (SE) with machine learning (ML) in model-

driven engineering. ML2+ allows users to deﬁne models, things, and messages for time-series forecasting.

We evaluated ML2+ through two IoT use cases, focusing on development time, performance metrics, and

lines of code (LOC). Results show that ML2+ maintains prediction accuracy similar to manual coding while

signiﬁcantly reducing development time by automating tedious tasks for developers. By automating feature

engineering, model training, and evaluation for time-series data, ML2+ streamlines forecasting, improving

scalability. ML2+ supports various forecasting models, including deep learning, statistical, and hybrid mod-

els. It offers preprocessing capabilities such as handling missing data, creating lagged features, and detecting

data seasonality. The tool automatically generates code for time-series forecasting, making it easier for devel-

opers to train and deploy ML models without coding.

1 INTRODUCTION

Model-driven engineering for the Internet of Things

(MDE4IoT) has gained signiﬁcant attention in recent

years due to its ability to improve the efﬁciency, pre-

dictability, and maintainability of IoT system devel-

opment. Using models as the main artifacts for soft-

https://orcid.org/0000-0001-9144-7964

https://orcid.org/0000-0002-5436-6070

https://orcid.org/0000-0002-8484-7836

https://orcid.org/0000-0002-6662-0806

https://orcid.org/0000-0002-7900-9846

https://orcid.org/0000-0002-8431-3877

https://orcid.org/0000-0003-0937-7237

ware design and implementation, MDE4IoT offers a

high-level abstract representation of the system, out

of which the source code and other artifacts can au-

tomatically be generated (Da Silva, 2015; Alulema

et al., 2020). To further enhance its capabilities,

data analytics and machine learning (ML) techniques

have been incorporated into the development process

(Moin et al., 2022b; Moin et al., 2022a; Kirchhof

et al., 2022). This integration has enabled the cre-

ation of more intelligent and adaptive IoT systems.

However, despite these advances, existing MDE4IoT

frameworks often lack native, out-of-the-box support

for time series forecasting, an essential component for

applications that rely on sequential or temporal data.

458

Korani, Z. M., Challenger, M., Moin, A., Ferreira, J. C., Rodrigues da Silva, A., Jesus, G. V., Alves, E. L. and Correia, R.

From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications.

DOI: 10.5220/0013443200003896

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 13th International Conference on Model-Based Software and Systems Engineering (MODELSWARD 2025), pages 458-465

ISBN: 978-989-758-729-0; ISSN: 2184-4348

Time-series forecasting focuses on predicting future

values derived from historical observations indexed

in time. As most IoT applications produce time se-

ries data (for example, evolving sensor readings), the

ability to accurately model and forecast these trends

is critical for real-time monitoring, predictive main-

tenance, anomaly detection, and trend analysis (Adi

et al., 2020; Cruz-N

ajera et al., 2022; Mardani Korani

et al., 2023). Although previous research has explored

integrating ML into MDE4IoT workﬂows, a notable

gap remains in the specialized support for advanced

time-series forecasting models. Previous works, such

as GreyCat (Hartmann et al., 2019), have attempted to

integrate machine learning (ML) into domain models

for the Internet of Things (IoT). However, they lack

dedicated frameworks to address the complexities in-

herent in sequential data, such as capturing tempo-

ral dependencies and managing seasonality. Bridging

this gap presents signiﬁcant challenges, necessitat-

ing a systematic approach to modeling temporal rela-

tionships, automating feature engineering for sequen-

tial data, and seamlessly integrating these processes

into Model-Driven Engineering (MDE) workﬂows.

This paper introduces ML2+, an enhanced extension

of the ML-Quadrat (ML2) framework (Moin et al.,

2022a) that directly tackles these challenges by em-

bedding comprehensive time-series forecasting capa-

bilities into MDE4IoT. Building upon ML2, an open-

source Model-Driven Software Engineering (MDSE)

tool that integrates ML with IoT development, ML2+

augments the framework with robust functionalities

for time-series forecasting. It leverages diverse mod-

els, including deep learning architectures like LSTM,

statistical models such as ARIMA and Prophet, and

hybrid approaches such as XGBoost, providing a

versatile toolkit for developers to effectively han-

dle temporal data within IoT applications. It auto-

matically generates Python code for data preprocess-

ing (e.g., handling missing values, creating lagged

features, detecting seasonality), model training, and

evaluation, thus eliminating the need for develop-

ers to have specialized expertise in time-series anal-

ysis. This automation optimizes development work-

ﬂows, improves scalability, and enables developers

to seamlessly integrate advanced forecasting features

into their IoT solutions. By reducing manual coding,

ML2+ allows developers to focus on higher-level ana-

lytical tasks rather than low-level implementation de-

tails. The framework supports conﬁgurations for mul-

tivariate analysis, seasonality detection, and stationar-

ity checks, ensuring that generated models are well-

tuned to the data characteristics. As an open-source,

community-driven platform, ML2+ fosters collabora-

tion and knowledge exchange, ultimately bridging a

key research gap in MDE4IoT and supporting sophis-

ticated time-series analysis for a wide range of IoT

systems. This paper is structured as follows: Sec-

tion 2 examines the latest developments in the ﬁeld,

while Section 3.1.4 presents our proposed solution.

Section 4 describes the validation and evaluation. Fi-

nally, Section 5 summarizes the study and outlines fu-

ture work.

2 LITERATURE REVIEW

We brieﬂy reviewed several related works in the

area of MDE for the IoT (MDE4IoT). ML-Quadrat

(ML2) (Moin et al., 2018; Moin et al., 2020; Moin,

2021; Moin et al., 2022a) integrated supervised, un-

supervised, and semi-supervised learning, improving

the development of IoT applications. However, it did

not explicitly address time-series forecasting, requir-

ing developers to manually manage complex tempo-

ral dependencies. Hartmann et al. introduced Gr-

eyCat (Hartmann et al., 2019), which integrated ML

techniques into domain models for IoT. Although Gr-

eyCat enabled predictions within domain models, it

did not emphasize the specialized models and pre-

processing steps necessary for time-series forecast-

ing. Similarly, MoSIoT (Meli

a et al., 2021) in-

corporated learning features into the IoT scenario

models, but lacked dedicated support for advanced

time-series forecasting tasks. The Stratum plat-

form (Bhattacharjee et al., 2019) focused on an-

alytics management in IoT contexts, and Hartsell

et al. (Hartsell et al., 2019) proposed model-based

design techniques for cyber-physical systems. Al-

though these approaches showcased integrated ana-

lytics or active learning methods, they did not ad-

dress the unique requirements of time series fore-

casting, such as handling seasonality, performing sta-

tionarity checks, and performing lag feature engineer-

ing. Beyond MDE4IoT-speciﬁc tools, a range of ML

frameworks (e.g., TensorFlow (Abadi et al., 2015),

Keras (Chollet et al., 2015)) and workﬂow design-

ers (e.g., KNIME (Berthold et al., 2009), Rapid-

Miner (Mierswa et al., 2006)) offered high-level APIs

and ﬂexible interfaces for ML. However, these did

not align fully with the model-driven engineering

paradigm and lacked direct mechanisms to integrate

time-series forecasting models into MDE4IoT arti-

facts. Interoperability standards such as PMML,

PFA, and ONNX (Bai et al., 2019) enhanced model

exchange but did not seamlessly incorporate temporal

modeling capabilities within domain-speciﬁc model-

ing languages (DSMLs) for IoT. In particular, while

the existing version of ML-Quadrat provided a strong

From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications

459

foundation for general ML tasks, it did not fully inte-

grate time-series forecasting. This gap remained a key

barrier for developers who needed to leverage fore-

casting within an MDE4IoT ecosystem. Our research

addressed this gap by enhancing ML-Quadrat with

ML2+. By incorporating advanced time-series fore-

casting models (e.g., LSTM, ARIMA, Prophet, XG-

Boost), automating data preprocessing, and stream-

lining model training, ML2+ enabled more effective

and efﬁcient development of IoT applications. Fu-

ture work will validate ML2+ through extensive IoT

case studies, focusing on improvements in develop-

ment time, forecast accuracy, and overall scalability.

3 PROPOSED APPROACH

We enhanced the ML-Quadrat (ML2) (Moin et al.,

2022a) framework to ML2+ by integrating time-

series forecasting. This extension introduces time-

series modeling constructs, automated preprocessing

pipelines, and advanced forecasting methods within

the Model-Driven Engineering for the Internet of

Things (MDE4IoT) workﬂow, enabling effective han-

dling of sequential data essential for IoT applications

requiring temporal pattern analysis.

3.1 The Foundational Software Model

(SM)

The foundational Software Model (SM) (Moin et al.,

2022a) for IoT/CPS systems is formally deﬁned as:

SM = (A, Ψ,B,C) (1)

Where:

• A: Annotations for external libraries and proto-

cols.

• Ψ: Structural Elements (e.g., Thing, Port,

Message).

• B: Behavioral Elements (e.g., ﬁnite-state ma-

chines or statecharts).

• C: Conﬁgurations that manage settings, compo-

nent instantiations, and connections.

Based on SM, we construct the Smart Software

Model (SSM) by incorporating a Domain Model (DM)

that speciﬁes machine learning (ML) tasks in behav-

ioral elements:

SSM = (A,Ψ, f

(DM),C) (2)

Here, DM stands for Domain Model, which encom-

passes the relevant ML artifacts (such as algorithms,

hyperparameters, and data schemas). The function

(DM) indicates the behavior modiﬁcations to the

ﬁnite-state machines of SM that arise from the incor-

poration of ML. In practical terms, these modiﬁca-

tions can include transitions or states that depend on

the predictions of the ML model (e.g., switching to

a state warning if a forecast sensor value exceeds a

threshold). This approach ensures that the system’s

run-time behavior is intelligently adapted based on

the results of data-driven models (Moin et al., 2022a).

To integrate time-series forecasting into the SSM, we

reﬁne f

(DM) to:

′

(DM

,DM

T S

)

Where:

• DM

: Standard ML tasks (e.g., classiﬁcation or

regression).

• DM

T S

: Time-series-speciﬁc constructs (e.g.,

ARIMA, LSTM, Prophet).

Thus, the SSM now becomes:

SSM = (A,Ψ, f

′

(DM

,DM

T S

),C) (3)

3.1.1 Time-Series Domain Model (DM

T S

)

The time-series domain model, DM

T S

, contains es-

sential artifacts for forecasting tasks, such as model

architectures, parameters, features, hyperparameters,

and metadata. Notably, the SSM is initially un-

trained, meaning DM

T S

includes constructs for train-

ing but not the trained model itself. Our implementa-

tion uses two separate datasets to train these ML mod-

els, ensuring robust and accurate forecasting.

T S

= (υ

T S

,Φ

T S

) (2) (4)

Where:

• υ

T S

: Model Architecture (e.g., ARIMA, LSTM)

deﬁning the structure of the forecasting approach.

• P

T S

: Model Parameters (e.g., ARIMA (p,d,q),

LSTM units) detailing speciﬁc conﬁguration val-

ues per architecture.

• Φ

T S

: Time-Series Features derived from histor-

ical data (e.g., lagged observations, rolling win-

dows).

• H

T S

: Hyperparameters (e.g., learning rate,

epochs) guiding the training process.

• I

T S

: Metadata (e.g., data frequency, forecast

horizon) providing contextual information for the

time-series data.

MBSE-AI Integration 2025 - 2nd Workshop on Model-based System Engineering and Artiﬁcial Intelligence

460

By incorporating DM

T S

into the SSM, we en-

able robust time-series forecasting within MDE4IoT

workﬂows. The following sections detail how ML2+

automates preprocessing, training, and deployment

for these models, minimizing manual effort while pre-

serving ﬂexibility for developers and data scientists.

3.1.2 Model Validation and Code Generation

The validated and complete SSM models are con-

verted into executable source code:

V (SSM) ∧C(SSM) =⇒ ∆(SSM) = full source code

(5)

Where:

• V (SSM): Validates the SSM.

• C(SSM): Checks the completeness of the SSM.

• ∆: Automates the transformation, reducing man-

ual effort.

• Supports statsmodels, xgboost, prophet, pytorch,

and integrates with scikit-learn, keras-tensorﬂow,

weka.

3.1.3 Conﬁgurations Management

Conﬁgurations manage component instantiations and

connections(Moin et al., 2022a):

= (A

,Θ, Ξ) (6)

where:

• A

: Annotations for component requirements.

• Θ: Instantiated components (e.g., data streams,

preprocessing units, models).

• Ξ: Communication connectors for data ﬂow.

By maintaining abstraction, ML2+ ensures that

complex IoT architectures with forecasting capabili-

ties are manageable and maintainable.

3.1.4 Proposed Architecture Design

This study improves the framework ML-Quadrat

(ML2) (Moin et al., 2022a) by integrating time-series

forecasting into Internet of Things (IoT) systems.

Figure 1 illustrates the metamodel of our architec-

ture, highlighting new time series forecasting compo-

nents in yellow, along with their preprocessing steps

in red in the Data Analytics component, assessment

methods, and model parameters. Green-marked com-

ponents indicate enhancements and expanded ma-

chine learning functionalities of ML-Quadrat (ML2).

Using domain-speciﬁc modeling languages (DSML)

and model-driven engineering (MDE), ML2 + man-

ages system complexity and automates development

tasks such as code generation and deployment (Mar-

dani Korani et al., 2023).

3.1.5 Abstract Syntax of the DSML

The DSML for ML2+ is built on an Ecore metamodel

within the MDE4IoT framework, tailored for time-

series forecasting. It deﬁnes datasets with features,

labels, and temporal attributes, and includes prepro-

cessing conﬁgurations such as imputation and out-

lier removal. The language supports models such

as ARIMA, SARIMA, LSTM, and hybrids with cus-

tomizable parameters, and incorporates evaluation

metrics that include RMSE, MAE, and MSE.

3.1.6 Concrete Syntax and Model Editors

The DSML’s concrete syntax is designed for user-

friendliness, facilitating interaction with time-series

forecasting models within the MDE4IoT framework.

Developed using Xtext grammar in Eclipse, it enables

structured model deﬁnitions. A cloud-based web ed-

itor has replaced the previous ML2 editor, offering

features such as syntax highlighting, autocompletion,

and customizable templates to enhance user experi-

ence.

3.1.7 Semantics and Model-to-Code

Transformations

The semantics of the DSML deﬁne the meaning and

behavior of models created using the language. It

includes model-to-code transformations that convert

high-level model descriptions into executable Python

and Java code. These transformations automatically

generate deployable scripts such as processing.py,

training.py, and predict.py, ensuring that the

generated code aligns with the model logic. Xtend-

based templates streamline this process, enabling

easy deployment on IoT devices and edge platforms.

3.1.8 Supported Time-Series Forecasting

Methods and Techniques

ML2+ provides a comprehensive framework for

time-series forecasting, preprocessing, and eval-

uation. It supports datasets with features, labels,

and temporal conﬁgurations, using parameters like

common period threshold to align multivariate

IoT datasets by specifying the maximum allowable

missing values. Preprocessing capabilities include

imputation, outlier removal, and transforma-

tions such as lag features, rolling windows,

and resampling. Temporal data handling is con-

ﬁgurable, with options for forecasting horizons,

lag settings, seasonality detection, and

stationarity checks. The framework supports

deep learning models, including MLP, GRU, CNN,

LSTM, RNN, TCN, and Transformers, as well as

From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications

461

statistical models like ARIMA, SARIMA, Holt-

Winters, and ETS. Machine learning models such

as SVR, RFR, GBM, and XGBoost, along with

hybrid approaches like ARIMA GARCH and Prophet,

are also included. Hyperparameter tuning tech-

niques, including grid search, random search,

and Bayesian optimization, ensure optimal model

performance. Model evaluation employs metrics

like RMSE, MAE, and MSE, with support for context-

speciﬁc performance analysis in domains such as

RiverFlow prediction and DataCenterManagement.

Domain-speciﬁc use cases, including RiverFlow,

ServerMonitoring, and IoTDataCenter, are

deﬁned for specialized applications. The framework

integrates seamlessly with widely used libraries such

as statsmodels, xgboost, prophet, pytorch,

scikit-learn, keras-tensorflow, and weka,

enabling efﬁcient data processing and accurate

predictions without extensive DSML or modeling

expertise.

4 VALIDATION AND

EVALUATION

4.1 River Flow Forecasting

A case study was implemented to validate the pro-

posed river ﬂow prediction framework using ac-

tual multivariate datasets. The system leverages the

ML2+ framework and the Data Analytics and Ma-

chine Learning (DAML) components to predict river

discharge and support ﬂood risk management in an

IoT environment. For example, to predict river

ﬂow for the next three days at the target station

Ac¸ude Ponte de Coimbra (12G/01AE), daily river

ﬂow data are collected from multiple monitoring sta-

tions, including Albufeira da Aguieira (11H/01A),

Albufeira da Raiva (12H/01A), Albufeira de Fronhas

(12I/01A) and Ac¸ude Ponte de Coimbra (12G/01AE).

The dataset spans from January 1, 1984, to Novem-

ber 18, 2024, consisting of daily average efﬂuent

ﬂow measurements (in m

/s), provided by the Por-

tuguese National Water Resources Information Sys-

tem (SNIRH)(System, ; Jesus et al., 2025). Each sta-

tion transmits its data daily to a central server, which

is subsequently accessed by the DAML server. The

workﬂow, orchestrated by the DAML components,

automates the critical stages of the pipeline. The data

preprocessing step, executed using DA Preprocess,

prepares the collected multivariate data by interpolat-

ing missing values, capped at 10 consecutive gaps,

to ensure data continuity. The dataset is resampled

to align with a common period threshold and con-

verted into a supervised learning format by generating

lagged features that capture temporal dependencies

across multiple stations, essential for accurate river

ﬂow forecasting. These preprocessing steps are im-

plemented using the Scikit-Learn library as part of

the software model. The preprocessed data are split

into 80% for training and 20% for testing, ensuring

the sequential nature of the multivariate time-series

data is preserved. The training process is executed

using DA Train, which deploys a Multilayer Percep-

tron (MLP) model conﬁgured with 50 neurons, 50

epochs, a batch size of 16, L2 regularization with a

value of 0.01, a dropout rate of 0.2, and the relu op-

timizer with Mean Squared Error (MSE) as the loss

function. This conﬁguration ensures efﬁcient train-

ing while maintaining model accuracy and is imple-

mented using the Keras library within the ML2+

framework. Following the training phase, predictions

are generated for the next three days at the target sta-

tion using DA Predict, a component in the ML2+

framework that retrieves the latest input features, per-

forms predictions, and stores the forecasted values in

predeﬁned properties. The predicted river ﬂow val-

ues are compared against a predeﬁned threshold of

150m

/s. If the predicted ﬂow exceeds this threshold,

a ﬂood alert is triggered, allowing proactive decision-

making to mitigate potential risks.

4.2 Energy Consumption Forecasting

A case study was conducted to evaluate the pro-

posed energy consumption forecasting system, lever-

aging real-world data from the UCI Machine Learn-

ing Repository (UCI Machine Learning Repository,

). Let us assume we want to predict the total en-

ergy consumption for the next three hours in a smart

home setting. The dataset spans from December

2006 to November 2010 and provides second-level

measurements of electric power consumption in kilo-

watts (kW). The data include total energy usage as

the target variable and time-indexed features, such

as appliance-speciﬁc consumption (e.g., dishwasher,

refrigerator), and additional contextual factors like

temperature when available. In this setup, data are

transmitted to a central server every second, support-

ing real-time monitoring and prediction. However,

to align with the forecasting goal of predicting to-

tal energy consumption over the next three hours,

the second-level data are aggregated into hourly in-

tervals during preprocessing. Hourly resampling re-

duces noise, highlights long-term trends, and ensures

computational efﬁciency. The aggregated data are

then accessed by the DAML server, which orches-

trates the workﬂow to automate the critical stages of

MBSE-AI Integration 2025 - 2nd Workshop on Model-based System Engineering and Artiﬁcial Intelligence

462

ThingMLModel

Import

importURI: EString

from: EString

PlatformAnnotation

name : EString

value : EString

AnnotatedElement

Thing

fragment: EBoolean = false

Protocol

Property

readonly: EBoolean = false

Message

Parameter

Port

DataAnalytics

combine_threshold : ELong

preprocess_feature_scaler

preprocess_sample_normalizer

ﬁlling_missing_value

remove_outliers

advanced_imputation

lagged_features

rolling_window_features

resampling

transformations

multivariate

stationary

seasonality_detection

supervised_learning

create_lagged_features

sliding_window

hyperparameter_tuning

ensemble_methods

model_evaluation

outlier_detection

time_series_clustering

contextArea

input fetures

Plots

DataAnalyticsModelAlgorithm

name: EString

Time_series_ModelAlgorithm

Deep_learning_ModelAlgorithm

optimizer

metrics

statistical_ModelAlgorithm

machine_learning_ModelAlgorithm

hybrid_ModelAlgorithm

GRU

hidden_layer_sizes : EString

regularization

input_activation

hidden_activation

output_activation

dropout : EDouble = 0.0

rate : EDouble = 0.0

LSTM

regularization

input_activation

hidden_activation

output_activation

hidden_layer_sizes :

EString

rate : EDouble = 0.0

predctionplot

ARIMA

trend : EString

XGBoost

learning_rate : EDouble = 0.0

objective : EString

booster : EString

gamma : EDouble = 0.0

min_child_weight : EDouble = 0.0

subsample : EDouble = 0.0

colsample_bytree : EDouble = 0.0

ARIMA_GARCH

seasonal_order : EString

trend : EString

garch_order : EString

Prophet

growth : EString

seasonality_

mode :

EString

ML2_ModelAlgorithm

PMML_ModelAlgorith

pmml_path : EString

PFA_ModelAlgorithm

pfa_path: EString

NN_MultilayerPerceptron

hidden_layer_sizes: EString

activation

hidden_layers_activation_functions: EString

activations

optimizer

Action

DASaveAction

DAPreprocessActio

DATrainAction

DAPredictAction

DAPreTrainedPredictAction

DAForecastAction

Conﬁguration

ConﬁgPropertyAssign

[0..*] imports

[0..*] protocols

[0..*] conﬁgs

[0..*] annotations

[0..*] includes

[0..*] messages

[0..*] ports

[0..*] properties

[0..*] dataAnalytics

[0..*] parameters

[0..*] sends

[0..*] receives

[0..*] features

[0..*] input_features

[0..*] output_features

[0..1] modelAlgorithm

[0..*] predictionResults

[0..1] dataAnalytics

[0..*] features

[0..1] dataAnalytics

[0..*] features

[0..1] dataAnalytics

[0..*] features

[0..*] propassigns

[0..1] property

[0..*] annotations

Figure 1: Meta-Model Diagram of the Enhanced ML2+ Framework, highlighting new components in yellow.

data preparation, model training, and forecasting. The

workﬂow begins with DA Preprocess, which auto-

mates preprocessing tasks, including resampling, in-

terpolation of missing data, generation of lagged fea-

tures to capture temporal dependencies, and scaling of

variables to ensure consistency. During resampling,

the dataset is transformed from second-level granu-

larity to hourly intervals by taking the mean energy

usage within each hour. Missing values were imputed

using interpolation to ensure data continuity. Lagged

versions of the use [kW] variable were generated for

the last three hours (t − 1, t − 2, t − 3) to incorporate

temporal dependencies. Features were normalized us-

ing Min-Max Scaling to bring values into the range

[0, 1], and outliers were removed using the Z-score

method to improve model robustness. Stationarity

of the time series was veriﬁed using the Augmented

Dickey-Fuller (ADF) test, and non-stationary series

were differenced as needed to stabilize the variance.

The dataset was divided into 80% training data and

20% testing data, maintaining the temporal order of

the time series to preserve sequence relationships. For

model training, the DA Train component was used

to train two forecasting models: Holt-Winters (Triple

Exponential Smoothing) and SARIMA. The Holt-

Winters model was conﬁgured to capture the level,

trend, and seasonality of energy usage, with additive

components and a seasonal period of 24 (correspond-

ing to daily hourly seasonality). The SARIMA model,

which extends ARIMA by incorporating seasonal pat-

terns, was conﬁgured with the parameters (1,1,1)

for ARIMA and (1,1,1, 24) for seasonal components.

Seasonal decomposition was used to identify optimal

seasonal parameters for the SARIMA model. Hyper-

parameter tuning was conducted to optimize model

performance. A library such as Statsmodels was uti-

lized within the ML2+ framework for model imple-

mentation and evaluation. Once the models were

trained, the DA Predict component was used to gen-

erate predictions for the next three hours. The DAML

server retrieved the forecast input data, aggregated it

into hourly intervals, fed them into the trained models,

and stored the predicted energy usage in predeﬁned

properties. For proactive energy management, an alert

threshold of 5kW was deﬁned. If the predicted total

energy consumption exceeded this threshold, an alert

was triggered, allowing real-time adjustments or in-

terventions in the smart home environment.

4.3 Comparative Analysis and Metrics

The evaluation presented in this paper was conducted

by a user with medium-level experience in Python

coding and a good understanding of the writing of

model instances. Forecast performance was assessed

using Root Mean Squared Error (RMSE) and Mean

Absolute Error (MAE). RMSE penalizes larger er-

rors, making it suitable for identifying signiﬁcant de-

viations, while MAE provides a straightforward mea-

sure of the magnitude of the average error. For the

From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications

463

River Flow Forecasting use case, ML2+ achieved

RMSE values of 50.75 for t + 0, 72.66 for t + 1, and

86.65 for t + 2, identical to those obtained through

manual coding. This shows that automation using

ML2+ does not compromise accuracy. Additionally,

ML2 + reduced the development time from 14 to 11

hours, although the lines of code (LOC) increased

from 80 to 210 due to the generation of a complete

software model instance. In the Energy Consump-

tion Prediction use case, RMSE values were 0.2301

for Holt-Winters and 0.1941 for SARIMA, with cor-

responding MAE values of 0.1882 and 0.1553. Us-

ing ML2+, the development time decreased from 11.5

hours to just 4 hours, while the LOC increased from

64 and 60 to 200 and 205 for the respective mod-

els. ML2+ automates the generation of a complete

software model instance, encompassing all system

elements such as Things, Ports, Messages, Proper-

ties, Statecharts, and Conﬁgurations. The user de-

ﬁnes the model instance through a web-based editor

and a template with predeﬁned structures for Things,

Ports, Messages, and workﬂows. This template-based

approach simpliﬁes the process by requiring the user

to modify speciﬁc parameters and attributes, such

as hyperparameters, model types, or message for-

mats, rather than constructing the entire model from

scratch. The built-in automation in ML2+ manages

repetitive tasks such as preprocessing steps, model

training, and evaluation workﬂows, allowing the user

to focus on adapting the provided templates to their

speciﬁc requirements. Consequently, ML2+ not only

reduces development time but also minimizes manual

effort, making it a valuable tool for users aiming to

streamline the model development process.

5 CONCLUSION AND FUTURE

WORK

This paper introduced ML2+, an enhanced frame-

work that integrates comprehensive time-series fore-

casting functionalities into the MDE4IoT paradigm.

By automating data preprocessing, feature engineer-

ing, model training, and evaluation, ML2+ addresses

a crucial gap in existing MDE4IoT frameworks, al-

lowing developers to handle sequential data without

specialized time-series expertise. Evaluating river

ﬂow and energy consumption scenarios shows that

ML2+ maintains predictive accuracy comparable to

manual coding approaches while reducing develop-

ment time. Automation of the framework and ab-

stractions reduces the probability of human error, im-

proves reliability, and simpliﬁes maintenance. ML2+

thus bridges the gap between MDE and ML for time

series forecasting, positioning itself as a valuable tool

for developing intelligent, scalable, and adaptable IoT

applications. As an open-source, community-driven

platform, ML2+ will evolve with user feedback and

emerging technologies, such as advanced ML mod-

els, ensuring it remains a key enabler of intelligent

IoT ecosystems. ML2+ has demonstrated its potential

to automate time-series forecasting in AIoT contexts.

Future improvements will focus on enhanced visual-

ization capabilities that allow users to inspect prepro-

cessing steps, monitor training progress, and interpret

forecast results interactively. We will gather feed-

back from IoT developers, data scientists, and MDE

practitioners to reﬁne ML2+, ensuring that it aligns

closely with user needs and workﬂows. Broader eval-

uations are planned across various domains, includ-

ing smart cities, healthcare, and industrial IoT, vali-

dating ML2+ under various data conditions. Further-

more, we aim to integrate ML2+ with edge comput-

ing strategies, enabling efﬁcient model updates on the

ﬂy without extensive code rewrites. This adaptability

is vital for continuously evolving IoT environments,

ensuring that the models remain accurate and rele-

vant. Community involvement and open collabora-

tion are also priorities. By fostering a user commu-

nity that shares best practices, templates, and exten-

sions, ML2+ can continuously evolve and remain at

the forefront of MDE4IoT advancements.

ACKNOWLEDGMENTS

This work is supported by several projects, in-

cluding Blockchain PT (PRR–RE-C05-i01.02:

Agendas/Alianc¸as Verdes para a Inovac¸

ao Empresar-

ial), CONNECT 22050 (Local Coastal Monitoring

Service for Portugal), and ATTRACT-DIH (Digital

Innovation Hub for Artiﬁcial Intelligence and High-

Performance Computing, funded under the Digital

European Programme, Grant 101083770).

REFERENCES

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean,

J., Devin, M., Ghemawat, S., Irving, G., Isard, M.,

et al. (2015). Tensorﬂow: Large-scale machine learn-

ing on heterogeneous systems. Software available

from tensorﬂow.org.

Adi, E., Anwar, A., Baig, Z., and Zeadally, S. (2020). Ma-

chine learning and data analytics for the iot. Neural

Computing and Applications, 32(20):16205–16233.

Alulema, D., Criado, J., Iribarne, L., Fern

andez-Garc

ıa,

A. J., and Ayala, R. (2020). A model-driven engineer-

MBSE-AI Integration 2025 - 2nd Workshop on Model-based System Engineering and Artiﬁcial Intelligence

464

ing approach for the service integration of iot systems.

Cluster Computing, 23(3):1937–1954.

Bai, H., Breuel, T. M., et al. (2019). Onnx: Open neural

network exchange. GitHub Repository. Available at:

https://onnx.ai/.

Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R.,

otter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., and

Wiswedel, B. (2009). Knime: The konstanz informa-

tion miner. In Data Analysis, Machine Learning and

Applications, pages 319–326. Springer.

Bhattacharjee, A., Barve, Y., Khare, S., Bao, S., Kang, Z.,

Gokhale, A., and Damiano, T. (2019). Stratum: A

bigdata-as-a-service for lifecycle management of iot

analytics applications. In 2019 IEEE International

Conference on Big Data (Big Data), pages 1607–

1612. IEEE.

Bredereke, J., Morin, B., et al. (2013). Models@runtime

to support dynamic adaptation. Software and Systems

Modeling, 12:159–168.

Chollet, F. et al. (2015). Keras.

Cruz-N

ajera, M. A., Trevi

no-Berrones, M. G., Ponce-

Flores, M. P., Ter

an-Villanueva, J. D., Cast

an-Rocha,

J. A., Ibarra-Mart

ınez, S., Santiago, A., and Laria-

Menchaca, J. (2022). Short time series forecasting:

Recommended methods and techniques. Symmetry,

14(6):1231.

Da Silva, A. R. (2015). Model-driven engineering: A sur-

vey supported by the uniﬁed conceptual model. Com-

puter Languages, Systems & Structures, 43:139–155.

Guazzelli, A., Zeller, M., Lin, W.-C., and Williams, G.

(2009). Pmml: An open standard for sharing models.

The R Journal, 1(1):60–65.

Harrand, N., Fleurey, F., Morin, B., and Husa, K. (2016).

Thingml: A language and code generation frame-

work for heterogeneous targets. Proceedings of the

ACM SIGPLAN International Conference on Model

Driven Engineering Languages and Systems (MOD-

ELS), pages 125–135.

Hartmann, T., Moawad, A., Fouquet, F., and Le Traon, Y.

(2019). The next evolution of mde: a seamless in-

tegration of machine learning into domain modeling.

Software & Systems Modeling, 18(2):1285–1304.

Hartsell, C., Mahadevan, N., Ramakrishna, S., Dubey, A.,

Bapty, T., Johnson, T., Koutsoukos, X., Sztipanovits,

J., and Karsai, G. (2019). Model-based design for cps

with learning-enabled components. In Proceedings of

the Workshop on Design Automation for CPS and IoT,

pages 1–9.

Jesus, G., Mardani, Z., Alves, E., and Oliveira, A. (2025).

Using deep learning for tejo river ﬂow forecasting.

Submitted to Sensors. Under review.

Kirchhof, J. C., Kusmenko, E., Ritz, J., Rumpe, B., Moin,

A., Badii, A., G

unnemann, S., and Challenger, M.

(2022). Mde for machine learning-enabled software

systems: a case study and comparison of montianna

& ml-quadrat. In Proceedings of the 25th Interna-

tional Conference on Model Driven Engineering Lan-

guages and Systems: Companion Proceedings, MOD-

ELS ’22, page 380–387, New York, NY, USA. Asso-

ciation for Computing Machinery.

Mardani Korani, Z., Moin, A., Rodrigues da Silva, A., and

Ferreira, J. C. (2023). Model-driven engineering tech-

niques and tools for machine learning-enabled iot ap-

plications: A scoping review. Sensors, 23(3).

Meli

a, S., Nasabeh, S., Luj

an-Mora, S., and Cachero,

C. (2021). Mosiot: Modeling and simulating iot

healthcare-monitoring systems for people with dis-

abilities. International Journal of Environmental Re-

search and Public Health, 18(12):6357.

Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and

Euler, T. (2006). Rapidminer: An open source sys-

tem for knowledge discovery in large data sets. In

Proceedings of the NIPS ML Open Source Software

Workshop.

Minka, T. P., Winn, J., Guiver, J., and Knowles, D. (2018).

Infer.net: A framework for running bayesian inference

in graphical models. Journal of Machine Learning

Research, 18:1–5.

Moin, A. (2021). Data analytics and machine learning

methods, techniques and tool for model-driven engi-

neering of smart iot services. In 2021 IEEE/ACM 43rd

International Conference on Software Engineering:

Companion Proceedings (ICSE-Companion), pages

287–292.

Moin, A., Challenger, M., Badii, A., and G

unnemann, S.

(2022a). A model-driven approach to machine learn-

ing and software modeling for the iot. Software and

Systems Modeling, 21(3):987–1014.

Moin, A., Mituca, A., Challenger, M., Badii, A., and

unnemann, S. (2022b). Ml-quadrat & driotdata: a

model-driven engineering tool and a low-code plat-

form for smart iot services. In Proceedings of the

ACM/IEEE 44th International Conference on Soft-

ware Engineering: Companion Proceedings, ICSE

’22, page 144–148, New York, NY, USA. Association

for Computing Machinery.

Moin, A., R

ossler, S., and G

unnemann, S. (2018).

Thingml+: Augmenting model-driven software engi-

neering for the internet of things with machine learn-

ing. In Proceedings of MODELS 2018 Workshops,

Copenhagen, Denmark, October, 14, 2018, volume

2245 of CEUR Workshop Proceedings, pages 521–

523. CEUR-WS.org.

Moin, A., R

ossler, S., Sayih, M., and G

unnemann, S.

(2020). From things’ modeling language (thingml) to

things’ machine learning (thingml2). MODELS ’20,

New York, NY, USA. ACM.

Morin, B., Barais, O., Fleurey, F., et al. (2016). Heads: A

holistic approach for the development of distributed

heterogeneous and adaptive systems. In International

Conference on Model Driven Engineering Languages

and Systems (MODELS), pages 92–101. Springer.

Open Data Group (2016). Portable format for analytics

(pfa). Available at: http://dmg.org/pfa/.

System, P. N. W. R. I. Snirh portal. https://snirh.

apambiente.pt/snirh/. Accessed on [2024].

UCI Machine Learning Repository. Individual

household electric power consumption dataset.

https://archive.ics.uci.edu/ml/datasets/individual+

household+electric+power+consumption. Accessed:

[2024].

From ML2 to ML2+: Integrating Time Series Forecasting in Model-Driven Engineering of Smart IoT Applications

465