Platform-Agnostic MLOps on Edge, Fog and Cloud Platforms in

Industrial IoT

Alexander Keusch

, Thomas Blumauer-Hiessl

, Alireza Furutanpey

Daniel Schall

and Schahram Dustdar

TU Vienna, Austria

Siemens Technology, Austria

Keywords:

IoT, Edge Intelligence, Machine Learning, MLOps.

Abstract:

The proliferation of edge computing systems drives the need for comprehensive frameworks that can seam-

lessly deploy machine learning models across edge, fog, and cloud layers. This work presents a platform-

agnostic Machine Learning Operations (MLOps) framework tailored for industrial applications. A novel

framework enables data scientists in an industrial setting to develop and deploy AI solutions across diverse de-

ployment modes while providing a consistent experience. We evaluate our framework on real-world industrial

data by collecting performance metrics and energy measurements on training and prediction runs of two ML

workﬂows. Then, we compare edge, fog, and cloud deployments and highlight the advantages and limitations

of each deployment mode. Our results emphasize the relevance of the introduced platform-agnostic MLOps

frameworks in enabling ﬂexible and efﬁcient AI deployments.

1 INTRODUCTION

The vast amounts of data constantly being generated

by industrial systems(Trabesinger et al., 2020) that

are often analyzed using Artiﬁcial Intelligence (AI) or

machine learning (ML) methods, as well as the emer-

gence of edge computing systems and AI accelera-

tors, lead to the rise of edge intelligence. (Rausch and

Dustdar, 2019; Ding et al., 2022).

Edge computing meets the demands of industrial

systems by offering low latency and a privacy-aware

approach for processing and analyzing the data (Calo

et al., 2017). Despite the availability of powerful edge

devices, computational resources, such as memory or

processing power are limited(Mendez et al., 2022).

The constrained resources limit the volume of acces-

sible data for analysis on edge devices, extend the du-

ration of the analysis process, and restrict the com-

plexity of applicable algorithms. In contrast, signiﬁ-

cantly more computational resources are available at

the cloud and fog levels. However, task ofﬂoading re-

quires sufﬁcient bandwidth and reduces privacy (Nas-

tic et al., 2022; Mendez et al., 2022). These obser-

vations underscore the inherent advantages and limi-

tations of edge, cloud, and fog deployments. Conse-

quently, there is a need for a comprehensive frame-

work capable of facilitating the development and

deployment of AI solutions in a seamless end-to-

end Machine learning Operations (MLOps) lifecycle

across these diverse layers. Several frameworks en-

compassing the entire MLOps lifecycle—from data

preprocessing and model training and validation to

model execution and monitoring—have been devel-

oped by leading cloud companies such as AWS and

Microsoft Azure. Nonetheless, these tools predomi-

nantly center around cloud-based workﬂows, leaving

a gap in the seamless integration of the fog and edge

layers(Kemnitz et al., 2023; Rausch et al., 2019).

To ﬁll this gap between the cloud- and the fog-

as well as edge layer, this paper presents an MLOps

Framework, that provides an end-to-end MLOps

workﬂow on all of these layers. The design goals of

this framework are:

• Platform Agnosticism: The framework must

function seamlessly across edge, fog, and cloud

environments. This extends the capabilities of

MLOps lifecycle management, which are typi-

cally conﬁned to frameworks provided by major

cloud providers, to also encompass edge and fog

layers.

• Consistent Experience: The framework must en-

sure a uniform user experience for data scientists

and operators across all layers. This entails that all

layers should be utilized in the same manner, re-

gardless of the speciﬁc deployment environment.

• Data Management: The framework should sup-

port a wide variety of data sources as input for

Keusch, A., Blumauer-Hiessl, T., Furutanpey, A., Schall, D. and Dustdar, S.

Platform-Agnostic MLOps on Edge, Fog and Cloud Platforms in Industrial IoT.

DOI: 10.5220/0012977500003825

In Proceedings of the 20th International Conference on Web Information Systems and Technologies (WEBIST 2024), pages 71-79

ISBN: 978-989-758-718-4; ISSN: 2184-3252

models.

• Model Management: Models should be encap-

sulated in deployment packages that can be up-

loaded, utilized, and managed consistently across

all layers.

• Integration with Existing Systems: The frame-

work should seamlessly integrate with existing

systems by allowing the results to be pushed back

into these systems.

We present a framework that meets these criteria and

evaluate its performance through experiments con-

ducted in setups that simulate real-life environments.

Speciﬁcally, the Framework is deployed on all three

deployment modes, and two ML-based data analy-

sis workﬂows are executed using real-world industrial

data. Metrics are then collected for each run of the

workﬂows, which are used to compare edge, fog, and

cloud executions and discuss the advantages and lim-

itations that the deployment modes result in.

In summary, our contributions are

• introducing a novel platform-agnostic MLOps

framework

• demonstrating the practicability of the general so-

lution approach with two representative industrial

ML workﬂows

• contrasting the differences between deployment

on edge, fog, and cloud layers

Section 2 discusses relevant related work. Section 3

introduces the core MLOps framework. Section 4 de-

scribes the evaluation methodology and experiments.

Lastly, Section 5 concludes the work with a brief dis-

cussion on limitations and future outlook.

2 RELATED WORK

Kreuzberger et al.(Kreuzberger et al., 2023) discuss

the challenges in automating and operationalizing

machine learning projects for production, leading to

many ML endeavors failing to meet expectations.

To address these challenges, the concept of Machine

Learning Operations (MLOps) is introduced, focus-

ing on automating and managing ML workﬂows ef-

fectively.

MLOps emphasizes collaboration, automation

and continuous integration and delivery, drawing par-

allels with the DevOps paradigm in software engi-

neering. By leveraging insights from DevOps expe-

riences, MLOps seeks to improve the efﬁciency and

reliability of ML systems. Their work concludes by

emphasizing the importance of MLOps in bridging

the gap between ML research and practical deploy-

ment, ultimately increasing the number of ML proofs

of concept that transition to production.

This work extends on the MLOps paradigm by

providing a framework that employs the key princi-

ples in a platform-agnostic manner.

John et al.(John et al., 2020) present an architec-

tural framework for edge computing in AI. The au-

thors present ﬁve architectures that facilitate the de-

ployment of AI models on edge devices using edge-

cloud collaboration. They conducted a qualitative in-

terview to evaluate the proposed architectures. The

validation study reveals insights from experts in dif-

ferent industries regarding the applicability and chal-

lenges of each architectural alternative. Companies

prefer architectures that balance centralized and de-

centralized approaches with data collection, model

optimization, and cost-effectiveness considerations.

However, the authors do not provide a platform-

agnostic MLOps framework, which is the focus of our

work.

Raj et al. (Raj et al., 2021) propose an edge

MLOps framework capable of automatically deploy-

ing and managing machine learning models on edge

devices. They focus on scalability and automatization

of the deployment process. They again focus on syn-

ergizing edge and cloud computing. This paper, in

contrast, focuses on providing a framework that can

be deployed on cloud, fog and edge level individually,

without the need to use cloud resources at all.

Kemnitz et al.(Kemnitz et al., 2023) introduce a

framework for building and operating AI models at

the industrial edge, addressing challenges in deploy-

ing and managing AI applications in such scenarios.

It introduces the concept of model artifacts and dis-

cusses three industrial AI model use cases: energy ef-

ﬁciency monitoring, predictive maintenance, and pa-

rameter forecasting. The framework aims to stream-

line AI model deployment, operation, and manage-

ment, catering to various user roles, including data

scientists, automation engineers, and service techni-

cians. Key contributions include requirements elicita-

tion based on user roles, framework design, and qual-

itative evaluation based on implemented use cases.

The framework seeks to enable seamless integration

of edge resources into AI workﬂows, focusing on

scalability, ease of deployment, and operationaliza-

tion without requiring software engineering expertise.

While their work focuses exclusively on processing at

the edge, our approach includes intermediate fog and

cloud nodes to consider an edge-cloud compute con-

tinuum.

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

3 SYSTEM DESIGN

This section describes the architecture of the

platform-agnostic MLOps framework. It shows, how

the framework can be used to deploy machine learn-

ing models on different platforms, while providing a

consistent interface for data scientists and operators.

3.1 System Architecture

Fig. 1 shows the system architecture. The MLOps

Framework can be deployed on docker-capable de-

vices, which can be a cloud server, a fog node or an

industrial edge device.

The asset, whose data is used for training and later

analyzed, is typically located on the ﬁeld level. The

edge device is located on this level as well.

The asset can either output its data (examples of

such data is described in Section 4.2) directly to an

MQTT broker or an industrial edge connector can

be used to access the asset data and convert it to a

standardized format which is then forwarded to the

MQTT broker. Each deployment has an MQTT bro-

ker which has to be accessible from the ﬁeld level, in

order to connect the framework to the asset. MQTT is

chosen as a protocol, as it offers a lightweight and

easy to use messaging system, however, also other

messaging solutions could be chosen.

Once the asset data is pushed to the MQTT bro-

ker it is ingested by the MLOps framework. This

then uses the data for training a machine learning

model or for generating predictions using the trained

model. These predictions might assign classes to data

points, as described in Section 4.2, try to predict fu-

ture anomalies of the asset operation or do any other

kind of analysis a data scientist might come up with.

These predictions can be pushed back to the

MQTT broker, which makes them available on the

ﬁeld level. For example, to integrate them into the

Siemens Industrial Edge Platform, which offers i.e.

a dashboard for visualizing the data or connectors to

use the data for further controlling of the asset.

The MLOps framework also offers a consistent

view of the system to data scientists and operators on

all levels. Operators can use the web interface to con-

ﬁgure the data ingest via MQTT and monitor the sys-

tem’s status. Data scientists can use it to deploy their

machine learning models, packaged as zip ﬁles con-

taining the Python sources and the required libraries.

After uploading the model, the data scientists can con-

ﬁgure the model’s training, monitor its performance

and manage and monitor the prediction stage.

3.2 Application Architecture

Fig. 2 shows the internal structure of the MLOps

framework. On the left are the assets connected to

the industrial edge platform, which pushes their data

to the MQTT broker. They do this by directly con-

necting to the MQTT broker or using an industrial

edge connector. Industrial edge connectors are part

of the Siemens Industrial Edge Platform and exist for

many commonly used industrial protocols. The con-

nectors interface with the asset and read its data via

the respective protocol, converting the data to a stan-

dardized format

and push it to the MQTT broker.

Data in this standardized format can be ingested

directly by the backend-controller service of the

MLOps framework. To directly connect to assets that

push data to the MQTT broker without using an in-

dustrial edge connector, High-Level Drivers (HLD)

are used. These custom modules can be implemented

and added to the MLOps framework. They man-

age data in any arbitrary format the asset provides

and modify it for ingestion by the backend-controller.

The connections can be conﬁgured via the web in-

terface of the MLOps framework. Examples of such

High-Level Drivers are the GENICAM HLD, which

enables communication with cameras supporting the

Genicam protocol, or the OPCUA HLD which ac-

cesses data via OPCUA.

The backend-controller is the core of the MLOps

framework. As described above, it handles the data

ingest before the data is stored in an internal database.

Furthermore, the backend-controller handles deploy-

ment, training, monitoring, and management of ma-

chine learning workﬂows. Workﬂows can be up-

loaded in a package format in the web interface, mak-

ing them available for use in the MLOps framework.

After uploading the ML workﬂows, training can be

started via the web interface. For this, data is se-

lected from the previously collected asset data, which

is then sent together with the ML workﬂow package

to the runtime, which executes the training routine

deﬁned by the workﬂow on the selected data. The

trained model artifact is then returned to the backend-

controller, and the workﬂow is ready for use in the

prediction stage. The prediction stage can be initi-

ated via the web interface. In this stage, the backend-

controller sends the ML workﬂow package and the

trained model artifact to the runtime. The runtime ex-

ecutes the routine deﬁned by the workﬂow package

for each relevant data point and generates a predic-

tion using the trained model. These predictions are

https://github.com/industrial-edge/common-databus

-payload-format/blob/main/docs/payload-format/PayloadF

ormat.md

Platform-Agnostic MLOps on Edge, Fog and Cloud Platforms in Industrial IoT

Field Level

MLOps

Framework

Edge Services

Visuali-

zation

Control

Systems

Assets

Industrial Edge

Connector

Connection

to Asset

direct

indirect

MLOps

Framework

MQTT

Broker

MQTT

Broker

Cloud Level

MLOps

Framework

Fog Level

model

output

Operator

Management

Data Scientist

Model

Cloud Instance

Fog Instance

Figure 1: System Architecture.

collected by the backend-controller and visualized via

the web interface. The predictions are also pushed

back to the MQTT broker, where they are accessible

to the industrial edge platform. The industrial edge

platform uses the predictions, i.e. for further visual-

izations or for controlling the assets.

4 EVALUATION

This section shows how different deployment modes

of the MLOps framework perform on real-world in-

dustrial data. For this, we deploy the application to

machines on the cloud, fog and edge layers and ex-

ecute two ML workﬂows using real-world industrial

data. The experiments should illustrate the differ-

ences between individual deployment modes. Our hy-

pothesis is that the three deployment modes—cloud,

fog, and edge—each have their own advantages and

disadvantages, with no clear overall winner, which

emphasizes the necessity for a platform-agnostic ap-

proach to MLOps.

4.1 Experiment Setup

For running the MLOps framework, a representa-

tive machine exhibiting typical performance charac-

teristics has been selected for each designated loca-

tion. The edge level is represented by a Siemens in-

dustrial PC (SIMATIC IPC427E), equipped with an

Intel Xeon CPU E3-1505L v5 @ 2.00GHz and 16

GiB of RAM. The fog and cloud locations use vir-

tual machines based on OpenStack and AWS. The

Fog VM features 8 vCPUs based Intel Xeon SP Gold

6230 20C/ 40T - 2,1GHz/ 3,9GHz and 8 GiB of

RAM, representing a medium-sized OpenStack in-

stance (c1.medium). On the cloud level, an AWS

EC2 instance with 2 vCPUs based on the AMD EPYC

7000 series and 8 GiB of RAM corresponds to a large

instance (t3a.large). A large instance was chosen on

the cloud level, compared to a medium instance on

the fog level, to represent the greater availability of

computing resources in the cloud.

4.2 ML Workﬂows

Two ML workﬂows that analyze industrial data are

used to run the experiments and collect metrics. The

following sections provide a quick overview about

these workﬂows:

4.2.1 PCB Quality Inspection

This ML-based data analysis workﬂow analyzes data

from a printed circuit board (PCB) manufacturing

process. In this process, solder paste is placed on

PCBs, a procedure prone to error. During this process,

16 numerical measurements are collected, which in-

clude e.g.: height, area, offset, x- & y-positions. The

ML model uses these measurements to classify the

PCBs into the classes OK, error, and pseudo-error.

The ML model uses a supervised learning approach

with labels by human inspectors for training.

4.2.2 Inertial Measurement Classiﬁcation

This ML workﬂow detects human actions in an indus-

trial setting from sequenced events. These actions in-

clude hammering, screwing, sawing, etc. To classify a

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

Data/

Measurements

Data/

Measurements

Data/

Predictions

internal

MQTT

Broker

Backend

Controller

Predictions

Connectivity

Data/

Measurements

High Level Driver

OPCUA

Data/

Measurements

High Level Driver

GENICAM

High Level Driver

...

AI Runtime

PCB model

IMU model

...

Model

Management

Controller

Asset

Predictions

Data/

Measurements

Asset

Web UI

Monitoring

Management

Conn-

ectivity

Conn-

ectivity

MQTT MQTTEdge Platform Edge Platform

setup

Connectivity

deploy

Models

monitoring/

Data

vizualisation

Figure 2: Application Architecture.

person’s actions, timeseries data is collected using in-

ertial measurement units (IMU) placed on the left and

right hands of workers. The timeseries data is man-

ually labeled for training, while the prediction stage

should assign a category to the timeseries data auto-

matically.

4.3 Experiment Runs

For evaluation, both ML workﬂows described above

are executed on the edge, fog and cloud deployments

of the MLOps framework. During each of two runs,

measurements are collected and used to compare the

layers. The collected measurements are as follows:

• Training Time: the time a machine needs to com-

plete the training procedure of the respective ML

workﬂow

• Prediction Time: the time needed to ingest and

process a data sample to generate a prediction/-

classiﬁcation based on the previous training. This

is split into the time required for the system to in-

gest the data point and the time that it needs to

calculate the result.

• Communication Latency: latency of the asset to

the respective machine

• Memory Utilization: maximum memory utiliza-

tion of the MLOps framework while executing the

ML workﬂows

• CPU Utilization: maximum CPU usage of the ML

workﬂow

• Energy Consumption: the amount of energy the

machines use for training and generating predic-

tions. This is generated by code carbon

, a Python

package that estimates the amount of energy a de-

vice uses to execute a program. The values gener-

ated by code carbon are estimates based on CPU

utilization, CPU type, time of calculations as well

as CPU power tracking if available

The experiment setup can be seen in Fig. 3. The edge

device is located directly on a shop ﬂoor in Vienna

(AUT), the fog instance (openstack-VM) is hosted in

a data center nearby located in the same building. And

the cloud instance (EC2-VM) is hosted in the nearest

AWS data center in Frankfurt (GER).

VM-openstack

MLOps

Framework

Model

VM-EC2

MLOps

Framework

Model

Cloud LayerFog Layer

Shop Floor

Metrics

Edge Device

MLOps

Framework

Model

Data

virtual

Asset

Metrics

Collector

Testing

Framework

Metrics

Network

Figure 3: Experiment Design.

https://codecarbon.io/

https://mlco2.github.io/codecarbon/methodology.htm

l#power-usage

Platform-Agnostic MLOps on Edge, Fog and Cloud Platforms in Industrial IoT

A Testing Framework we developed is executed

on a machine located on the same shop ﬂoor as the

edge device for testing the setup. The framework

provides a virtual asset corresponding to the type of

asset expected by the respective ML workﬂow. e.g.:

PCB data and IMU data The testing framework also

automatically starts training runs on all deployments

and collects training time, CPU & memory utiliza-

tion, and energy consumption as described above. Af-

ter the training, a set of ten prediction cycles is started.

For this, a data point is generated by the virtual asset

and ingested into the system. A prediction is then gen-

erated, and metrics like prediction time, energy con-

sumption and CPU & memory consumption are cal-

culated. Finally, communication latency is measured

by doing a series of pings.

4.4 Results

For each workﬂow, two training sessions and 20 pre-

diction cycles were performed. The collected metrics

are as follows:

Table 1 shows the system measurements that have

been collected. The ﬁrst column shows the commu-

nication latency from the asset on the shop ﬂoor to

the respective machines. This clearly shows the ad-

vantage of the edge device being in the same room,

causing a latency of only 8.5ms. In contrast, the fog

and cloud machines exhibit a much greater latency.

The second column shows the maximum CPU

utilization of a training/prediction cycle of the PCB

workﬂow. Here 100% corresponds to the complete

utilization of a single CPU core. The measurements

show a high utilization of the edge device due to the

weaker hardware, whereas fog and cloud have a lower

utilization due to more powerful hardware.

The last column shows the maximum memory uti-

lization of the PCB workﬂow runs. These do not show

any signiﬁcant differences. As long as the application

does not exceed the available memory on the given

machine, no swapping occurs, which could inﬂuence

the memory footprint.

CPU and memory utilization were only collected

for the PCB workﬂow. All further measurements are

collected for both workﬂows and are shown in the fol-

lowing ﬁgures.

4.4.1 Training

Fig. 4 shows the training time of the PCB and IMU

workﬂows on the cloud, fog and edge machines. The

bar plots show the mean of the two executed runs, and

the error bars show the standard deviation.

The green bars show the training time of the PCB

workﬂow. Despite the different hardware equipment,

Figure 4: Training Time.

the training time of the PCB workﬂow shows almost

no differences between the deployment modes. This

is because the training algorithm of the PCB work-

ﬂow requires minimal effort. Most of the time comes

from deploying the workﬂow via a zip ﬁle (e.g. from

the framework extracting the ﬁle and adding it to the

AI runtime). The duration of this process seems to

depend very little on the machine’s hardware, as the

compression implementation does not utilize the full

hardware potential.

The blue bar shows the training time of the IMU

workﬂow. The training algorithm takes a much larger

share of the overall training effort in this workﬂow.

The training on the edge device takes the longest,

which is due to the weaker hardware. The training on

the fog and cloud machines is faster, however there is

only minimal difference. That the difference is not

greater is possibly due to the not fully parallelized

training algorithm. Because of this lack of paralleliza-

tion, the time depends primarily on the single core

performance of the machines, which while similar on

both machines, still differs due to the different archi-

tectures of AMD and Intel CPUs.

Figure 5: Training Energy Consumption.

Fig. 5 shows the energy consumption of the train-

ing runs of the PCB and IMU workﬂows. Again, the

bar plots show the mean of the two executed runs and

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

Table 1: System measurements.

platform

latency

[ms]

CPU utilization

[%]

memory consumption

[MiB]

edge 8.5 313.5 1144

fog 173.8 150.7 1138

cloud 241.9 146 1169

the error bars show the standard deviation. The plot is

split into two diagrams, due to the different scales for

each workﬂow. The left plot shows the energy con-

sumption of the PCB workﬂow in kWh. The energy

consumption is very low because the training algo-

rithm is not very demanding. Nevertheless, it shows

that the edge device consumes the least energy, due

to the weaker hardware, while the fog machine con-

sumes a little more than the edge device. The cloud

machine consumes the most energy, due to being the

most powerful hardware.

The right plot shows the energy consumption of

the IMU workﬂow. Here the difference between the

edge device, which consumes the least energy, and

the fog and cloud machines, which consume more

and roughly the same amount of energy, is less pro-

nounced.

This is due to the weaker hardware of the edge

device, which on the one hand consumes less energy,

but on the other hand also takes longer to complete

the training, which in turn raises the total amount of

spent energy again.

The cloud and fog machines consume roughly the

same amount of energy, as they also consume a very

similar amount of time for training.

4.4.2 Prediction

Figure 6: Prediction Time.

Fig. 6 shows the prediction time of the PCB and IMU

workﬂows. The bar plots show the mean of 20 runs

that we executed, and the error bars show the standard

deviation.

The bars are further split into two sections. The

darker section shows the time that the respective sys-

tem uses for the calculation of the model output, while

the lighter section on top shows the time that is added

by the system to ingest the data point, therefore rep-

resenting the communication latency. Both sections

combined show the total time that is needed to gener-

ate the model output.

Both workﬂows show a very similar pattern. The

measurements of the edge device show that the la-

tency is very low but due to the weaker hardware, the

time that is needed for the calculation of the model

output is the highest. Therefore, the total time needed

for the prediction is the highest, as the low latency

has only a minuscule impact on the overall prediction

time. On the fog machine, the time needed for calcu-

lation of the model output is lower, due to the more

powerful hardware, but the latency is higher due to

the greater physical distance to the asset. So the im-

pact of the latency on the overall prediction time is

higher. However, the overall prediction time is still

lower than on the edge device. The cloud machine

offers the lowest overall prediction time, even though

the latency has a greater relative impact on the overall

time than on the other devices. Due to the much faster

calculation of the model output, the overall prediction

is still the fastest, however.

Figure 7: Prediction Energy Consumption.

Fig. 7 shows the energy consumption of the pre-

diction runs of both workﬂows. The bar plots show

the mean of the energy consumption of 20 runs in

kWh and the error bars show the standard deviation.

Platform-Agnostic MLOps on Edge, Fog and Cloud Platforms in Industrial IoT

Both workﬂows show a similar pattern. However,

the IMU workﬂow shows on all platforms approxi-

mately double the energy consumption of the PCB

workﬂow, due to the more demanding algorithm.

The edge device consumes the least energy, due to

the weaker hardware. While the fog and cloud ma-

chines consume more energy, with the cloud machine

having the highest energy consumption.

4.5 Discussion

As demonstrated in Section 4, there are several differ-

ences when running machine learning tasks on vari-

ous deployment modes. The individual implications

of the results provided are discussed as follows:

Resource Availability: The measurements of the

training, as well as the prediction time show that the

higher levels in the deployment hierarchy can beneﬁt

signiﬁcantly from the computational resources avail-

able on these levels. The resources available on the

cloud and fog level greatly exceed the capabilities

of typical edge devices. This can beneﬁt particu-

larly computationally demanding tasks such as ma-

chine learning. We showed that the performance ad-

vantage of cloud and fog machines can even exceed

the latency advantage provided by edge devices due

to their proximity to the assets.

Energy Consumption: The measurements conducted

within this paper show that due to the weaker hard-

ware resources of edge devices, they can execute

the ML tasks with lower overall power consumption.

This makes the usage of edge devices economical and

environmentally friendly.

The energy measurements are however limited,

as they rely on a software package that tracks CPU

and memory usage. This provides reasonably accu-

rate energy consumption estimates, but the concrete

kWh values might not be entirely correct. However,

as the same method is used for all measurements, the

relative difference between the measurements is still

valid.

Qualitative Aspects: Other differences between the

deployment levels are qualitative aspects, such as se-

curity or privacy. With edge computing and possi-

bly fog computing, the data is processed on the edge

device or fog instance, which can be beneﬁcial for

privacy, as the data does not have to be transmit-

ted to a remote server possibly located in a different

country(Mendez et al., 2022; Satyanarayanan, 2017).

This can be especially important for industrial appli-

cations, where the data might be sensitive and strict

regulations may apply.

Relevance of Platform-Agnostic MLOps: These ﬁnd-

ings underline the relevance of a platform-agnostic

MLOps framework that enables data scientists and

operators to deploy machine learning models on dif-

ferent platforms while providing a consistent experi-

ence. Platform-agnostic MLOps frameworks enable

users to choose the platform that best ﬁts their require-

ments without having to adapt their workﬂows to the

selected platform.

Feedback for the Operator: The experiments above

show that the Testing Framework provides useful

metrics for deploying an ML task. An operator can

use the Testing Framework to get crucial insight for

assessing the performance and viability of a deploy-

ment location for a speciﬁc ML scenario. These in-

sights can be used to inform optimal deployment de-

cisions tailored to the needs of the given ML task.

Generalizability: The results of our experiments are

expected to be generalizable across various machine

learning tasks. Performance differences among the

cloud, fog, and edge layers are anticipated to be exis-

tent for most ML tasks, indicating that different tasks

may be better suited to different layers depending on

speciﬁc requirements and settings. This suggests that

a ﬂexible, platform-agnostic approach is crucial for

optimizing MLOps across diverse scenarios.

Hybrid Deployments: Deployment is often not a bi-

nary choice; for instance, training might be best per-

formed in the cloud or fog, while inference is more

suitable for the edge. Our experimental results high-

light differences between the layers, suggesting that

hybrid deployments might be advantageous. How-

ever, these hybrid approaches are not considered in

the current study and are suggested for future re-

search.

5 CONCLUSION

This work presented a platform-agnostic MLOps

framework, which can be deployed on the cloud,

fog and edge level. The framework enables users to

choose the platform that ﬁts their requirements best,

without the need to adapt workﬂows and practices

to a speciﬁc deployment mode. We also evaluated

the framework on real-world ML tasks and discussed

the implications that the deployment mode has on

the given task. These results underline the need for

platform-agnostic MLOps.

Future research should explore the potential of

hybrid deployments, where different stages of the

MLOps lifecycle are distributed across cloud, fog,

and edge environments. Additionally, incorporating

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

automatic orchestration mechanisms for detecting the

most suitable platform for each task and deploy mod-

els accordingly would enhance efﬁciency and perfor-

mance. This would involve developing intelligent

systems capable of dynamically optimizing deploy-

ment strategies based on task requirements and re-

source availability. These advancements could sig-

niﬁcantly improve the ﬂexibility and effectiveness of

MLOps frameworks.

REFERENCES

Calo, S. B., Touna, M., Verma, D. C., and Cullen, A. (2017).

Edge computing architecture for applying AI to IoT.

In 2017 IEEE International Conference on Big Data

(Big Data), pages 3012–3016.

Ding, A. Y., Peltonen, E., Meuser, T., Aral, A., Becker,

C., Dustdar, S., Hiessl, T., Kranzlm

uller, D., Liyan-

age, M., Maghsudi, S., Mohan, N., Ott, J., Reller-

meyer, J. S., Schulte, S., Schulzrinne, H., Sol-

maz, G., Tarkoma, S., Varghese, B., and Wolf, L.

(2022). Roadmap for edge AI: A Dagstuhl perspec-

tive. ACM SIGCOMM Computer Communication Re-

view, 52(1):28–33.

John, M. M., Holmstr

om Olsson, H., and Bosch, J. (2020).

AI on the Edge: Architectural Alternatives. In 2020

46th Euromicro Conference on Software Engineering

and Advanced Applications (SEAA), pages 21–28.

Kemnitz, J., Weissenfeld, A., Schoefﬂ, L., Stiftinger, A.,

Rechberger, D., Prangl, B., Kaufmann, T., Hiessl, T.,

Holly, S., Heistracher, C., and Schall, D. (2023). An

Edge Deployment Framework to Scale AI in Indus-

trial Applications. In 2023 IEEE 7th International

Conference on Fog and Edge Computing (ICFEC),

pages 24–32.

Kreuzberger, D., K

uhl, N., and Hirschl, S. (2023). Machine

Learning Operations (MLOps): Overview, Deﬁnition,

and Architecture. IEEE Access, 11:31866–31879.

Mendez, J., Bierzynski, K., Cu

ellar, M. P., and Morales,

D. P. (2022). Edge Intelligence: Concepts, Ar-

chitectures, Applications, and Future Directions.

ACM Transactions on Embedded Computing Systems,

21(5):48:1–48:41.

Nastic, S., Raith, P., Furutanpey, A., Pusztai, T., and Dust-

dar, S. (2022). A Serverless Computing Fabric for

Edge & Cloud. In 2022 IEEE 4th International Con-

ference on Cognitive Machine Intelligence (CogMI),

pages 1–12.

Raj, E., Buffoni, D., Westerlund, M., and Ahola, K. (2021).

Edge MLOps: An Automation Framework for AIoT

Applications. In 2021 IEEE International Conference

on Cloud Engineering (IC2E), pages 191–200.

Rausch, T. and Dustdar, S. (2019). Edge Intelligence: The

Convergence of Humans, Things, and AI. In 2019

IEEE International Conference on Cloud Engineering

(IC2E), pages 86–96.

Rausch, T., Hummer, W., Muthusamy, V., Rashed, A., and

Dustdar, S. (2019). Towards a Serverless Platform for

Edge {AI}. In 2nd USENIX Workshop on Hot Topics

in Edge Computing (HotEdge 19).

Satyanarayanan, M. (2017). The Emergence of Edge Com-

puting. Computer, 50(1):30–39.

Trabesinger, S., Butzerin, A., Schall, D., and Pichler, R.

(2020). Analysis of High Frequency Data of a Ma-

chine Tool via Edge Computing. Procedia Manufac-

turing, 45:343–348.

Platform-Agnostic MLOps on Edge, Fog and Cloud Platforms in Industrial IoT