Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices

Vinay Yadhav

, Andrew Williams

, Ondrej Smid

, Jimmy Kj

allman

, Raihan Ul Islam

4,∗

Joacim Hal

5,∗

and Wolfgang John

Ericsson Research, Bangalore, India

Ericsson Research, Stockholm, Sweden

Ericsson Research, Jorvas, Finland

East West University, Dhaka, Bangladesh

Independent Researcher, Stockholm, Sweden

Keywords:

Computational Ofﬂoading, Edge Computing, WebAssembly, Edge-Native Application Development, 5G, 6G.

Abstract:

The proliferation of applications across mobile devices coupled with fast mobile broadband have led to ex-

pectations of better application performance, user experiences, and extended device battery life. To address

this, we propose a dynamic computational ofﬂoading solution that migrates critical application tasks to re-

mote compute sites within mobile networks. Ofﬂoading is particularly advantageous for lightweight devices,

as it enables access to more capable processing hardware. Application developers can also leverage the of-

ﬂoading service to customize features, address privacy concerns, and optimize performance based on user

requirements. Moreover, the solution facilitates local synchronization among collaborating users. Our so-

lution focuses on ad-hoc deployment and dynamic scheduling of ﬁne-grained application tasks triggered by

changes in device metrics, without extensive development efforts. It extends application functionality from

mobile devices to remote compute environments, complementing the cloud-to-edge paradigm. We introduce

a distributed execution framework based on portable, lightweight, and secure WebAssembly runtimes. Ad-

ditionally, we present a programming model to simplify ad-hoc deployment and dynamic invocation of task

modules during runtime. We demonstrate the beneﬁts of our solution, showing signiﬁcant performance im-

provements of the application, and reduced energy consumption and heat generation on the mobile device.

1 INTRODUCTION

With the advent of a global app economy, most users

have a large variety of applications running on mo-

bile phones and increasingly also on other types of

user equipment like XR headsets, cars, or drones, all

connected via cellular networks. Many of these con-

nected, mobile devices offer limited computational

capabilities and/or energy capacity to run certain ap-

plications with high quality of experience for more

than a limited time. Computational Ofﬂoading is

a subset of the distributed computing paradigm that

concerns the ability to migrate a component of a run-

ning application from a mobile User-Equipment (UE)

to a remote ofﬂoading site. The end-user of an ap-

plication would beneﬁt from the service’s automated

capabilities to balance compute and energy tradeoffs

between the UE and the ofﬂoading site. Executing

∗

has been with Ericsson when writing this paper.

certain critical parts of an application in an external

compute infrastructure can increase the application’s

quality of experience by giving ofﬂoaded tasks remote

access to domain speciﬁc hardware accelerators such

as GPUs. At the same time, ofﬂoading can prolong

battery life by decreasing the UE’s power utilization

and device heat. For very lightweight devices (e.g.,

head-mounted devices, IoT sensors), this might ac-

tually be the only option to offer richer application

experience. Application developers can also leverage

an ofﬂoading service to customize features, address

privacy concerns, and optimize performance based on

user requirements and contextual factors.

The core idea behind the dynamic computational

ofﬂoading solution described in this paper is to ex-

pand application functionality from connected user

equipment to a remote compute environment, e.g., lo-

cated within the cellular network. This differs from

typical edge computing solutions, which target the

edge from the opposite direction, i.e., moving applica-

Yadhav, V., Williams, A., Smid, O., Kjällman, J., Islam, R., Halén, J. and John, W.

Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices.

DOI: 10.5220/0012719800003711

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 14th International Conference on Cloud Computing and Services Science (CLOSER 2024), pages 265-276

ISBN: 978-989-758-701-6; ISSN: 2184-5042

265

tion functionality traditionally running in the cloud to

edge or on-premises compute facilities using largely

static, pre-deployed application servers targeting ver-

tical use-cases with big groups of users. In contrast,

our solution aims at dynamic ofﬂoad and deploy-

ment of highly granular application tasks, triggered

by the application based on situational changes such

as changes in the device (e.g., battery levels), network

(e.g. radio quality) or application needs. This implies

that this solution is use-case agnostic, offering sand-

boxed remote computational resources to any applica-

tion on an UE, able to execute ad-hoc deployed appli-

cation tasks, invoked dynamically essentially anytime

and anywhere. We believe that such a solution has

the potential to target even the long tail of regional

enterprises and developers that usually would not en-

gage in the heavy burdens of deployment, manage-

ment, and contract handling related to edge solutions

like MEC (ESTI, 2023). In this paper we evaluate

such a solution based upon two novel features:

• A distributed execution framework that takes ad-

vantage of the light weight and portability of stan-

dalone WebAssembly (Wasm) runtimes

• A programming model and associated develop-

ment toolchain that abstracts away the complex-

ities of communication between a device applica-

tion and an ofﬂoaded component

This paper is structured as follows: Section 2 de-

scribes the requirements and design of our proposed

ofﬂoading solution. Section 3 describes the demo

setup and proof-of-concept (PoC) implementation of

the solution concept. In Section 4 we present and dis-

cuss our measurement results, showing when ofﬂoad-

ing is beneﬁcial for the application or the mobile de-

vice. Section 5 summarizes and relates current state

of the art. Finally, Section 6 concludes the paper and

points out next steps towards our vision of dynamic

computational ofﬂoading as a network service.

2 SYSTEM DESIGN

This section describes details about the design of our

proposed ofﬂoading framework, including require-

ments, the basic system architecture, the proposed

programming model and development toolchain for

optimal usage of the ofﬂoading framework, as well as

crucial security considerations.

2.1 Requirements

In order to design a dynamic computational ofﬂoad-

ing solution according to our vision and idea, we state

Figure 1: System architecture with communication inter-

faces (control plane in solid, user-plane in dashed lines).

the following qualitative requirements:

• Platform independent, to allow an ofﬂoaded task

to execute on a variety of different hardware ar-

chitecture and operating system combinations.

• Lightweight so that the framework does not im-

pose an excessive overhead and negatively affect

ofﬂoaded component load time.

• Low footprint so that the framework size is mini-

mized compared to the ofﬂoaded component.

• Secure, providing isolation between tenants and

between tenants and infrastructure.

• Language independent and open, allowing devel-

oper ﬂexibility and extensibility.

As a result of these requirements, we selected stan-

dalone WebAssembly (Wasm) runtimes as the basic

execution environment in our solution. While it is

relatively immature compared to more conventional

portable bytecode formats such as Java, or to more

heavyweight distributed computing runtime platforms

such as containers or virtual machines, we believe that

Wasm is a promising technology for scenarios that re-

quire portable and secure runtimes.

2.2 System Architecture

Based on our objectives and requirements, we derived

a system architecture for an ofﬂoading service frame-

work as depicted in Figure 1. It has two major com-

munication interfaces, one for the control plane and

one for the user plane of the ofﬂoading service.

The service control plane is split into two parts:

(i) User-Equipment (UE) side and (ii) remote ofﬂoad-

ing cluster side. The latter consists of functionality to

manage ofﬂoad requests and a collection of resources

handling and executing ofﬂoaded modules. For the

application developer and the application running in

the UE, the framework manifests as:

A Notiﬁer is a local helper functionality which gath-

ers device metrics and exposes them via an API to the

running application.

CLOSER 2024 - 14th International Conference on Cloud Computing and Services Science

266

An Ofﬂoad Handler is responsible for handling the

interaction with the application, discovering the clos-

est ofﬂoading cluster, requesting and negotiating an

ofﬂoad of a speciﬁed application module

, assem-

bling and packaging all meta-data together with the

module itself necessary to ofﬂoad to the cluster, and

handshaking the migration of the module with its cor-

responding part on an allocated host in an edge clus-

ter. It also maintains a list of all ofﬂoaded modules

and their corresponding applications as well as a ref-

erence to the ofﬂoading hosts. The application com-

municates with the UE Ofﬂoad Handler using an API

which is implemented as a programming language

speciﬁc library and contains functions to request and

recall an ofﬂoad, select an ofﬂoading host from an of-

ﬂoad resource offer, and synchronize the migration of

a module’s local state.

An ofﬂoading cluster instance handles and offers

remote compute to UEs and has two main functional

parts, depicted in Figure 1: (i) management of ofﬂoad

requests, mapping them to hosts (workers) based on

resource requirements; and (ii) the set of hosts where

each host synchronizes the code migration as well as

executing the ofﬂoaded module. The remote compute

part of the control plane consists of the following:

An Ofﬂoad Manager acts as the ﬁrst remote point of

contact for the Ofﬂoad Handler of an UE that wishes

to ofﬂoad. It coordinates activities such as authenti-

cation, authorization, resource management, etc. for

each request.

A Resource Manager is responsible for the allocated

resources (i.e., hosts) assigned to the ofﬂoading clus-

ter. It selects a (virtual) host with capabilities that

match the requested requirements as speciﬁed in an

ofﬂoad request for a module.

A Remote Ofﬂoad Handler is responsible for syn-

chronizing an ofﬂoading event with its counterpart in

the UE, including replication of the runtime execu-

tion environment for the ofﬂoaded module as well as

re-establishing the internal and external ﬂows. Each

(virtual) host will have its own Remote Ofﬂoad Han-

dler instance.

An Execution handler is responsible to execute the

ofﬂoaded task. We base the execution of tasks on

Wasm

, an instruction format designed to be executed

on a memory safe and sandboxed stack-based vir-

tual machine (Haas et al., 2017). An application run-

ning on a host machine embeds a WebAssembly run-

time, allowing it to load Wasm modules and call ex-

ported functions contained within the module, known

as guest functions. Though sandboxed, the Wasm

A module is part of the app possible to be ofﬂoaded,

including single functions, tasks, or the whole application

https://webassembly.org

module can optionally access host machine function-

ality either through the standardized WASI system in-

terface

or through non-standard host function calls to

host resources like networking, databases, or access

to HW accelerators. The ofﬂoading framework ex-

ecution handler acts as the remote application and

is extendable through the use of both common and

developer-supplied host functions, which provides a

lightweight programmatic method of providing addi-

tional capabilities to the ofﬂoaded module. For com-

munication, the execution handler exposes a REST-

ful API to the Remote Ofﬂoad Handler that allows

for the invocation of Wasm modules and the calling

of exposed Wasm functions. It also exposes an ab-

stracted interface to the Wasm runtime, allowing var-

ious implementations such as Wasmedge

or Wasm-

time

to be selected at runtime depending on appli-

cation requirements. Once instantiated, the ofﬂoaded

module communicates with the parent application di-

rectly rather than through the service framework.

2.3 Programming Model

One of the ambitions of this service is to hide the

distributed nature of an application with ofﬂoadable

components from the developer. The programming

models for ofﬂoading should therefore guide a de-

veloper in the task of dividing the application func-

tionality in a simple way. They should feel intuitive

and as natural as possible for the developer, that is,

they should preferably be modiﬁcations or extensions

to well-known existing models. We have focused on

two models to realize the dynamic ofﬂoading solu-

tion, namely channels and functions. Below, one can

ﬁnd brief descriptions of both.

Firstly, we developed a point-to-point communi-

cation method, known as elastic channel. Normally,

channel endpoints are created and connected at the lo-

cation of the threads or processes and exist through-

out their lifetime. An elastic channel is an extension

of the channel model where endpoints can transpar-

ently change location during their lifetime. In the

proposed ofﬂoading service framework, elastic chan-

nels are asynchronous and bidirectional supporting

both one-to-one and one-to-many communication. In

the current implementation, one endpoint remains sta-

tionary and ﬁxed in the UE device, while the other

endpoint is ﬂexible, implying that its location can

freely move around, e.g., the closest ofﬂoading clus-

ter at the time. This ﬂexibility and static nature of

endpoints are explicitly declared by the developer, en-

https://github.com/WebAssembly/WASI/

https://wasmedge.org/

https://wasmtime.dev/

Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices

267

abling elastic channels to be constructed as a library

without the need for external resources.

Secondly, a form of Remote Procedure Call (RPC)

API known as elastic functions was developed, based

upon elastic channel network functionality. Functions

and procedures are the fundamental way to abstract

an expression or a set of statements in most program-

ming languages. As functions and procedures are so

fundamental, it would be good if they could be made

elastic in a similar way as the elastic channels. That

is, a call (invocation) should look the same both when

a module containing the implementation of the func-

tion or procedure is residing in the same execution

runtime as when the it has been moved to an exe-

cution runtime at a different location. In contrast to

RPC, here the remote end is not ﬁxed and can be ex-

ecuted locally or remotely (ofﬂoaded). To do that, we

must be able to identify the functions and procedures

that could be moved during an ofﬂoad and transform

them to functions and procedures with the same inter-

face and behaviour but are movable in a transparent

way. We realize this via an extra pre-processing step

in the compiler toolchain that permits the developer

to tag functions as ofﬂoadable. Functions with such a

tag have their interfaces rewritten prior to compilation

as elastic channel functions. In this case two versions

of the function are built, one that will be natively run

on the device, and another compiled to WebAssembly

format that can be readily run remotely, independent

of the exact hardware and software platform choices

at the ofﬂoading cluster.

2.4 Ofﬂoad Sequence

We assume the following pre-conditions when oper-

ating this service: the UE and Ofﬂoad Manager are

aware of each other; there is pre-established commu-

nication link between the UE Ofﬂoad Handler and the

Ofﬂoad Manager; and the application is monitoring

ofﬂoad triggers. The former two of these conditions

could, e.g, be established by adaptations to existing

control plane functions of existing or future cellular

networks (i.e., 5G/6G).

Given these pre-conditions, a module ofﬂoad se-

quence (using the control plane in Fig. 1) includes

the following steps:

1. Information from the notiﬁer results in an ofﬂoad-

ing trigger threshold being reached.

2. The application, via the ofﬂoad handler, informs

the remote ofﬂoad manager that there is a pending

ofﬂoad request.

3. The ofﬂoad manager responds with a set of offers

of resources obtained from the resource manager.

4. The application selects one of the offers and in-

forms the ofﬂoad manager, which instructs the rel-

evant remote ofﬂoad handler to assign resources.

5. The local and remote ofﬂoad handlers communi-

cate directly to transfer the module.

6. The execution handler initializes the module, and

communication with the local application com-

mences using elastic channels or functions.

2.5 Security Considerations

We identiﬁed an important requirement in security

and isolation, especially for a solution that executes

code imported from untrusted sources (i.e., the UEs

and their users) in a remote compute environment.

Here we consider security in the dynamic ofﬂoading

architecture from the point of view of the users and

applications on one hand, and the system itself on the

other. We outline a few key technologies we see as

building blocks for securing our solution.

In addition to utilizing WebAssembly-based sand-

boxing, we support ofﬂoading modules into Trusted

Execution Environments (TEEs), such as Conﬁden-

tial Virtual Machines (Guanciale et al., 2022), in cases

where ofﬂoaded modules and data need to be pro-

tected also from the ofﬂoading host. The UE side at-

tests these TEEs (M

etrey et al., 2022) before mod-

ules are migrated to them.

We also need to ensure that users are authenticated

and authorized to perform ofﬂoading and use remote

resources. In a scenario where the ofﬂoading system

is deployed as a service in a mobile network, the Au-

thentication and Key Management for Applications

(AKMA) (Huang et al., 2021) mechanism can per-

form authentication and authorization of the UE side

based on network subscription credentials, as well as

distribute shared symmetric keys to the UE and the

remote ofﬂoading system. These keys can be used

within Transport Layer Security (TLS) handshakes, in

which we also perform TEE attestation when needed.

3 DEMO PoC AND

EXPERIMENTAL SETUP

In order to validate the functionality of the described

ofﬂoading solution and assess the resulting perfor-

mance, we developed an experiment setup based on

a search and rescue scenario involving hazardous en-

vironment exploration.

CLOSER 2024 - 14th International Conference on Cloud Computing and Services Science

268

3.1 Use-Case Scenario

We assume a scenario in which a rescue team arrives

at a disaster site to explore the potentially dangerous

unknown environment using battery-driven robot ve-

hicles or drones. The goal is to discover the location

of speciﬁc objects such as human survivors or haz-

ardous materials. In such a critical situation it would

be advantageous to use more than one robot to simul-

taneously scout the area, each contributing to a full

picture of the search area by exploring a part of the

total environment. As such, this scenario requires col-

laboration between multiple robots on the same task.

To prolong the operation time of the battery-driven

robots and potentially even increase their scouting

performance, each robot can choose to ofﬂoad some

of its functions to a remote ofﬂoading site.

As the disaster site was a priori unknown, this sce-

nario takes advantage of ad-hoc deployment of spe-

ciﬁc application tasks at exactly the place and time

when needed. These tasks - here either computa-

tional heavy functions or collaborative tasks - can

then be scheduled dynamically based on situational

changes, seamlessly during the applications initially

run on each robot into distributed application.

3.2 Demo Application

The application written to support the proposed sce-

nario consists of three main tasks:

• The Navigation function, which was designed to

capture a video stream from a vehicle’s onboard

camera and forward it to the object detection.

• The Object Detection function, which receives

video frames and uses a trained image detection

library to identify objects within a scene.

• The Map Server function, which acts as a com-

munication hub between vehicles, constructing a

common view of the region and objects provided

by the partial views supplied by each vehicle.

The Navigation function runs permanently onboard

the device due to its dependency upon the onboard

camera. The Map Server is ofﬂoaded by the ﬁrst ve-

hicle entering the scenario, with subsequent arrivals

connecting to it rather than ofﬂoading their own copy.

The resource heavy Object Detection task initially

executes onboard each robot device, but may be of-

ﬂoaded to a local edge site once certain trigger thresh-

olds are met. It is this latter task that we use to mea-

sure both the beneﬁts and drawbacks in application

performance and resource utilization when migrating

a task between onboard and remote execution.

3.3 PoC Implementation

In our proof of concept (PoC) implementation, sev-

eral network-connected, battery-driven four-wheeled

robots

were deployed on a ﬂat area with opaque ob-

structions representing buildings or rocky landscape.

Scattered across the environment we placed a number

of hazardous objects that must be identiﬁed by the ve-

hicles. In this layout, a robot entering from one side

of the region could only see a limited number of these

objects. In order to detect all the objects, a vehicle

would need to roam in the area, or multiple vehicles

could collaborate by pooling their partial view of the

environment to create a complete picture in less time.

Figure 2 describes the deployment of the appli-

cation as a vehicle arrives on site. Initially the three

components of the application are running in natively

compiled binary format on the resource-constrained

UE robot hardware under the supervision of the of-

ﬂoading framework. In this scenario, the ﬁrst robot

arriving on site triggers an ofﬂoad of the collaborative

Map Server in portable Wasm format to the ofﬂoad-

ing cluster co-located with the network access point.

It then proceeds to attempt to identify objects through

the Object Detection function, which makes use of

the YoloV4 neural network software for object detec-

tion (Bochkovskiy et al., 2020) based upon OpenCV

4.5.4

. Subsequent vehicles will attempt to deploy

their own Map Server, but are redirected to connect to

the existing server instead. This collaborative task is

being shared among all robot UEs. Under certain con-

ditions, e.g., a low battery level at a robot, ofﬂoading

of the computational heavy Object Detection function

is triggered as well. In this case, every robot will of-

ﬂoad its own individual instance of this function in

portable Wasm format, which can then be scheduled

dynamically during operations, depending on the cur-

rent situation. As shown in Figure 2, the computa-

tional ofﬂoading framework is deployed across both

the UEs and the ofﬂoading site, and allows communi-

cation between the UE device and the ofﬂoading site

through elastic functions as described in Section 2.3.

3.4 Experiment Setup

We conducted a number of experiments to examine

the performance characteristics of ofﬂoading a partic-

ular function, Object Detection, using the described

ofﬂoading framework. To collect relevant measure-

ments from the devices, we developed a telemetry

service. It permitted the gathering of metrics such

Robots built from a Freenove 4WD Smart Car Kit

https://opencv.org

Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices

269

Figure 2: Application conﬁguration showing tasks running both on the UE (left) and the edge cluster (right) within the

computational ofﬂoading framework. Ofﬂoaded tasks communicate with parent application using elastic channels.

as CPU/GPU utilization, CPU/GPU temperature, net-

work utilization, and power consumption from both

the UE device and the host at the ofﬂoading site.

In order to reﬂect different UE capabilities in our

experiments, we employed two different types of de-

vices as base platform for the robot vehicle kits. The

ﬁrst one is a Raspberry Pi 3, which is an example

of a device with limited computational capabilities.

The second device is a Nvidia Jetson Nano, which is

more powerful and comes with a 128-core Maxwell

GPU. Wireless network connectivity was provided by

a third party 5G adapter compatible with both the

Nvidia Jetson and the Raspberry Pi, which connected

to a private 5G base station and a 5G standalone (SA)

core network with access to a lab environment act-

ing as edge site. As a reference, we also performed

ofﬂoading via wired LAN connected directly to the

same edge site. Note that, compared to the LAN, the

5G connection included an experimental core network

and three additional IP hops to reach the edge site. We

list detailed speciﬁcations of the hardware and oper-

ating systems for both devices in Table 1. The table

also includes information about the ofﬂoading cluster.

We have given several live demos of the search &

rescue scenario with multiple robots serving as suc-

cessful functional validation of the framework. For

the follow measurements, however, we constrained

the operation to provide a higher level of consistency.

Speciﬁcally, the car-kit was removed not to interfere

with the energy readings, and we used a recorded im-

age stream taken from the camera whilst under test-

ing. Furthermore, only one UE was used at a time

during the measurement campaigns in order to avoid

network and processor congestion.

Once all application components on the UE were

started, a spin-up time of 150 sec was allowed for the

system to reach a stable state. Subsequently, the ob-

ject detection module was locally executed for 150

sec before being ofﬂoaded to the ofﬂoading cluster.

Remote execution of the object detection task was

then performed for 150 sec. This local and remote

execution was repeated three times for each device.

Table 1: Experiment equipment.

Mobile devices

Raspberry Pi 3B Nvidia Jetson Nano

CPU Quad Core

1.2GHz Broadcom

BCM2837 64bit

CPU

Quad-core ARM A57 @ 1.43 GHz

GPU 128-core Maxwell

RAM 1 GB 4 GB

Adapter

Waveshare 500Q-GL 5G HAT

OS Ubuntu 22.04

server

Ubuntu 20.04 Desktop

Ofﬂoading Cluster

CPU Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz 40 Core

RAM 62 GB

GPU 2560-core NVIDIA Tesla T4

OS Ubuntu 22.04 server

4 RESULTS AND DISCUSSIONS

In our experiments, we focused on the measurable

advantages of ofﬂoading compute-intensive tasks on

two examples of mobile devices: the Raspberry Pi and

Nvidia Jetson Nano. These measurements encompass

various aspects and performance metrics of the de-

vices, shedding light on the efﬁciency and behavior of

the Object Detection module and the ofﬂoading solu-

tion throughout the experimentation process.

4.1 Raspberry Pi

Figure 3 presents the measurements of CPU, tempera-

ture, network and power usage of a Raspberry Pi dur-

ing local and remote execution of the object detection

module, represented as graphs of resource usage ver-

sus time. We have conducted a cycle of three local and

three remote executions of this module, each of 150

CLOSER 2024 - 14th International Conference on Cloud Computing and Services Science

270

sec duration. Those regions of the graphs are shaded

dark (local) and light grey (remote execution), respec-

tively. During the ﬁrst instance of ofﬂoading, the ob-

ject detection module is transferred from the device

to the remote host and instantiated, thus imposing an

additional resource burden equivalent to a cold start.

Subsequent ofﬂoad events simply activate the already

transferred but dormant module code, more closely

resembling a warm start event.

It can be observed from the Figure 3[a] that CPU

utilization rises from idle to around 90% during local

execution and returns to on average below 10% during

ofﬂoad execution of the computationally heavy mod-

ule. When executed locally, the object detection task

constitutes the bulk of CPU usage. On the other hand,

when ofﬂoaded the application handles data transmis-

sion and reception over the network, along with asso-

ciated serialization and de-serialization of this infor-

mation. During the initial idle period, two spikes in

CPU utilization are observed. These spikes occur as a

result of the messaging involved in the initial setup of

the ofﬂoading procedure and loading the application

into memory. We conclude that for compute-intensive

AI tasks like object detection, the ofﬂoading to a re-

mote site signiﬁcantly decreases CPU utilization of a

resource constrained device, clearly outweighing the

additional networking burden.

Figure 3[b] illustrates the temperature of the CPU

during the experiments. During onboard execution

the processor temperature rises signiﬁcantly before

dropping off sharply when ofﬂoading commences.

Local execution was limited in time in order to pre-

vent CPU throttling, which sets in at around 80 de-

grees Celcius on a Raspberry Pie, eventually render-

ing the device unresponsive.

Power usage is shown in Figure 3[c], clearly

showing the relation between CPU usage and power

consumption. When ofﬂoaded, power usage is

slightly elevated over the idle value of approximately

2 Watts due to the energy drain of the 5G adaptor,

but still signiﬁcantly below the approximately 5 Watts

consumed during local execution.

Figures 3[d] and 3[e] demonstrate the increase in

data transmission between the application and the of-

ﬂoaded component as a consequence of remote exe-

cution of this task. During local execution the net-

work usage is negligible. A spike in resource usage

during initial local execution is visible, representing

the transfer of the code module.

Figure 4 shows the Object detection response time

when it is executed locally, remotely over 5G, and

LAN respectively. This consists of the total time

taken to detect an object on a supplied video frame,

including data transmission time when ofﬂoaded. The

Table 2: Power vs Performance for Raspberry Pi.

Energy (in mWh per frame)

Local Execution 2.02

Remote Execution - 5G 0.09

Remote Execution - LAN 0.06

observed variability can be partially attributed to the

fact that the time taken to perform the object detection

depends on parameters other than resource availabil-

ity, such as the complexity of the video frame in ques-

tion and the number of objects represented. The order

of magnitude performance improvement when exe-

cuted remotely, from almost 1300 ms locally on the

device to about 90 ms (5G) or 65 ms (LAN), signif-

icantly validates the performance beneﬁts of ofﬂoad-

ing from a resource constrained device.

To further analyze the UE energy consumption of

an Object Detection execution task, we calculated the

energy required to receive a reply from the object de-

tection task for each frame from the simulated video

stream, measured in Milliwatt-hours (mWh). This

metric, referred to as mWh per frame, provides a nor-

malized measure of the energy efﬁciency of the object

detection process from an UE point of view (i.e., high

CPU utilization when processing onboard, vs lower

CPU utilization but additional data transfer when of-

ﬂoaded). Table 2 presents the results, indicating that

the energy consumption per frame is over an order of

magnitude lower for remote execution compared to

local execution for this device. This suggests that of-

ﬂoading the object detection task to a remote host re-

sults in more energy-efﬁcient execution. Note also the

slight difference when using 5G vs LAN connectivity.

This gives and indication of the additional energy us-

age of the 5G HAT, which is negligible compared to

the energy used to power the CPU on full capacity.

In summary, by ofﬂoading the object detection

module and executing it remotely, the CPU usage,

power consumption, and temperature of the Rasp-

berry Pi have notably decreased compared to lo-

cal execution. However, the network usage dur-

ing remote execution has correspondingly increased,

which, however, did not lead to a signiﬁcant increase

in power consumption. We conclude that ofﬂoad-

ing computational-heavy tasks is beneﬁcial, reducing

CPU usage, heat, and power consumption, which ex-

tend the device’s longevity when running on batteries.

4.2 Nvidia Jetson Nano

Figure 5 illustrates the measurements of CPU/GPU

utilization, temperature, network utilization, and

power usage for the Nvidia Jetson Nano device, us-

Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices

271

(a) CPU Usage. (b) CPU temperature.

(e) Network data receiving.

Figure 3: Measurement results on a Raspberry Pi.

ing onboard and ofﬂoading periods of 150 seconds.

The CPU utilization of Jetson Nano is illustrated

in Figure 5[a]. During the local execution of the ob-

ject detection module, the relatively low CPU utiliza-

tion compared to the Raspberry Pi of less than 30%

can be attributed to the greater utilization of the in-

tegrated GPU, as evident from Figure 5[b]. A pro-

portion of the CPU utilization can be attributed to

data transmission from CPU memory to GPU mem-

ory. The same spikes due to messaging, setup and

module transfer as in the case of the Raspberry Pi can

be observed.

CLOSER 2024 - 14th International Conference on Cloud Computing and Services Science

272

Figure 4: Object detection response time for Raspberry Pi.

Figure 5[c] shows CPU/GPU temperature during

the experiments. The Jetson Nano utilizes a heat sink,

which effectively dissipates heat and prevents signif-

icant temperature increases. As a result, the temper-

atures remain relatively stable throughout the experi-

ments compared to the Raspberry Pi.

Figure 5[d] illustrates the power utilization of the

Jetson Nano device. Onboard and ofﬂoaded power

consumption follow the same patterns as on the Rasp-

berry Pi despite the increased local performance, in

this case approximately 6 and 2 Watts respectively.

Also here the results show signiﬁcant reduction in de-

vice power consumption as a beneﬁt of ofﬂoading. As

expected, Figures 5[d] and 5[e] conﬁrm the signiﬁ-

cant increase in network trafﬁc caused by ofﬂoading.

Figure 6 illustrates the results of the object de-

tection response times for local and remote execu-

tion, using the same setup as in the earlier measure-

ments. The average local execution time for object

detection on the Jetson Nano device is measured at

60.01 milliseconds (ms), which is comparable to the

average execution times for remote execution over 5G

and LAN, which are 84.40 ms and 63.01 ms, respec-

tively. This is due to the more capable hardware on

the Nvidia Jetson, including a GPU. Also the extra

delay of the experimental 5G network compared to

the direct LAN connectivity is more relevant on these

timescales.

We note that energy efﬁciency of remote execu-

tion from a UE device perspective is still signiﬁcantly

improved when the object detection task is ofﬂoaded,

see Table 3. These results indicate that computational

ofﬂoading of compute-intensive tasks may not result

in meaningful performance improvements when the

UE has similar compute capabilities than the remote

site. However, the reduced power drain and heat dis-

sipation on the UE may prove to be a decisive factor

when choosing whether to ofﬂoad or not.

Table 3: Power vs Performance for Jetson Nano.

Energy (in mWh per frame)

Local Execution 0.101

Remote Execution - 5G 0.058

Remote Execution - LAN 0.050

5 RELATED WORK

Concepts related to computational ofﬂoading have

been signiﬁcant subject of research in the last decade.

An early survey on mobile edge computing (Mach

and Becvar, 2017) (MEC, later re-labelled multi-

access edge computing), analyzed more than 100 pa-

pers, focusing on the decision mechanisms for com-

putational ofﬂoading as well as the allocation of com-

puting resources. The assumed virtualization technol-

ogy and granularity in these papers has been VMs,

which does not meet our requirements of lightweight-

ness and portability. However, a few general lessons

learned from this paper are relevant also for our pro-

posed solution, e.g., that VM migration is impractical

if a sufﬁciently large amount of data needs to be trans-

mitted, and partial ofﬂoading can save signiﬁcantly

more energy at the UE compared to full ofﬂoading.

The survey by Lin et al. (Lin et al., 2019) re-

views research on computation ofﬂoading, and identi-

ﬁed three main groups of challenges: application par-

titioning; task allocation and resource management;

and distributed task execution. In terms of application

partitioning, challenges related to lightweight pro-

gramming models as well as the partitioning granu-

larity are pointed out. In our solution, we address this

challenges with a novel programming model in Sec-

tion 2.3, and the ﬂexibility to ofﬂoad arbitrary code

granularity, including both whole tasks or compo-

nents down to the level of methods/functions. Task al-

location and resource management with optimal strat-

egy are not part of the present paper, but we are work-

ing on a related publication. With respect to dis-

tributed task execution, our solution deviates from the

VM and container based solution discussed in this

survey by building our solution based on portable,

lightweight and secure Wasm runtimes.

Another survey focused on service migration

strategies in the context of MEC (Wang et al., 2018).

As part of this survey, execution environments to host

MEC applications are compared. The conclusions are

that VMs have good isolation properties but are large

in terms of footprint and slow to boot and run. Con-

tainers, on the other hand, have a smaller footprint and

faster startup times, but perform sub-optimally across

operating systems and hardware platforms. Finally,

Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices

273

(a) CPU Usage. (b) GPU Usage.

(e) Network data transmission. (f) Network data receiving.

Figure 5: Measurement results on a Nvidia Jetson Nano.

agent-based systems have very small footprint and

rapid boot and runtime performance with the promise

of convenient administration. However, these sys-

tems are still of experimental nature without contem-

porary stable existing frameworks. Together with our

review of several concepts related to VMs or con-

tainer migration (Junior et al., 2020)(Machen et al.,

2018)(Benjaponpitak et al., 2020), we conclude that

literature conﬁrms our our choice of Wasm a promis-

ing execution environment.

While the above surveys focus on mobile edge

computing based on classical Cloud technologies

CLOSER 2024 - 14th International Conference on Cloud Computing and Services Science

274

Figure 6: Object detection response time for Jetson Nano.

such as VMs and containers, there is more recent

work applying lightweight, portable runtimes such as

Wasm for computational ofﬂoading. The authors of

(Hoque and Harras, 2022) argue along these lines,

whereas (M

etrey et al., 2022) point out beneﬁts of

Wasm for faster migration time due to smaller mem-

ory footprint, and (Long et al., 2021) show better per-

formance of Wasm runtimes than OS containers.

Authors of (Tachibana et al., 2022) suggested

application-driven dynamic task ofﬂoading by task

classiﬁcation on real-time basis. Application-driven

ofﬂoading allows the application developers to decide

when and where to ofﬂoad, but this might not always

be optimal. The network or computation provider has

to be able to offer only edge hosts or nodes which are

capable to execute the ofﬂoaded task and the develop-

ers and/or users should be able to choose the node.

The ability to choose arbitrary nodes would be es-

pecially needed in multi-provider environment where

providers might compete for users.

Host to host Wasm based modules migration has

been proposed in (Nieke et al., 2021), where authors

migrate mobile agents - application server instance.

Host to host migration is very important for load bal-

ancing or compliance with QoS of the application-

module communication during user mobility, but the

task has to be already represented as a Wasm module

in order to migrate it which is not optimal for UE to

host migration. In this case, it is a natural choice for

the task is to be executed in the original device in na-

tive code. In Section 2.3 we proposed a UE to host

migration method addressing this issue.

As mentioned, Wasm modules have beneﬁts for

compute migration, but are not fully protected from

threats from other malicious tenants or the service

provider. Security measures like utilizing Trusted

Execution Environments and Wasm enclaves are ad-

dressed in (M

etrey et al., 2021)(Pop et al., 2022).

6 CONCLUSION AND NEXT

STEPS

We presented a solution that enables an application

running on a mobile device to dynamically ofﬂoad

critical parts of its functionality, as identiﬁed by the

developer, to a remote site, seamlessly during appli-

cation runtime. We identiﬁed a number of require-

ments, designed, built, and veriﬁed an according so-

lution. Our solution consists of a novel programming

model and associated toolchain, with particular focus

on developer ease of use, as well as a distributed ex-

ecution framework based on portable and lightweight

standalone Wasm runtimes.

We demonstrated and evaluated the viability of the

presented solution, providing a compelling case for

performance improvements through ad-hoc computa-

tional ofﬂoading in resource-constrained devices and

reduced power consumption in general, despite the in-

creased network usage. We realized and demonstrated

the presented solution using two examples of UE de-

vices with varying computational capacities (Rasp-

berry Pi and Nvidia Jetson Nano). We also applied

two different categories of connection between the de-

vice and the edge servers - wired LAN as the baseline

reference, as well as standalone (SA) 5G wireless as

a realistic, futureproof wide-area wireless connectiv-

ity example. For both devices, module ofﬂoading re-

duces device power usage. The Raspberry Pi extends

its compute capability for computation heavy tasks,

whereas the Jetson Nano does not see this beneﬁt due

to its onboard GPU. However, in both cases, device

energy efﬁciency is improved.

We assert that this type of use-case agnostic so-

lution - offering sandboxed remote computational re-

sources to any application on a mobile device basi-

cally anytime and anywhere - has the potential to tar-

get even the long tail of regional enterprises and de-

velopers that usually would not engage in the heavy

burdens of deployment, management, and contract

handling related to edge solutions. We are aware that

there are still numerous open issues to reach this vi-

sion, and we conclude this article by pointing out a

few open research questions we plan to address as

next steps. For instance, an important building block

to make this solution viable is a comprehensive cost

function service which can intelligently decide when

and where to ofﬂoad in order to provide the best over-

all performance for the application, device and user.

As security and isolation are of utmost importance in

remote computation scenarios, we also work on ex-

tending our solution with a more comprehensive se-

curity architecture based on an detailed analysis of

security and privacy threats. Moreover, we plan to

Beneﬁts of Dynamic Computational Ofﬂoading for Mobile Devices

275

further explore speciﬁc features in the areas described

in this work, such as granular isolation of Wasm mod-

ules into TEEs similar to (Nieke et al., 2021). While

we had very positive experiences of using Wasm run-

times when building our prototype, there are still is-

sues with proprietary system interfaces to host func-

tion related to efﬁcient realization of the execution

handler. We thus encourage the networking commu-

nity to engage in consortia like the W3C WASI work-

ing group to make sure that future Wasm related stan-

dards meet the requirements of advanced use-cases

like proposed in this paper. Finally, we are exploring

alternatives to offering computational ofﬂoading as a

service via cellular mobile broadband networks,e.g.,

5G advanced or 6G (Wikstrom et al., 2022). This in-

cludes integration points like service registration and

discovery, management of the ofﬂoading service user

plans and associated mobility support, control plane

handling of ofﬂoad triggers, as well as crucial authen-

tication and billing mechanisms.

REFERENCES

Benjaponpitak, T., Karakate, M., and Sripanidkulchai, K.

(2020). Enabling live migration of containerized ap-

plications across clouds. In IEEE INFOCOM 2020 -

IEEE Conference on Computer Communications.

Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y. M. (2020).

Yolov4: Optimal speed and accuracy of object detec-

tion. arXiv preprint arXiv:2004.10934.

ESTI (2023). ETSI TS 123 558 V17.7.0 5G; Architecture

for enabling Edge Applications (3GPP TS 23.558 ver-

sion 17.7.0 Release 17). Technical Report TS 123 558,

European Telecommunications Standards Institute.

Guanciale, R., Paladi, N., and Vahidi, A. (2022). Sok:

Conﬁdential quartet - comparison of platforms for

virtualization-based conﬁdential computing. In IEEE

Int. Sym. on Secure and Private Execution Environ-

ment Design (SEED).

Haas, A., Rossberg, A., Schuff, D. L., Titzer, B. L., Hol-

man, M., Gohman, D., Wagner, L., Zakai, A., and

Bastien, J. (2017). Bringing the web up to speed with

webassembly. SIGPLAN Not., 52(6):185–200.

Hoque, M. N. and Harras, K. A. (2022). Webassembly

for edge computing: Potential and challenges. IEEE

Communications Standards Magazine, 6(4):68–73.

Huang, X., Tsiatsis, V., Palanigounder, A., Su, L., and

Yang, B. (2021). 5g authentication and key manage-

ment for applications. IEEE Communications Stan-

dards Magazine, 5(2):142–148.

Junior, P. S., Miorandi, D., and Pierre, G. (2020). Stateful

container migration in geo-distributed environments.

In IEEE Int. Conf. on Cloud Computing Technology

and Science (CloudCom).

Lin, L., Liao, X., Jin, H., and Li, P. (2019). Computation

ofﬂoading toward edge computing. Proceedings of the

IEEE, 107(8):1584–1607.

Long, J., Tai, H.-Y., Hsieh, S.-T., and Yuan, M. J. (2021). A

lightweight design for serverless function as a service.

IEEE Software, 38(1):75–80.

Mach, P. and Becvar, Z. (2017). Mobile edge comput-

ing: A survey on architecture and computation of-

ﬂoading. IEEE Communications Surveys & Tutorials,

19(3):1628–1656.

Machen, A., Wang, S., Leung, K. K., Ko, B. J., and Saloni-

dis, T. (2018). Live service migration in mobile edge

clouds. IEEE Wireless Communications, 25(1).

etrey, J., Pasin, M., Felber, P., and Schiavoni, V. (2022).

Webassembly as a common layer for the cloud-edge

continuum. In Workshop on Flexible Resource and

Application Management on the Edge: FRame.

etrey, J., G

ottel, C., Khurshid, A., Pasin, M., Felber, P.,

Schiavoni, V., and Raza, S. (2022). Attestation mecha-

nisms for trusted execution environments demystiﬁed.

In IFIP Intr. Conf. on Distributed Applications and In-

teroperable Systems: DAIS.

etrey, J., Pasin, M., Felber, P., and Schiavoni, V. (2021).

Twine: An embedded trusted runtime for webassem-

bly. In 2021 IEEE 37th International Conference on

Data Engineering (ICDE), pages 205–216.

Nieke, M., Almstedt, L., and Kapitza, R. (2021).

Edgedancer: Secure mobile webassembly services on

the edge. In Int. Workshop on Edge Systems, Analytics

and Networking, EdgeSys ’21.

Pop, V. A. B., Niemi, A., Manea, V., Rusanen, A., and Ek-

berg, J.-E. (2022). Towards securely migrating we-

bassembly enclaves. Proceedings of the 15th Euro-

pean Workshop on Systems Security.

Tachibana, T., Sawada, K., Fujii, H., Maruyama, R., Ya-

mada, T., Fujii, M., and Fukuda, T. (2022). Open

multi-access network platform with dynamic task of-

ﬂoading and intelligent resource monitoring. IEEE

Communications Magazine, 60(8):52–58.

Wang, S., Xu, J., Zhang, N., and Liu, Y. (2018). A survey

on service migration in mobile edge computing. IEEE

Access, 6:23511–23528.

Wikstrom, G., Persson, P., Parkvall, S., Mildh, G.,

Dahlman, E., Rune, G., Arkko, J., John, W., et al.

(2022). 6g—connecting a cyber-physical world. Er-

icsson White Paper, 28.

CLOSER 2024 - 14th International Conference on Cloud Computing and Services Science

276