Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation

Igor Ferrazza Capeletti

3 a

, Ariel Goes de Castro

3 b

, Daniel Chaves Temp

1,3 c

Paulo Silas Severo de Souza

2 d

, Arthur Francisco Lorenzon

4 e

, F

abio Diniz Rossi

1,3 f

and Marcelo Caggiani Luizelli

3 g

Federal Institute Farroupilha, Alegrete, Brazil

Pontiﬁcal Catholic University of Rio Grande do Sul, Porto Alegre, Brazil

Federal University of Pampa, Alegrete, Brazil

Federal University of Rio Grande do Sul, Porto Alegre, Brazil

Keywords:

Cloud Continuum, Resource Allocation, Heuristics, Simulation.

Abstract:

The IT community has witnessed a transition towards the cooperation of two major paradigms, Cloud Com-

puting and Edge Computing, paving the way to a Cloud Continuum, where computation can be performed at

the various network levels. While this model widens the provisioning possibilities, choosing the most cost-

efﬁcient processing location is not trivial. In addition, network bottlenecks between end users and computing

facilities assigned for carrying out processing can undermine application performance. To overcome this chal-

lenge, this paper presents a novel algorithm that leverages a path-aware heuristic approach to opportunistically

process application requests on compute devices along the network path. Once intermediate hosts process

information, requests are sent back to users, alleviating the demand on the network core and minimizing end-

to-end application latency. Simulated experiments demonstrate that our approach outperforms baseline routing

strategies by a factor of 24x in terms of network saturation reduction without sacriﬁcing application latency.

1 INTRODUCTION

Cloud Continuum is an emerging paradigm that ex-

tends the traditional Cloud to the Edge, Fog, and in-

between. In other words, it is an aggregation of het-

erogeneous resources of other computing facilities,

such as micro-data centers and intermediate comput-

ing nodes along the path between the user’s requests

and larger computing premises (Moreschini et al.,

2022b; Liao et al., 2022).

By extending the computing capabilities, Cloud

Continuum allows the in-transit processing of appli-

cation requests and expands the computing capabili-

ties (and granularity) (Baresi et al., 2019). There are

many advantages of this computing paradigm. For in-

stance, application requests can be processed while

https://orcid.org/0000-0002-8712-2195

https://orcid.org/0000-0002-5391-5082

https://orcid.org/0000-0002-9724-1331

https://orcid.org/0000-0003-4945-3329

https://orcid.org/0000-0002-2412-3027

https://orcid.org/0000-0002-2450-1024

https://orcid.org/0000-0003-0537-3052

routed to the following processing hop (e.g., an Edge

node). By doing that, it can reduce communication

latency and improve the resource utilization of Cloud

and Edge nodes by freeing their resources up. It is

imperative for emerging applications with stringent

performance requirements in a highly connected and

dynamic environment. Examples of such applica-

tions include multi-sensory extended reality, brain-

computer interfaces, haptic interaction, meta-verses,

and ﬂying vehicles (Yeh et al., 2022).

Figure 1 illustrates a Cloud Continuum example.

In the traditional Cloud Computing approach, appli-

cation requests (e.g., R

) are processed by a central-

ized Cloud server. In the Cloud Continuum, how-

ever, computing resources are spread over the net-

work infrastructure. That includes, for instance, Edge

Cloud nodes and micro-servers placed closer to the

access network (e.g., base stations). However, the

resources available at these computing premises are

usually quite limited. In the example, request R

could be processed entirely by an Edge Cloud server

(in case it has enough resources) or, depending on the

request’s requirements, still be processed partially by

different Edge Cloud servers. In turn, request R

is to-

Capeletti, I., Goes de Castro, A., Temp, D., Severo de Souza, P., Lorenzon, A., Rossi, F. and Luizelli, M.

Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation.

DOI: 10.5220/0011995700003488

In Proceedings of the 13th International Conference on Cloud Computing and Services Science (CLOSER 2023), pages 90-99

ISBN: 978-989-758-650-7; ISSN: 2184-5042

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

Core

Edge

Access

Micro servers

Edge Cloud

Cloud Servers

Cloud

Edge Cloud

Figure 1: Overview of the Cloud-Continuum Computing.

tally processed by a micro-server located at the base

station at one-hop from the users.

Despite the extra layer of available resources pro-

vided by the aggregation of heterogeneous resources

in the infrastructure, it is challenging to efﬁciently use

distributed computing resources without proper or-

chestration. For instance, a Cloud Platform can route

requests between Edge and Cloud nodes by using

consolidated routing protocols (e.g., OSPF (Fortz and

Thorup, 2000) – based on the shortest path). Even ap-

plying customized routing schemes to improve other

network metrics (e.g., to balance network link load)

might under-use the available computing resources in

the network. An efﬁcient solution would be to route

application requests using a path with higher access

to computation power. In this case, a request could be

processed opportunistically by a computing premise

along the path. In the best case, the request is pro-

cessed in the ﬁrst hop. Otherwise, it is routed to

its ﬁnal destination (e.g., the Cloud), passing through

nodes that have direct access to more powerful com-

puting premises (e.g., Edge Cloud servers).

This paper proposes a path-aware heuristic ap-

proach to efﬁciently route application requests to

computing nodes to ﬁll this gap. The main advantage

of our heuristic approach is that it considers the avail-

able resources along the path of the network to pro-

cess the requests as early as possible. If a computing

premise along the path processes a request, it is routed

back to the source device. Our heuristic approach is

based on the k-shortest path. However, alternative

paths obtained by our search algorithm try not to de-

tour much from the original shortest path. It does that

by incrementally allowing an expansion of the neigh-

borhood of nodes in the original path. Results show

that our solution outperforms shortest-path-based so-

lutions by a factor of 24X in terms of the number of

processed applications while not imposing signiﬁcant

delay. Our contributions can be summarized as fol-

lows:

• We formalize an orchestration model for in-transit

processing on Cloud Continuum environments.

• We propose a novel algorithm that opportunisti-

cally processes application requests on compute

devices along the Cloud Continuum.

• We present an evaluation showing that our algo-

rithm reduces the network saturation by 24x com-

pared to traditional routing strategies.

• We disclosure the dataset and source code of our

approach to foster reproducibility.

The remainder of this paper is organized as follows.

In Section 2, we provide a brief background on Cloud

Continuum aspects. Then, we discuss the related lit-

erature in the area. In Sections 3 and 4, we introduce

our model and present the proposed heuristic. In Sec-

tion 5, we discuss the obtained results. Last, in Sec-

tion 6, we conclude the paper with ﬁnal remarks and

perspectives for future work.

2 BACKGROUND AND RELATED

WORK

In this section, we start by discussing cloud comput-

ing concepts. Next, we overview the most prominent

work related to Cloud Continuum computing.

2.1 Cloud Computing and Beyond

Cloud computing is a paradigm that allows users to

move out their data/applications from local comput-

ing to remote “cloud” (Wang et al., 2010). There are

many beneﬁts to this approach. First, estimating the

cost of acquiring new equipment (e.g., switches and

servers) is not trivial. Consider certain services may

have different demands throughout the day – i.e., peak

hours. In this case, a budget can end up being (i) over-

estimated, wasting resources, or (ii) underestimated

and needing to be more to deal with speciﬁc user de-

mands, such as enough bandwidth or processing ca-

pacity to deal with an excessive amount of requests.

Second, the equipment takes up space and is costly.

It is necessary to dedicate considerable time to guar-

antee the correct functioning of the equipment and its

running services to maintain qualiﬁed staff in the pay-

roll for unforeseen problems. Finally, planning the

above goals is time-consuming, and managers could

use this time for other company tasks.

Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation

(a) (b)

(c)

(d) (e) (f)

Figure 2: Illustration of the behavior of our heuristic approach when ﬁnding near-optimal solutions.

Despite that, the cloud computing model strug-

gles to keep up with the constraints of current ser-

vices, such as ultra-low latency services in 5G cel-

lular networks (Chiu et al., 2016). To tackle this

problem, new computation paradigms such as edge

computing (Satyanarayanan, 2019) arise to satisfy

these demands. More speciﬁcally, edge computing

allows performing computation and storage closer to

the end-user with minimal to no intervention from

cloud nodes - incurring less latency. Despite that,

edge computing still has the limited processing power

and high-time responses. It may require cooperation

from remote resources such as those in cloud provider

infrastructure that do not cope well with these require-

ments.

More recently, the cloud continuum has drawn at-

tention from academia and industry as a candidate

to overcome these limitations and has several deﬁ-

nitions (Moreschini et al., 2022a). In this paper, we

consider an early deﬁnition by Gupta et al (Gupta

et al., 2016) where it deﬁnes cloud continuum as “a

continuum of resources available from the network

edge to the cloud/datacenter”. That said, with the ad-

vent of programmable network devices (e.g., Smart-

NICs, programmable switches), the ofﬂoading of ap-

plications to the data plane presents a superior perfor-

mance in terms of high-throughput, low-latency com-

puting which may be combined with the existing edge

infrastructure to avoid the need to access high-latency

computation resources located in centralized clouds -

to guarantee SLAs are met.

2.2 Related Work

This section discusses the most prominent studies re-

lated to in-transit or in-path computing strategies.

NetCache (Jin et al., 2017) leverages switch

ASICs to perform on-path network caching to store

key-value data. Similarly, Wang et al. (Wang et al.,

2019) is the ﬁrst to design and implement packet ag-

gregation/disaggregation entirely in switching ASICs

while PMNet (Seemakhupt et al., 2021) persistently

store and update data in network devices with sub-

RTT latency. SwitchTree (Lee and Singh, 2020)

estimates ﬂow-level stateful features, such as RTT

and per-ﬂow bitrate. FlexMon (Wang et al., 2022)

presents a network-wide trafﬁc measurement scheme

that optimally deploys measurement nodes and uses

these nodes to measure the ﬂows collaboratively.

Tokusashi et al. (Tokusashi et al., 2019) selec-

tively ofﬂoad services to the data plane according to

changes in workload. Similarly, Mai et al. (Mai et al.,

2020) partially ofﬂoads the lightweight critical tasks

to the data plane devices and leaves the rest to the

mobile edge computing (MEC) nodes. In contrast,

Saquetti et al. (Saquetti et al., 2021) distributes neu-

ron computation of an Artiﬁcial Neural Network to

multiple switches. Friday et al. (Friday et al., 2020)

introduces an engine to detect attacks in real-time by

analyzing one-way ingress trafﬁc on the switch. Sim-

ilarly, INDDos et al. (Ding et al., 2021) can reduce

DDoS detection time. It identiﬁes as attacks the des-

tination IPs contacted by several source IPs greater

than a threshold, in a given time interval, entirely

in the data plane. SmartWatch (Panda et al., 2021)

leverages advances in switch-based network teleme-

try platforms to process the bulk of the trafﬁc and

only forward suspicious trafﬁc subsets to the Smart-

NIC, which have more processing power to provide

ﬁner-grained analysis.

Kannan et al (Kannan et al., 2019) is the ﬁrst

to perform time synchronization in the data plane,

enabling to add of high-resolution timing informa-

tion to the packets at line rates. Tang et al (Tang

CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science

et al., 2020) and pHeavy (Zhang et al., 2021) make

efforts to reduce the time to detect heavy hitters in

the data plane. (Tang et al., 2020) propose a compact

sketch statically allocated in switch memory, while

(Zhang et al., 2021) introduces a machine learning-

based scheme to mitigate latency overhead to SDN

controllers. (Sankaran et al., 2021) increases data

plane security by restricting modiﬁcations to a per-

sistent network switch state, enabling ML decision-

making computation to be ofﬂoaded to industrial net-

work elements. Similarly, Kottur et al (Kottur et al.,

2022) propose crypto externs for Netronome Agilo

smartNICs for authentication and conﬁdentiality di-

rectly in the data plane.

Most recent initiatives have focused on ofﬂoad-

ing computing mechanisms to speciﬁc computing

premises, such as programmable network devices. In

this work, we take a further step toward efﬁciently us-

ing distributed resources in the Edge-to-Cloud Con-

tinuum to increase the network capability regarding

the number of processed application requests.

3 SYSTEM MODEL

This section describes the resource allocation model

proposed in this work for the Edge-to-Cloud Con-

tinuum approach. Next, we describe the inputs and

outputs of our model, as well as the constraints and

objective function. Table 1 summarizes the notations

used hereafter.

3.1 Model Description and Notation

Input. The optimization model considers as input a

physical network infrastructure G = (D, L), and a set

of application requests A. Set D in network G rep-

resents routing/forwarding devices D = {1, ..., |D|},

while set L consists of unidirectional links intercon-

necting pair of devices (i, j) ∈ (D × D). We as-

sume that only one computing premise (e.g., Edge

node) is connected to a forwarding device D. Devices

⊆ D represents the subset of devices with comput-

ing premises. Each computing premise d ∈ D

has

a computational capacity deﬁned by C : D

→ N

Conversely, each application i ∈ A has a computing

requirement deﬁned by R : A → N

. We denote the

routing taken by application i ∈ A as function P : A →

× ... × D

|D−1|

}. We assume the path given by

function C is simple.

For simplicity, we assume that distributed com-

puting platforms can partially compute the computing

power required by an application i ∈ A. As a simpliﬁ-

cation, we assume that partially computed values are

embedded into the packet that transports the request.

Example of similar strategies that utilize the packet

encapsulation to carry information includes In-Band

Network Telemetry (INT) (Marques et al., 2019)(Ho-

hemberger et al., 2019).

Variables. Our model considers a variable set X =

i, j

, ∀i ∈ A, j ∈ D} which indicates the amount used

by computing premises j ∈ D

to process application

i ∈ A.

i, j



If application i ∈ A is processed by j ∈ D

0 otherwise.

(1)

Constraints. Next, we describe the main feasibility

constraints related to the optimization problem. The

problem is subject to two main constraints: (i) path

computing capacity and (ii) route connectivity con-

straints.

(i) Path Computing Capacity: Application i ∈ A has a

computing requirement that is attended along the path

taken by the application. Therefore, the routing path

(or the computing path) establishes an upper bound

of computing power. In other words, the amount of

computing power in the path cannot be lower than the

application requirement. Thus, in Equation set (2), we

sum the available capacity along the routing taken by i

and ensure it is equal to the application’s requirement.

Similarly, Equation set (3) ensures that the computing

power of computing premises j ∈ D

is not violated.

∑

∀ j∈P (i): j∈D

C( j) · x

i, j

= R(i) : (∀i ∈ A) (2)

∑

∀i∈A,P (i)

R( j) · x

i, j

≤ C( j) : (∀ j ∈ D

) (3)

(ii) Route Connectivity: we assume all devices i ∈

P (i) are pairwise strongly connected, i.e., any pair

of devices in P(i) are reachable to each other by a

link (i, j) ∈ L. To describe this property, we recall

a auxiliary function δ : (P × D × D) → {true, false}

that returns true in case there exists a path be-

tween node i and j in path C. Note that one

can use other constraints to describe route connec-

tivity such as the based on ﬂow conservation con-

straints. In other words, P(i)

[k]

→ ... → P (i)

[|P |]

where (P (i)

[k]

, P (i)

[k+1]

) ∈ L. Otherwise, function δ

returns false. Equation set (4) ensures all applications

i ∈ A have a computing path, while Equation set (5)

ensures they are valid (or connected).

|C (i)| 6=

0 : (∀i ∈ A) (4)

δ(P (i), k, l) = true : (∀i ∈ A), ∀(k, l) ∈ P(i) (5)

Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation

Table 1: Summary of symbols.

Symbol Deﬁniton

G = (D, L) Physical infrastructure G.

D Set of forwarding devices.

Set of computing premises.

L Set of physical links.

A Set of applications.

P (i) Computing path given to ap-

plication i ∈ A

C : A → N

Computing capacity of j ∈

R : A → N

Computing requirement of

application i ∈ A

Given the feasibility constraints deﬁned above,

we assume there exists an assignment function A :

(G, i) → (P (i)) : ∀i ∈ A that, given a network infras-

tructure G, and a set of application A, it returns a fea-

sible computing path (P ), with respect to constraint

sets (i) and (ii).

4 PROPOSED HEURISTIC

We propose a heuristic procedure that builds

computation-aware paths to tackle the above problem

efﬁciently and provide a quality-wise solution. Next,

we overview the ideas behind our proposed heuristic

and discuss the pseudo-code.

Consider Figure 2a illustrates a network topology

G(D, L) where D is a set of devices, and L is the

set of network links. Every time an end-user submits

an application request, it must be routed somehow to

reach its resources (e.g., the Cloud). If we wanted to

orchestrate how to route an application from a host

connected at the edge (e.g., S8) to a remote appli-

cation (e.g., S15), there would be several strategies

to reach this goal. Figures 2b to 2c summarize our

proposal to solve this problem. First, we ﬁnd the

ﬁrst shortest path from the origin of the request S8

– which is the reference path –, leveraging existing

programmable switches to reduce server computation

overhead to the application server S15. In a binary

manner, it is checked if the devices along the short-

est path have enough resources (together) to com-

pletely process the network request entirely in the data

plane (Figure 2b). Otherwise, the reference path (S8,

S7, S11, S15) is iteratively modiﬁed according to the

neighborhood size – controlled by the amp variable –

as seen in Figure 2c. More speciﬁcally, we ﬁx a node

in the reference path and explore its adjacent neigh-

bors. Then, detours are performed by applying the

shortest path on sub-graphs that do not contain previ-

ously explored nodes (Figure 2c and Figure 2e). Fi-

nally, suppose none of the generated paths can fully

ofﬂoad the application request. In that case, we start

the procedure all over again by ﬁxing another node in

the reference path, i.e., – S7 as in Figure 2f.

Algorithm 1 summarizes the main procedure. For

each application (app), we store the shortest path

(line 3) from a host connected to the source switch

(s) to the destination server (d). If the switches in the

path between s and d are unable to completely satisfy

the application’s computing request (req), we set the

distance by the number of hops between the shortest

path and the remaining graph (line 6) and narrows the

search ﬁeld for alternate switches based on amp value

(line 8) - the higher, the broader. Then, an optimiza-

tion procedure (line 10) is invoked to try and ﬁnd a

path to satisfy the computation request (see Algorithm

2 for details). Finally, if the optimization procedure

succeeds, we store the new path. Otherwise, keep the

default path.

The core of our proposal is presented in Algo-

rithm 2. It iteratively tries to completely ofﬂoad the

server computation in the data plane by modifying the

original path. For a given application app, we need its

previous computation path cam old

app

Algorithm 1: Overview of the procedure.

Input: G(D, L): topology graph, A: set of applications,

amp: length of the subgraph.

1: graph

old

← graph

2: for app in A do

3: cam old

app

← di jkstra

mod

(s, d, amp, graph)

4: if sum

cap

(cam

app

) < req then

5: for i ∈ cam

app

6: set weigths(amp, graph)

7: end for

8: graph ← gen subgraph(graph, amp)

9: end if

10: cam

app

← opt(app, graph, cam old

app

)

11: if sum

app

(cam

app

) 6= NULL then

12: res ← add(cam

app

)

13: else

14: res ← add(cam old

app

)

15: end if

16: graph ← graph

old

17: end for

The procedure works as follows: ﬁrst, we (i) mark

each node (line 1) as the reference re f at a time, then

(ii) we perform detours by deleting the edge (line

2) starting from the current reference node to its up-

following neighbor in the original path. Then, we re-

peat the process (lines 5-15) for the remaining adja-

cent nodes (line 4). Finally, if the taken detours reach

the total application request, return the modiﬁed path

(line 4).

CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science

Algorithm 2: Overview of the optimization procedure.

Input: G

(D, L): topology subgraph, app: application,

amp: length of the subgraph, cam old

app

: reference

application computation path.

1: for re f ∈ cam old

app

2: del(link(re f , re f → next))

3: ad j list

re f

← get ad j nodes(re f )

4: cam

alt

← di jkstra

mod

(s, d, amp)

5: while ad j list

re f

6= NU LL and cam

alt

6= NU LL do

6: if sum

cap

(cam

alt

) ≥ req then return cam

alt

7: else

8: for j ∈ cam

alt

9: if j ∈ ad j list

re f

then

10: del(link(re f , j))

11: cam

alt

← di jkstra

mod

(s, d, amp)

12: end if

13: end for

14: end if

15: end while

16: end for

return NU LL

5 EVALUATION AND

DISCUSSION

This section describes the experiments carried out to

evaluate the proposed algorithm. First, we detail our

setup and methodology (§5.1). Then, we discuss the

achieved results (§5.2).

5.1 Setup

To evaluate and assess the performance metrics of our

proposed heuristic algorithm, we implement it using

Python language. All experiments were conducted on

an AMD ThreadRipper 3990X with 64 physical cores

and 32 GB of RAM, using the Ubuntu GNU/Linux

Server 22.04 x86-64 operating system. For our ex-

periments, we generated physical network infrastruc-

tures with 100 routers. We consider that each rout-

ing device in our network has a computing premise

attached to it. Physical networks are generated ran-

domly with the link connectivity probability ranging

from 10% to 50%. Each network link has latency val-

ues that vary between 10ms and 100ms. All comput-

ing premises have a processing capacity between 200

and 500, while 100 applications are requested to de-

mand processing power ranging from 100 to 200. The

source and destination node of each request is gener-

ated randomly. We varied in our approach the param-

eter amp between 1 and 2. This parameter controls

the search depth, as already discussed.

We repeated each experiment 30 times to obtain

the average and ensure a conﬁdence level of at least

95%.

Baseline. We compare our approach against an

OSPF-based approach. All requests follow the short-

est path between source and destination nodes. We

varied the order in which the application’s requests

are processed based on the computing power re-

quested: random (rnd), ascending (asc), and de-

scending (dsc).

5.2 Results

Neighborhood Search. Figure 3 illustrates how the

value of amp impacts the shortest path computation on

the search for distinct alternative paths. In the base-

line (amp equal to 0), fewer applications are entirely

ofﬂoaded to the computing premises for both cate-

gories (1 to 1 and n to 1) because there allowed no

alternative routes besides the anchor path. On aver-

age, our approach ofﬂoaded 5% of the applications in

the ﬁrst category (Figure 3a). In contrast, it increases

to 75% when compared to the second strategy (Fig-

ure 3b) – i.e., multi-source – because there are propor-

tionally more paths for different applications. Also,

for amp equal to 1 and 2 scenarios, our approach of-

ﬂoaded at least 1.2x more applications, up to 24.85x

in the best case.

Impact of Ordering Strategies. Figure 4 illustrates

the impact of applying different strategies to distribute

the computation of a set of cloud applications into

computing premises in the infrastructure. The asc al-

gorithm orders application priority based on higher

computation resources needed. On the other hand,

the dsc strategy prioritizes less computation-intensive

requests. Finally, the rnd strategy ofﬂoads requests

as they arrive. We varied the source location (in the

topology) where the requests were run. On the left

(Figure 4a), the applications are limited to a single

origin node, while on the right (Figure 4b, every node

may be chosen randomly at each application request.

We can observe that as the probability of the exis-

tence of a link increases for each pair of nodes (x-

axis), the overall number of successfully processed

applications also increases, reaching up to 173 run-

ning applications for the rnd strategy – i.e., applica-

tions are ofﬂoaded as they arrive. However, on the

right, when we extend compute sources across the

network (Figure 4b). In the worst case (5% link cover-

age topology), we already have 193 covered applica-

tions (i.e., 97% coverage) with the dsc strategy. It is

because new shortest reference paths are created for

each source-destination pair, and nodes from differ-

ent reference paths can reach unreachable neighbors

when the value of amp is too small for a single refer-

ence path.

Path Length and Path Latency. Figure 5a summa-

Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation

100

120

140

160

180

200

5 10 20 30 40 50

# of Running Applications

Link Probability (%)

Baseline

Our Approach (amp=1) Our Approach (amp=2)

(a) Single source, single destiny.

100

150

200

5 10 20 30 40 50

# of Running Applications

Link Probability (%)

Baseline

Our Approach (amp=1) Our Approach (amp=2)

(b) Multiple sources, single destiny.

Figure 3: Impact of neighborhood search on the number of running applications.

100

150

200

5 10 20 30 40 50

# of Running Applications

Link Probability (%)

Our Approach (rnd)

Our Approach (asc) Our Approach (dsc)

(a) Single source, single destiny.

100

150

200

5 10 20 30 40 50

# of Running Applications

Link Probability (%)

Our Approach (rnd)

Our Approach (asc) Our Approach (dsc)

(b) Multiple sources, single destiny.

Figure 4: Impact of the order in which applications are processed on the number of running applications.

rizes the average size of computation paths with a

search radius on neighboring nodes ﬁxed at up to 1

node - i.e., amp being 1. - for single (Figure 5a) and

multiple sources (Figure 5b). We can observe that the

more link connections in both scenarios, the shorter

the paths. As the nodes are more connected, it allows

direct access to nodes with greater computing capac-

ity, and thus it reduces the total path length. For exam-

ple, when the link probability is doubled (from 5% to

10%), the path length decreases by 23.7% on average.

It is even more evident when we have multiple paths

because multiple anchor paths increase the probabil-

ity that different neighbors also have a connection to

a node with greater computing capacity. Also, the im-

pact on the alternative path size is perceived in both

scenarios. In the single source scenario, even with

50% link probability, the alternative path length is

25.7% longer than its anchor path. In parallel, this dif-

ference does not exceed 14.8% (with 50% link proba-

bility) for a multi-source scenario. On average, some

applications have a chance of being resolved with

fewer hops than using just one path. Similarly, Fig-

ure 5e indicates the cumulative latency for each path

in milliseconds for both the single- and multi-source

runs. On average, single-source instances have more

cumulative latency than multi-source runs. Finally,

we can see a detailed look at the impact of latency in

a single-source scenario on a CDF (Figure 5c). Simi-

larly, a CDF shows resource availability decreasing as

applications are allocated. In all strategies, network

resources remain to be used after the allocation of ap-

plications. In this case, the random approach outper-

formed the others with 171 ofﬂoaded applications out

of 200 in a single-source scenario (Figure 5f).

6 CONCLUSION AND FUTURE

WORK

Cloud Computing has been in the spotlight for provid-

ing ﬂexible and robust computing capabilities through

the Internet (Buyya et al., 2009). The core component

in the traditional Cloud model is the consolidation

of computing resources on large-scale data centers,

which comprise dedicated networks and specialized

power supply and cooling mechanisms for maintain-

ing the infrastructure.

CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science

3.5

4.5

5.5

5 10 20 30 40 50

Path Size (# of hops)

Link Probability (%)

Baseline (asc)

Our Approach (asc)

Baseline (rnd)

Our Approach (rnd)

Baseline (dsc)

Our Approach (dsc)

(a) Path size (single source, single destiny).

3.5

4.5

5.5

5 10 20 30 40 50

Path Size (# of hops)

Link Probability (%)

Baseline (asc)

Our Approach(asc)

Baseline (rnd)

Our Approach (rnd)

Baseline (dsc)

Our Approach (dsc)

(b) Path size (multiple sources, single destiny).

100

120

140

5 10 20 30 40 50

Latency (ms)

Link Probability (%)

Baseline (asc)

Our Approach (asc)

Baseline (rnd)

Our Approach (rnd)

Baseline (dsc)

Our Approach (dsc)

100

120

140

5 10 20 30 40 50

Latency (ms)

Link Probability (%)

Baseline (asc)

Our Approach (asc)

Baseline (rnd)

Our Approach (rnd)

Baseline (dsc)

Our Approach (dsc)

(d) Path latency (multiple source, single destiny).

100

120

0 15 30 45 60 75 90 105 120 135 150 165

Latency (ms)

Application ID

Our Approach (asc)

Our Approach (rnd) Our Approach (dsc)

(e) Accumulated latency (single source, single destiny).

5000

10000

15000

20000

25000

30000

35000

0 15 30 45 60 75 90 105 120 135 150 165

Available Computing Resources

# of Applications

Our Approach (asc)

Our Approach (rnd) Our Approach (dsc)

(f) Resource utilization (single source, single destiny).

Figure 5: Average path latency and the number of hops per computing path.

As large-scale data centers require a complex

and resource-consuming infrastructure, they typically

cannot be deployed inside urban centers, where data

sources are located (Satyanarayanan et al., 2019). On

top of that, the emergence of applications with tight

latency and bandwidth requirements has called into

question Cloud’s prominence, highlighting the need

for alternative approaches for processing the high data

inﬂux at reduced time. This challenge gave birth to

the Cloud Continuum paradigm, which merges vari-

ous paradigms, such as Cloud Computing and Edge

Computing, to get the best-of-breed performance in

terms of latency and bandwidth.

There has been considerable prior work toward

optimizing Cloud Continuum provisioning at its end-

points (i.e., on Cloud and Edge). However, we make

a case for leveraging in-transit optimizations through-

out the Cloud Continuum to mitigate performance is-

sues. Despite a few initiatives in that line of reason-

ing, to the best of our knowledge, none of the existing

approaches coordinates the in-transit application rout-

ing with the location of computing premises.

This paper presents a heuristic algorithm that

orchestrates the application routing throughout the

Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation

Cloud Continuum, looking for suitable hosts for pro-

cessing the requests along the way to reduce the

applications’ end-to-end delay. Simulated experi-

ments demonstrate that the proposed solution outper-

forms baseline strategies by 24x in terms of served

application requests without sacriﬁcing the applica-

tions’ delay. In future work, we intend to explore

meta-heuristics and other optimization techniques to

ﬁnd optimal in-transit application schedules within a

bounded time.

ACKNOWLEDGEMENTS

This work was ﬁnanced in part by the Coordenac¸

de Aperfeic¸oamento de Pessoal de N

ıvel Superior

- Brasil (CAPES) – Finance Code 001. Also, this

work was partially funded by National Council for

Scientiﬁc and Technological Development (CNPq

404027/2021-0), Foundation for Research of the State

of Sao Paulo (FAPESP 2021/06981-0, 2021/00199-

8, 2020/05183-0), and Foundation for Research of

the State of Rio Grande do Sul (19/2551-0001266-7,

19/2551-0001224-1, 19/2551-0001689-1, 21/2551-

0000688-9).

REFERENCES

Baresi, L., Mendonc¸a, D. F., Garriga, M., Guinea, S., and

Quattrocchi, G. (2019). A uniﬁed model for the

mobile-edge-cloud continuum. ACM Transactions on

Internet Technology (TOIT), 19(2):1–21.

Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., and

Brandic, I. (2009). Cloud computing and emerging

it platforms: Vision, hype, and reality for delivering

computing as the 5th utility. Future Generation com-

puter systems, 25(6):599–616.

Chiu, T.-C., Chung, W.-H., Pang, A.-C., Yu, Y.-J., and Yen,

P.-H. (2016). Ultra-low latency service provision in

5g fog-radio access networks. In 2016 IEEE 27th

Annual International Symposium on Personal, Indoor,

and Mobile Radio Communications (PIMRC), pages

1–6. IEEE.

Ding, D., Savi, M., Pederzolli, F., Campanella, M., and

Siracusa, D. (2021). In-network volumetric ddos

victim identiﬁcation using programmable commodity

switches. IEEE Transactions on Network and Service

Management, 18(2):1191–1202.

Fortz, B. and Thorup, M. (2000). Internet trafﬁc engineer-

ing by optimizing ospf weights. In Proceedings IEEE

INFOCOM 2000. Conference on Computer Commu-

nications. Nineteenth Annual Joint Conference of the

IEEE Computer and Communications Societies (Cat.

No.00CH37064), volume 2, pages 519–528 vol.2.

Friday, K., Kfoury, E., Bou-Harb, E., and Crichigno, J.

(2020). Towards a uniﬁed in-network ddos detection

and mitigation strategy. In 2020 6th IEEE Conference

on Network Softwarization (NetSoft), pages 218–226.

IEEE.

Gupta, H., Nath, S. B., Chakraborty, S., and Ghosh, S. K.

(2016). Sdfog: A software deﬁned computing archi-

tecture for qos aware service orchestration over edge

devices. arXiv preprint arXiv:1609.01190.

Hohemberger, R., Castro, A. G., Vogt, F. G., Mansilha,

R. B., Lorenzon, A. F., Rossi, F. D., and Luizelli,

M. C. (2019). Orchestrating in-band data plane

telemetry with machine learning. IEEE Communica-

tions Letters, 23(12):2247–2251.

Jin, X., Li, X., Zhang, H., Soul

e, R., Lee, J., Foster, N.,

Kim, C., and Stoica, I. (2017). Netcache: Balancing

key-value stores with fast in-network caching. In Pro-

ceedings of the 26th Symposium on Operating Systems

Principles, pages 121–136.

Kannan, P. G., Joshi, R., and Chan, M. C. (2019). Pre-

cise time-synchronization in the data-plane using pro-

grammable switching asics. In Proceedings of the

2019 ACM Symposium on SDN Research, pages 8–20.

Kottur, S. Z., Kadiyala, K., Tammana, P., and Shah, R.

(2022). Implementing chacha based crypto primitives

on programmable smartnics. In Proceedings of the

ACM SIGCOMM Workshop on Formal Foundations

and Security of Programmable Network Infrastruc-

tures, pages 15–23.

Lee, J.-H. and Singh, K. (2020). Switchtree: in-network

computing and trafﬁc analyses with random forests.

Neural Computing and Applications, pages 1–12.

Liao, Q., Marchenko, N., Hu, T., Kulics, P., and Ewe, L.

(2022). Haru: Haptic augmented reality-assisted user-

centric industrial network planning. In 2022 IEEE

Globecom Workshops (GC Wkshps), pages 389–394.

IEEE.

Mai, T., Yao, H., Guo, S., and Liu, Y. (2020). In-network

computing powered mobile edge: Toward high perfor-

mance industrial iot. IEEE network, 35(1):289–295.

Marques, J. A., Luizelli, M. C., Da Costa, R. I. T., and Gas-

pary, L. P. (2019). An optimization-based approach

for efﬁcient network monitoring using in-band net-

work telemetry. Journal of Internet Services and Ap-

plications, (1):16.

Moreschini, S., Pecorelli, F., Li, X., Naz, S., H

astbacka, D.,

and Taibi, D. (2022a). Cloud continuum: the deﬁni-

tion. IEEE Access.

Moreschini, S., Pecorelli, F., Li, X., Naz, S., H

astbacka, D.,

and Taibi, D. (2022b). Cloud continuum: The deﬁni-

tion. IEEE Access, 10:131876–131886.

Panda, S., Feng, Y., Kulkarni, S. G., Ramakrishnan, K.,

Dufﬁeld, N., and Bhuyan, L. N. (2021). Smartwatch:

accurate trafﬁc analysis and ﬂow-state tracking for in-

trusion prevention using smartnics. In Proceedings of

the 17th International Conference on emerging Net-

working EXperiments and Technologies, pages 60–75.

Sankaran, G. C., Sivalingam, K. M., and Gondaliya, H.

(2021). P4 and netfpga based secure in-network com-

puting architecture for ai-enabled industrial internet of

things. IEEE Internet of Things Journal.

CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science

Saquetti, M., Canofre, R., Lorenzon, A. F., Rossi, F. D.,

Azambuja, J. R., Cordeiro, W., and Luizelli, M. C.

(2021). Toward in-network intelligence: running dis-

tributed artiﬁcial neural networks in the data plane.

IEEE Communications Letters, 25(11):3551–3555.

Satyanarayanan, M. (2019). How we created edge comput-

ing. Nature Electronics, 2(1):42–42.

Satyanarayanan, M., Klas, G., Silva, M., and Mangiante,

S. (2019). The seminal role of edge-native applica-

tions. In International Conference on Edge Comput-

ing, pages 33–40. IEEE.

Seemakhupt, K., Liu, S., Senevirathne, Y., Shahbaz, M.,

and Khan, S. (2021). Pmnet: in-network data persis-

tence. In 2021 ACM/IEEE 48th Annual International

Symposium on Computer Architecture (ISCA), pages

804–817. IEEE.

Tang, L., Huang, Q., and Lee, P. P. (2020). A fast and com-

pact invertible sketch for network-wide heavy ﬂow

detection. IEEE/ACM Transactions on Networking,

28(5):2350–2363.

Tokusashi, Y., Dang, H. T., Pedone, F., Soul

e, R., and Zil-

berman, N. (2019). The case for in-network comput-

ing on demand. In Proceedings of the Fourteenth Eu-

roSys Conference 2019, pages 1–16.

Wang, L., Von Laszewski, G., Younge, A., He, X., Kunze,

M., Tao, J., and Fu, C. (2010). Cloud comput-

ing: a perspective study. New generation computing,

28(2):137–146.

Wang, S.-Y., Wu, C.-M., Lin, Y.-B., and Huang, C.-C.

(2019). High-speed data-plane packet aggregation and

disaggregation by p4 switches. Journal of Network

and Computer Applications, 142:98–110.

Wang, Y., Wang, X., Xu, S., He, C., Zhang, Y., Ren, J., and

Yu, S. (2022). Flexmon: A ﬂexible and ﬁne-grained

trafﬁc monitor for programmable networks. Journal

of Network and Computer Applications, 201:103344.

Yeh, C., Do Jo, G., Ko, Y.-J., and Chung, H. K. (2022).

Perspectives on 6g wireless communications. ICT Ex-

press.

Zhang, X., Cui, L., Tso, F. P., and Jia, W. (2021). pheavy:

Predicting heavy ﬂows in the programmable data

plane. IEEE Transactions on Network and Service

Management, 18(4):4353–4364.

Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation