Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation
Igor Ferrazza Capeletti
3 a
, Ariel Goes de Castro
3 b
, Daniel Chaves Temp
1,3 c
,
Paulo Silas Severo de Souza
2 d
, Arthur Francisco Lorenzon
4 e
, F
´
abio Diniz Rossi
1,3 f
and Marcelo Caggiani Luizelli
3 g
1
Federal Institute Farroupilha, Alegrete, Brazil
2
Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
3
Federal University of Pampa, Alegrete, Brazil
4
Federal University of Rio Grande do Sul, Porto Alegre, Brazil
Keywords:
Cloud Continuum, Resource Allocation, Heuristics, Simulation.
Abstract:
The IT community has witnessed a transition towards the cooperation of two major paradigms, Cloud Com-
puting and Edge Computing, paving the way to a Cloud Continuum, where computation can be performed at
the various network levels. While this model widens the provisioning possibilities, choosing the most cost-
efficient processing location is not trivial. In addition, network bottlenecks between end users and computing
facilities assigned for carrying out processing can undermine application performance. To overcome this chal-
lenge, this paper presents a novel algorithm that leverages a path-aware heuristic approach to opportunistically
process application requests on compute devices along the network path. Once intermediate hosts process
information, requests are sent back to users, alleviating the demand on the network core and minimizing end-
to-end application latency. Simulated experiments demonstrate that our approach outperforms baseline routing
strategies by a factor of 24x in terms of network saturation reduction without sacrificing application latency.
1 INTRODUCTION
Cloud Continuum is an emerging paradigm that ex-
tends the traditional Cloud to the Edge, Fog, and in-
between. In other words, it is an aggregation of het-
erogeneous resources of other computing facilities,
such as micro-data centers and intermediate comput-
ing nodes along the path between the user’s requests
and larger computing premises (Moreschini et al.,
2022b; Liao et al., 2022).
By extending the computing capabilities, Cloud
Continuum allows the in-transit processing of appli-
cation requests and expands the computing capabili-
ties (and granularity) (Baresi et al., 2019). There are
many advantages of this computing paradigm. For in-
stance, application requests can be processed while
a
https://orcid.org/0000-0002-8712-2195
b
https://orcid.org/0000-0002-5391-5082
c
https://orcid.org/0000-0002-9724-1331
d
https://orcid.org/0000-0003-4945-3329
e
https://orcid.org/0000-0002-2412-3027
f
https://orcid.org/0000-0002-2450-1024
g
https://orcid.org/0000-0003-0537-3052
routed to the following processing hop (e.g., an Edge
node). By doing that, it can reduce communication
latency and improve the resource utilization of Cloud
and Edge nodes by freeing their resources up. It is
imperative for emerging applications with stringent
performance requirements in a highly connected and
dynamic environment. Examples of such applica-
tions include multi-sensory extended reality, brain-
computer interfaces, haptic interaction, meta-verses,
and flying vehicles (Yeh et al., 2022).
Figure 1 illustrates a Cloud Continuum example.
In the traditional Cloud Computing approach, appli-
cation requests (e.g., R
3
) are processed by a central-
ized Cloud server. In the Cloud Continuum, how-
ever, computing resources are spread over the net-
work infrastructure. That includes, for instance, Edge
Cloud nodes and micro-servers placed closer to the
access network (e.g., base stations). However, the
resources available at these computing premises are
usually quite limited. In the example, request R
2
could be processed entirely by an Edge Cloud server
(in case it has enough resources) or, depending on the
request’s requirements, still be processed partially by
different Edge Cloud servers. In turn, request R
2
is to-
90
Capeletti, I., Goes de Castro, A., Temp, D., Severo de Souza, P., Lorenzon, A., Rossi, F. and Luizelli, M.
Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation.
DOI: 10.5220/0011995700003488
In Proceedings of the 13th International Conference on Cloud Computing and Services Science (CLOSER 2023), pages 90-99
ISBN: 978-989-758-650-7; ISSN: 2184-5042
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
Core
Edge
Access
Micro servers
Micro servers
Edge Cloud
Cloud Servers
Cloud
Edge Cloud
Edge Cloud
Edge Cloud
R
1
R
3
R
2
Figure 1: Overview of the Cloud-Continuum Computing.
tally processed by a micro-server located at the base
station at one-hop from the users.
Despite the extra layer of available resources pro-
vided by the aggregation of heterogeneous resources
in the infrastructure, it is challenging to efficiently use
distributed computing resources without proper or-
chestration. For instance, a Cloud Platform can route
requests between Edge and Cloud nodes by using
consolidated routing protocols (e.g., OSPF (Fortz and
Thorup, 2000) – based on the shortest path). Even ap-
plying customized routing schemes to improve other
network metrics (e.g., to balance network link load)
might under-use the available computing resources in
the network. An efficient solution would be to route
application requests using a path with higher access
to computation power. In this case, a request could be
processed opportunistically by a computing premise
along the path. In the best case, the request is pro-
cessed in the first hop. Otherwise, it is routed to
its final destination (e.g., the Cloud), passing through
nodes that have direct access to more powerful com-
puting premises (e.g., Edge Cloud servers).
This paper proposes a path-aware heuristic ap-
proach to efficiently route application requests to
computing nodes to fill this gap. The main advantage
of our heuristic approach is that it considers the avail-
able resources along the path of the network to pro-
cess the requests as early as possible. If a computing
premise along the path processes a request, it is routed
back to the source device. Our heuristic approach is
based on the k-shortest path. However, alternative
paths obtained by our search algorithm try not to de-
tour much from the original shortest path. It does that
by incrementally allowing an expansion of the neigh-
borhood of nodes in the original path. Results show
that our solution outperforms shortest-path-based so-
lutions by a factor of 24X in terms of the number of
processed applications while not imposing significant
delay. Our contributions can be summarized as fol-
lows:
We formalize an orchestration model for in-transit
processing on Cloud Continuum environments.
We propose a novel algorithm that opportunisti-
cally processes application requests on compute
devices along the Cloud Continuum.
We present an evaluation showing that our algo-
rithm reduces the network saturation by 24x com-
pared to traditional routing strategies.
We disclosure the dataset and source code of our
approach to foster reproducibility.
The remainder of this paper is organized as follows.
In Section 2, we provide a brief background on Cloud
Continuum aspects. Then, we discuss the related lit-
erature in the area. In Sections 3 and 4, we introduce
our model and present the proposed heuristic. In Sec-
tion 5, we discuss the obtained results. Last, in Sec-
tion 6, we conclude the paper with final remarks and
perspectives for future work.
2 BACKGROUND AND RELATED
WORK
In this section, we start by discussing cloud comput-
ing concepts. Next, we overview the most prominent
work related to Cloud Continuum computing.
2.1 Cloud Computing and Beyond
Cloud computing is a paradigm that allows users to
move out their data/applications from local comput-
ing to remote “cloud” (Wang et al., 2010). There are
many benefits to this approach. First, estimating the
cost of acquiring new equipment (e.g., switches and
servers) is not trivial. Consider certain services may
have different demands throughout the day i.e., peak
hours. In this case, a budget can end up being (i) over-
estimated, wasting resources, or (ii) underestimated
and needing to be more to deal with specific user de-
mands, such as enough bandwidth or processing ca-
pacity to deal with an excessive amount of requests.
Second, the equipment takes up space and is costly.
It is necessary to dedicate considerable time to guar-
antee the correct functioning of the equipment and its
running services to maintain qualified staff in the pay-
roll for unforeseen problems. Finally, planning the
above goals is time-consuming, and managers could
use this time for other company tasks.
Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation
91
?
(a) (b)
(c)
(d) (e) (f)
Figure 2: Illustration of the behavior of our heuristic approach when finding near-optimal solutions.
Despite that, the cloud computing model strug-
gles to keep up with the constraints of current ser-
vices, such as ultra-low latency services in 5G cel-
lular networks (Chiu et al., 2016). To tackle this
problem, new computation paradigms such as edge
computing (Satyanarayanan, 2019) arise to satisfy
these demands. More specifically, edge computing
allows performing computation and storage closer to
the end-user with minimal to no intervention from
cloud nodes - incurring less latency. Despite that,
edge computing still has the limited processing power
and high-time responses. It may require cooperation
from remote resources such as those in cloud provider
infrastructure that do not cope well with these require-
ments.
More recently, the cloud continuum has drawn at-
tention from academia and industry as a candidate
to overcome these limitations and has several defi-
nitions (Moreschini et al., 2022a). In this paper, we
consider an early definition by Gupta et al (Gupta
et al., 2016) where it defines cloud continuum as “a
continuum of resources available from the network
edge to the cloud/datacenter”. That said, with the ad-
vent of programmable network devices (e.g., Smart-
NICs, programmable switches), the offloading of ap-
plications to the data plane presents a superior perfor-
mance in terms of high-throughput, low-latency com-
puting which may be combined with the existing edge
infrastructure to avoid the need to access high-latency
computation resources located in centralized clouds -
to guarantee SLAs are met.
2.2 Related Work
This section discusses the most prominent studies re-
lated to in-transit or in-path computing strategies.
NetCache (Jin et al., 2017) leverages switch
ASICs to perform on-path network caching to store
key-value data. Similarly, Wang et al. (Wang et al.,
2019) is the first to design and implement packet ag-
gregation/disaggregation entirely in switching ASICs
while PMNet (Seemakhupt et al., 2021) persistently
store and update data in network devices with sub-
RTT latency. SwitchTree (Lee and Singh, 2020)
estimates flow-level stateful features, such as RTT
and per-flow bitrate. FlexMon (Wang et al., 2022)
presents a network-wide traffic measurement scheme
that optimally deploys measurement nodes and uses
these nodes to measure the flows collaboratively.
Tokusashi et al. (Tokusashi et al., 2019) selec-
tively offload services to the data plane according to
changes in workload. Similarly, Mai et al. (Mai et al.,
2020) partially offloads the lightweight critical tasks
to the data plane devices and leaves the rest to the
mobile edge computing (MEC) nodes. In contrast,
Saquetti et al. (Saquetti et al., 2021) distributes neu-
ron computation of an Artificial Neural Network to
multiple switches. Friday et al. (Friday et al., 2020)
introduces an engine to detect attacks in real-time by
analyzing one-way ingress traffic on the switch. Sim-
ilarly, INDDos et al. (Ding et al., 2021) can reduce
DDoS detection time. It identifies as attacks the des-
tination IPs contacted by several source IPs greater
than a threshold, in a given time interval, entirely
in the data plane. SmartWatch (Panda et al., 2021)
leverages advances in switch-based network teleme-
try platforms to process the bulk of the traffic and
only forward suspicious traffic subsets to the Smart-
NIC, which have more processing power to provide
finer-grained analysis.
Kannan et al (Kannan et al., 2019) is the first
to perform time synchronization in the data plane,
enabling to add of high-resolution timing informa-
tion to the packets at line rates. Tang et al (Tang
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
92
et al., 2020) and pHeavy (Zhang et al., 2021) make
efforts to reduce the time to detect heavy hitters in
the data plane. (Tang et al., 2020) propose a compact
sketch statically allocated in switch memory, while
(Zhang et al., 2021) introduces a machine learning-
based scheme to mitigate latency overhead to SDN
controllers. (Sankaran et al., 2021) increases data
plane security by restricting modifications to a per-
sistent network switch state, enabling ML decision-
making computation to be offloaded to industrial net-
work elements. Similarly, Kottur et al (Kottur et al.,
2022) propose crypto externs for Netronome Agilo
smartNICs for authentication and confidentiality di-
rectly in the data plane.
Most recent initiatives have focused on offload-
ing computing mechanisms to specific computing
premises, such as programmable network devices. In
this work, we take a further step toward efficiently us-
ing distributed resources in the Edge-to-Cloud Con-
tinuum to increase the network capability regarding
the number of processed application requests.
3 SYSTEM MODEL
This section describes the resource allocation model
proposed in this work for the Edge-to-Cloud Con-
tinuum approach. Next, we describe the inputs and
outputs of our model, as well as the constraints and
objective function. Table 1 summarizes the notations
used hereafter.
3.1 Model Description and Notation
Input. The optimization model considers as input a
physical network infrastructure G = (D, L), and a set
of application requests A. Set D in network G rep-
resents routing/forwarding devices D = {1, ..., |D|},
while set L consists of unidirectional links intercon-
necting pair of devices (i, j) (D × D). We as-
sume that only one computing premise (e.g., Edge
node) is connected to a forwarding device D. Devices
D
c
D represents the subset of devices with comput-
ing premises. Each computing premise d D
p
has
a computational capacity defined by C : D
c
N
+
.
Conversely, each application i A has a computing
requirement defined by R : A N
+
. We denote the
routing taken by application i A as function P : A
{D
1
× ... × D
|D1|
}. We assume the path given by
function C is simple.
For simplicity, we assume that distributed com-
puting platforms can partially compute the computing
power required by an application i A. As a simplifi-
cation, we assume that partially computed values are
embedded into the packet that transports the request.
Example of similar strategies that utilize the packet
encapsulation to carry information includes In-Band
Network Telemetry (INT) (Marques et al., 2019)(Ho-
hemberger et al., 2019).
Variables. Our model considers a variable set X =
{x
i, j
, i A, j D} which indicates the amount used
by computing premises j D
c
to process application
i A.
x
i, j
=
N
+
If application i A is processed by j D
c
0 otherwise.
(1)
Constraints. Next, we describe the main feasibility
constraints related to the optimization problem. The
problem is subject to two main constraints: (i) path
computing capacity and (ii) route connectivity con-
straints.
(i) Path Computing Capacity: Application i A has a
computing requirement that is attended along the path
taken by the application. Therefore, the routing path
(or the computing path) establishes an upper bound
of computing power. In other words, the amount of
computing power in the path cannot be lower than the
application requirement. Thus, in Equation set (2), we
sum the available capacity along the routing taken by i
and ensure it is equal to the application’s requirement.
Similarly, Equation set (3) ensures that the computing
power of computing premises j D
c
is not violated.
jP (i): jD
c
C( j) · x
i, j
= R(i) : (i A) (2)
iA,P (i)
R( j) · x
i, j
C( j) : ( j D
c
) (3)
(ii) Route Connectivity: we assume all devices i
P (i) are pairwise strongly connected, i.e., any pair
of devices in P(i) are reachable to each other by a
link (i, j) L. To describe this property, we recall
a auxiliary function δ : (P × D × D) {true, false}
that returns true in case there exists a path be-
tween node i and j in path C. Note that one
can use other constraints to describe route connec-
tivity such as the based on flow conservation con-
straints. In other words, P(i)
[k]
... P (i)
[|P |]
,
where (P (i)
[k]
, P (i)
[k+1]
) L. Otherwise, function δ
returns false. Equation set (4) ensures all applications
i A have a computing path, while Equation set (5)
ensures they are valid (or connected).
|C (i)| 6=
/
0 : (i A) (4)
δ(P (i), k, l) = true : (i A), (k, l) P(i) (5)
Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation
93
Table 1: Summary of symbols.
Symbol Definiton
G = (D, L) Physical infrastructure G.
D Set of forwarding devices.
D
c
Set of computing premises.
L Set of physical links.
A Set of applications.
P (i) Computing path given to ap-
plication i A
C : A N
+
Computing capacity of j
D
c
R : A N
+
Computing requirement of
application i A
Given the feasibility constraints defined above,
we assume there exists an assignment function A :
(G, i) (P (i)) : i A that, given a network infras-
tructure G, and a set of application A, it returns a fea-
sible computing path (P ), with respect to constraint
sets (i) and (ii).
4 PROPOSED HEURISTIC
We propose a heuristic procedure that builds
computation-aware paths to tackle the above problem
efficiently and provide a quality-wise solution. Next,
we overview the ideas behind our proposed heuristic
and discuss the pseudo-code.
Consider Figure 2a illustrates a network topology
G(D, L) where D is a set of devices, and L is the
set of network links. Every time an end-user submits
an application request, it must be routed somehow to
reach its resources (e.g., the Cloud). If we wanted to
orchestrate how to route an application from a host
connected at the edge (e.g., S8) to a remote appli-
cation (e.g., S15), there would be several strategies
to reach this goal. Figures 2b to 2c summarize our
proposal to solve this problem. First, we find the
first shortest path from the origin of the request S8
which is the reference path –, leveraging existing
programmable switches to reduce server computation
overhead to the application server S15. In a binary
manner, it is checked if the devices along the short-
est path have enough resources (together) to com-
pletely process the network request entirely in the data
plane (Figure 2b). Otherwise, the reference path (S8,
S7, S11, S15) is iteratively modified according to the
neighborhood size – controlled by the amp variable –
as seen in Figure 2c. More specifically, we fix a node
in the reference path and explore its adjacent neigh-
bors. Then, detours are performed by applying the
shortest path on sub-graphs that do not contain previ-
ously explored nodes (Figure 2c and Figure 2e). Fi-
nally, suppose none of the generated paths can fully
offload the application request. In that case, we start
the procedure all over again by fixing another node in
the reference path, i.e., – S7 as in Figure 2f.
Algorithm 1 summarizes the main procedure. For
each application (app), we store the shortest path
(line 3) from a host connected to the source switch
(s) to the destination server (d). If the switches in the
path between s and d are unable to completely satisfy
the application’s computing request (req), we set the
distance by the number of hops between the shortest
path and the remaining graph (line 6) and narrows the
search field for alternate switches based on amp value
(line 8) - the higher, the broader. Then, an optimiza-
tion procedure (line 10) is invoked to try and find a
path to satisfy the computation request (see Algorithm
2 for details). Finally, if the optimization procedure
succeeds, we store the new path. Otherwise, keep the
default path.
The core of our proposal is presented in Algo-
rithm 2. It iteratively tries to completely offload the
server computation in the data plane by modifying the
original path. For a given application app, we need its
previous computation path cam old
app
.
Algorithm 1: Overview of the procedure.
Input: G(D, L): topology graph, A: set of applications,
amp: length of the subgraph.
1: graph
old
graph
2: for app in A do
3: cam old
app
di jkstra
mod
(s, d, amp, graph)
4: if sum
cap
(cam
app
) < req then
5: for i cam
app
do
6: set weigths(amp, graph)
7: end for
8: graph gen subgraph(graph, amp)
9: end if
10: cam
app
opt(app, graph, cam old
app
)
11: if sum
app
(cam
app
) 6= NULL then
12: res add(cam
app
)
13: else
14: res add(cam old
app
)
15: end if
16: graph graph
old
17: end for
The procedure works as follows: first, we (i) mark
each node (line 1) as the reference re f at a time, then
(ii) we perform detours by deleting the edge (line
2) starting from the current reference node to its up-
following neighbor in the original path. Then, we re-
peat the process (lines 5-15) for the remaining adja-
cent nodes (line 4). Finally, if the taken detours reach
the total application request, return the modified path
(line 4).
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
94
Algorithm 2: Overview of the optimization procedure.
Input: G
0
(D, L): topology subgraph, app: application,
amp: length of the subgraph, cam old
app
: reference
application computation path.
1: for re f cam old
app
do
2: del(link(re f , re f next))
3: ad j list
re f
get ad j nodes(re f )
4: cam
alt
di jkstra
mod
(s, d, amp)
5: while ad j list
re f
6= NU LL and cam
alt
6= NU LL do
6: if sum
cap
(cam
alt
) req then return cam
alt
7: else
8: for j cam
alt
do
9: if j ad j list
re f
then
10: del(link(re f , j))
11: cam
alt
di jkstra
mod
(s, d, amp)
12: end if
13: end for
14: end if
15: end while
16: end for
return NU LL
5 EVALUATION AND
DISCUSSION
This section describes the experiments carried out to
evaluate the proposed algorithm. First, we detail our
setup and methodology (§5.1). Then, we discuss the
achieved results (§5.2).
5.1 Setup
To evaluate and assess the performance metrics of our
proposed heuristic algorithm, we implement it using
Python language. All experiments were conducted on
an AMD ThreadRipper 3990X with 64 physical cores
and 32 GB of RAM, using the Ubuntu GNU/Linux
Server 22.04 x86-64 operating system. For our ex-
periments, we generated physical network infrastruc-
tures with 100 routers. We consider that each rout-
ing device in our network has a computing premise
attached to it. Physical networks are generated ran-
domly with the link connectivity probability ranging
from 10% to 50%. Each network link has latency val-
ues that vary between 10ms and 100ms. All comput-
ing premises have a processing capacity between 200
and 500, while 100 applications are requested to de-
mand processing power ranging from 100 to 200. The
source and destination node of each request is gener-
ated randomly. We varied in our approach the param-
eter amp between 1 and 2. This parameter controls
the search depth, as already discussed.
We repeated each experiment 30 times to obtain
the average and ensure a confidence level of at least
95%.
Baseline. We compare our approach against an
OSPF-based approach. All requests follow the short-
est path between source and destination nodes. We
varied the order in which the application’s requests
are processed based on the computing power re-
quested: random (rnd), ascending (asc), and de-
scending (dsc).
5.2 Results
Neighborhood Search. Figure 3 illustrates how the
value of amp impacts the shortest path computation on
the search for distinct alternative paths. In the base-
line (amp equal to 0), fewer applications are entirely
offloaded to the computing premises for both cate-
gories (1 to 1 and n to 1) because there allowed no
alternative routes besides the anchor path. On aver-
age, our approach offloaded 5% of the applications in
the first category (Figure 3a). In contrast, it increases
to 75% when compared to the second strategy (Fig-
ure 3b) i.e., multi-source because there are propor-
tionally more paths for different applications. Also,
for amp equal to 1 and 2 scenarios, our approach of-
floaded at least 1.2x more applications, up to 24.85x
in the best case.
Impact of Ordering Strategies. Figure 4 illustrates
the impact of applying different strategies to distribute
the computation of a set of cloud applications into
computing premises in the infrastructure. The asc al-
gorithm orders application priority based on higher
computation resources needed. On the other hand,
the dsc strategy prioritizes less computation-intensive
requests. Finally, the rnd strategy offloads requests
as they arrive. We varied the source location (in the
topology) where the requests were run. On the left
(Figure 4a), the applications are limited to a single
origin node, while on the right (Figure 4b, every node
may be chosen randomly at each application request.
We can observe that as the probability of the exis-
tence of a link increases for each pair of nodes (x-
axis), the overall number of successfully processed
applications also increases, reaching up to 173 run-
ning applications for the rnd strategy i.e., applica-
tions are offloaded as they arrive. However, on the
right, when we extend compute sources across the
network (Figure 4b). In the worst case (5% link cover-
age topology), we already have 193 covered applica-
tions (i.e., 97% coverage) with the dsc strategy. It is
because new shortest reference paths are created for
each source-destination pair, and nodes from differ-
ent reference paths can reach unreachable neighbors
when the value of amp is too small for a single refer-
ence path.
Path Length and Path Latency. Figure 5a summa-
Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation
95
0
20
40
60
80
100
120
140
160
180
200
5 10 20 30 40 50
# of Running Applications
Link Probability (%)
Baseline
Our Approach (amp=1) Our Approach (amp=2)
(a) Single source, single destiny.
0
50
100
150
200
5 10 20 30 40 50
# of Running Applications
Link Probability (%)
Baseline
Our Approach (amp=1) Our Approach (amp=2)
(b) Multiple sources, single destiny.
Figure 3: Impact of neighborhood search on the number of running applications.
0
50
100
150
200
5 10 20 30 40 50
# of Running Applications
Link Probability (%)
Our Approach (rnd)
Our Approach (asc) Our Approach (dsc)
(a) Single source, single destiny.
0
50
100
150
200
5 10 20 30 40 50
# of Running Applications
Link Probability (%)
Our Approach (rnd)
Our Approach (asc) Our Approach (dsc)
(b) Multiple sources, single destiny.
Figure 4: Impact of the order in which applications are processed on the number of running applications.
rizes the average size of computation paths with a
search radius on neighboring nodes fixed at up to 1
node - i.e., amp being 1. - for single (Figure 5a) and
multiple sources (Figure 5b). We can observe that the
more link connections in both scenarios, the shorter
the paths. As the nodes are more connected, it allows
direct access to nodes with greater computing capac-
ity, and thus it reduces the total path length. For exam-
ple, when the link probability is doubled (from 5% to
10%), the path length decreases by 23.7% on average.
It is even more evident when we have multiple paths
because multiple anchor paths increase the probabil-
ity that different neighbors also have a connection to
a node with greater computing capacity. Also, the im-
pact on the alternative path size is perceived in both
scenarios. In the single source scenario, even with
50% link probability, the alternative path length is
25.7% longer than its anchor path. In parallel, this dif-
ference does not exceed 14.8% (with 50% link proba-
bility) for a multi-source scenario. On average, some
applications have a chance of being resolved with
fewer hops than using just one path. Similarly, Fig-
ure 5e indicates the cumulative latency for each path
in milliseconds for both the single- and multi-source
runs. On average, single-source instances have more
cumulative latency than multi-source runs. Finally,
we can see a detailed look at the impact of latency in
a single-source scenario on a CDF (Figure 5c). Simi-
larly, a CDF shows resource availability decreasing as
applications are allocated. In all strategies, network
resources remain to be used after the allocation of ap-
plications. In this case, the random approach outper-
formed the others with 171 offloaded applications out
of 200 in a single-source scenario (Figure 5f).
6 CONCLUSION AND FUTURE
WORK
Cloud Computing has been in the spotlight for provid-
ing flexible and robust computing capabilities through
the Internet (Buyya et al., 2009). The core component
in the traditional Cloud model is the consolidation
of computing resources on large-scale data centers,
which comprise dedicated networks and specialized
power supply and cooling mechanisms for maintain-
ing the infrastructure.
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
96
3
3.5
4
4.5
5
5.5
5 10 20 30 40 50
Path Size (# of hops)
Link Probability (%)
Baseline (asc)
Our Approach (asc)
Baseline (rnd)
Our Approach (rnd)
Baseline (dsc)
Our Approach (dsc)
(a) Path size (single source, single destiny).
3
3.5
4
4.5
5
5.5
5 10 20 30 40 50
Path Size (# of hops)
Link Probability (%)
Baseline (asc)
Our Approach(asc)
Baseline (rnd)
Our Approach (rnd)
Baseline (dsc)
Our Approach (dsc)
(b) Path size (multiple sources, single destiny).
20
40
60
80
100
120
140
5 10 20 30 40 50
Latency (ms)
Link Probability (%)
Baseline (asc)
Our Approach (asc)
Baseline (rnd)
Our Approach (rnd)
Baseline (dsc)
Our Approach (dsc)
(c) Path latency (single source, single destiny).
20
40
60
80
100
120
140
5 10 20 30 40 50
Latency (ms)
Link Probability (%)
Baseline (asc)
Our Approach (asc)
Baseline (rnd)
Our Approach (rnd)
Baseline (dsc)
Our Approach (dsc)
(d) Path latency (multiple source, single destiny).
20
40
60
80
100
120
0 15 30 45 60 75 90 105 120 135 150 165
Latency (ms)
Application ID
Our Approach (asc)
Our Approach (rnd) Our Approach (dsc)
(e) Accumulated latency (single source, single destiny).
5000
10000
15000
20000
25000
30000
35000
0 15 30 45 60 75 90 105 120 135 150 165
Available Computing Resources
# of Applications
Our Approach (asc)
Our Approach (rnd) Our Approach (dsc)
(f) Resource utilization (single source, single destiny).
Figure 5: Average path latency and the number of hops per computing path.
As large-scale data centers require a complex
and resource-consuming infrastructure, they typically
cannot be deployed inside urban centers, where data
sources are located (Satyanarayanan et al., 2019). On
top of that, the emergence of applications with tight
latency and bandwidth requirements has called into
question Cloud’s prominence, highlighting the need
for alternative approaches for processing the high data
influx at reduced time. This challenge gave birth to
the Cloud Continuum paradigm, which merges vari-
ous paradigms, such as Cloud Computing and Edge
Computing, to get the best-of-breed performance in
terms of latency and bandwidth.
There has been considerable prior work toward
optimizing Cloud Continuum provisioning at its end-
points (i.e., on Cloud and Edge). However, we make
a case for leveraging in-transit optimizations through-
out the Cloud Continuum to mitigate performance is-
sues. Despite a few initiatives in that line of reason-
ing, to the best of our knowledge, none of the existing
approaches coordinates the in-transit application rout-
ing with the location of computing premises.
This paper presents a heuristic algorithm that
orchestrates the application routing throughout the
Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation
97
Cloud Continuum, looking for suitable hosts for pro-
cessing the requests along the way to reduce the
applications’ end-to-end delay. Simulated experi-
ments demonstrate that the proposed solution outper-
forms baseline strategies by 24x in terms of served
application requests without sacrificing the applica-
tions’ delay. In future work, we intend to explore
meta-heuristics and other optimization techniques to
find optimal in-transit application schedules within a
bounded time.
ACKNOWLEDGEMENTS
This work was financed in part by the Coordenac¸
˜
ao
de Aperfeic¸oamento de Pessoal de N
´
ıvel Superior
- Brasil (CAPES) Finance Code 001. Also, this
work was partially funded by National Council for
Scientific and Technological Development (CNPq
404027/2021-0), Foundation for Research of the State
of Sao Paulo (FAPESP 2021/06981-0, 2021/00199-
8, 2020/05183-0), and Foundation for Research of
the State of Rio Grande do Sul (19/2551-0001266-7,
19/2551-0001224-1, 19/2551-0001689-1, 21/2551-
0000688-9).
REFERENCES
Baresi, L., Mendonc¸a, D. F., Garriga, M., Guinea, S., and
Quattrocchi, G. (2019). A unified model for the
mobile-edge-cloud continuum. ACM Transactions on
Internet Technology (TOIT), 19(2):1–21.
Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., and
Brandic, I. (2009). Cloud computing and emerging
it platforms: Vision, hype, and reality for delivering
computing as the 5th utility. Future Generation com-
puter systems, 25(6):599–616.
Chiu, T.-C., Chung, W.-H., Pang, A.-C., Yu, Y.-J., and Yen,
P.-H. (2016). Ultra-low latency service provision in
5g fog-radio access networks. In 2016 IEEE 27th
Annual International Symposium on Personal, Indoor,
and Mobile Radio Communications (PIMRC), pages
1–6. IEEE.
Ding, D., Savi, M., Pederzolli, F., Campanella, M., and
Siracusa, D. (2021). In-network volumetric ddos
victim identification using programmable commodity
switches. IEEE Transactions on Network and Service
Management, 18(2):1191–1202.
Fortz, B. and Thorup, M. (2000). Internet traffic engineer-
ing by optimizing ospf weights. In Proceedings IEEE
INFOCOM 2000. Conference on Computer Commu-
nications. Nineteenth Annual Joint Conference of the
IEEE Computer and Communications Societies (Cat.
No.00CH37064), volume 2, pages 519–528 vol.2.
Friday, K., Kfoury, E., Bou-Harb, E., and Crichigno, J.
(2020). Towards a unified in-network ddos detection
and mitigation strategy. In 2020 6th IEEE Conference
on Network Softwarization (NetSoft), pages 218–226.
IEEE.
Gupta, H., Nath, S. B., Chakraborty, S., and Ghosh, S. K.
(2016). Sdfog: A software defined computing archi-
tecture for qos aware service orchestration over edge
devices. arXiv preprint arXiv:1609.01190.
Hohemberger, R., Castro, A. G., Vogt, F. G., Mansilha,
R. B., Lorenzon, A. F., Rossi, F. D., and Luizelli,
M. C. (2019). Orchestrating in-band data plane
telemetry with machine learning. IEEE Communica-
tions Letters, 23(12):2247–2251.
Jin, X., Li, X., Zhang, H., Soul
´
e, R., Lee, J., Foster, N.,
Kim, C., and Stoica, I. (2017). Netcache: Balancing
key-value stores with fast in-network caching. In Pro-
ceedings of the 26th Symposium on Operating Systems
Principles, pages 121–136.
Kannan, P. G., Joshi, R., and Chan, M. C. (2019). Pre-
cise time-synchronization in the data-plane using pro-
grammable switching asics. In Proceedings of the
2019 ACM Symposium on SDN Research, pages 8–20.
Kottur, S. Z., Kadiyala, K., Tammana, P., and Shah, R.
(2022). Implementing chacha based crypto primitives
on programmable smartnics. In Proceedings of the
ACM SIGCOMM Workshop on Formal Foundations
and Security of Programmable Network Infrastruc-
tures, pages 15–23.
Lee, J.-H. and Singh, K. (2020). Switchtree: in-network
computing and traffic analyses with random forests.
Neural Computing and Applications, pages 1–12.
Liao, Q., Marchenko, N., Hu, T., Kulics, P., and Ewe, L.
(2022). Haru: Haptic augmented reality-assisted user-
centric industrial network planning. In 2022 IEEE
Globecom Workshops (GC Wkshps), pages 389–394.
IEEE.
Mai, T., Yao, H., Guo, S., and Liu, Y. (2020). In-network
computing powered mobile edge: Toward high perfor-
mance industrial iot. IEEE network, 35(1):289–295.
Marques, J. A., Luizelli, M. C., Da Costa, R. I. T., and Gas-
pary, L. P. (2019). An optimization-based approach
for efficient network monitoring using in-band net-
work telemetry. Journal of Internet Services and Ap-
plications, (1):16.
Moreschini, S., Pecorelli, F., Li, X., Naz, S., H
¨
astbacka, D.,
and Taibi, D. (2022a). Cloud continuum: the defini-
tion. IEEE Access.
Moreschini, S., Pecorelli, F., Li, X., Naz, S., H
¨
astbacka, D.,
and Taibi, D. (2022b). Cloud continuum: The defini-
tion. IEEE Access, 10:131876–131886.
Panda, S., Feng, Y., Kulkarni, S. G., Ramakrishnan, K.,
Duffield, N., and Bhuyan, L. N. (2021). Smartwatch:
accurate traffic analysis and flow-state tracking for in-
trusion prevention using smartnics. In Proceedings of
the 17th International Conference on emerging Net-
working EXperiments and Technologies, pages 60–75.
Sankaran, G. C., Sivalingam, K. M., and Gondaliya, H.
(2021). P4 and netfpga based secure in-network com-
puting architecture for ai-enabled industrial internet of
things. IEEE Internet of Things Journal.
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
98
Saquetti, M., Canofre, R., Lorenzon, A. F., Rossi, F. D.,
Azambuja, J. R., Cordeiro, W., and Luizelli, M. C.
(2021). Toward in-network intelligence: running dis-
tributed artificial neural networks in the data plane.
IEEE Communications Letters, 25(11):3551–3555.
Satyanarayanan, M. (2019). How we created edge comput-
ing. Nature Electronics, 2(1):42–42.
Satyanarayanan, M., Klas, G., Silva, M., and Mangiante,
S. (2019). The seminal role of edge-native applica-
tions. In International Conference on Edge Comput-
ing, pages 33–40. IEEE.
Seemakhupt, K., Liu, S., Senevirathne, Y., Shahbaz, M.,
and Khan, S. (2021). Pmnet: in-network data persis-
tence. In 2021 ACM/IEEE 48th Annual International
Symposium on Computer Architecture (ISCA), pages
804–817. IEEE.
Tang, L., Huang, Q., and Lee, P. P. (2020). A fast and com-
pact invertible sketch for network-wide heavy flow
detection. IEEE/ACM Transactions on Networking,
28(5):2350–2363.
Tokusashi, Y., Dang, H. T., Pedone, F., Soul
´
e, R., and Zil-
berman, N. (2019). The case for in-network comput-
ing on demand. In Proceedings of the Fourteenth Eu-
roSys Conference 2019, pages 1–16.
Wang, L., Von Laszewski, G., Younge, A., He, X., Kunze,
M., Tao, J., and Fu, C. (2010). Cloud comput-
ing: a perspective study. New generation computing,
28(2):137–146.
Wang, S.-Y., Wu, C.-M., Lin, Y.-B., and Huang, C.-C.
(2019). High-speed data-plane packet aggregation and
disaggregation by p4 switches. Journal of Network
and Computer Applications, 142:98–110.
Wang, Y., Wang, X., Xu, S., He, C., Zhang, Y., Ren, J., and
Yu, S. (2022). Flexmon: A flexible and fine-grained
traffic monitor for programmable networks. Journal
of Network and Computer Applications, 201:103344.
Yeh, C., Do Jo, G., Ko, Y.-J., and Chung, H. K. (2022).
Perspectives on 6g wireless communications. ICT Ex-
press.
Zhang, X., Cui, L., Tso, F. P., and Jia, W. (2021). pheavy:
Predicting heavy flows in the programmable data
plane. IEEE Transactions on Network and Service
Management, 18(4):4353–4364.
Towards Optimizing the Edge-to-Cloud Continuum Resource Allocation
99