paradigm of edge computing has arisen (Shi et al.,
2016). With edge computing, units placed at various
locations close to the data sources, computing,
networking and storage capabilities are provided on
the spot. These units, although of lower capacity
compared to cloud infrastructures, serve time-critical
tasks and reduce the load offloaded to the central
cloud. Cloud resources are also utilized in
combination with the edge ones, e.g., serving latency-
tolerant workloads, creating a powerful edge-cloud
continuum. Software-wise, lightweight containers are
the ideal technology to enable the seamless execution
of cloud-native applications, or parts of them, on the
edge; (Goethals, n.d.) in contrast to other
virtualization approaches like virtual machines.
The present work focuses on the development of
a novel mechanism for the appropriate allocation of
the available computing and storage resources in the
various layers of an edge-cloud infrastructure, to
support incoming workload from cloud-native
applications. Our aim is to jointly optimize a
weighted combination of the average delay (per
application) and the average cost of service while
ensuring that the delay between dependent
microservices and the available resources on the
infrastructure nodes meet the requirements specified
by the applications.. We first model the problem as a
Mixed Integer Linear Programming (MILP) problem.
Then, we construct a fast heuristic algorithm, called
Greedy Resource Allocation Algorithm (GRAA),
which is also utilized by a novel Rollout technique to
further optimize the generated solution (namely
Rollout algorithm based on GRAA) relying on
Reinforcement Learning (RL) principles. We
evaluate the results through extensive simulations
under various scenarios and demonstrate the
efficiency of the proposed solutions.
The remainder of this paper is organized as
follows: In Section 2, we present the related work. In
Section 3, we analyze the considered edge-cloud
infrastructure and the cloud-native applications
workload and formulate the resource allocation
problem as a MILP. In Section 4, we present the
heuristic algorithms developed. In Section 5, we
comment on the simulation results and finally, in
Section 6 we conclude our work.
2 RELATED WORKS
The resource allocation problem in virtualized
environments is a multi-dimensional research area
that has attracted the interest of the research
community. The modeling of the problem among the
different works varies according to the considered
topology and the adopted technologies, while the
proposed solutions employ techniques from the wider
realm of mathematics and computer science.
(Li et al., 2018) examine the placement of virtual
machines (VMs) on top of physical systems in a cloud
data center to perform big-data analytics from
Internet of Things (IoT) devices. The infrastructure is
modeled as a graph, where nodes represent VMs, and
links represent the network communication between
them. The aim is to minimize the maximum
utilization across the links in order to optimally utilize
network resources and avoid congestion. A greedy,
first-fit heuristic algorithm is presented that targets
placing as many interacting VMs as possible on the
same physical systems to minimize communication
costs. Authors in (Kiran et al., 2020) introduce the
VNFPRA problem, which focuses on the optimal
placement of VNFs (Virtualized Network Functions)
in SDN-NFV-enabled Multi-Access Edge Computing
(MEC) nodes with the aim of minimizing the
deployment and resource usage cost. The MEC
topology is modeled with a weighted graph, where
each node corresponds to a MEC node, characterized
by its available resources, while each link is a
network link with a given capacity. The problem is
formulated as a Mixed Integer Programming (MIP)
problem, where the objective function is the total
service cost that considers the placement cost,
resource usage cost and link usage/replication cost.
To tackle the time-consuming process of finding the
optimal solution, the authors propose a genetic-based
heuristic algorithm. In (da Silva & da Fonseca, 2018),
the authors develop an algorithm based on Gaussian
Process Regression to predict future traffic and
minimize request blocking, especially in the case of
time-critical requests. A hierarchical infrastructure
that consists of computing layers (near edge, far edge,
cloud) is considered and the objective is to ensure that
the near/far edge resources are sufficient enough to
serve future time-sensitive demands.
Shifting our attention to more relevant works to
the one presented in this paper, authors in (Santoro et
al., 2018) developed “Foggy”, an architectural
framework based on open-source tools that handles
requests from end users in a multi-level
heterogeneous fog/edge environment. The requests
arrive in a FIFO queue, and at each stage, the
available nodes are ranked by their processing power
and their networking towards the end user to extract
the best match. The authors in (Mutlag et al., 2021)
proposed a dynamic resource scheduling scheme for
critical smart-healthcare tasks in a fog/edge-cloud
topology. Their model consists of a multi-agent