Data Driven Meta-Heuristic-Assisted Approach for Placement of
Standard IT Enterprise Systems in Hybrid-Cloud
Andrey Kharitonov, Abdulrahman Nahhas, Hendrik M
¨
uller and Klaus Turowski
Faculty of Computer Science, Otto von Guericke University, Magdeburg, Germany
Keywords:
Commercial-off-the-Shelf Enterprise Applications, Capacity Management as a Service, Hybrid-Cloud.
Abstract:
We address the problem of hybrid-cloud placement selection for commercial off-the-shelf IT enterprise appli-
cations with the sizing done based on workload profiles collected from real-world production systems. The
proposed approach leverages techniques based on evolutionary meta-heuristics with a multi-criteria weighted
sum objective function. A placement decision is made between an on-premises data center and a public cloud,
using real pricing information for virtual machines, storage, and networking published by the public cloud
vendor via automation APIs and on-premises cost estimation as a share of expense per service. Additional ob-
jectives, such as expertise and non-functional requirements, are encoded in a numerical form for the objective
function. The evaluation is performed as single and multi-objective optimization by employing genetic algo-
rithm, and non-dominated-sorting genetic-algorithm-III on the case study of an SAP landscape hybrid-cloud
placement on a selected public cloud with real workload data collected during day-to-day business operations,
indicating the viability of the approach.
1 INTRODUCTION
Standard IT enterprise systems encompass software
solutions designed to fulfill the needs of varying
size organizations (i.e., enterprise resource planning
(ERP) systems ). Specifically, in medium and large
enterprises, such systems are often comprised of mul-
tiple semi-independent sub-systems, such as a vari-
ety of application systems and databases. Infrastruc-
ture refresh, expansion, or transformation involves
a decision-making process on the software systems
placement within a heterogeneous infrastructure con-
sisting of private data centers and public cloud place-
ment options (Missbach et al., 2016).
When simply selecting the cheapest available in-
frastructure fulfilling the required key performance
indicators (KPI) configuration isn’t enough, other re-
quirements and relevant indicators must be consid-
ered in the decision-making process. Furthermore,
the efficiency of applying this process in real life is of
utmost importance. It is also important to avoid de-
veloping a process that resembles a black box for the
user expert or a customer receiving placement recom-
mendations generated by this process.
Therefore, the motivation for this work is a need
to develop an easily explainable process that can be
delivered as a service providing support in a hybrid-
cloud transformation process of standard IT enter-
prise applications, with SAP-based IT infrastructure
as a particular case study. The service is to provide
data-driven decision support on the challenge of iden-
tifying the best-suitable operating environments for
given SAP workloads. Based on an arbitrary num-
ber of customer-specific quantifiable requirements,
existing constraints, measured workloads, resource
demands, and cost estimations, a placement recom-
mendation is provided for each assessed SAP sys-
tem (identified by its system ID, or, in short, SID).
In principle, this is a multi-criteria decision problem,
which can be tackled through appropriate approaches
(i.e., weighted sum model), also in the area of cloud
provider selection(Chauhan et al., 2020).
The search space of possible target environments
for each SID is formed by the variety of available
placement configurations and regions provided by a
public cloud provider and, if applicable, the cus-
tomer’s on-premises data center. In order to measure
what is ”best-suitable”, each potential solution is as-
sociated with a score that represents its fulfillment de-
gree of customer requirements, constraints, and the
resulting costs. Therefore, it is a challenge to identify
solutions that maximize the score.
Due to the complexity of the problem, multiple
solution search approaches for similar problems were
Kharitonov, A., Nahhas, A., Müller, H. and Turowski, K.
Data Driven Meta-Heuristic-Assisted Approach for Placement of Standard IT Enterprise Systems in Hybrid-Cloud.
DOI: 10.5220/0011726600003488
In Proceedings of the 13th International Conference on Cloud Computing and Services Science (CLOSER 2023), pages 139-146
ISBN: 978-989-758-650-7; ISSN: 2184-5042
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
139
proposed in the existing literature. Including those
based on algorithms that belong to the class of meta-
heuristics (Rahimi et al., 2022), which we also adopt
within this work. Furthermore, in the presented case
study, of Microsoft Azure, infrastructure costs are
calculated based on publicly available list prices for
compute, network, and storage components per sys-
tem component placement. In the case of on-premises
environments, a generic data center cost model is
used and parameterized with customer input to esti-
mate cost summary for system components. In both
cases (cloud and on-premises), required components
are sized based on real customer workloads and re-
source demands. For this purpose, a workload data
collection software measures resource demands and
capacity for the defined set of SAP systems over a
defined period of time. The impact of using the work-
load collector software on the overall system perfor-
mance is negligible.
Essentially, the goal of the approach is to answer
the following questions posed by a customer under-
going SAP landscape transformation:
Q1: Which of my SAP workloads are suitable for
cloud operations, and which are suitable for on-
premises?
Q2: What would be the infrastructure costs to operate
my SAP workloads in a public cloud?
Q3: What’s the most cost-effective distribution of my
SAP workloads across on-premises infrastruc-
ture and public cloud regional data centers?
Q4: What’s the right level of computational resources
at which point in time?
Q5: How does Remote Function Call (RFC) network
traffic between my SAP instances affects pric-
ing?
2 DATA DRIVEN SIZING
Redesign or transformation of the hybrid-cloud in-
frastructure capacities for the fulfillment of the KPIs
can be done based on the real-world workload pro-
files. As mentioned in the previous section, this work-
load profile can be recorded and used for analysis or
alternatively, performance unit tests can be utilized on
the real system (Hork
´
y et al., 2015). In this work,
we concentrate on making use of such workload pro-
files for standard off-the-shelf enterprise systems us-
ing SAP as a specific use case, well suitable for data-
driven sizing capacity (M
¨
uller et al., 2022).
Alongside the KPIs, pricing plays a crucial role in
planning the hybrid-cloud infrastructure as it’s neces-
sary to find the balance between the costs of system
placements in private data centers and pricing plans in
considered public clouds. Estimating the costs of pri-
vate data centers is not trivial (Patel and Shah, 2005),
especially in cases of hybrid-cloud component distri-
bution of the same system. In such cases we can rely
on system component-based cost estimations (Brogi
et al., 2019), while relying upon the workload profiles
(CPU, memory, storage, network utilization). Simi-
larly, we can rely on the workload profiles to select an
optimal pricing model (Wu et al., 2019) per system
per cloud placement within a specified planning hori-
zon, with automation facilitated via integration with
the pricing APIs of the public cloud providers.
Pre-calculation of all viable placement combina-
tions for a landscape consisting of hundreds of sys-
tems and databases is computationally and memory
expensive. For each pricing component, which de-
pends on the landscape configuration, a matrix with
dimensions |K| |P| × |K| |P| would have to be
calculated, where K is a set of systems in the land-
scape, and P is a set of viable individual placement
configurations. Instead, such costs, which depend on
the configuration of the landscape, are estimated as
part of the target function of a chosen meta-heuristic
during the evaluation of a solution candidate. It is im-
portant to note that the placement decision can not be
made for each system component in isolation as exist-
ing interdependencies (e.g., network, storage) affect
not only KPIs but also the pricing. Such interdepen-
dencies can be inferred from the corresponding work-
load profile.
3 DECISION OBJECTIVES
The problem discussed in this work is the multi-
attribute decision-making problem (Zanakis et al.,
1998), which we address with an approach similar
to a weighted sum for optimization method (Marler
and Arora, 2010). That means that we consider multi-
ple objectives, which might have a conflicting nature.
The simplest example of such conflicting objectives is
the maximization of the infrastructure capacities that
stands in direct opposition to price minimization. In
the real world, the decision objectives are more nu-
merous and intricate. Some of these objectives are
universal from project to project, but some can be
unique functional or non-functional business require-
ments. Such objectives would be defined and quanti-
fied for each transformation or infrastructure upgrade
project individually. Therefore, it’s important to have
a mechanism that supports the addition of an arbitrary
number of new objectives in an intuitive way.
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
140
3.1 Objectives Definition
To illustrate this principle, in Table 1 we provide an
example list of such additional objective categories
with values and weights that were considered within
the context of this work. These objectives were de-
fined during a systematic interviewing process of ex-
perts in the field of SAP infrastructure architecture as
well as the customers undergoing a refresh process of
SAP infrastructure.
We asked the interviewees to provide a tu-
ple of three numeric values for each requirement
v, w
onprem
, w
cloud
. The first value v denotes the
client’s answer to the stated requirement importance
or an assessment question in a numerical representa-
tion. The values v, in principle, can be denoted on any
numerical scale that is logically sound or convenient
for any given requirement, but the upper and lower
bounds of the scale must be known for later normal-
ization. This means that in the end, every requirement
is quantified in the range of [0, 1], which is selected
for simplicity of calculation as any arbitrary numeri-
cal scale can be easily normalized to this range. If a
requirement can’t be represented as a numerical scale,
it can’t be used in the proposed approach directly and
must be either considered in a post-processing logic
or decomposed to simple quantifiable elements.
Values w
onprem
and w
cloud
represent the weights
in the range [0, 1], which denote the importance of
the requirement for the on-premises system place-
ment solution and a public cloud placement solu-
tion, respectively. This means that every objective
is associated with separate importance weights in the
range [0, 1] separately for on-premises and cloud. The
weights are applied via the means of simple multi-
plication during the assessment phase of the system
placement to the value v. While in our short example,
we operate with two types of weights, this principle
can be intuitively extended to vendor-specific clouds
or different types of private data centers.
The provided in Table 1 short example of such
possible objectives consists of a total of four cate-
gories, differentiated by the designation of two-letter
prefixes. Weights for the specific placement type
within a single category are expected to be always
summed up to 1. Note that the provided values for the
competencies favor the on-premises operations be-
cause the values are based on the customer and expert
interviewees. Every category is also associated with
its own weight, as denoted in Table 2. Important to
note that these criteria are high-level and not meant to
be representative of real-world project complexity.
In the example above, even though values for all
objectives are presented to the experts and the cus-
tomers on a scale with a range [0, 10], the intuition
behind these values differs from category to cate-
gory. Values v in ”Expertise” determine the level of
expertise that is already present within the organiza-
tion. These values in the categories ”Non-functional
requirements” and ”Business strategy” determine the
importance of the objectives for the customer accord-
ing the specifics of business goals and strategies of the
customer. These values are weighted and considered
directly in the decision process.
All of the costs-related values v are subject to ad-
ditional processing and not considered directly in the
decision process. Instead, the value v denotes an ad-
ditional level of importance for the specific cost com-
ponent and acts as a scaling factor similar to weights
but independent from on-premises or cloud placement
type. While in the example above, such value is rela-
tively redundant, as well as the explicit separation of
costs between compute and networking, this demon-
strates how such approach can also serve as a way to
precisely balance cost impact on the decision process
for more complex pricing models. These costs for
compute and storage are precalculated as described
in Section 2 for every possible placement type for
each systems present in the IT landscape and added
to the set denoted P
γ
. Pricing in the set is defined
per system P
γ
sys
P
γ
All prices are normalized to
values in the range [0, 1] and subtracted from 1 to en-
able the values to be used as part of the maximization
problem: p P
γ
sys
.P
p
= 1
P
γ
sys
p
min (P
γ
sys
)
÷
(max (P
γ
sys
) min (P
γ
sys
)), where P is a final set of
normalized prices P
p
P. Estimations for additional
pricing points (i.e., networking, backups) are pro-
cessed in a similar manner but normalized based on
the landscape-wide minimum and maximum values.
3.2 Constraints
Not every technically possible solution that satisfies
the business requirements mentioned above is, in fact,
viable. For example, if a company processes sensitive
information (e.g., personal, medical, commercial) it
falls under certain regulations concerning data place-
ment and processing (Hippelainen et al., 2017; Sar-
feraz, 2022). Therefore, if we take any public cloud
provider with data centers located all over the world,
only a subset of these data centers and combinations
of offerings within others will be viable for consid-
eration. For example, a German company that falls
under such regulations (i.e., GDPR, DSGVO), might
want to only consider data centers within the Euro-
pean Union. This is a hard constraint that cannot be
broken as then the company risks non-compliance le-
gal cases raised against it.
Data Driven Meta-Heuristic-Assisted Approach for Placement of Standard IT Enterprise Systems in Hybrid-Cloud
141
Table 1: Example objective values.
Designation Description v w
cloud
w
onprem
CO 1 Compute and storage costs 8 0.5 0.5
CO 2 Networking costs 8 0.5 0.5
EX 1 Setting up on-prem SAP landscapes 10 0 0.5
EX 2 Setting up cloud SAP landscapes 5 0.25 0
EX 3 Maintaining on-prem SAP landscapes 10 0 0.5
EX 4 Maintaining cloud SAP landscapes 5 0.25 0
EX 5 Cloud Operations 5 0.25 0
EX 6 Cloud Architecture 4 0.25 0
NF 1 Performance 8 0.2 0.8
NF 2 Scalability & elasticity 5 0.4 0.1
NF 3 Innovation degree 6 0.4 0.1
ST 1 Level of control 5 0.1 0.4
ST 2 Carbon Footprint 2 0.1 0.2
ST 3 OPEX importance 7 0.7 0.1
ST 4 IT impact on business model 8 0.1 0.3
Table 2: Example objective categories.
Prefix Objective category Weight
CO Costs 0,35
EX Expertise 0,25
NF Non-functional requirement 0,25
ST Business strategy 0,15
Such constraints can be included as part of the ob-
jectives and processed as part of the objective function
calculation. However, in this particular case, it would
cause needlessly wasteful use of the computational re-
sources as placement hard constraints can easily be
satisfied by simple preliminary filtering of the pos-
sible placement options before any calculation takes
place. In the aforementioned example, the placement
options are not a whole set of all possible public cloud
provider locations but a subset limited to the region of
the European Union.
Another type of constraint is a soft constraint.
Soft constraint reflects a strong preference that can
be overturned under the weight of the other objec-
tive values. The most obvious examples of such con-
straints are co-location and anti-co-locations (Jam-
mal, M. et al., 2015) for different systems within the
IT landscape. For example, there can be two or more
systems that, due to technical requirements, such as
a high amount of data transferred between them, are
preferred to be placed together. Alternatively, two
systems that are individually critical to the business
processes and the loss of both of them at the same
time can impact business processes, and therefore,
these systems preferably shouldn’t be placed together.
Note that the latter example can also be mitigated by
employing high-availability options on-premises and
in the cloud, but these result in significant cost in-
creases, which might not be warranted by the busi-
ness significance of the system. High-availability op-
tions, where needed, are selected as part of the pre-
configuration of the placement for the specific sys-
tem. Soft constraints are associated with a weight,
which determines the significance of the constraint to
the overall assessment of the selected placement.
3.3 Objective Function
We approach the problem of placement decision in
two variants, which can be referred to as follows:
multi-objective solved as mono-objective, pure multi-
objective (Pires and Baran, 2015). Or termed alterna-
tively, the problem is represented as a single-objective
and as a many-objective (Helbig and Engelbrecht,
2013). Furthermore, since the discussed objectives
must be represented in a numerical way suitable for
usage in an automation process of calculation of best
available options from the presented alternatives and
for different questions ranges of values can be differ-
ent, normalization is performed to bring all objectives
to a common scale. Considering the practical orien-
tation of the presented approach, simplicity of the ob-
jective function was one of the priorities as it must be
easily explainable and understandable on all levels of
decision-making process in real-world enterprises.
Objectives are grouped into a set of categories de-
noted C. Each c C is associated with a weight
w
c
W, and every weight is with placement type,
such as ”on-premises” or ”cloud” in the use case
discussed in this work: W = W
onprem
S
W
cloud
. All
weights wights within the category per placement are
summing up to exactly to 1. Meaning that for every
c C, on-premises sum of weights is
W
onprem
c
= 1.0
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
142
and cloud is
W
cloud
c
= 1.0.
ς(c) =
xc
ζ(x
v
) x
w
|c|
(1)
ζ(x) =
x,x R
ς(x),x ̸=
/
0
(2)
S
p
=
cC
p
ς(c) W
p
c
(3)
Finally, the overall weighted score is denoted S
p
,
as presented in Equation 3, with a recursive system
of Equation 1 and Equation 2. The approach taken
in this work is inspired by using the weighted sum
model(Chauhan et al., 2020) as a basis and modifying
it to better suit the problem discussed in this work.
Through this simple, easily explainable calculation,
we receive a numerical representation that includes
an arbitrary number of functional and non-functional
requirements coupled with the pricing components.
This representation theoretically allows a hierarchical
composition of the requirement representation where
each value in each category can be represented either
as a real number R or as a subset of values with its
own weights. However, because every subset, when
weighted and summed up, is normalized to the range
[0, 1], the further down below in the hierarchy the
value is located, the less relevant these values become
to the final score.
Therefore, the possible maxim score for system
placement configuration is simply a total number of
all top-level categories: S
max
p
= | {c | c C {r | r
C c r} =
/
0} |. Then the minimum score is simply
S
min
p
= 0. The final scoring for the entire landscape of
multiple systems placed and configured at available
locations is simply the sum of scores acquired for all
these systems. This final summation of scores can
be directly used for the direct comparison of different
landscape placement configurations and the selection
of the most suitable one according to the price and all
of the specified additional requirements. In this case,
the higher the score is, the better. This approach can
be directly used as an objective function in a variety of
heuristic or meta-heuristic algorithms as a score max-
imization problem.
However, there is a variety of algorithms that are
suitable for solution search in a multi-objective fash-
ion. In this case the described above mechanism can
still be applied, but instead of calculating of the fi-
nal score, we operate with the top-level requirement
categories as multiple objectives directly with the ac-
cording weights without summing them up. We sim-
ply select all categories C
C that aren’t child to
any other, as described in Equation 4 and then apply
appropriate weights as seen in Equation 5.
C
= {c | c C {r | r C c r} = } (4)
ϕ =
n
ς(x) W
x
| x C
o
(5)
Furthermore, the mentioned above scores in the
Equation 3 and Equation 5 are calculated per system
in the landscape. For the landscape-wide final score,
a sum of the scores is taken after the placement loca-
tions are selected P
P and systems are assigned to
these placements P
K
and K
p
=
{
k
p
|
k K k
P
}.
Constraints, if defined, are calculated based on the
entire landscape placement. For each individual soft
constraint, a percentage of satisfaction is calculated
and then normalized to the range [0, 1]. After all of the
constraint satisfaction values are calculated for each
constraint type, a weighted average of these values
is calculated and used for further assessment together
with the scoring function. This resulted value is used
as a multiplicator for the calculated score in a single-
objective variate, effectively scaling the resulted score
by the percentage of the soft constraint satisfaction. In
the multi-objective variant, this value becomes just an
additional objective.
4 OPTIMAL PLACEMENT
Capacity planning and placement of the landscape
that consists of just a couple of systems with pre-
calculated sizing and costs for the compute and stor-
age components of the VMs and just a few possi-
ble placement options is a straightforward problem
that can be solved analytically. However, decision-
making becomes more complicated on more complex
landscapes with a variety of different placement op-
tions and requirements, which falls into a category of
problems that is, in fact, an NP-hard problem (Bichler
et al., 2006). The total number of solutions depends
on the number of placement options in set P and the
services in set K and equates to |P|
|K|
(Hyser et al.,
2007).
To simplify and speed up the process of decision-
making, we can employ meta-heuristic optimization
algorithms that are known to work well with objec-
tive maximization problems based on objective val-
ues similar to those described in the previous section.
As an example and to validate the scoring function,
we employ a classic meta-heuristic genetic algorithm
(GA), which is traditionally employed for solving op-
timization problems (Back, 1996) for a single objec-
tive function and a Non-Dominated-Sorting-Genetic-
Algorithm-III or, in short, NSGA-III (Deb and Jain,
Data Driven Meta-Heuristic-Assisted Approach for Placement of Standard IT Enterprise Systems in Hybrid-Cloud
143
2014), that is well suitable for the multi-objective op-
timizations.
We select the classic genetic algorithm meta-
heuristic for the single objective problem solution due
to its simplicity and easy-to-understand and imple-
ment principle, which is inspired by evolutionary the-
ory. GA metaheuristic is a technique for a solution
search in a problem search space within a number
of generations, directed by a so-called fitness func-
tion that determines the quality of the solution. Solu-
tion candidates in this class of algorithms are named
individuals. A large number of GA variations ex-
ist, but in principle, they share the same three main
functions: initialization, mutation, crossover, and se-
lection. Most of the hyperparameters are also com-
mon across many GA variations (i.e., population size,
number of evolutionary generations, crossover rate).
We encode individuals, or solution candidates, as
a sequence of whole numbers with a length of |K|,
where K is a set of systems in the landscape. Each
number in the sequence has a range of (0, |P|] where
each number corresponds to an index of a possible
placement configuration specific to the system p
k
P
k
. In our encoding, every system is allowed to be
placed on every considered on-premises or cloud lo-
cation. If a specific placement isn’t valid for the spe-
cific system, we invalidate the entire individual. Ev-
ery individual is then evaluated directly with a single-
objective function discussed in subsection 3.3.
For multi-objective solution search, we encode a
solution individual in the same way, but the evalua-
tion function used is now based on Equation 5, and
every value is considered as a separate objective. In
our example based on NSGA-III, all of the objective
values are considered equally important by the meta-
heuristic itself, but the category weights mentioned in
subsection 3.1, are applied as part of the aforemen-
tioned equation.
The key parameters in the family of genetic algo-
rithm meta-heuristics are common for all algorithms.
First is the maximum number of generations, denoted
maxGen, controlling the number of iterations the ge-
netic algorithm goes through. Next is the size of the
population, denoted pop, which controls the number
of solution individuals within every generation. Solu-
tion individual mutation probability, prob
mut
, which
determines the probability of random changes in the
solution individual. And finally, λ, the total number
of children produced during recombination (partially
combining different solution individuals together).
Many different approaches exist for the selection of
the best individual within the generated population
(Kruse et al., 2011), but in this work, we rely specifi-
cally on tournament selection.
5 EVALUATION
The presented approach of answering the five ques-
tions stated in section 1 is evaluated with a real-world
use case based on a partial cloud transformation of
SAP landscape with target placement environments
located on-premises and in the public cloud infras-
tructure of Microsoft Azure. We consider an example
IT landscape consisting of nine SAP systems, each
identified by a System ID (SID), some of which are
independent of others, and some have Remote Func-
tion Call (RFC) network connections recorded in the
workload profiling data. These interconnections are
illustrated in Figure 1. The volume of data transfer
is taken as an average per hour within the recorded
month-long workload profile, and this value is used
for network cost estimations.
SID 1 SID 2 SID 3 SID 4
SID 5
SID 6 SID 7 SID 8 SID 9
Figure 1: Validation example network connections.
Costs for on-premises and public cloud placement
options are estimated for the horizon of planning of
five years. Placement options are generated per sys-
tem component and based on the recorded workload
profiles and are viable capacity-wise. No potential
workload profile changes or pricing changes are con-
sidered. For cloud-based placements, only the data
centers within the European Union were considered.
In total, 38 cloud placement configuration permuta-
tions per SID distributed across 19 locations were
considered. Only SAP-certified Azure VMs were
considered. For on-premises, we have considered a
single location with the capacity limit, which we use
as a basis for on-premise cost calculation formed by
the measured workload plus a parameterized buffer of
35% for both CPU and main memory.
Pricing for compute resources and adequate size
of the appropriate type of storage for SAP systems
at the given horizon is estimated, without considering
networking costs for cloud placements, to be on aver-
age 16% cheaper for eight out of nine systems placed
on-premises in comparison to the same systems in
cloud placements, including electricity and cost share
per system for additional expenses (i.e., facility main-
tenance, IT staff, networking equipment). The only
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
144
system with the estimated running costs cheaper on
the cloud is SID 4, with on-premise placements being
significantly cheaper, 29% on average.
We investigate two use cases, with and without
constraints. If constraints are present, these are intro-
duced to promote the optimization algorithm to find a
solution that is oriented toward reducing networking
costs faster by co-locating communicating systems.
Co-location constraints are defined in pairs of systems
with a weight of 1.0.
Execution of both single- and multi-objective ex-
periments with weights and additional objective val-
ues are as specified in subsection 3.1. Due to lower
pricing and additional objectives favoring the on-
premises operations, we received the expected results,
which favor the on-premises placement for all sys-
tems, including SID 4 placement due to the network
bandwidth costs. From this, we can directly infer the
answers to Q1-Q5.
This result was achieved within 700 generations
with the size of a population of 200. The best place-
ment result for both single-objective experiments is
found on average at generation 428 within 20 experi-
mental runs, while multi-objective NSGA-III required
on average 683 generations to achieve the expected
outcome.
The presence or absence of placement co-location
constraints made no difference in the single-objective
scenario, as the estimated network bandwidth’s in-
creased pricing expenses negatively affected the ob-
jective function’s result. However, the multi-objective
scenario without constraints required more genera-
tions to achieve the same results as with the presence
of constraints.
Changing category weights balance presented in
Table 1 to favor experience over price with CO weight
reduced to 0, 10 and EX increased to 0, 50 also re-
sulted in an expected final landscape placement out-
come that favors on-premises for all services because
additional objectives specified in this specific case
study favor on-premises placement options.
Furthermore, it is also easy to see from Ta-
ble 1 that the on-premises configuration is favored
by the scoring and the pricing in our example is also
cheaper on-premises for most systems. In addition
to the aforementioned evaluation runs, we have con-
ducted a series of experiments to see if increasing the
pricing for on-premises without changing the scor-
ing would affect the decision-making within realistic
price ranges. The desired re-allocation favoring on-
cloud placement solutions was achieved by increas-
ing the pricing of the on-premises placement by 41%
on average, which indicates a strong influence of the
customer requirements in the objective function, and
hence the algorithms are not guided strictly by the
pricing difference. It is important to note that the sys-
tems with high volumes of networking communica-
tions were also successfully placed in the cloud within
the same locations, therefore avoiding high band-
width costs between regions or between on-premises
and on-cloud placement options.
The second set of evaluations we have conducted
on a larger IT landscape consisting of 30 SAP sys-
tems. The landscape is more heterogeneous in its
composition, with 20% of the system estimated to be
significantly cheaper when placed in the cloud over
the horizon of planning of ve years. The rest of the
system pricing was almost identical to the first exam-
ple, with an average margin of difference of 2%. The
additional objective values and weights were taken
identically to the previous example. However, the net-
working is not recorded for this specific landscape,
and we relied on the expert estimation of the possible
bandwidth and system interconnections based on their
nature. Only 6 out of 30 systems were designated as
interconnected.
The results of this evaluation were consistent with
such obtained from data representing a smaller land-
scape. However, larger problem instances required
a greater computational effort from both of the em-
ployed algorithms. The best results were obtained
with the population size increased to 320, and the
number of generations also scaled up to 1200 gener-
ations for GA. NSGA-III was able to find a suitable
solution with only a 260 population size and within
950 generations. However, these settings are still well
within a realm of tolerable computational effort, indi-
cating the potential of the described approach to scale
with the size of the landscape.
6 FUTURE WORK
Further investigation is planned on the addition of
more constraint types, specifically with a conflicting
nature. A mechanism of resolution for constraints
conflicts should be investigated. Furthermore, work-
load forecasting, based on the existing recorded work-
load profile before placement, is an interesting direc-
tion for predicting required capacities according to
possible changes in the business environment. Lastly,
incorporation of the bin-packing type of a problem for
the on-premises placement within the same placement
selection process might have the potential to reduce
any potential inaccuracies in the cost estimations.
Data Driven Meta-Heuristic-Assisted Approach for Placement of Standard IT Enterprise Systems in Hybrid-Cloud
145
7 CONCLUSION
In this work, we present a simple-to-explain data-
driven approach for processing quantifiable require-
ments and pricing components for the selection of
the most suitable placement for commercial off-the-
shelf IT enterprise applications, with a case study
based on SAP and sizing performed prior to place-
ment based on the recorded real-world system work-
load profile. We note that this problem can be for-
mulated in a single and multi-objective way, which
allows for the potential use of various optimization
algorithms. The validity of the approach is evalu-
ated with the use of evolutionary meta-heuristics and
the selected algorithms were able to find a suitable
solution while taking the pricing, the requirements,
and the considered constraints into account. It’s also
noted that the use of explicit constraints for the facil-
itation of the co-location for interconnected services
leads to faster discovery of a better suitable placement
than simple reliance on the implicit increased costs.
The approach discussed in this work is suitable for the
variable size of considered IT landscapes. However,
it’s noted that a multi-objective NSGA-III suffers a
noticeably smaller performance degradation on larger
problems in comparison to the single-objective GA.
REFERENCES
Back, T. (1996). Evolutionary Algorithms in Theory and
Practice: Evolution Strategies, Evolutionary Pro-
gramming, Genetic Algorithms. Oxford University
Press, USA, Oxford.
Bichler, M., Setzer, T., and Speitkamp, B. (2006). Capacity
Planning for Virtualized Servers. Workshop on Infor-
mation Technologies and Systems (WITS), Milwau-
kee, Wisconsin, USA, 2006.
Brogi, A., Corradini, A., and Soldani, J. (2019). Estimat-
ing costs of multi-component enterprise applications.
Formal Aspects of Computing, 31(4):421–451.
Chauhan, N., Agarwal, R., Garg, K., and Choudhury, T.
(2020). Redundant iaas cloud selection with con-
sideration of multi criteria decision analysis. Proce-
dia Computer Science, 167:1325–1333. International
Conference on Computational Intelligence and Data
Science.
Deb, K. and Jain, H. (2014). An evolutionary many-
objective optimization algorithm using reference-
point-based nondominated sorting approach, part i:
Solving problems with box constraints. IEEE Trans-
actions on Evolutionary Computation, 18(4):577–
601.
Helbig, M. and Engelbrecht, A. P. (op. 2013). Analysing
the performance of dynamic multi-objective optimi-
sation algorithms. In IEEE Congress on Evolutionary
Computation, pages 1531–1539, [S. l.]. IEEE.
Hippelainen, L., Oliver, I., and Lal, S. (2017). Towards
dependably detecting geolocation of cloud servers.
pages 643–656. Springer, Cham.
Hork
´
y, V., Libi
ˇ
c, P., Marek, L., Steinhauser, A., and T
˚
uma,
P. (2015). Utilizing performance unit tests to increase
performance awareness. In John, L. K., editor, Pro-
ceedings of the 6th ACMSPEC International Confer-
ence on Performance Engineering, pages 289–300,
New York, NY. ACM.
Hyser, C., McKee, B., Gardner, R., and Watson, B. J.
(2007). Autonomic virtual machine placement in the
data center. In Hewlett Packard Laboratories, Tech.
Rep. HPL-2007-189, volume 189.
Jammal, M. , Kanso, A., and Shami, A. (2015). High
availability-aware optimization digest for applications
deployment in cloud. In 2015 IEEE International
Conference on Communications (ICC), pages 6822–
6828.
Kruse, R., Borgelt, C., Braune, C., Mostaghim, S., and and
Steinbrecher, M. (2011). Computational intelligence:
a methodological introduction. Springer.
Marler, R. T. and Arora, J. S. (2010). The weighted
sum method for multi-objective optimization: new in-
sights. Structural and Multidisciplinary Optimization,
41(6):853–862.
Missbach, M., Staerk, T., Gardiner, C., McCloud, J., Madl,
R., Tempes, M., and Anderson, G. (2016). SAP on
the Cloud. Management for Professionals. Springer
Berlin Heidelberg, Berlin, Heidelberg, 2nd ed. 2016
edition.
M
¨
uller, H., Kharitonov, A., Nahhas, A., Bosse, S., and Tur-
owski, K. (2022). Addressing it capacity management
concerns using machine learning techniques. SN Com-
puter Science, 3(1):1–15.
Patel, C. D. and Shah, A. J. (2005). Cost model for plan-
ning, development and operation of a data center. In
Hewlett-Packard Laboratories Technical Report, vol-
ume 107, pages 1–36.
Pires, F. L. and Baran, B. (2015). A virtual machine
placement taxonomy. In 2015 IEEE/ACM 15th In-
ternational Symposium on Cluster, Cloud and Grid
Computing, pages 159–168, Los Alamitos, California.
Conference Publishing Services, IEEE Computer So-
ciety.
Rahimi, M., Jafari Navimipour, N., Hosseinzadeh, M.,
Moattar, M. H., and Darwesh, A. (2022). Toward the
efficient service selection approaches in cloud com-
puting. Kybernetes, 51(4):1388–1412.
Sarferaz, S. (2022). Data protection and privacy. In
Compendium on Enterprise Resource Planning, pages
499–513. Springer, Cham.
Wu, C., Buyya, R., and Ramamohanarao, K. (2019). Cloud
pricing models: Taxonomy, survey, and interdisci-
plinary challenges. 52(6).
Zanakis, S. H., Solomon, A., Wishart, N., and Dublish, S.
(1998). Multi-attribute decision making: A simulation
comparison of select methods. European Journal of
Operational Research, 107(3):507–529.
CLOSER 2023 - 13th International Conference on Cloud Computing and Services Science
146