Migrating to the Cloud

A Software Product Line based Analysis

Jes

us Garc

ıa-Gal

, Omer Rana

, Pablo Trinidad

and Antonio Ruiz-Cort

ETS Ingenier

ıa Inform

atica, Universidad de Sevilla, Seville, Spain

School of Computer Science & Informatics, Cardiff University, Cardiff, U.K.

Keywords:

Cloud, IaaS, Variability Management, Feature Model, Automated Analysis, AWS, Modelling.

Abstract:

Identifying which part of a local system should be migrated to a public Cloud environment is often a difﬁcult

and error prone process. With the signiﬁcant (and increasing) number of commercial Cloud providers, choos-

ing a provider whose capability best meets requirements is also often difﬁcult. Most Cloud service providers

offer large amounts of conﬁgurable resources, which can be combined in a number of different ways. In the

case of small and medium companies, ﬁnding a suitable conﬁguration with the minimum cost is often an es-

sential requirement to migrate, or even to initiate the decision process for migration. We interpret this need as

a problem associated with variability management and analysis. Variability techniques and models deal with

large conﬁguration spaces, and have been proposed previously to support conﬁguration processes in industrial

cases. Furthermore, this is a mature ﬁeld which has a large catalog of analysis operations to extract valuable

information in an automated way. Some of these operations can be used and tailored for Cloud environments.

We focus in this work on Amazon Cloud services, primarily due to the large number of possible conﬁgura-

tions available by this service provider and its popularity. Our approach can also be adapted to other providers

offering similar capabilities.

1 INTRODUCTION

Infrastructure as a Service (IaaS) enables the dy-

namic provisioning of computational & data re-

sources (often on-demand), and as an alternative

to private and expensive data centers. Recently, a

number of companies are deciding whether to mi-

grate their internal systems to a Cloud environment,

or to utilise Cloud-based infrastructure directly for

building and deploying new systems. Among other

beneﬁts, IaaS reduces costs (for short term work-

loads), speeds up the start-up process for many com-

panies, and decreases resource and power consump-

tion. However, there are a number of providers cur-

rently available – Cloud Harmony (clo, ) identiﬁes

over 100 public Cloud providers currently on the mar-

ket. As each user/company intending to make use of

Cloud computing infrastructure is likely to have their

own (speciﬁc) requirements, a process is necessary to

identify the most relevant provider and subsequently

a suitable conﬁguration that must be used on a par-

ticular provider. In this work we focus on the sec-

ond of these requirements – i.e. better understanding

whether a requirement of a user/company can be met

through the offerings of a particular provider.

IaaS conﬁguration process is often challenging –

due to the variety of possible options that many Cloud

providers offer (at different prices). Cloud providers

offer highly conﬁgurable IaaS capabilities, like com-

puting instance size, operating system, database type

or storage features (and recently, availability of spe-

cialist accelerators). Therefore, users have to under-

stand and navigate through a very large conﬁgura-

tion space to identify whether their particular require-

ments are likely to be met. For instance, Amazon Web

Services (AWS) provides, just for computing services

(EC2), 1758 different conﬁgurations

. Identifying

compatibility within such a large space of possible op-

tions is a tedious and error-prone task. Although on-

line conﬁguration and cost estimation tools are pro-

vided – such as the Amazon.com calculator (ama, ),

they do not let the users specify their preferences to

ﬁnd a suitable conﬁguration. Often within such sys-

tems a user needs to look through the various pos-

sible offerings and assess whether the capability is

Additional information at https://dl.dropbox.com/u/

1019151/addinf.pdf

416

García-Galán J., Rana O., Trinidad P. and Ruiz-Cortés A..

Migrating to the Cloud - A Software Product Line based Analysis.

DOI: 10.5220/0004357104160426

In Proceedings of the 3rd International Conference on Cloud Computing and Services Science (CLOSER-2013), pages 416-426

ISBN: 978-989-8565-52-5

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

suitable (or not) for their own requirements. Other

more user friendly approaches are also available, such

as (Khajeh-Hosseini et al., 2011) – which make use

of a spreadsheet for assessing suitability of a particu-

lar provider and possible conﬁguration options. Such

an approach relies on a dialogue between the various

individuals within a company responsible for decid-

ing what and when to migrate to the Cloud, involving

managers, system administrators, etc. The assessment

outcome is however made manually based on the out-

come of the people-based interaction.

We consider an alternative approach in this work,

based on the observation that management of large

conﬁguration spaces is a common task in the domain

of the Software Product Lines (SPLs). SPLs are fam-

ilies of related software products (or conﬁgurations),

which share a set of common assets. Variability mod-

els, and speciﬁcally Feature Models (FMs), are an

extended way to represent commonalities and differ-

ences of SPLs in terms of functional features. They

can also include additional information, represented

as a set of attributes, and identiﬁed as being an Ex-

tended Feature Model (EFM). In addition to repre-

senting variabilities, FMs also contain valuable infor-

mation about the conﬁguration space they depict, e.g.,

the set of conﬁgurations they represent or the optimal

conﬁguration given certain criteria. This information

extraction has been deﬁned and organized by means

of a catalog of analysis operations through an Auto-

mated Analysis of Feature Models (AAFM).

We propose applying Software Product Line-

based techniques to the IaaS conﬁguration process,

in order to overcome some of the challenges asso-

ciated with determining whether a particular Cloud

provider is able to meet a customer requirement. We

model several services of AWS as an FM, focusing

on this particular provider due to the widest range

of possible conﬁguration options that are available

in AWS, and due to the largest user community (of

any other Cloud provider) using these services. We

have also identiﬁed analysis operations of the AAFM

to automate the search of suitable AWS conﬁgura-

tions. Although works like (Dougherty et al., 2012) or

(Schroeter et al., 2012) propose to model Cloud sce-

narios as FMs, this is the ﬁrst work to use AAFM to

support a user-driven IaaS conﬁguration process. A

prototype, the associated analysis operations, a case

study and the outcome of analysis are presented in

this paper.

Approaching the IaaS conﬁguration process from

a SPL perspective has several beneﬁts. The use of

EFMs provides us with a compact and easy to man-

age representation of the whole conﬁguration space.

It also allows the user to choose by means of abstract

features and non-functional attributes, instead of se-

lecting speciﬁc AWS features. For instance, a user

would only need to specify a Linux-based operating

system and does not need to identify a particular type

(i.e. if this is the only requirement he has). Currently,

the user would need to identify a particular type, such

as Suse or RedHat, for instance. Moreover, the anal-

ysis operations of AAFM gives us support to validate

user choices and to determine suitable conﬁguration

options. Alternatives to FMs also exist, such as the

use of spreadsheets (Khajeh-Hosseini et al., 2012), a

relational database or the development of a model in

the Uniﬁed Modelling Language (UML). Both SQL

queries and spreadsheets enable the representation of

costs depending on the required features. They are

a similar approach to the Amazon calculator (ama, ).

Given a conﬁguration, we could identify the associ-

ated cost and other related information. However, the

opposite (and more useful) process is not possible,

i.e. given constraints about functional features and

non-functional properties (based on the requirements

of a user), obtaining suitable conﬁgurations given the

feature set and conﬁguration options available. UML

could be an alternative to model the AWS scenario,

using class diagrams. However, UML is not oriented

to variability modelling and information extraction,

and also lacks analysis operations such as those pro-

vided in AAFM.

The rest of the paper is structured as follows: Sec-

tion 2 brieﬂy describes the various Cloud services that

Amazon provides, while in Section 3 we present these

services modelled as a FM. Section 4 describes the

process we propose to extract information from the

AWS model. A study case is presented in Section 5 to

demonstrate how our proposed approach can be used

in practice. Related work is described in Section 6,

and Section 7 concludes the paper with a discussion

of lessons learned and possible future work.

1.1 SPL Background

SPLs (Clements and Northrop, 2002) are a software

engineering and development paradigm focused on

re-utilization and cost optimisation of software prod-

ucts. In essence, a SPL is a family of related soft-

ware products which share a set of common assets

(capabilities), but each of which can also include a

number of variations. It may be viewed as a way to

achieve mass customization, the next step after mass

production in software industry. SPLs have been ap-

plied to multiple industrial experiences and research

(Rabiser et al., 2010), including Cloud environments

(Dougherty et al., 2012). Assets, also named fea-

tures, of a SPL are represented using variability mod-

MigratingtotheCloud-ASoftwareProductLinebasedAnalysis

417

Figure 1: Example of a FM.

els. The most common variability model used in SPL

is the FM, proposed in 1990 by K.Kang (Kang et al.,

1990). A FM is a tree-like data structure in which

each node represents a feature. All nodes taken to-

gether represent all the possible types of products

(also named conﬁgurations or variants) of the SPL.

However, not every feature has to correspond with

a speciﬁc functionality. There may be abstract fea-

tures (Thum et al., 2011), which represent domain

decisions, but not concrete functionalities made avail-

able within a particular product. A FM also may

contain attributes linked to features, representing non-

functional properties. Figure 1 is a FM that represents

general features of an IaaS provider. ComputingIaaS

is the root node, which has two children, an optional

feature (white circle) named Storage, and a manda-

tory feature (ﬁlled/black circle) named OS. Both fea-

tures present group relationships, but Storage has an

Or relationship, where one or more features should

be selected, while OS has an Alternative relationship,

where exactly one child must be selected.

FMs contain valuable information about the con-

ﬁgurations and the whole product line. From the

FM of a SPL we can deduce a number of possible

outcomes, such as the total list of possible products,

the set of common features of every product, the set

of products that meet a given criteria and the prod-

uct with the minimum associated cost. However,

analysing the FM manually is a tedious and error-

prone task, and for computers can become computa-

tionally intractable in case of large sized FMs. Hence,

various research has focused on the AAFM (Bena-

vides et al., 2005) (Benavides et al., 2010), using

modelling and constraint programming techniques.

Various analysis operations have been proposed and

implemented since 2005 (Benavides et al., 2010), sev-

eral of them usable within our approach.

2 AMAZON WEB SERVICES

Amazon Web Services (AWS)

provides a number of

http://aws.amazon.com/products/

capabilities, such as the ability to execute and store

software/data in the Amazon Cloud, and on pay-

per-use basis. Amazon generally provides a per in-

stance price and has recently also started to offer a

“spot” price option (to make better use of spare ca-

pacity). Although in the last few years the number

and types of Cloud service providers has increased

(Rackspace/OpenStack, Flexiant, Microsoft Azure

are good examples), AWS still has the largest, most

conﬁgurable & mature services available on the mar-

ket – especially for IaaS Clouds. For this reason, sev-

eral PaaS and SaaS providers, like Heroku

or Netﬂix

, run over AWS. Among others, AWS provides ser-

vices for computation, storage, databases, clusters or

content delivery. Moreover, users can choose options

like datacenter location, resource reservation or man-

aged support.

In this paper, we have focused on four of the most

well-known Amazon services: EC2 (Elastic Compute

Cloud), EBS (Elastic Block Storage) which makes

use of EC2 instances, S3 (Simple Storage Service)

for common storage, and RDS (Relational Database

Service). EC2 provides resizable computing capac-

ity, available at different instance sizes. Instances can

be reserved, or run on demand, and several versions

of Windows and Linux are available as OS. Instance

size can vary in granularity from small to extra large.

Additionally, there are instances for special needs,

which have boosted CPU, RAM, or IO performance,

and even clusters of instances. Options like detailed

monitoring of instances, data center location or load

balancing are also available. EBS provides storage

linked to the EC2 instances. Furthermore, the user can

also conﬁgure the storage as on-demand or provision-

ing IOPS (Input Output Operations per Second). For

simple storage, we can conﬁgure S3, which presents

two types of storage: standard (more durable) or re-

duced redundancy (less durable than standard). Fi-

nally, RDS provides different DB instances, in a sim-

ilar way to EC2 providing compute/CPU instances.

We can choose between different DB engines, like

Oracle, SQLServer or MySQL; instances sizes, from

a small standard instance to a high memory quadru-

ple extra large instance; deployment type, between

single or multi-availability zone deployment; conﬁg-

urable storage and reserved instances.

The two main concerns a user has when relying

on IaaS in general and AWS in particular are cost and

risk of failure. Cost is easier to measure and control.

All the conﬁgurable options have an impact on the

cost. That impact could be looked up at the AWS

www.heroku.com

www.netﬂix.com

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

418

EC2

usage: % month

cu: compute units

ram: GB

dataOut: GB

Requests: IOPS

size: GB

cost: $

size: GB

Figure 2: AWS Extended Feature Model: EC2, EBS and S3.

website

. However, risk is harder to measure and

control. AWS provides a Service Level Agreement

for some of its services, which guarantees a monthly

uptime percentage and penalties in case these uptime

guarantees cannot be met. For detailed information

about the availability, a website

with the current and

historical status of services is provided by AWS.

3 MODELLING AMAZON WEB

SERVICES

In this work, we have modelled Amazon EC2, EBS,

S3 and RDS as an EFM, as Figure 2 and Figure 3

show. A set of conﬁgurable aspects of the previous

services is depicted at Figure 4. Functional aspects,

like OS or instance type, have been modelled as fea-

tures, while non-functional aspects, such as cost or

usage, have been modelled as attributes. Both fea-

tures and attributes present constraints to model the

real behaviour of AWS. However, just some of them

have been represented in the ﬁgures to make these ﬁg-

ures readable. A working version of the AWS EFM

is presented, in FaMa notation format, in Section 4.1.

The EC2 EFM is shown in Figure 2. Two main

sets of features, instance type and OS represent most

of the variability in EC2. The OS is composed by

Windows and Linux variants, while instance types

contain features such as standard, high mem, high

CPU, high IO and cluster instances. Both instance

type and OS feature groups are modelled as alterna-

tive features, i.e. only one feature within this group

http://aws.amazon.com/pricing/

http://status.aws.amazon.com/

must be selected. Detailed monitoring is also present

as an optional feature. EC2 feature has four attributes:

usage, ram, cu, and data out. Ram represents

the memory in GB, usage the percentage of use per

month, cu the Amazon Compute Units

, and data out

the amount of data transferred to outside AWS. Al-

though all of them are conﬁgurable, the value of cost,

ram and cu depends on the instance type and the OS.

Cost also depends on the detailed monitoring, and on

the usage attribute. These dependencies between at-

tributes and features are modelled using constraints.

EBS and S3 services are also represented in Fig-

ure 2. EBS, a child feature of EC2 in the model, is the

storage linked to EC2 instances. Its features represent

the two types of storage, standard or provisioned. The

associated attributes include cost (in dollars), storage

(in GB) size and IOPS (Input/Ouput Operations Per

Second). S3 is a child of the root feature, at the same

level of EC2. For this service, the two storage alterna-

tives, standard storage and reduced redundancy stor-

age, are represented. Cost of the S3 service, modelled

as an attribute, depends on the conﬁgured storage size,

also an attribute, of each of the alternatives.

The last group of features of Figure 2, named Spe-

cialReqs, shows a set of options to satisfy speciﬁc

user needs. They do not correspond directly with any

AWS service instance, but their selection/de-selection

implies addition/removal of other features. Prefer-

ences about IO performance, 64/32 bit OS, Solid State

Disk (SSD) storage or GPU capabilities are provided

within this group.

RDS (shown in Figure 3) contains four main con-

A EC2 Compute Unit provides the equivalent CPU ca-

pacity of a 1.0-1.2 GHz 2007 Opteron

MigratingtotheCloud-ASoftwareProductLinebasedAnalysis

419

RDS

requests: IOPS

size: GB

cost: $

usage: $

dataOut: GB

Figure 3: AWS Extended Feature Model: RDS.

ﬁgurable parameters: DB engine, instance class, de-

ployment type and storage capacity. Three different

engines (MySQL, Oracle ad SQLServer) are avail-

able with different licenses, and even a Bringing Your

Own License (BYOL) capability has been included.

Instances have been split into two groups: standard

class (small, large and extra large) and high memory

class (extra large, double extra large and quadruple

extra large). Two alternative deployments are avail-

able: a standard deployment, and a multi-zone avail-

ability deployment, which is maintained for planned

or unplanned outages. The storage has the same op-

tions as EBS, standard or provisioned IOPS. For the

non-functional aspects of RDS, we have deﬁned cost,

usage, and out data attributes for the whole RDS ser-

vice, and size and IO operation attributes for the RDS

storage.

We show in Figure 4 the extra-functional aspects

related to several of the Amazon services. Both EC2

and RDS instances could be reserved for periods of

one to three years. This supposes a ﬁxed cost per

year, but a reduced cost per hour of usage. Such a

ﬁxed price option may be suitable for users who have

a heavy computational requirement over a longer time

frame. The location of the data center is another as-

pect that increases or decreases the price of the ser-

vices, and which we use to measure the risk associ-

ated with deployment.

Both cost and risk have been taken into account

when deﬁning this model. Cost is represented by cost

attributes associated with EC2 and RDS. It is calcu-

lated by means of constraints, based on the informa-

tion at the AWS website, and it depends on the se-

lected features and attribute values. Risk is modelled

by considering the location, the service type and the

AWS status website. Status website has a historical

availability status (and therefore the associated down-

time) for the last month for each service and location.

We use this data to infer the values of the EC2, RDS

and S3 issue rate attribute.

Attributes and abstract features (features which do

not correspond directly to any AWS instance) are the

main reasons for making our model suitable for user

conﬁguration. Using abstract features and attributes,

a user can express general preferences without spec-

ifying concrete AWS service instances. For exam-

ple, a user who needs an EC2 instance with high IO

performance can express it by just selecting EC2 and

HighIoPerformance features. He can also express his

RAM and CPU requirements, by identifying suitable

constraints associated with the ram and cu attributes.

In the analysis phase that follows such a requirement

description, we can search and identify the most suit-

able EC2 instance for the user.

3.1 Conﬁguring the AWS Feature

Model

Usually, users need to conﬁgure more than one in-

stance of some of the previously presented services,

example, different EC2 (e.g. standard or cluster) or

RDS instances. Often, just one variant is not enough.

We allow a user to conﬁgure several instances, as

many as they need, of the AWS EFM. However, this

implies the total cost is now the sum of the cost asso-

ciated with every instance. We deﬁne a new attribute

named global cost, with a global scope, to store

and conﬁgure this value.

To identify a conﬁguration that meets the user re-

quirements, it is necessary to select features, iden-

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

420

Extra aspects

fixedCost: $

EC2IssuesRate: %

RDSIssuesRate: %

S3IssuesRate: %

Figure 4: AWS Extended Feature Model: Reserved instances and location.

tify constraints associated with attribute values and

deﬁne an optimisation criterion. Each conﬁguration

instance is composed of a set of feature selections/de-

selections, and constraints associated with attribute

values. If a certain feature is needed, it should be

marked as selected. If it is forbidden, it should be

marked as removed. However, if a user does not care

about a feature, it can be left unmodiﬁed in the model.

Constraints are expressed using relational operators

(<, ≤, >, ≥, =, and 6=) to exclude or force the selec-

tion of certain values. For example, our requirements

is to ﬁnd, a linux cluster with at least 15 cu, with a

cost constraint of 400 dollars, to maximise the usage.

We have to mark as selected the cluster feature, in the

group of instance type, and the linux feature, in the

group of OS type. We must also constrain the cost

to be less than 400 dollars, and set the EC2 usage at-

tribute as the maximisation criterion. If we also want

another EC2 instance (in addition and different to the

linux cluster), we have to conﬁgure another instance.

In addition to the global cost attribute, all in-

stances are linked by the optimisation criteria. Hence,

the optimisation criteria is what determines which

conﬁguration is the most suitable for a user. This

optimisation maximises or minimises the value of a

certain attribute, which could be an instance speciﬁc

attribute, like the usage or the cu, or the global cost.

4 ANALYSIS METHODOLOGY

A key objective of this work is to identify the most

suitable AWS conﬁguration that meets a set of user

requirements. When undertaken manually, the user

has to navigate through a very large search space – a

task that becomes untenable as the complexity of the

search space increases. In addition, it is also useful

to identify (as a boolean outcome) whether a given

provider (in this case AWS) is able to support a re-

quired capability. For example, a user requires a Red

Hat Cluster instance, but AWS does not provide sup-

port these at all. Subsequently, it is also necessary to

evaluate functional (features) and non-functional (at-

tributes and optimisation) information, and look for

the variant which best suit user requirements. To over-

come this challenges, we propose the use of AAFM

techniques.

(Benavides et al., 2010) deﬁne AAFM as the pro-

cess of ”extracting information from feature models

using automated mechanisms”. As input, a FM, an

analysis operation (e.g. ﬁlter, list, etc) and optionally

one or more conﬁgurations, are provided. Each anal-

ysis operation leads to different types of information

being retrieved from the model. Some examples are

the set of products that a FM represents, checking if a

conﬁguration over the FM is valid, or checking if the

FM contains errors. Subsequently, depending on the

input parameters, the FM is translated into a speciﬁc

logical paradigm, like Propositional Logic, Satisﬁa-

bility problem (SAT) or Constraint Satisfaction Prob-

lem (CSP), and mapped to a solver. Finally, the result

is mapped again to the FM domain to present it in

a comprehensible way. The choice of the solver de-

pends on the type of analysis required over the FM–

for instance a SAT solver would be used to undertake

boolean satisﬁability checking, etc.

The AAFM can include over 17 analysis opera-

tions, therefore our ﬁrst step is checking if some of

these operations are suitable for our analysis require-

ments (which include conﬁguration validation and op-

timisation). Fortunately, Valid partial conﬁguration

and Optimisation analysis operations provide us the

analysis we need. Valid partial conﬁguration takes a

FM and a partial conﬁguration as input and returns

a value indicating whether the partial conﬁguration

(identiﬁed by a user) meets the FM. When the out-

come is negative, we could also identify to the user

what changes need to be made to make the conﬁgura-

tion valid, as suggested by (White et al., 2010). The

optimisation operation takes a FM and an objective

function (involving the maximisation or minimisation

of an attribute value – such as memory, number of

MigratingtotheCloud-ASoftwareProductLinebasedAnalysis

421

CPUs, etc) as inputs and returns the conﬁguration ful-

ﬁlling the criteria identiﬁed by the objective function.

Figure 5 shows our proposed approach, based on the

use of AAFM, and with the support of the aforemen-

tioned operations. The inputs are the AWS model,

the user preferences (as partial conﬁgurations) and the

optimisation criteria. Using the AAFM operations,

ﬁrst we ensure the user preferences are valid. If they

are, we use the optimisation operation to obtain the

most suitable conﬁgurations. Finally, the model con-

ﬁgurations are translated to AWS conﬁgurations and

returned to the user.

The standard optimisation process in AAFM takes

as inputs a FM and a optimisation criteria, and option-

ally a partial variant too – and generally returns a sin-

gle outcome. However, a user conﬁguring AWS may

need more than one instance – for example if he needs

two EC2 or RDS instances. It is therefore necessary

to enable the user to conﬁgure as many instances as

needed as part of the optimisation operation. These

instances are related by attribute constraints and op-

timisation criteria. We have an optimisation where

the user provides several related instances which we

must identify, at the same time, and in the same search

space.

Due to the relatively large size of the search space

(involving 1758 possible conﬁgurations in AWS), the

performance of the optimisation operation is a key is-

sue. We propose a pre-processing step to obtain the

subset of variants which are able to support user re-

quirements in terms of features. Those variants which

cannot fulﬁll user needs are removed. For this task,

we use the ﬁlter AAFM operation, which takes as in-

put a FM and a partial variant, and returns the set of

variants including the input variant that can be derived

from the model. This subset is then used within the

optimisation operation.

We propose using Boolean Satisﬁability Prob-

lems (SAT) techniques for the pre-processing phase

and Constraint Satisfaction Optimization Problems

(CSOP) techniques – both paradigms have been ex-

tensively used in AAFM. SAT provides a better per-

formance than CSP, but is limited to the use of

boolean variables. For the optimisation, we use

CSOP, a variant of the classic Constraint Satisfaction

Problem (CSP) with optimisation capabilities. A CSP

is a three tuple composed of a ﬁnite set of variables,

domains (for each variable) and constraints. CSOP

has been chosen because it has a high degree of ex-

pressivity about variable domains and operators, and

a large tools support. Additional discussion and map-

ping over SAT and CSP can be found in (Benavides

et al., 2005) and (Benavides et al., 2010).

4.1 Prototype

We have developed a prototype of the proposed pro-

cess, supporting the Valid Partial Conﬁguration and

Optimization operations. To develop this prototype,

we have extended FaMa

(Trinidad et al., 2008), a

Java-based tool for AAFM. FaMa includes several

plugins around a central core. The tool supports dif-

ferent variability metamodels (FMs, EFMs, Orthogo-

nal Variability Models, etc) and also reasoners, which

implement most of the deﬁned AAFM analysis oper-

ations.

Valid Partial Conﬁguration for EFMs is supported

by default in FaMa, we have therefore developed the

optimization operation (without pre-processing) for

multiple variants. This analysis operation receives the

AmazonEFM, a set of partial variants, an optimisa-

tion criteria, and optionally an execution time limit as

inputs, and returns the set of optimal AWS variants

found. Choco

, one of the FaMa reasoners, is the

CSP solver we have used for the implementation. It

provides a set of heuristics, which are useful to cus-

tomise the search in complex problems like this.

The AmazonEFM is provided in FaMa text for-

mat

. Syntax and details about the format are avail-

able in the FaMa user guide

. We have simpli-

ﬁed the model removing the content of ﬁgure 4, it

means, removing location and reserved instances. At-

tributes have been modelled as integer variables, and

cost variations (depending on features and attributes

selected) have been modelled as constraints. User

conﬁguration, optimisation criteria and time limit are

speciﬁed using the FaMa programmatic interface.

5 CASE STUDY

In order to validate our conﬁguration and analysis ap-

proach, we consider a case study involving the mi-

gration of computational resources at a University re-

search group interested in: (i) expanding their exist-

ing computational capability; (ii) using additional re-

sources to support a demand peak; (iii) understanding

the cost implications of using a Cloud environment

for hosting their computational infrastructure (with

reference to upgrading in-house resources). Fully un-

derstanding cost implications – for (iii) – is often a

difﬁcult task in practice – as cost associated with ad-

ministrative staff has to be considered over a partic-

www.isa.us.es/fama

http://www.emn.fr/z-info/choco-solver/

https://dl.dropbox.com/u/1019151/AWS.afm

http://famats.googlecode.com/ﬁles/

FaMa%20Manual.pdf

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

422

Linux cluster,

usage > 40%,

cu > 30 units

Suse Cluster 4XL,

usage = 41%

StdLinux HighRam

instance,

usage = 41%

Our proposal

✓

✗

✓

…

✓

✗

✓

…

✗

✓

✗

✓

✗

…

✗

✓

Amazon EFM

mininum cost

RAM > 8 Gb

cost < 150 $

usage > 40%

AAFM

Valid variant

Optimisation

Figure 5: Automated analysis support for AWS conﬁguration.

Table 1: Computational infrastructure of the research group.

Name Processor RAM Disk OS DB Usage

Labs 2.8 GHZ (1 core) 2 GB 20 GB Red hat MySQL on demand

Clinker 2.8 GHZ (1 core) 4 GB 40 GB Red hat MySQL on demand

Cluster 2.33 GHZ (8 cores) 10 GB 500 GB Linux MySQL on demand

ular time frame. This scenario has been chosen as it

is representative of a small to medium scale organisa-

tion, often operating under a tight budget, and repre-

sents on-demand computing requirements. Our inten-

tion is to check if our approach is useful in this context

and if AWS is a realistic choice for these cases.

The computing infrastructure is shown in Table 1.

Although we have four machines, we want to migrate

just two of them, the non-critical ones. These com-

puters (labs and clinker) are used for pre-production

tasks, so their usage is on-demand. Furthermore, the

group has requirements for using a cluster to execute

large experiments. Since the budget is limited, our

objective is to ﬁnd the minimal cost solution that sat-

isﬁes these requirements.

Table 2 shows the conﬁguration of the machines

over the AWS FM. Labs and Clinker both perform

with RedHat OS, while the cluster would require a

Linux distribution. To conﬁgure the computing ca-

pacity, a conversion to Amazon Compute Units is re-

quired. Dividing each desired CPU capacity by the

standard cu, we obtain the values in the table. RAM,

storage, and DB engine properties are translated di-

rectly. To estimate usage we calculate the hours per

month we need to make use of these resources (as fol-

lows): 40 staff working hours per week, divided by

168 hours per week, is approx. 24%. We use a value

of 30% to cover extraordinary events. For the cluster,

we estimate that the demand for experiments could

be about 10%. Finally, we have set IO operations per

month to 100 million.

Analysis results are shown in Table 3. The op-

timal cost per month, obtained after 308 seconds, is

118 + 152 + 148 = 418 $. Additional decisions also

need to be made, such as the selection of a High CPU

medium instance for labs, instead of a standard large

one. Although we require just 16 cu, we can only use

(as a minimum) an instance with 34 cu. A similar case

holds for storage capacity for EC2 instances.

Currently, the conﬁguration process has to be

manually speciﬁed by a user – we do not provide a

user interface for interacting with the prototype. It is

useful to note that identifying the values for data out,

usage or IOPS need prior evaluation by a user. Deter-

mining what the values for these should be is often a

non-trivial process.

5.1 Performance Preliminary Results

To evaluate the feasibility of our prototype for more

realistic (larger scale) scenarios, we calculate the

overall performance associated with undertaking the

MigratingtotheCloud-ASoftwareProductLinebasedAnalysis

423

Table 2: Machines conﬁguration over AmazonEFM.

Name OS Instance.cu Instance.ram Instance.usage Storage.size Storage.IOPS

Labs RedHat ≥ 3cu ≥ 2GB ≥ 30% ≥ 20GB ≥ 200M

Clinker RedHat ≥ 3cu ≥ 4GB ≥ 30% ≥ 40GB ≥ 200M

Cluster Linux ≥ 16cu ≥ 10GB ≥ 10% ≥ 500GB ≥ 300M

Table 3: Case study analysis results.

Name EC2 Instance EC2 OS EC2.cu EC2.ram EC2.storage RDS instance Cost

Labs High CPU - M Red Hat 5cu 2GB 350GB Small 118

Clinker Standard - L Red Hat 5cu 8GB 850GB Small 152

Cluster Cluster - 4XL Linux 34cu 23GB 1690GB Small 148

analysis on the FM. Our experiment involves running

optimisation operations from one to four conﬁgura-

tions and measuring the associated time for ﬁnding

the optimal conﬁguration(s). Experiments have been

executed on a Core 2 Duo 2.00 Ghz laptop, with 2GB

Ram and Windows 7 Business Edition OS.

Table 4 shows the preliminary performance re-

sults. As we can see, the execution time growth is

exponential for the selected optimisation operation.

For one conﬁguration, the time is less than 6000 ms,

to 32000 ms for two conﬁgurations, and 370000 ms

for three conﬁgurations. However, identifying Valid

Partial Conﬁguration shows a different, linear time

growth. Therefore, efforts must be focused on im-

proving the Optimisation performance.

6 RELATED WORK

Conﬁguring and analysing Cloud plaftorms/providers

is continuing to receive signiﬁcant attention from both

the research and business community. For instance,

CloudHarmony

is a startup which looks for obtain-

ing metrics about cloud providers performance, and

provides a comparison framework for many services

providers. PlanForCloud

is another startup, fo-

cused on conﬁguring and simulating cost of several

cloud platforms, like Amazon, Azure or Rackspace.

They provide interesting options, like creating elastic

demand patterns, and ﬁltering by options like OS or

computing needs. Increasingly, there has also been a

recognition that Cloud performance can vary signif-

icantly over time (Iosup et al., 2011) – based on the

workload currently running on a particular provider.

Several research efforts focus on optimising de-

ployment over a cloud infrastructure. (Clark et al.,

2012) propose an Intelligent Cloud Resource Alloca-

tion Service to evaluate the most suitable conﬁgura-

tion given consumer preferences. (Tsai et al., 2012)

http://cloudharmony.com/

http://www.planforcloud.com/

considers a similar approach, choosing between dif-

ferent Cloud providers using data mining and trend

analysis techniques, and looking for minimising cost.

In a related research, (Sundareswaran et al., 2012)

propose indexing and ranking Cloud providers using

a set of algorithms based on user preferences. (Ven-

ticinque et al., 2011) describe an approach to collect

Cloud resources from different providers that contin-

uously meet requirements of user applications. The

related work of (Borgetto et al., 2012) is oriented to

software reallocation in different Virtual Machines in

order to decrease energy consumption. Several on-

tologies have been also proposed in the last years

for cloud services discovery and selection (Androcec

et al., 2012).

Applying SPL techniques to Cloud services has

also been considered – for instance, (Quinten et al.,

2012) propose using FMs to conﬁgure IaaS and PaaS,

and also use the AAFM to extract information from

the model. In a more speciﬁc work, (Schroeter et al.,

2012) use extended FMs to conﬁgure IaaS, PaaS and

SaaS, and also present a process to manage the con-

ﬁguration of several stakeholders at the same time.

(Dougherty et al., 2012) also uses FMs to model IaaS,

but the goal in this case is reducing energy cost and

energy consumption, towards the development of a

“Green Cloud”. In a different way, (Cavalcante et al.,

2012) proposes the extension of traditional SPL with

Cloud computing aspects. Our work differs from

these approaches in two key ways: (i) we focus on

one speciﬁc provider – to better understand the range

of capabilities that this provider offers. We have cho-

sen AWS as it is the most widely used and conﬁg-

urable provider currently on the market; (ii) we focus

on using specialist analysis approaches – such as use

of SAT and CSP solvers – in order to automatically

analyse the resulting feature model.

The optimisation operation has been the focus of

several research works related to AAFM. Recently,

(Roos-Frantz et al., 2012) propose the optimisation

of a radio frequency warner system using Orthog-

onal Variability Models (OVM), an alternative to

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

424

Table 4: Average execution time of operations for 1, 2, 3

and 4 conﬁgurations.

Time in ms

1 conf 2 confs 3 confs 4 confs

Optimisation 5615 31799 363578 1h30+

Valid Conf. 135 260 267 302

FMs. (Guo et al., 2011) approach the optimisation for

EFMs from a more general perspective, using genetic

algorithms.

7 CONCLUSIONS AND FUTURE

WORK

In this work, we have modelled several AWS services

using FMs. Modelling the conﬁguration space of a

large Cloud provider like AWS provides a useful and

compact representation to express user preferences.

The resulting model still contains lots of information

and required data, it is therefore necessary for be clear

about their requirements and needs. Therefore al-

though the AWS FM eases the conﬁguration process,

it is still necessary to have an indepth understanding

about the infrastructure requirements to interpret the

analysis results. However, this kind of information is

near to the users domain, and allows them to express

their problem in a way more conducive to their own

understanding of it, and not in the terms of a speciﬁc

provider catalogue.

We have also proposed the use of several analy-

sis operations of the AAFM, some of them tailored to

assist the user in the conﬁguration process. With the

support of these operations, user preferences (func-

tional and non-functional) can be validated, and used

to obtain suitable AWS conﬁgurations. From the

variability management perspective, we have demon-

strated the applicability of AAFM to Cloud services.

Finally, we have presented a prototype of the anal-

ysis operations, based in the FaMa tool, and a case

study. Although the prototype and its performance

are at a premliminary stage, additional work is being

carried out to improve their performance. The proto-

type has revealed the need for improving performance

by using pre-processing and heuristics to execute the

optimisation operation in a reasonable time.

Our approach could be extended and gen-

eralised to providers other than AWS, like

Rackspace/OpenStack or Microsoft Azure. Most

of the AWS FM structure is reusable, enabling us

to adapt some of the attributes and cost to other

providers. We believe there is a need to create

a FM of an abstract Cloud provider which has a

minimal core set of features and attributes common

across providers, and where existing cloud services

ontologies could be considered for a better result.

Subsequently, FMs can be specialised based on the

particular provider being considered. This would

also provide a useful basis for comparing between

providers. A friendly way to conﬁgure the AWS

FM is also required, by means of a Domain Speciﬁc

Language or at a minimal the development of a

user interface. In this sense, an integration with a

platform like www.planforcloud.com would be really

interesting. Applying metaheuristic techniques, such

as (Guo et al., 2011), could also be used as a basis to

improve the performance in real-time scenarios. A

deeper experimentation is also required. Modelling

and analysing an small-medium company case, and

comparing performance and optimality of different

implementation techniques are mandatory tasks for

further work.

ACKNOWLEDGEMENTS

This work has been partially supported by the Eu-

ropean Commission (FEDER) and Spanish Govern-

ment under CICYT project SETI (TIN2009-07366)

and by the Andalusian Government under ISABEL

(TIC-2533) and THEOS (TIC-5906) projects.

REFERENCES

Amazon ec2 calculator. http://calculator.s3.amazonaws.com/

calc5.html. Last accessed: November 2012.

Cloud harmony. http://cloudharmony.com/. Last accessed:

November 2012.

Androcec, D., Vrcek, N., and Seva, J. (2012). Cloud com-

puting ontologies: A systematic review. In MOPAS

2012, The Third International Conference on Mod-

els and Ontology-based Design of Protocols, Archi-

tectures and Services.

Benavides, D., Ruiz-Cort

es, A., and Trinidad, P. (2005).

Automated reasoning on feature models. LNCS, Ad-

vanced Information Systems Engineering: 17th Inter-

national Conference, CAiSE 2005.

Benavides, D., Segura, S., and Ruiz-Cortes, A. (2010). Au-

tomated analysis of feature models 20 years later: A

literature review. Information Systems.

Borgetto, D., Maurer, M., Da-Costa, G., Pierson, J.-M.,

and Brandic, I. (2012). Energy-efﬁcient and sla-aware

management of iaas clouds. In Proceedings of the 3rd

International Conference on Future Energy Systems:

Where Energy, Computing and Communication Meet.

Cavalcante, E., Almeida, A., and Batista, T. (2012). Ex-

ploiting software product lines to develop cloud com-

puting applications. In Software Product Line Confer-

ence.

MigratingtotheCloud-ASoftwareProductLinebasedAnalysis

425

Clark, K., Warnier, M., and Brazier, F. (2012). An intelli-

gent cloud resource allocation service: Agent-based

automated cloud resource allocation using micro-

agreement. In CLOSER 2012 - Proceedings of the

2nd International Conference on Cloud Computing

and Services Science.

Clements, P. and Northrop, L. M. (2002). Software Product

Lines: Practices and Patterns. Addison-Wesley.

Dougherty, B., White, J., and Schmidt, D. C. (2012).

Model-driven auto-scaling of green cloud computing

infrastructure. Future Generation Computer Systems.

Guo, J., White, J., Wang, G., Li, J., and Wang, Y. (2011). A

genetic algorithm for optimized feature selection with

resource constraints in software product lines. J. Syst.

Softw.

Iosup, A., Yigitbasi, N., and Epema, D. H. J. (2011). On the

performance variability of production cloud services.

In Proceedings of CCGRID.

Kang, K., Cohen, S., Hess, J., Nowak, W., and Peterson, S.

(1990). Feature-Oriented Domain Analysis (FODA)

Feasibility Study.

Khajeh-Hosseini, A., Greenwood, D., Smith, J. W., and

Sommerville, I. (2012). The cloud adoption toolkit:

supporting cloud adoption decisions in the enterprise.

Softw., Pract. Exper.

Khajeh-Hosseini, A., Sommerville, I., Bogaerts, J., and

Teregowda, P. B. (2011). Decision support tools for

cloud migration in the enterprise. In IEEE CLOUD

Conference.

Quinten, C., Duchien, L., Heymans, P., Mouton, S., and

Charlier, E. (2012). Using feature modelling and

automations to select among cloud solutions. In

2012 3rd International Workshop on Product LinE Ap-

proaches in Software Engineering, PLEASE 2012 -

Proceedings.

Rabiser, R., Grnbacher, P., and Dhungana, D. (2010). Re-

quirements for product derivation support: Results

from a systematic literature review and an expert sur-

vey. Information and Software Technology.

Roos-Frantz, F., Benavides, D., Ruiz-Corts, A., Heuer, A.,

and Lauenroth, K. (2012). Quality-aware analysis in

product line engineering with the orthogonal variabil-

ity model. Software Quality Journal.

Schroeter, J., Mucha, P., Muth, M., Jugel, K., and Lochau,

M. (2012). Dynamic conﬁguration management of

cloud-based applications. In Proceedings of the 16th

International Software Product Line Conference - Vol-

ume 2.

Sundareswaran, S., Squicciarini, A., and Lin, D. (2012). A

brokerage-based approach for cloud service selection.

In Cloud Computing (CLOUD), 2012 IEEE 5th Inter-

national Conference on.

Thum, T., Kastner, C., Erdweg, S., and Siegmund, N.

(2011). Abstract features in feature modeling. In

Proceedings of the 2011 15th International Software

Product Line Conference.

Trinidad, P., Benavides, D., Ruiz-Cort

es, A., Segura, S., and

A.Jimenez (2008). Fama framework. In 12th Software

Product Lines Conference (SPLC).

Tsai, W.-T., Qi, G., and Chen, Y. (2012). A cost-effective

intelligent conﬁguration model in cloud computing.

In Distributed Computing Systems Workshops (ICD-

CSW), 2012 32nd International Conference on.

Venticinque, S., Aversa, R., Di Martino, B., and Petcu, D.

(2011). Agent based cloud provisioning and man-

agement: Design and prototypal implementation. In

CLOSER 2011 - Proceedings of the 1st International

Conference on Cloud Computing and Services Sci-

ence.

White, J., Benavides, D., Schmidt, D., Trinidad, P.,

Dougherty, B., and Ruiz-Cortes, A. (2010). Au-

tomated diagnosis of feature model conﬁgurations.

Journal of Systems and Software.

CLOSER2013-3rdInternationalConferenceonCloudComputingandServicesScience

426